Sunday, May 27, 2007

using dd to create a 4GB file

I'm in the process of learning how to use to dar (disk archive) to archive my videos in files that will span multiple DVDs. First, though, I want to find out exactly how large a file I can have dar create in order to maximize the utilized space on the DVD. I will use growisofs to copy the file to the DVD, so I will tell dd to create a large file of around 4.7GB and see if growisofs can write that file.

Here is the syntax of the dd command to create the file. I will fill the file with NULL characters.
dd if=/dev/zero of=zerofile.tst bs=1k count=4700000

if = input file
of = output file
bs = block size
count = file size in kb


Woops..file is too big for growisofs!
[root@computer ~]# growisofs -Z /dev/hda -R -J /mnt/videos/zerofile.tst
Executing 'mkisofs -R -J /mnt/videos/zerofile.tst builtin_dd of=/dev/hda obs=32k seek=0'
INFO: UTF-8 character encoding detected by locale settings.
Assuming UTF-8 encoded filenames on source filesystem,
use -input-charset to override.
mkisofs: Value too large for defined data type. File /mnt/videos/zerofile.tst is too large - ignoring
Total translation table size: 0
Total rockridge attributes bytes: 169
Total directory bytes: 0
Path table size(bytes): 10
Max brk space used 0
181 extents written (0 MB)
/dev/hda: "Current Write Speed" is 8.2x1352KBps.
builtin_dd: 192*2KB out @ average 0.0x1352KBps
/dev/hda: flushing cache
/dev/hda: closing track
/dev/hda: closing session


I'd better do some research. According to these two articles:
Cheetah Burner FAQ
http://en.wikipedia.org/wiki/ISO_9660

the maximum file size for a data DVD with IS09660 and Joliet extensions is 4.2GB. dar creates files in chunks of megabytes (1024 x 1024 x a value specified in the thousands). So if use a value of 4100 in dar, dar should create archive files that are 4,194,304,000. This is right under the limit of the usable space on a DVD.

Here's the second test I performed using dd:
[root@computer ~]# dd if=/dev/zero of=/mnt/videos/zerofile.tst bs=1k count=4100000
4100000+0 records in
4100000+0 records out
4198400000 bytes (4.3 GB) copied, 50.7164 seconds, 84.8 MB/s


Success! dar was able to correctly write this 4,198,400,000 byte file to disk:
dar -m 256 -y -s 4000M -D -R /mnt/videos/zerofile.tst -c `date -I`_data

--------------------------------------------
1 inode(s) saved
with 0 hard link(s) recorded
0 inode(s) changed at the moment of the backup
0 inode(s) not saved (no file change)
0 inode(s) failed to save (filesystem error)
0 files(s) ignored (excluded by filters)
0 files(s) recorded as deleted from reference backup
--------------------------------------------
Total number of file considered: 1
--------------------------------------------


I did try a third test using a value of 4100MB, but this yielded a total of 4,299,161,600 bytes. This value exceeded the 32-bit current limitation of ISO9660. As I've read, this specification is under review and may be increased in the near future.

Side Note:
Instead of NULL characters, dd can fill a file with random (/dev/random) characters or less random characters (/dev/urandom). According to the Wiki entry, the difference between the two is that urandom is "less cryptographically secure" and takes a shorter amount of time to use because of this insecurity:
http://en.wikipedia.org/wiki/Dd_(Unix)

dd has been around since the mid-70s and is used for creating and archiving files at a very low level. Be careful using this program, because it's synonym is "destroy data" and you can easily wipe out the data on your hard drive with the wrong syntax.

Here's the man page on dd:
http://www.die.net/doc/linux/man/man1/dd.1.html

UPDATE: Here is how I've decided to backup my HDV material using a combination of dar (disk archive) and growisofs:
http://crazedmuleproductions.blogspot.com/2007/05/problem-of-archivingand-solution.html

UPDATE: Here is another way to create a large file using dd. Try making the block size larger and reduce the number (count) of blocks for the test file! The following command creates a file of zeros one gigabyte in length using a block size of 1,000 megabytes (one Gb):
dd if=/dev/zero of=zerofile.tst bs=1000M count=1
1+0 records in
1+0 records out
1048576000 bytes (1.0 GB) copied, 7.812 s, 134 MB/s

No comments: