There’s but one way to determine the optimal block size, and that’s a benchmark. I’ve just made a quick benchmark. The test machine is a PC running Debian GNU/Linux, with kernel 2.6.32 and coreutils 8.5. Both filesystems involved are ext3 on LVM volumes on a hard disk partition. The source file is 2GB (2040000kB to be precise). Caching and buffering are enabled. Before each run, I emptied the cache with
sync; echo 1 >|/proc/sys/vm/drop_caches. The run times do not include a final
sync to flush the buffers; the final
sync takes on the order of 1 second.
same runs were copies on the same filesystem; the
diff runs were copies to a filesystem on a different hard disk. For consistency, the times reported are the wall clock times obtained with the
time utility, in seconds. I only ran each command once, so I don’t know how much variance there is in the timing.
same diff t (s) t (s) dd bs=64M 71.1 51.3 dd bs=1M 73.9 41.8 dd bs=4k 79.6 48.5 dd bs=512 85.3 48.9 cat 76.2 41.7 cp 77.8 45.3
Conclusion: A large block size (several megabytes) helps, but not dramatically (a lot less than I expected for same-drive copies). And
cp don’t perform so badly. With these numbers, I don’t find
dd worth bothering with. Go with
dd dates from back when it was needed to translate old IBM mainframe tapes, and the block size had to match the one used to write the tape or data blocks would be skipped or truncated. (9-track tapes were finicky. Be glad they’re long dead.) These days, the block size should be a multiple of the device sector size (usually 4KB, but on very recent disks may be much larger and on very small thumb drives may be smaller, but 4KB is a reasonable middle ground regardless) and the larger the better for performance. I often use 1MB block sizes with hard drives. (We have a lot more memory to throw around these days too.)
I agree with geekosaur’s answer that the size should be a multiple of the block size, which is often 4K.
If you want to find the block size
stat -c "%o" filename is probably the easiest option.
But say you do
dd bs=4K, that means it does
read(4096); write(4096); read(4096); write(4096)…
Each system call involves a context switch, which involves some overhead, and depending on the I/O scheduler, reads with interspersed writes could cause the disk to do lots of seeks. (Probably not a major issue with the Linux scheduler, but nonetheless something to think about.)
So if you do
bs=8K, you allow the disk to read two blocks at a time, which are probably close together on the disk, before seeking somewhere else to do the write (or to service I/O for another process).
By that logic,
bs=16K is even better, etc.
So what I’d like to know is if there is an upper limit where performance starts to get worse, or if it’s only bounded by memory.