Saving time and bandwidth by creating a DVD image from CD ISO files

A little while back I did something stupid and deleted a lot of files from a hard drive array, including a few hundred Linux and BSD CD and DVD ISO images. I had a large number of these archived to disc and have been able to re-download most of the other stuff I need easily enough, but there’s always the challenge to download the replacements as quickly as possible and using as little bandwidth as possible (bandwidth is still pretty expensive here in New Zealand).

In the case of a distribution like CentOS, each release comes on a number of CDs or alternatively a single DVD. The contents of each are fairly similar, and there are a number of mirrors around the world which support downloading using rsync, as well as the more standard ftp and http methods.

Update November 5th 2008: Thanks to this forum thread which mentions this post, I have found out that you can download the mkdvdiso.shscript at http://isoredirect.centos.org/centos/build/ to create a DVD from the CD images. I have not tried this myself but presumably the DVD ISO image that is generated matches the md5 checksum. I will try this out in the next few days and update again.

Update ends; original post resumes…

It suddenly struck me one day that I should be able to simply concatenate the contents of the CDs into one great big file, and then rsync it against an rsync server. It will contain a fair amount of start and end stuff not on the DVD, but assuming the contents are in a relatively similar order on the discs then the amount needed to download to make the DVD image correct should be less than having to download the whole thing.

I already had the CD ISO images but not the DVD one. If it didn’t work, the worst case would be downloading the entire thing. I expected to maybe save a few hundred megabytes, or something along those lines, and was impressed to discover that, at least with the x86_64 version of CentOS 5.0 that I only had to download 131MB to make the image valid.

This is how I did it…

First of all, here’s the list of CD ISO images:

$ ls -1
 -rw-r--r--  1 1000 users 655493120 Jan 18 12:27 CentOS-5.0-x86_64-bin-1of7.iso
 -rw-r--r--  1 root root  665100288 Apr 11  2007 CentOS-5.0-x86_64-bin-2of7.iso
 -rw-r--r--  1 root root  666744832 Apr 11  2007 CentOS-5.0-x86_64-bin-3of7.iso
 -rw-r--r--  1 root root  617988096 Apr 11  2007 CentOS-5.0-x86_64-bin-4of7.iso
 -rw-r--r--  1 root root  645744640 Apr 11  2007 CentOS-5.0-x86_64-bin-5of7.iso
 -rw-r--r--  1 root root  664485888 Apr 11  2007 CentOS-5.0-x86_64-bin-6of7.iso
 -rw-r--r--  1 root root  374009856 Apr 11  2007 CentOS-5.0-x86_64-bin-7of7.iso

Now I concatenate them together into a file named after the DVD version:

$ for filename in `ls -1 *.iso`; do cat $filename >> CentOS-5.0-x86_64-bin-DVD.iso; done;

Then rsync against a CentOS rsync mirror which also allows direct DVD downloads:

$ rsync -az --progress --stats rsync://ftp.jaist.ac.jp/pub/Linux/CentOS/5.0/isos-dvd/CentOS-5.0-x86_64-bin-DVD.iso .

And here’s the output from the above command:

receiving file list ...
 1 file to consider
 CentOS-5.0-x86_64-bin-DVD.iso
   4287268864 100%    3.40MB/s    0:20:01  (1, 100.0% of 1)
 
 Number of files: 1
 Number of files transferred: 1
 Total file size: 4287268864 bytes
 Total transferred file size: 4287268864 bytes
 Literal data: 137032352 bytes
 Matched data: 4150236512 bytes
 File list size: 94
 Total bytes sent: 524156
 Total bytes received: 137091839
 
 sent 524156 bytes  received 137091839 bytes  91348.15 bytes/sec
 total size is 4287268864  speedup is 31.15

So it only took 20 minutes to download and only required 131MB of actual data downloaded to correct the image to the valid DVD image. I’ve bolded the part above which shows the actual data transferred.

Now I checksummed it against the md5sum, just to double check, first by creating the md5 checksum file:

$ echo "246f5740f70abd020048d87becf8af24  CentOS-5.0-x86_64-bin-DVD.iso" > CentOS-5.0-x86_64-bin-DVD.iso.md5
 

and then running the checksum:

$ md5sum -c CentOS-5.0-x86_64-bin-DVD.iso.md5
 CentOS-5.0-x86_64-bin-DVD.iso: OK

Excellent, it’s perfectly valid. The great thing about using this method is that when a new release comes out, I only need to get the CDs and can then make a DVD image from them, finally rsyncing the DVD against a rsync server to fix the image.

md5sum: only one argument may be specified when using –check

I burn CDs and DVDs for my Linux CD Mall website on a CentOS 4 machine and use md5sum checksums to ensure that the ISO images have downloaded correctly from the http or ftp server or via bittorrent. I store each ISO files’s md5sum in a separate .md5 file and use the md5sum command’s -c flag to check the contents are valid like so:

md5sum -c SimplyMEPIS-CD_7.0-rel_32.iso.md5
 

If the file checksums correctly, then the output will look like this:

SimplyMEPIS-CD_7.0-rel_32.iso: OK

In order to check multiple files at once, it would be nice to do this:

md5sum -c *.md5

but the md5sum command in CentOS 4 complains with this error message:

md5sum: only one argument may be specified when using --check
 Try `md5sum --help' for more information.
 

This method of checking multiple files at once does work for more recent versions of the md5sum command but unfortunately does not work for the version shipped with CentOS 4. However, with a little BASH magic it’s possible to still issue just one command (instead of manually entering them one after the other after each checksum has completed) like so:

for filename in `ls -1 *.md5`; do md5sum -c $filename; done
 

Using the Simply Mepis example above, where there is a 32 bit and 64 bit version of the ISO image, the output from the above command would look like this:

SimplyMEPIS-CD_7.0-rel_32.iso: OK
 SimplyMEPIS-CD_7.0-rel_64.iso: OK
 

So not quite as easy as “md5sum -c *.md5”, but easier than having to checksum each one of them one by one.

Mount reiserfs partitions on CentOS 4

I needed to copy some files from an old hard drive to a machine I have running CentOS 4, but the partition on the hard drive I needed to access was formatted with the reiserfs file system. The Linux kernel in CentOS 4 does not include support for reiserfs so you need to install a new kernel from the CentOS Plus repository. By doing this, the kernel is no longer the same as that provided by Red Hat’s Enterprise Linux, but you will be able to mount ReiserFS volumes.

If you try to mount a reiserfs filesystem but do not have support for reiserfs in the Linux kernel, you’ll get an error message like this:

mount: fs type reiserfs not supported by kernel
 

It’s possible to see what filesystems are supported by running this command:

cat /proc/filesystems
 

On my fairly default install CentOS 4 machine, the output of the above command was this:

nodev   sysfs
 nodev   rootfs
 nodev   bdev
 nodev   proc
 nodev   sockfs
 nodev   binfmt_misc
 nodev   usbfs
 nodev   usbdevfs
 nodev   futexfs
 nodev   tmpfs
 nodev   pipefs
 nodev   eventpollfs
 nodev   devpts
         ext2
 nodev   ramfs
 nodev   hugetlbfs
         iso9660
 nodev   relayfs
 nodev   mqueue
         ext3
 nodev   rpc_pipefs
 nodev   autofs
 nodev   nfs
 nodev   nfs4
 

So clearly reiserfs was not available for mounting a filesyetem. To enable it, you need to edit the /etc/yum.repos.d/CentOS-Base.repo file and then install a special CentOS Plus kernel which includes reiserfs support. Using your favourite text editor (nano in the example below), run the following command to edit the file, running it either as the root user or using sudo:

nano /etc/yum.repos.d/CentOS-Base.repo
 

Look for the [centosplus] section and change the enabled flag from 0 to 1 and add the includepkgs line as shown below:

[centosplus]
 ...
 enabled=1
 ...
 includepkgs=kernel* reiserfs-utils
 

Then add the following to the [base] and [updates] sections:

exclude=kernel kernel-devel kernel-smp-* kernel-hugemem* kernel-largesmp*
 

Now run the following command either as the root user or using sudo:

yum install reiserfs-utils kernel
 

This will do this:

Setting up Install Process
 Setting up repositories
 Reading repository metadata in from local files
 Excluding Packages from CentOS-4 - Updates
 Finished
 Excluding Packages from CentOS-4 - Base
 Finished
 Reducing CentOS-4 - Contrib to included packages only
 Finished
 Parsing package install arguments
 Resolving Dependencies
 --> Populating transaction set with selected packages. Please wait.
 ---> Package kernel.i686 0:2.6.9-67.0.1.EL.plus.c4 set to be installed
 ---> Package reiserfs-utils.i386 2:3.6.19-2.4.1 set to be updated
 --> Running transaction check
 
 Dependencies Resolved
 
 =============================================================================
  Package                 Arch       Version          Repository        Size
 =============================================================================
 Installing:
  kernel                  i686       2.6.9-67.0.1.EL.plus.c4  centosplus         14 M
  reiserfs-utils          i386       2:3.6.19-2.4.1   centosplus        434 k
 
 Transaction Summary
 =============================================================================
 Install      2 Package(s)
 Update       0 Package(s)
 Remove       0 Package(s)
 Total download size: 15 M
 Is this ok [y/N]:   
 

Type in “Y” and hit enter and then this will happen:

Downloading Packages:
 (1/2): reiserfs-utils-3.6 100% |=========================| 434 kB    00:04
 (2/2): kernel-2.6.9-67.0. 100% |=========================|  14 MB    02:47
 Running Transaction Test
 Finished Transaction Test
 Transaction Test Succeeded
 Running Transaction
   Installing: reiserfs-utils               ######################### [1/2]
   Installing: kernel                       ######################### [2/2]
 
 Installed: kernel.i686 0:2.6.9-67.0.1.EL.plus.c4 reiserfs-utils.i386 2:3.6.19-2.4.1
 Complete!
 

Once the installation has completed you need to reboot the system to load the new kernel and get access to the reiserfs partition.

After rebooting, cat /proc/filesystems will now show this:

nodev   sysfs
 nodev   rootfs
 nodev   bdev
 nodev   proc
 nodev   sockfs
 nodev   binfmt_misc
 nodev   usbfs
 nodev   usbdevfs
 nodev   futexfs
 nodev   tmpfs
 nodev   pipefs
 nodev   eventpollfs
 nodev   devpts
         ext2
 nodev   ramfs
 nodev   hugetlbfs
         iso9660
 nodev   relayfs
 nodev   mqueue
         ext3
 nodev   rpc_pipefs
 nodev   autofs
 nodev   nfs
 nodev   nfs4
         reiserfs
 

Note that reiserfs is now listed at the bottom, and you will be able to now successfully mount the filesystem formatted with reiserfs.