wiki:DissectingAPackage

Version 5 (modified by jdreed, 12 years ago) (diff)

--

Dissecting A Package

Given a .deb package, the dpkg-deb command will tell us a bit about it:

$ dpkg-deb -I rpncalc_0.1_amd64.deb 
 new debian package, version 2.0.
 size 1854 bytes: control archive= 515 bytes.
     304 bytes,    11 lines      control              
     248 bytes,     4 lines      md5sums              
 Package: rpncalc
 Version: 0.1
 Architecture: amd64
 Maintainer: Jonathan Reed <jdreed@mit.edu>
 Installed-Size: 30
 Section: utils
 Priority: extra
 Description: Simple RPN calculator in bash
  A simple RPN calculator, implemented in bash.
  Not legal for trade.
  This was taken from the O'Reilly bash cookbook.

Here we can see some basic information about the size, name, version, and other metadata.

We can also extract the contents of the package into a temporary directory ("scratch"):

$ mkdir scratch
$ dpkg-deb -x rpncalc_0.1_amd64.deb scratch
$ cd scratch/
$ ls
usr
$ tree
.
└── usr
    ├── bin
    │   └── rpncalc
    └── share
        └── doc
            └── rpncalc
                ├── changelog.gz
                ├── copyright
                └── README

5 directories, 4 files

So now we have the contents in a temporary directory, and we could explore them and maybe try and run the binaries.

Now let's dig a bit deeper into how the package is put together:

A Debian package is merely a specialized version of an older archive format, known as ar. The ar command is used to manipulate these types of archives. Let's extract a .deb archive:

$ ar xv rpncalc_0.1_amd64.deb 
x - debian-binary
x - control.tar.gz
x - data.tar.gz

Here, we told the ar command to extract (x) in verbose mode (v) the .deb file. It successfully extracted 3 items. Let's take a look at them:

  • debian-binary: The presence of this file identifies this as a binary Debian package, and its contents (in this case, "2.0"), identifies as version 2 of the format.
  • control.tar.gz and data.tar.gz: These are yet more archives, in tar ("TApe aRchive") format, compressed with Gzip ("gz"). We can take a look at what's inside them with this command:
$ tar tzvf control.tar.gz 
drwxr-xr-x root/root         0 2013-01-23 15:37 ./
-rw-r--r-- root/root       248 2013-01-23 15:37 ./md5sums
-rw-r--r-- root/root       304 2013-01-23 15:37 ./control
$ tar tzvf data.tar.gz 
drwxr-xr-x root/root         0 2013-01-23 15:37 ./
drwxr-xr-x root/root         0 2013-01-23 15:37 ./usr/
drwxr-xr-x root/root         0 2013-01-23 15:37 ./usr/share/
drwxr-xr-x root/root         0 2013-01-23 15:37 ./usr/share/doc/
drwxr-xr-x root/root         0 2013-01-23 15:37 ./usr/share/doc/rpncalc/
-rw-r--r-- root/root       163 2013-01-23 15:30 ./usr/share/doc/rpncalc/copyright
-rw-r--r-- root/root       136 2013-01-23 15:33 ./usr/share/doc/rpncalc/changelog.gz
-rw-r--r-- root/root       310 2013-01-23 15:37 ./usr/share/doc/rpncalc/README
drwxr-xr-x root/root         0 2013-01-23 15:37 ./usr/bin/
-rwxr-xr-x root/root       578 2013-01-23 15:37 ./usr/bin/rpncalc

control.tar.gz

This contains the package's "control" information, or metadata. In this case, it contains the two files: md5sums and control. Let's take a look at them:

$ tar xzvf control.tar.gz 
./
./md5sums
./control

md5sums

md5sums contains the MD5 checksums of the files in the package. These are used to verify the integrity of these files when the package is unpacked, and also can be used later by the system administrator to verify that they haven't been changed since the package was installed.

$ cat md5sums 
84ab9c5699930934aa485a7e1e977ad7  usr/bin/rpncalc
43ae450d42636f929f4c8d6fcaf3b77f  usr/share/doc/rpncalc/README
dd9cb407332a29aae18ee26cc6c5346c  usr/share/doc/rpncalc/changelog.gz
76a03e152d5ca56ba30707781d20d458  usr/share/doc/rpncalc/copyright

You can verify individual checksums yourself with the md5sum command.

control

Package: rpncalc
Version: 0.1
Architecture: amd64
Maintainer: Jonathan Reed <jdreed@mit.edu>
Installed-Size: 30
Section: utils
Priority: extra
Description: Simple RPN calculator in bash
 A simple RPN calculator, implemented in bash.
 Not legal for trade.
 This was taken from the O'Reilly bash cookbook.

The control information contains metadata about the package. The fields are largely self-explanatory -- we'll visit them in more detail later.

Metapackages

Metapackages are a special case of binary packages -- they don't include any software, only control information. This is useful for defining a set of packages that provide a certain functionality. For example, the debathena-standard metapackage is designed to provide basic Athena functionality, such as Kerberos, AFS, etc. You don't need to know the many individual packages that provide this functionality, because debathena-standard either _Depends_ or _Recommends_ those packages. This way, if we suddenly change filesystems, or add some new functionality, we can update the debathena-standard metapackage, and let the package manager take care of making the individual changes.

Useful Commands

  • dpkg -I foo.deb - View package control information.
  • dpkg -c foo.deb - List of files in package.
  • dpkg -x foo.deb dir - Extract the package into directory dir.
  • dpkg -e foo.deb dir - Extract Debian metadata (control file, maintainer scripts, md5sums, etc.) to directory dir.
  • apt-cache show package - Display control information for package.

Next: SourcePackages