Introduction to packaging step-by-step example
Last updated
Last updated
Instaloader is a Python 3 application with a single dependency (Python’s requests
). This makes it a relatively simple package, however not as straightforward as only packaging up a shell script would be. Because of the learning opportunities and simplicity, this makes it a good introduction package.
Instaloader Code Overview
The first thing we do is look at the application’s GitHub page. A few things stand out which we take a note of:
What we notice here is some information that will come in handy later:
The tool contains a setup.py
script
It has a release
The license is MIT based
We’ll be digging into each of these more later, for now it is just information to know.
Setting Up The Environment
We will assume that we have already followed our documentation on setting up a packing environment.
Let’s set up our directories now for this package:
Everything that relates to us building a package will be using ~/kali/
. In there will be two sub folders:
packages/
will be a source code of the package we are going to create
upstream/
will be a compressed file of the source code of the application (ideally from a tag version release which we saw before)
Downloading Tag Releases
Because we are making a new package from scratch, we’ll manually download the version of the tool we want to package up. If we were updating a package (and it was packaged correctly), there is a process to help speed it up. However, this will be covered in another guide.
Going to the GitHub’s release page, we can see the latest version (which at the time of writing is 4.4.4
). Here is the option to download instaloader-v4.4.4-windows-standalone.zip
, as well as Source Code (zip)
, and Source Code (tar.gz)
. We are interested in the tar.gz
option.
We will use wget
and make sure to format its name appropriately according to Debian’s standards for source packages (take note of .orig.tar.gz
):
If there isn’t a tag release for the software (or it hasn’t had an release in some time), we can use the latest git commit. This is covered in another guide. However, it is preferred to use a tag release when available.
Creating Package Source Code
We need to switch paths to the working location of the package:
We are now going to create a new blank git repository:
If we wanted to, we can confirm this by looking at “status” and “log”:
Great. Everything is empty; we have a clean working area.
We can now import the upstream version into our packing source code by using the file downloaded from wget
before. Because of the filename format, gbp
is able to detect the values instaloader
as the package name, and 4.4.4
as the version. We just press enter to accept the default values:
If we wanted to check everything is okay, once again, we can use git to do so:
So there is now an automatic commit created in the master branch (which is the current active branch, shown by the *
), as well as two other branches:
pristine-tar
which is metadata from the import
upstream
which is the source code of the application, without any of our package modifications
We are creating a Kali package, and we don’t use the master
branch, but rather kali/master
. So let’s switch:
Now we can generate the necessary files required to build a Debain-based package and also remove any example files created. During the process, we will be asked if its:
Single binary
Arch-Independent
Library
Python
We are going to keep it simple, and go with “Single”. Then accept what’s on the screen with Y
. If you would like more information about when to use what option, please see the manpage for dh_make.:
We use --file
to say where the orig.tar.gz
file is. If the file was one directory back (../
), this would not be needed, however as we have created a separate location for the file it is.
If you would like to see what got generated when using dh_make
, we can use git
:
A quick overview of each of those files:
changelog
- tracks when the package gets an update (including why and by who). This is responsible for the package version
control
- is the metadata for the package (often seen with apt
)
copyright
- what is under what license. The package can be under something different to the work we have put in to create the package
rules
- how to install the package
source/format
- is the source package format
At this point, we have the base packaging files in place, and it feels like a good idea to commit before starting some real work:
We now need to edit most of these to make sure the information is accurate. We can use what we found on GitHub to supply the correct info into the debian/
files:
License
Dependencies
Maintainers
Description
Collecting InformationLicense/Maintainers
For this package, its straight forward. GitHub has given us a helping hand, and detected the license as MIT. We can also see there is a license file:
Reading the license, we can see there are two authors which are given credit too: Alexander Graf
and André Koch-Kramer
. However, we don’t have a method of contact for them. We continue to explore the rest of the git repository, looking for something which may give us more authors so we can give credit to them. There isn’t a fixed structure in place, however there are some things to check and look out for:
README*
- authors may put contact information here
A few examples could be: README
, README.txt
, README.MD
README.MKDOCS
, or Readme.txt
AUTHOR*
- They may have a dedicated file for author information
CREDIT*
- They may have a dedicated file to who they give credit to
LICENSE*
- Like mentioned above, the license file may give author information
docs/
- They may place all their documentation in a separate folder
The “main” starting point of the application may have comments at the top of the file - in this, instaloader.py
Git commits - git --no-pager log -s --format="%ae" | sort -u
For our package, we can see:
As it turns out, there is: AUTHORS.md
, docs/
, instaloader.py
, and README.rst
, so we have a few places to look at. Starting with AUTHORS.md
, we can see the authors name and their method of contact:
So rather than an email address, it appears to be a username (could be just for GitHub, or a generic Internet handle). This is enough for us to go forward (even though its not ideal).
Another trick we could try is looking to see if they used a “legit” email address with git:
It doesn’t appear so. Was worth a try!
Dependencies/Maintainers
We need to see what is required to be installed on the machine in order for the application to work. Either pre-installed or will be installed using the application.
Some starting places to look at for this information:
README*
SETUP*
INSTALL*
docs/
There is a README for this application, but it just says how to install the application, rather than how to build it/compile from source:
Exploring the pip option is something we could do, but out of scope for this guide.
Next we spot setup.py
, which contains a lot of useful information:
We managed to get the following information from this:
From the shebang, we can see its Python 3 (#!/usr/bin/env python3
).
We can see it wants Python 3.5 or higher
We can see it wants requests
and for it to be v2.4
or higher
We can see if its on Windows, it requires another dependency, but we are Linux, so not the case
We can see the program’s home URL
We can see the license (MIT)
We can see the authors and their email addresses
We can get a description of the program
Handy!
When packing, we are building a standalone package, which needs to be able to install offline. Something else which needs to be kept in mind, other systems package management systems, such as Python’s pip, or ruby’s gems. Any of these dependencies also need to be in the main OS package management. In our case, we need Python’s requests
.
We have two ways of searching for it. We can use either:
apt-cache
But we also need to know what we are searching for. There is a naming convention, but if you are un-sure, doing multiple searches may help:
requests
python-requests
python3-requests
We will stick with the command line option for the time being.
Doing just requests
gives a little too many results:
So we need to do better to shorten the list, by just searching the short version of the description (we will cover this more later, but its the visible part of the output):
After removing the documentation from the results, we don’t get any results. So on with the next search!:
The first result, python3-requests
, looks exactly right! We can look closer:
And we can see its version is 2.23.0
, which is higher than than 2.4
, so we don’t need to update the package. This will be covered in another guide when required.
Maintainers
While doing the other parts, we have discovered the authors and maintainers of the software, so we don’t need to do anything extra for this.
Description
There are two descriptions that we need to supply, a long description and a short description. When we look at the GitHub page we can see an about section that we can use for the short description. For the long description, we can use the description in the README.
We also have a value from the setup.py
.
Editing Package Source Code
Now that we have that information copied down, we can start to populate the files in the debian/
folder we created with dh_make
.
More information on the subject can be found on the Debian documentation.
Changelog
If we followed the documentation on setting up a packaging environment, the only values we will need to alter would be distribution (from UNRELEASED
to kali-dev
), version (from 4.4.4-1
to 4.4.4-0kali1
) and the log entry:
Control
This file is the metadata for the package, and contains a lot of information.
More information on the subject can be found on the Debian documentation.
Out of the box, it will look a little like this:
So we can see a few things that need updating:
Section
- we set this to be misc, or if we know for sure it should be another section based off of the sections in Debian testing we can set it to that section
Maintainer
- we switch to be the Kali team, rather than an individual
Uploaders
- this is the individual(s) who are responsible for packaging up the application
Build-Depends
- what packages are required to BUILD the package
Homepage
- where is the tool located on the Internet
Vcs-Browser
- package source code to view online
Vcs-Git
- package source code location
Architecture
- what machines can this work on
Depends
- what other packages are required for this package to work
Description
- short and long description
Most of this we have now figured out from before, so it should make it easier to fill in. We went ahead and created a remote empty git repository on our GitLab account. In our example, this is the end result:
NOTE: The Build-Depends
& Depends
are indented with one space (and end with commas). The Description
is also indentend with one space.
There is a lot going on here, so lets point out a few things
Something to keep in mind with the formatting of the long descriptions, at about every 70 characters in (to the nearest whole word), we would put a new line, to help keep the formatting under control.
Now onto the dependencies, of which we have: Build & Package. For the build-dependencies of Python 3 we will have to have four things:
debhelper-compat
dh-python
python3-all
python3-setuptools
In a separate guide there will be an explanation as to why these are included, however only the first two are going to be a staple of Python 3 packaging as the latter two are for more specific cases.
In our application, we have another one, python3-requests
, which we got from setup.py
and that is a requirement from the application. Typically, if there was not a setup.py
file, we would not need to include python3-requests
in our “Build-Depends”. However, due to the setup.py
file, we will need to include python3-requests
in both the “Build-Depends” as well as the package “Depends”. This ensures these packages are always on the system when we install our package (especially handy when using “sbuild”).
The debhelper-compat
level determines how the package will be built. The higher the compat level, the newer the version. Newer versions have certain menial tasks done automatically, so this should not be lowered.
The package dependencies are relatively straightforward. We get rid of the ${shlibs:Depends}
as we are packaging up a Python tool, and instead replace it with the python3 depends version ${python3:Depends}
. We also ensure that python3-requests
is included as the tool requires this. No other dependencies are needed by this tool, so we are done.
The final thing we need to ensure we change is the architecture from any to all, as this tool can be installed on all architectures.
Copyright
Everything that gets created has an original author. They control what happens with it and it needs to be respected. We can call out this in the copyright file.
More information on the subject can be found on the Debian documentation and here.
Below is the skeleton template output (with comments removed):
The original tool’s author has ownership on their work, and the work we have put into creating the package belongs to us. After updating it, it looks like the following:
We altered the following:
We removed an optional parameter (Upstream-Contact
), as that is touched on in the copyright file.
We put in the homepage of the application to Source
Put the two authors name and addresses from setup.py
. The dates came from the LICENSE
file
Rather than putting the whole block of MIT license text directly after, we placed it towards the end of the file, and gave a header to it.
We replaced the GPL-2+
used as default for the packaging section with the same MIT
license, which is used in the application. This is the standard for Debian packages (packaging work should match the application’s license).
Rules
This file is a Makefile, for building the Debian package.
More information on the subject can be found on the Debian documentation.
The output of the template looks like the following:
So there are a lot of items which are pre-commented out, that may be handy for debugging & troubleshooting. Other than the shebang (#!/usr/bin/make -f
), there is only two other lines which are currently in use:
Which is a wildcard (%
), and feed in all the arguments into dh
.
What needs to go here now starts to depend on the program and how complex it is. As our program is a python application we are going to have to tell it to build with python3
. We also need to tell it to use pybuild
to build, as we have a setup.py
file included in the source of the application. If there was not a setup.py
file, we would not add this flag. We also need to tell PyBuild the name of the application. This looks like: