503 Service Unavailable

2012-06-11

Things to know about leap seconds and your Unix system

Filed under: Software — rg3 @ 21:16

Let me start with a small apology for not writing anything new in several months. It’s amazing it’s been so long since my last post. Anyway, June 30th is approaching quickly and this past week I’ve been in charge of reviewing the code in our ATC system regarding the possible operational impact that the next leap second will have for controllers. Fortunately, some changes had been introduced for the previous leap second in 2008 to fix a few problems that did appear in the 2005 one, and this means the impact should be minimal and it will still be safe to fly over the Iberian Peninsula during the leap second midnight. ;)

Seeing me work on that, a few colleagues and friends have asked me what the leap second thing is about and if there’s anything they need to know or do on their home systems to handle it properly. The short answer is “No”. Most people won’t need to do a single thing but I’m sure some of them would enjoy a longer explanation, so here it goes.

What is a leap second?

For humans and regarding this topic, there are basically two ways to count time. On the one hand you have systems which are based on  the International System of Units. In these systems, a second is defined as a very specific amount of fixed time related to the number of periods of a specific radiation related to caesium 133 atoms. No more, no less. Some time scales, like International Atomic Time or GPS Time follow that definition and time, in them, advances at a fixed pace forward. On the other hand, we humans like to give our time scales an astronomical meaning in daily use. The Earth moves around the Sun and completes one lap in a specific period of time and we define the duration of one day and, for that, one second, regarding that movement. The UTC time scale, in which our time zones are based, works like that.

The problem with giving that astronomical meaning is that the time the Earth takes to go around the Sun is not constant, so the definition of one second according to those two scales doesn’t match exactly. But as human as we are, we decided to simplify things and define UTC time regarding a simpler, non-astronomical scale, like International Atomic Time. To preserve the astronomical meaning of UTC, then, once in a blue moon we are forced to have a minute with 59 or 61 seconds (until now, only the latter) just like from time to time the year needs to have 366 days. The moment these leap seconds are introduced is decided on the go by the International Earth Rotation and Reference Systems Service according to observations, but the event always takes place after or before 23:59:59 in June 30th or December 31st in a given year in the UTC time “zone”.

Since they were introduced, International Atomic Time and UTC time have fallen 34 seconds apart and, after June 30th, the difference will be 35 seconds.

How is a leap second introduced in theory?

In theory it’s very simple if we represent time as text or as a structure containing different pieces. If what we’re doing is adding a second, in the night of June 30th we’d have:

June 30th, 2012 23:59:57
June 30th, 2012 23:59:58
June 30th, 2012 23:59:59
June 30th, 2012 23:59:60 -- leap second here
July 1st,  2012 00:00:00
July 1st,  2012 00:00:01
July 1st,  2012 00:00:02

How is a leap second introduced in practice, in a computer?

In Unix systems (and other types of systems as well) there are a lot of system calls and different APIs to handle time in different and flexible ways. Unfortunately, most of them were not designed with leap seconds in mind, specially the most interesting ones from a computational point of view because they are mainly based on Unix time. As we cannot easily represent going from 23:59:59 to 23:59:60 and then to 00:00:00, in most systems the leap second is introduced by advancing or putting back the clock one second at 23:59:59 like you’d do on a digital watch. So you’d be at the end of 23:59:59 and then you jump back to 23:59:59 again. Some other systems prefer to “freeze” the clock for one second so an application requesting the system time would not see a jump back in time between to consecutive system calls.

Anyway, it’s a bit easier to make leap-second aware those applications that need it than to introduce a new standard API that would take them into account and be used by everyone. Because, honestly, most applications don’t care about leap seconds and, for those that do, it’s not that hard to detect a leap second happened and handle the special case properly with a few lines of code.

Enter NTP

NTP or Network Time Protocol is a tool that can be used by computers to stay in sync and properly on time. Specifically, it allows to computers to talk and exchange information about what time it is allowing one of them, the client, to closely follow the time of the other one, the server, which is considered a trustworthy reference. This server in turn may be in sync with another more trustworthy time source, and so on. Obviously, this cannot follow indefinitely and at one point in time someone will have to trust a time source as being infallible.

NTP is needed because, even if computers have a precise internal hardware oscillator and clock, just like a normal watch it will go too fast or too slow. The NTP daemons running in the computer are able to measure this drift and inform the operating system kernel about it, so that system time advances at the proper pace and your clock stays in time.

The thing about NTP and leap seconds is that the protocol is leap-second aware. If a time source knows a leap second is about to take place, it can inform its clients about the upcoming leap second. These clients may in turn be servers to other clients, and they would propagate the leap second warning to them. The NTP daemon may inform the kernel about the upcoming leap second so the kernel applies the leap second jump, or may apply the jump itself if its configuration allows it to behave that way and it’s being run with enough privileges. Anyway, you probably won’t decide that as the operating system vendor will probably ship you the NTP daemon with a working configuration in that regard, so you only have to customize the reference server(s) if any or some security related parameters about who can query or modify the daemon state.

So, hopefully, if you have a Unix system synced with NTP, your computer will most likely receive a leap second warning before it happens and will make the clock jump for you without doing anything in the proper moment at midnight.

A few remaining questions

Who provides the original leap second warning? Many infallible time sources in NTP are connected to a GPS clock. This clock receives the current time from the satellites and the known offset between GPS time and UTC time. This difference will change after a leap second happens. They will also receive a leap second warning some time before it happens. Moreover, a properly configured NTP server (either for public use or for internal corporate or home use) should have its software or firmware updated so it knows in advance the leap second is going to take place, without having to wait for the upcoming modified UTF offset message from the satellites or the leap second warning. For example, if you use Linux and the NTP server from ntp.org, you can download a file from the NIST FTP server with an up-to-date list of leap seconds, save it to your hard drive and configure the NTP daemon with a “leapfile” directive in the configuration file to point the daemon to the list of known leap seconds. You can even do this if you don’t trust your upstream servers are going to properly propagate the leap second warning. More information can be found in the ntp.org wiki.

What if my computer is turned off before the leap second warning and turned on after it has happened? Nothing to worry about. At any moment the NTP daemon detects you’re off by about one second, it will step the clock to the proper time without affecting the drift calculations. For example, the manpage for ntpd in my system states the step will happen if it detects you’re off by more than 128 milliseconds.

Experiment for the future: be playing a video or a song in your computer in the moment the leap second happens and see if it stalls for one second or some other weird thing happens, to test your movie or audio player is able to detect the leap second. File a bug if it’s not. :)

As you can see, keeping your computers in sync is really easy even in the event of a leap second. It becomes a bit trickier when the system has several different devices of different brands and ages and each one of them takes the time reference from a different source, but most home computer owners don’t need to do anything special, and most people simply won’t care about the leap second anyway.

A few references for more information

2010-11-20

Disabling antialiasing for a specific font with freetype

Filed under: Software — rg3 @ 19:39

In the following paragraphs I’ll describe how to disable antialiasing for a specific font with freetype. The individual pieces that need to be put together to achieve this are well documented, but a Google search didn’t turn up many relevant results regarding this specific topic, so I hope anyone else searching for quick instructions will find the following text useful and in the first page of a web search.

As you may know, freetype is normally configured by creating files in /etc/fonts/conf.avail and creating symlinks to those files in /etc/fonts/conf.d. Normally, separating each configuration parameter or parameter group to individual files lets you easily enable and disable specific font-rendering features by creating and destroying symlinks. One of these configurable features usually enabled in any distribution is to parse the file ~/.fonts.conf to allow every user to set their own font rendering parameters. For example, when KDE configures the font rendering features from the “System Settings” panel, it overwrites your ~/.fonts.conf. If you want to disable antialiasing for a specific font in freetype, you can either create a new config file in /etc/fonts/conf.avail and link to it in /etc/fonts/conf.d, setting it for any user, or adding the setting in your own ~/.fonts.conf. If you do the later, be sure to back file up somewhere, because fiddling with the font settings in your Destkop Environment may overwrite the file.

Going to specific details, I recently installed the Tahoma font from my Windows installation and wanted to use it with the bytecode interpreter and without antialiasing in the GUI, so it would look like this:

KDE Style System Settings Windows showing Tahoma without antialiasing

However, the rest of the fonts look ugly with those settings, so I wanted to disable antialiasing for the Tahoma font only, and only in sizes of 10 points or less. For bigger sizes, antialiasing would be enabled. Long story short, here are the settings that need to be integrated into your personal ~/.fonts.conf or put in an individual file in /etc/fonts/conf.{avail,d}. I’ll explain the contents next.

<?xml version='1.0'?>
<!DOCTYPE fontconfig SYSTEM 'fonts.dtd'>
<fontconfig>
  <match target="font">
    <test qual="any" name="family">
      <string>Tahoma</string>
    </test>
    <!-- pixelsize or size -->
    <test compare="more_eq" name="size" qual="any">
      <double>1</double>
    </test>
    <test compare="less_eq" name="size" qual="any">
      <double>10</double>
    </test>
    <edit mode="assign" name="antialias">
      <bool>false</bool>
    </edit>
    <edit name="autohint" mode="assign"><bool>false</bool></edit>
  </match>
  <match target="font">
    <test qual="any" name="family">
      <string>Tahoma</string>
    </test>
    <!-- pixelsize or size -->
    <test compare="more_eq" name="pixelsize" qual="any">
      <double>1</double>
    </test>
    <test compare="less_eq" name="pixelsize" qual="any">
      <double>14</double>
    </test>
    <edit mode="assign" name="antialias">
      <bool>false</bool>
    </edit>
    <edit name="autohint" mode="assign"><bool>false</bool></edit>
  </match>
</fontconfig>

I don’t want to go into specific details about the rules above. There is an XML header that needs to be present in any configuration file, and it contains a “fontconfig” section. Inside that section, you can put any number of “match” sections among other things, and we need two. One specifies the rules in terms of point size and another one in terms of pixel size. Both are needed for some reason.

The matches look for fonts named Tahoma and disable antialiasing and autohinting for them in some specific sizes. The exact point and pixel sizes depend on your X server and/or Xft settings. Most people set the DPI value to 75, 96 or 100. In KDE, you can override the current setting from the style configuration window. DPI stands for “Dots Per Inch”. In this case, pixels per inch. Normally it should really match your monitor. That is, if you have a 22″ screen with a specific resolution in pixels, you’d specify a DPI setting that would match the real DPI. However, like I said, most people use 75, 96 or 100 (I set it to 96 myself) and it DOES NOT match the real DPI. Depending on the DPI setting, your fonts will look bigger or smaller at the same size in points. In my case, I was interested in sizes lower than 10. Hence the match you can read above.

To write the pixel size match you need to calculate the equivalent of those point-values in pixels. This is easily calculated knowing two constants: the DPI value you’re currently using and knowing that an inch has exactly 72 points. So the equivalent in pixels of a 10-point distance in a 96 DPI screen is the following:

10 points, in inches: 10 / 72 = 0.1388

With 96 pixels per inch, those are: 0.1388 * 96 = 13.33, or 14 pixels rounding the number up, which is what you see in the config file I pasted above.

2010-11-06

youtube-dl has moved to github.com

Filed under: Programming,Software — rg3 @ 12:13

Some days ago, youtube-dl, my most popular project, moved from being managed using Mercurial at bitbucket.org to being managed using Git at github.com. Since the move, I’ve been wanting to write something about it. I’ve also been wanting to rewrite the program partly or completely every time I look at its source code, but that’s a different matter. Back to the main topic.

I should start by apologizing to anyone who thinks this is a bad move either because they may have to rebase all their work in a new repository, bringing all their changes back, or simply registered at bitbucket.org to follow the project. It currently has 17 forks and 100 followers, and I’m pretty sure some of them registered there just to follow youtube-dl, and the move to github.com is, if anything, a problem because they would have to create an account somewhere else to continue following the project. Again, apologies to anyone for whom the move has no practical aspects.

That said, I’d like to explain why I made the move. You may recall I wrote an article some time ago about Mercurial vs. Git. Apart from explaining what I considered were the main differences between the two, I also wanted to express my indecision about which one was better. While I think Mercurial is and was great, the balance has been leaning towards Git for some time now, and I tend to use Git for all my personal projects. Many of the reasons, if not all of them, have been expressed by other people in the past. It’s a good moment to quote a very well known blog post from Theodore Tso, written in 2007 when he was still planning to migrate e2fsprogs to Git from Mercurial:

The main reason why I’ve come out in favor of git is that I see its potential as being greater than hg, and so while it definitely has some ease-of-use and documentation shortcomings, in the long run I think it has “more legs” than hg, […]

I think that paragraph describes with great accuracy what I think too. In the medium and long run, Git’s problems almost vanish. Its documentation was a bit poor back then, but people have been writing more and more about Git and there are a few very good resources to learn its internals and basic features. Furthermore, once you have a simple idea about its internals and use it daily, you no longer need that much documentation. If you’re not sure how to do something, chances are a simple web search will tell you how to do what you wanted to achieve.

Also, as many people know, Mercurial was and is mostly about not modifying the project’s history, while Git has a lot of commands that directly modify the project’s history. With time, I’ve come to realize that modifying the project’s history is simply more practical in many cases and in a range of situations it leads to less confusion. In my day job, we are slowly moving from CVS to Subversion to manage the sources of a very old and important project, which exists since about 1984. At the same time, we are modifying our work flow here and there to take advantage of Subversion, and we’re heavily using branching and merging despite the fact that’s not one of Subversion’s greatest strengths, as you may know. That’s giving us some problems and it’s amazing how many times I caught myself thinking “this would be much easier if we were using git, because we would simply do this and that and job done”. Many of those actions would modify the project’s history and clean it up. I repeat, in real situations with a lot of people working on something and not doing everything exactly as it should be done, it’s only a matter of time that you miss a Git feature.

The only thing I don’t like about Git is its staging area. From a technical perspective, the staging area makes a lot of sense, and you can build many neat features based on it. However, one thing is having a staging area and a second thing is exposing it to end users. I think you can have a staging area and all the features it provides while hiding it from users in their most common work flows. Still, it’s something you get used to and everybody knows that, when your project is a bit mature, you spend way more time browsing the source code, debugging, running it and testing it than actually committing changes to the source tree. The staging area is not a big issue and “git commit -a” covers the most common cases.

Apart from Git itself, the move was partly motivated by the site, github.com. When I started using bitbucket.org I liked it a bit more than github.com, but things have changed slowly. github.com fixed a rendering bug that hid part of project top bar, got rid of its Flash-based file uploader and got an internal issue tracker with a web interface that works really really well. The site is very nice and the “pages” feature, that allows you to set up a simple web page for the project, is still not provided by bitbucket.org as far as I know. In addition, with the arrival of Firesheep, it quickly moved to using SSL for everything. It’s fantastic.

bitbucket.org was recently bought by Atlassian and their plans are indeed better. For me, however, the number of private repositories and private collaborators is not an issue, because all the projects I host on github.com are public. Still, it’s fair to mention their plans because it could be a deciding factor for some people.

I wouldn’t like to close this article without mentioning the big improvement that both sites bring to the typical free and open source software developer. I still host a few projects on sourceforge.net, and I can tell you I’m not going back to it despite the great service they have provided for years for which I thank them sincerely.

It’s been months since I last used it so I apologize if things have changed without me noticing, but back then it was very hard to get your code on sourceforge.net. You didn’t perceive it was hard because there was no github.com. Once you try github.com or bitbucket.org, you realize how much the process can be simplified. Two key aspects to note. First, the project name doesn’t have to be unique. It only needs to have a unique name among your own projects, which is much easier and simplifies choosing the project name a lot. Second, once the project is created and has a basic description, without filling any form and without having to wait for anything, you are only a few commands away from uploading your code to the Internet. It can literally take less than 5 minutes to create a project and have your code publicly available, and that’s fantastic and motivating. You don’t need to find time to upload your code or thinking if the process is worth it for the size of the project. You simply do it. That’s good news for everyone.

Let me finish by apologizing again to anyone for the inconveniences created by the move. I sincerely hope this will remain the project location for many years to come.

2010-04-01

iptables rules for desktop computers

Filed under: Software — rg3 @ 13:30

Today I will show you the iptables rules I set on my main personal computer, with detailed comments about why I came to use these rules after several years of Linux desktop usage. The rules I use now have been simplified as much as I could and are based on common rules and advice that can be found on the network and also on input I got from experienced network administrators. I’ve been using them unmodified for a few years. They are designed for desktop users either directly connected to the Internet or behind a router. They are a bit restrictive in some aspects but we’ll see you can easily create a few holes for specific purposes. So here they are:

# iptables -v -L
Chain INPUT (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
 663K  905M ACCEPT     all  --  any    any     anywhere             anywhere            state RELATED,ESTABLISHED
  105  6300 ACCEPT     all  --  lo     any     anywhere             anywhere
    0     0 ACCEPT     icmp --  any    any     anywhere             anywhere            icmp destination-unreachable
    0     0 ACCEPT     icmp --  any    any     anywhere             anywhere            icmp time-exceeded
    0     0 ACCEPT     icmp --  any    any     anywhere             anywhere            icmp source-quench
    0     0 ACCEPT     icmp --  any    any     anywhere             anywhere            icmp parameter-problem
    0     0 DROP       tcp  --  any    any     anywhere             anywhere            tcp flags:!FIN,SYN,RST,ACK/SYN state NEW

Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

We’ll start by the most obvious rules. The FORWARD chain has a policy of “DROP” and no specific rules. A desktop computer isn’t usually employed as a router or to share an Internet connection, so there’s no reason in allowing forwarding.

The OUTPUT chain has a policy of “ACCEPT” and no rules. Basically, we are allowing everything going out of our computer. While this isn’t the most secure policy at all, it’s usually enough for a desktop computer. Many paranoid people would not let everything out. For example, to prevent their computers from being used to send spam due to a mistake somewhere else, sometimes people forbid from sending traffic from the source port 25, or in general from source ports below 1024, where most common services are. We could do that, but I think it’s not really needed for a desktop computer. We’ll put more effort blocking incoming traffic, and we can keep a relaxed policy on outgoing traffic.

Finally, the guts of the rules. The INPUT chain has a policy of DROP. That is, everything not explicitly allowed will be forbidden. If anything passes through all the rules, the traffic will be discarded silently without making noise.

The rules in the INPUT chain are sorted according to the typical frequency of hits. “Popular” and frequent traffic will be quickly accepted instead of having to check many rules before. That’s why the first rule is to allow RELATED and ESTABLISHED traffic, for any protocol. The any part is important. This is the rule that, basically, allows us to receive replies and normal traffic for connections we start ourselves. For example, when we open a web page with our web browser, we’ll send traffic one way and when we receive the reply, the connection will be ESTABLISHED and we’ll see the reply. This first rule is the most important one because, just due to it, we can use the computer “normally”.

The stateful packet firewall in Linux is quite clever and understands established connections even when the underlying protocol has no notion of connections. For example, that first rule allows us to receive DNS replies from queries we made ourselves, using the UDP protocol, or allows receiving ICMP echo replies from our own requests. In other words, we can ping other computers thanks to that rule.

On to the second rule, it looks like it would accept any traffic from anywhere, but the keyword here is lo:

  105  6300 ACCEPT     all  --  lo     any     anywhere             anywhere

This rule accepts all incoming traffic from interface “lo”, which is the loopback interface. This rule allows us to connect to services on our own machine by pointing to 127.0.0.1, or ::1 in IPv6. This rule would allow connecting to the CUPS printing service, for example, if we had a printer connected to our computer. A variant of this rule that can be frequently found on the Internet is to include a further check to verify the destination IP is 127.0.0.1, just to be more paranoid and forbid strange traffic. While this can increase security, I don’t think you need that further check generally. Just to clarify, browsing unsafe web pages with Javascript and/or Flash is more dangerous than not checking if traffic coming through “lo” is really directed to 127.0.0.1, so it’s not a priority.

Then, you can see I allow some specific types of ICMP packets that usually signal network problems. None of those require a reply to be sent, so we accept them and try to interpret what they would mean if they ever come in. I don’t think it’s possible to get anything more than a DoS attack with those rules, but comments are welcome. And, of course, you can be DoS’ed just by someone saturating you with incoming traffic. Again, this is a matter of getting your priorities sorted. If you feel paranoid, well, drop those rules.

Finally, at the end of the chain we have the famous specific rule to block incoming traffic with state “NEW” and the SYN flag not set in TCP. This rule is quite specific and an explanation for it can be found in many iptables manuals, FAQs and tutorials. I put the rule in the end because the first rule is not affected by it, because the second rule isn’t either (we are allowing ALL traffic coming from “lo”, after all), and the ICMP rules are not affected either.

However, we still keep it there even if the traffic was going to be dropped anyway due to the chain policy, because when we want to create a hole in these rules, we do it by adding more rules at the end of the INPUT chain. For example, sometimes I want to allow incoming traffic to a specific port where I have configured a server that is supposed to be reached from other machines, to serve a specific content in a specific point in time. For that, I have created a couple of scripts called “service-open” and “service-close”, that can be used followed by a list of service names or port numbers. For example, when I start a web server to allow someone in my home network to get a file from my computer, I usually run the command “service-open 8080” (the server would be listening on that port). Once the file is served, I run “service-close 8080” and shut the server down. Those commands add and remove rules at the end of the INPUT chain, so that’s why I put the last rule there, so it’s present before any holes I punch through my firewall in those special cases. If you frequently run a P2P application on your computer, you may want to open a hole permanently to some port and save it as part of your usual rules. I don’t, so I keep everything closed.

The content of my scripts are:

# cat /usr/local/sbin/service-open 
#!/bin/sh
if test $# -eq 0; then
        echo usage: $( basename $0 ) service ... 1>&2
        exit 1
fi
while test $# -ne 0; do
        /usr/sbin/iptables -A INPUT -p tcp --dport "$1" -j ACCEPT
        /usr/sbin/iptables -A INPUT -p udp --dport "$1" -j ACCEPT
        shift
done
# cat /usr/local/sbin/service-close
#!/bin/sh
if test $# -eq 0; then
        echo usage: $( basename $0 ) service ... 1>&2
        exit 1
fi
while test $# -ne 0; do
        /usr/sbin/iptables -D INPUT -p tcp --dport "$1" -j ACCEPT
        /usr/sbin/iptables -D INPUT -p udp --dport "$1" -j ACCEPT
        shift
done

Those scripts play nicely with my set of rules because they are designed with my rules in mind. Also, you can see they are dead simple.

With the set of rules I have described, you can use your computer normally, you can easily let more traffic through in specific cases and, more importantly, you’ll be “invisible” on the network. Nobody will know if your computer is really there or not unless you send them traffic or if they found out by other means. And, also, it’s a very small set of rules and it’s very easy to remember and understand, and to create scripts that modify it easily.

Edit: The commands needed to create those rules:

iptables -P FORWARD DROP
iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT 
iptables -A INPUT -i lo -j ACCEPT 
iptables -A INPUT -p icmp -m icmp --icmp-type 3 -j ACCEPT 
iptables -A INPUT -p icmp -m icmp --icmp-type 11 -j ACCEPT 
iptables -A INPUT -p icmp -m icmp --icmp-type 4 -j ACCEPT 
iptables -A INPUT -p icmp -m icmp --icmp-type 12 -j ACCEPT 
iptables -A INPUT -p tcp -m tcp ! --tcp-flags FIN,SYN,RST,ACK SYN -m state --state NEW -j DROP 
iptables -P INPUT DROP

2010-01-20

Managing Linux kernel sources using Git

Filed under: Software — rg3 @ 21:53

This will be a short and easy tutorial on how to use Git to manage your kernel sources.

Before Git, the easiest way to manage your kernel sources was to download the kernel using the provided tarballs from kernel.org and update them downloading the provided patches between releases, which was very important to keep the download size small, instead of downloading complete tarballs each time. Also, by applying patches, you only needed to rebuild stuff that changed between releases instead of the full kernel once more. This is a good method that can be applied today and will probably never disappear. Simple HTTP and FTP downloads are very convenient in many situations.

However, with the arrival of kernel 2.6, its stable branches (e.g. the 2.6.32.y branch) and Git, there have been some changes. First of all, the process is now a bit more complicated. Stable patches are applied against the base release. If you have the kernel sources for version 2.6.32.1 and want to jump to version 2.6.32.2, you first have to revert the changes of release 2.6.32.1 (patch --reverse) and then apply the 2.6.32.2 patch. Slightly less convenient and, furthermore, you’ll modify every file that changed with every patch until that moment. This will affect the compilation process that would follow afterwards. In other words, if patch 2.6.32.1 meant (hypothetically speaking) a long build because it changed stuff that affected a lot of systems, so will be the build process for any other subsequent release in the 2.6.32.y branch. It was this small glitch that prompted me to manage my kernel sources the way I’m going to describe. Also, using Git is fun. :)

We will try to achieve the following:

-------------------------------------------------> Linus Torvalds' master branch
           \                   \
            \                   \
             A stable release    Another stable release

We will have a master branch that will follow Torvalds’ master branch and will be updated from time to time, or when he releases a new stable version of the Linux kernel (e.g. 2.6.32).

We will have other local branches that follow the stable releases by Greg K-H (e.g. 2.6.30.y, 2.6.31.y, 2.6.32.y, etc).

Git is very flexible and simple, and allows more than one way to do things. I will try to explain why I do things this way and why they make sense to me, and will try to avoid shortcuts, i.e. I will use one command for each action even if two actions could be compressed into a single command.

First, we will create a directory to hold the kernel sources. Let’s name it /path/to/kernel. In it we’ll have a directory named “src” that will hold the unmodified kernel sources and a second directory named “build” that we’ll use to build the kernel and keep the sources intact, for clarity. We start by cloning Torvalds’ branch:

cd /path/to/kernel
mkdir build
git clone 'git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git' src

This will create a directory named “src” with the sources. Take into account you’ll be downloading the full repository with a lot of revision history. It’s a relatively long download that requires a lot of patience or a good broadband connection. Whatever you have at hand. At the moment I’m writing this, it’s several hundred MBs but less than 1 GB, if I recall correctly.

If you issue a “git branch” command you’ll see you only have a local branch named “master”. This local branch follows Torvalds’ master branch. You can update your kernel sources when you are in this branch issuing a simple “git pull” command.

Now, we will add a second local branch to follow the stable 2.6.32.y kernel. In other words, our master branch follows Torvalds’ master branch and our “branch_2.6.32.y” (let’s call it that way) will have to follow the master branch in the stable 2.6.32.y repository.

First, we create a shortcut to the 2.6.32.y repository for convenience:

git remote add remote_2.6.32.y \
    'git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.32.y.git'

The name “remote_2.6.32.y” is arbitrary. At this moment, that name is only like an alias for that long URL and barely anything more. The next step is very important so that the name becomes something more and the following git commands understand what you mean when you use it. It will download data to your repository under that name.

git fetch remote_2.6.32.y

After you run that, which will take considerably less time that the full repository clone we did previously, remote_2.6.32.y will have a meaning in your hard drive. You can then use the following command:

git branch --track branch_2.6.32.y remote_2.6.32.y/master

This will create a new branch in your local repository that will be tracking the master branch at the 2.6.32.y repository. If you issue a “git branch” command you’ll now see you have two branches. Being a “tracking branch” means several things. You can change between the master branch and the new branch using “git checkout <branch name>” and, in each branch, you can perform a simple “git pull” to retrieve changes to that branch from the remote repository. From this point you’re on your own using Git to manage the sources and perform more operations if you need them, but the many tutorials available on the web will get you going in the basics of Git and that’s the only thing needed to manage the kernel sources with the only purpose of easing the downloading and building process.

Note that, between release 2.6.32.1 and 2.6.32.2, for example, you will only download the changes between those releases and a painful build for 2.6.32.1 does not have to mean a painful build for 2.6.32.2 if you update your sources this way.

Finally, we had created a “build” directory previously, in parallel to the “src” directory, in order to keep the sources directory clean. We can use this directory easily. When we are at the “src” directory, any “make command” we use can and would have to be replaced by “make O=../build”. To avoid mistakes, I have created a global alias in my system called “kmake”, aliased precisely to “make O=../build”. It affects the regular user account that I use to compile the kernel sources and the root account that I use in the installation step, to perform the “modules_install”, “firmware_install” and “install” operations.

As a regular user account:

  • kmake menuconfig
  • kmake
  • kmake oldconfig
  • etc

As the root account:

  • kmake modules_install
  • kmake firmware_install
  • kmake install

These aliases could be tuned further to install the kernel image, modules, firmware, etc to a sandbox directory if you intend to create packages with them, for example. The README file in the kernel sources directory has more information about this topic.

Next Page »

Create a free website or blog at WordPress.com.