I love infomercial gifs

In a recent 5 minute lightning talk on docker, I got a little carried away with the gifs. So good.

 

 

docker: devicemapper fix for “device or resource busy” (EBUSY)

Audience:
This article is intended for folks familiar with docker and looking to fix particular issues encountered when using devicemapper storage/graph driver.

Overview:
While this is issue not exclusive to devicemapper, the mechanics currently involved in this driver cause it to be affected by this.

A couple of the more commons issues seen when using the ‘devicemapper’ storage driver is when trying to stop and/or remove a contianer.
In the docker daemon logs, you may see output like:

[error] deviceset.go:792 Warning: error waiting for device ac05cffda663a01cbc37879bc146fcd68d0f95b5b141f60da2b64579add1f4ef to close: Timeout while waiting for device ac05cffda663a01cbc37879bc146fcd68d0f95b5b141f60da2b64579add1f4ef to close

or more likely:

Cannot destroy container ac05cffda663: Driver devicemapper failed to remove root filesystem ac05cffda663a01cbc37879bc146fcd68d0f95b5b141f60da2b64579add1f4ef: Device is Busy
[8ad069f7] -job rm(ac05cffda663) = ERR (1)
[error] server.go:1207 Handler for DELETE /containers/{name:.*} returned error: Cannot destroy container ac05cffda663: Driver devicemapper failed to remove root filesystem ac05cffda663a01cbc37879bc146fcd68d0f95b5b141f60da2b64579add1f4ef: Device is Busy
[error] server.go:110 HTTP Error: statusCode=500 Cannot destroy container ac05cffda663: Driver devicemapper failed to remove root filesystem ac05cffda663a01cbc37879bc146fcd68d0f95b5b141f60da2b64579add1f4ef: Device is Busy

Diagnosis:
What’s happening behind the scenes is that devicemapper has established a new thin snapshot device to mount then container on.
Sometime during the life of that container another PID on the host, unrelated to docker, has likely started and unshared some namespaces from the root namespace, namely the mount namespace (CLONE_NEWNS). In the mounts referenced in this unshared host PID, it includes the thin snapshot device and its mount for the container runtime.

When the container goes to stop and unmount, while it may unmount the device from the root namespace, that umount does not propogate to the unshared host PID.
When the container is removed, devicemapper attempts to remove the thin snapshot device, but since the unshared host PID includes a reference to the device in its mountinfo, the kernel sees the device as still busy (EBUSY). Despite the fact the mounts of the root mount namespace may no longer show this device and mount.

Reproduction:
1) start the docker daemon (with debugging and forced devicemapper): `sudo docker -d -D -g devicemapper`
2) start a container: `sudo docker run -it busybox top`
3) run a pid with unshared mount namespace:
3.a) compile a simple C application: http://pastebin.com/HfSn8udJ
3.b) use the unshare(1) utility: `sudo unshare -m top`
4) stop or kill the container from step #2
5) watch the docker daemon logs

If you kill/quit the unshared application from #3 while the docker daemon is attempting to remove the container, then the daemon can clean up the container nicely. Otherwise there is cruft left around (in /var/lib/docker) and the daemon will not have record of the container to be removed.

Investigating:
Tooling to visualize this inheritance does not really exist yet, so deriving the culprit can take a little effort.
Sometimes the `perf` tool can quickly point out the culprit. Something like:

perf probe -a clone_mnt
perf record -g -e probe:clone_mnt -aR docker -d
[...]
perf report

Another option, is that while the container is running, get the mountinfo of the container from only the docker daemon’s pid:

> sudo grep $(docker ps -q | head -1) /proc/$(cat /var/run/docker.pid)/mountinfo
173 169 253:5 / /var/lib/docker/devicemapper/mnt/7c62e78ca18f88e152debfb0b40847c1486bcef14d40300154bf0c9e9800d824 rw,relatime - ext4 /dev/mapper/docker-253:2-4980739-7c62e78ca18f88e152debfb0b40847c1486bcef14d40300154bf0c9e9800d824 rw,seclabel,discard,stripe=16,data=ordered

Kill/stop the container (`docker kill ...` or just ‘q’ out of `top`).
Now that the container and device should be unmounted everywhere, lets grep for any PIDs still holding a reference:

sudo grep -l 7c62e78ca1 /proc/*/mountinfo
/proc/5132/mountinfo

We have our culprit. Find out the command:

ps -f 5132
UID PID PPID C STIME TTY STAT TIME CMD
root 5132 5131 0 17:47 pts/6 S+ 0:01 top

This is the unshared PID we started earlier.

Notice also, we can determine for sure they are on different mount namespaces by checking the following:

sudo ls -l /proc/$(cat /var/run/docker.pid)/ns/mnt /proc/5132/ns/mnt
lrwxrwxrwx. 1 root root 0 Nov 4 18:13 /proc/10388/ns/mnt -> mnt:[4026532737]
lrwxrwxrwx. 1 root root 0 Nov 4 18:13 /proc/5132/ns/mnt -> mnt:[4026531840]

Notice they are referencing different numbers here.

Now to cleanup, we can stop/kill this unshared PID, and `docker rm ...` our test container.

Solution:
The solution to this issue is to have the docker docker itself run in an unshared mount namespace. Unfortunately, due to the threading model of the golang runtime, this can not be encapsulated inside the docker daemon itself, but will have to be done for the invocation of the docker daemon.

The two ways to go about this is either with the unshare(1) utility or editing the systemd unit file for the docker.server (where ever this applies).

For systemd unit file (likely /usr/lib/systemd/system/docker.service, but may also be /etc/systemd/system/docker.service), include “MountFlags=private” in the [service] section. e.g.:

[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=network.target docker.socket
Requires=docker.socket

[Service]
ExecStart=/usr/bin/docker -d -H fd:// -H unix://var/run/docker.sock
MountFlags=private
LimitNOFILE=1048576
LimitNPROC=1048576

[Install]
WantedBy=multi-user.target

For the unshare(1) approach, you’ll have to find and edit the init script for your system. But the layout is as follows (and covers manually calling docker as well):

sudo unshare -m docker -- -d

The important piece being the “-m” flag for cloning the mount namespace, and the “--” for the separation before arguments to the docker executable itself.

What this will accomplish is at the launch of the docker daemon, it will get its own private mount namespace that is a clone of the root namespace at that point. The docker daemon is free to create devices, mount them, unmount them and remove the device, because none of that mount information will be in the root mount namespace nor subject to being cloned into other PID’s mount namespace.

Links:

Linking golang statically

If you are not familiar with Golang, do take the go tour or read some of the docs first.

There are a number of reasons that folks are in love with golang. One the most mentioned is the static linking.

As long as the source being compiled is native go, the go compiler will statically link the executable. Though when you need to use cgo, then the compiler has to use its external linker.

Pure go

// code-pure.go
package main

import "fmt"

func main() {
        fmt.Println("hello, world!")
}

Straight forward example. Let’s compile it.

$> go build ./code-pure.go
$> ldd ./code-pure
        not a dynamic executable
$> file ./code-pure
./code-pure: ELF 64-bit LSB  executable, x86-64, version 1 (SYSV), statically linked, not stripped

cgo

Using a contrived, but that passes through the C barrier:

// code-cgo.go
package main

/*
char* foo(void) { return "hello, world!"; }
*/
import "C"

import "fmt"

func main() {
  fmt.Println(C.GoString(C.foo()))
}

Seems simple enough. Let’s compile it

$> go build ./code-cgo.go
$> file ./code-cgo
./code-cgo: ELF 64-bit LSB  executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), not stripped
$> ldd ./code-cgo
        linux-vdso.so.1 (0x00007fff07339000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f5e62737000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f5e6236e000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f5e62996000)
$> ./code-cgo
hello, world!

wait, what?

That code that is using cgo is not statically linked. Why not?

The compile for this does not wholly use golang’s internal linker, and has to use the external linker. So, this is not surprising, since this is not unlike simple `gcc -o hello-world.c`, which is default to dynamically linked.

// hello-world.c

int main() {
        puts("hello, world!");
}

$> gcc -o ./hello-world ./hello-world.c                                            
$> ./hello-world                                                                   
hello, world!                                                                      
$> file ./he                                                                       
hello-world    hello-world.c  hex.go                                               
$> file ./hello-world                                                              
./hello-world: ELF 64-bit LSB  executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), not stripped 
$> ldd ./hello-world
        linux-vdso.so.1 (0x00007fff5f109000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f0906e53000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f090725e000)

But for that example, we just have to add the '-static' flag to gcc (and ensure that glibc-static package is available).

$> gcc -o ./hello-world -static ./hello-world.c
$ file ./hello-world
./hello-world: ELF 64-bit LSB  executable, x86-64, version 1 (GNU/Linux), statically linked, not stripped
$ ldd ./hello-world
        not a dynamic executable

Let's apply that same logic to our `go build`

static cgo

Using same code-cgo.go source, let's apply that gcc flag, but using the go build command.

$> go build --ldflags '-extldflags "-static"' ./code-cgo.go
$> file ./code-cgo
./code-cgo: ELF 64-bit LSB  executable, x86-64, version 1 (GNU/Linux), statically linked, not stripped
$> ldd ./code-cgo
        not a dynamic executable
$> ./code-cgo
hello, world!

Cool! Here we've let the go compiler use the external linker, and that linker linked statically from libc.

An explanation of the flags here.

--ldflags is passed to the go linker. It takes a string of arguments.
On my linux-x86_64 machine, that is the 6l tool (5l for arm and 8l for ix86). To see the tools available on your host, call go tool, and then get help on that tool with go tool 6l --help

'-extldflags ...' is a flag for the 6l linker, to pass additional flags to the external linker (in my situation, that is gcc).

"-static" is the argument to gcc (also to ld) to link statically.

gccgo love

Say you have a use case for using/needing the gccgo compiler, instead of the go compiler.
Again, using our same code-cgo.go source, let's compile the code statically using gccgo.

$> go build -compiler gccgo --gccgoflags "-static" ./code-cgo.go
$> file ./code-cgo
./code-cgo: ELF 64-bit LSB  executable, x86-64, version 1 (GNU/Linux), statically linked, not stripped
$> ldd ./code-cgo
        not a dynamic executable
$> ./code-cgo
hello, world!

Huzzah! Still a static binary, using cgo and giving gccgo a whirl. A quick run-down of these flags.

-compiler gccgo instructs the build to use gccgo instead of the go compiler (gc).

--gccgoflags ... are additional arguments passed to the gccgo command. See also the gccgo man page.

"-static" similar to gcc, this is the same flag the instructs gccgo to link the executable statically.

more info

If you're curious to what is happening behind the scenes of your compile, the go build has an -x flag that prints out the commands that it is running. This is often helpful if you are following what is linked in, or to see the commands used during compile such that you can find where and how to insert arguments needed.

ruby 1.9.3 is released, and slackware packages are here

Ruby 1.9.3 is *finally* released. Their NEWS and ChangeLog are available.


And I have packages built for slackware-current and slackware64-current here:

ruby-1.9.3_p0-i486-1_vb.txz [MD5] [ASC]

ruby-1.9.3_p0-x86_64-1_vb.txz [MD5] [ASC]


Again, these are built on the -current development branch, so use at your own risk. :)


Take care,
vb

Tastes in Languages: an internal dialogue

This has been a point of contention for me, ever since I have had a desire to learn a “language” other than shell scripting. Which language would be most practical, valuable, powerful, reach the most receptive audience, designed with the ideals you share, what pays the bills, etc. Some, or all, of these aspects can conflict at any given point.

As for being practical or powerful, these measurements are based on your use of the language, which vary more than there are languages, so it is a continually refining target. As I continue to learn more languages, and deepen my knowledge of others, it is evident that no one language can be the end-all. Learning their benefits and drawbacks is a large part of efficient implementation.

As for valuable, this can be in a business sense, or marketability since. For me, this is the first real point of indecision. Fortunately AND unfortunately, this measurement has a callback to another measurement, “What are the ideals and motives of the language’s design?”. Business sense can be a fuzzy option sometimes. If you are a business unit that is exclusively a Java shop, it does not mean that all glue and helpers must also be java. Inversely, just because Perl is flexible and powerful, does not mean it needs to be built into a megalithic solution. The marketability of a language though, can relate more to the individual developer.

The evolution of all the available languages comes due to varying factors and motivation, and that motivation assists greatly in the type of community and reputation that a language accrues. These are the ideals that a language is evolved with. This is the perception that can be reflected back on a developer. Like how PHP was formed by a loose group of code hackers, perl was an overly flexible parsing and report language, ruby was started by a single humble person seeking to please others. Thus is the seed that grows the community. As soon as you state what language you write in, the other person immediately conjures stereotypes to cast over you. Regardless of good or bad, it is another expression of the eight mundane concerns. If it is blame or disapproval that you seek to avoid, then you find yourself not associating with a broader audience. And seeking praise and approval, you turn in towards like minded folks to reaffirm your expressed position. This may work out just fine for many, but it does put you at a great chance of clinging to an us-and-them mentality, which can be poisonous for you and the community at large. I’m not going to expand on that greatly, you can find more insight on that, starting by reading on ingroups and outgroups

And for many, they may not be an advocate of the language they know best, but it is what has been taught abd/or what they are paid to write in. That is okay, because you can learn other languages, despite the stereotypes others cast on you from what you’ve written in. There are soon likely to be a multitude of .Net folks, regardless whether loyalist or indifferent, that need to branch out to new languages. Thankfully for the open source communities at large, these folks have fodder to learn from, and become involved with to broaden their own experience.

While I have languages that I am more comfortable with, than others, ultimately I am an advocate for being a language generalist. Determining the best tool for the job requires having a broad idea of the pros/cons of many options. Forcing a project into the box you are most familiar with, is not always the most efficient or effective approach.

Hacking on KDE: part 1

In a personal brainstorm of how to better facilitate folks that want to have an ideal hacking environment, for X. The primitives of the idea need to be figured out, and often nothing happens because you can determine “If the person has enough desire, and know how, they’ll figure it out,” and most of the time this is a fine explanation.

When it comes to a project collection like KDE, compiling it and fiddling around can be supremely easy without the bounds of package dependencies, but still manageable packages. Although, there are now a lot a projects in the KDE Software Collection, and they are all modular. Even within the Slackware community, alienBob has already updated the build process for his KDE SC 4.7 packages to be modular. With the migration from SVN to git, it has clarified some aspects of the build, but it has now made a segregated landscape of projects. If someone wants to check out a new project, the latest development in it, it seems like a hidden process.

Being agnostic of which Linux distribution I am running, if, as a developer, I want to be able to easily manage working from the git repositories of the projects, installing them, and possibly even packaging the artifacts, this is a intricate process. But thankfully, it is a describable process, therefore we can automate it.

Last night I began working on a project, tentatively named ‘kappy’. With the K App, in mind, and it is written in Python, so having a py in there seems suitable.
The source code is currently living at http://hashbangbash.com/kappy.git, or browser viewable at http://hashbangbash.com/git/?p=kappy.git;a=summary

Initially it is just doing XML parsing of the projects available. Next plans are:

  • making a caching ground for cloning/updating git repositories
  • building the software
  • having user definable configurations for flags and install paths
  • making recipes, of a suite of software to build
  • having a manager to package, for a respective distribution

Update: I’ve added a bug tracker for this project http://redmine.hashbangbash.com/projects/kappy

If you have feedback, feel free to send it in.

Take care,
vb

Good Times

This year has been a nice for socializing with nerds. Attending conferences is something that can get tiresome, if the content is something you do not find interesting or stimulating. Thankfully, I have no required conferences, so I can be choosy (schedules permitting). Naturally open source and languages would top the list of places to attend.

First off, was hanging out with the KDE folks in San Francisco, CA for CampKDE – April 4,5 2011. There were a number of good talks, and a great opportunity to shake the hands of names that I have seen around, as well as meeting many new folks. The kde-promo team has a YouTube channel, that they have published all the talks and interviews to. http://www.youtube.com/user/kdepromo
The talk I gave there is titled “Slackware: Quickly and Easily Manage Your KDE SC Hacking.” You can get the slides in [PDF] or [ODP], plus the videos posted on the kde-promo channel have the full talk (youtube.com/watch?v=Qs7vR3POHeo), as well as an interview afterwards by Wade Olson (youtube.com/watch?v=YIpUmPul1i4).

Next, was down to Spartanburg, SC for the SouthEast Linux Fest (SELF) – June 10-12, 2011.
This is the third year that I have attended SELF, and second time to speak, but what differentiated this year from any other speaking engagement (in the past, or distant future), was that it was a talk title “Slackware Demystified”, and none other than the founder of Slackware Linux, Patrick Volkerding was not only in attendance, but sitting on the front row! The slides from this presentation are available in [HTML] or [PDF]. Unfortunately, the videos have not been published yet. Hopefully they will actually get them published, unlike the previous two years…

Lastly … so far, was a local conference, that I did not speak at, only attended. JrubyConf – August 3-5, 2011. While I use MRI Ruby much more than JRuby, this conference was a great way to be around and hear from many brilliant folks (Like Wayne Seguin, Charles Nutter (headius), Nick Sieger, and Jim Weirich, to name a few), plus I felt that I needed to make up for missing out on RubyConf taking place in Baltimore, MD.

The KDE folks strongly encouraged making it to the DesktopSummit, which was hosted in Berling, Germany this year. While it was surely appetising to think of attending, it did not work out this time. It would have been nice to shake hands with some fellow contributors, like Eric Hameleers (alienBob). Better luck next year.

All good times, I look forward to next year, or event the rest of 2011.

Feel free to leave feedback on the talks.

Take care,
vb

More 1337 for your Slackware-13.37 release

If you are one that is not afraid to recompile your kernel, then here is a little treat for you.
http://slackware.com/~vbatts/things/linux-2.6.37.6-logo_slk.patch.gz

This patch applies to the Linux kernel source stored in /usr/src/linux of your Slackware Linux 13.37 install. If you do not have a ‘.config’ file present, then do

#> zcat /proc/config.gz > /usr/src/linux/.config
#> cd /usr/src/linux
#> zcat $PATH_TO_DOWNLOADS/linux-2.6.37.6-logo_slk.patch.gz | patch --backup -p2

Assuming there are no errors, then you can proceed along with

#> make bzImage
#> cp arch/x86/boot/bzImage /boot/vmlinuz-1337-2.6.37.6

If you are running the ‘-smp’ kernel, then append ‘-smp’ to the kernel version. To determine if you are, then run
$> uname -r

The assumption with this option, is that this new vmlinuz will be using the kernel modules provided with the package
‘kernel-modules-2.6.37.6*’.


IF you want to differentiate this kernel from the stock installed kernel, open the ‘.config’ file, search for the CONFIG_LOCALVERSION line, and set the value to something brilliant, like


CONFIG_LOCALVERSION="_1337"

This however, will require to (re)compile all the modules and the bzImage kernel. So then it may be just as easy to do


#> make tar-pkg
#> cd tar-install
#> rm -rf boot/vmlinux* lib/firmware/
#> ts=$(date +%s)
#> makepkg -l y -c y ../linux_2.6.37.6_1337-${ts}-$(uname -m)-1_mine.tgz
#> installpkg ../linux_2.6.37.6_1337-${ts}-$(uname -m)-1_mine.tgz

Explanation of events here:

  • the vmlinux-* is usually around 18mb, and primarily used when debugging.
  • the lib/firmware/ would clobber the stock kernel-firmware package. no need in doing that.
  • the ${ts} is just a timestamp of epoch time. nothing fancy.
  • makepkg will create a Slackware package with a distinguished name and version, so as to not interfere with the stock packages
  • change the ‘_mine’ in the package name, if you have your own tagging name for packages you’ve created for your own system.


In Either Event, you will need to:

  • create an initrd, with mkinitrd, if you use one
  • adjust your /etc/lilo.conf accordingly
  • re-run `lilo`
  • reboot
  • enjoy the l33t penguins


If You Want to fetch the kernel source, rather than use /usr/src/linux, download it here http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.37.6.tar.bz2


Also, so that you are not suspect of malicious kernel patches. The patch adds ‘drivers/video/logo/logo_slk_clut224.ppm’, enables this logo image to be chosen with the config option of ‘CONFIG_LOGO_SLK_CLUT224′.

Then BEHOLD!


Take care,

vb

More Red-Black Tree, but a little Ruby Benchmarking

After hearing a bit about the improvements in the Rubinius ruby interpreter, I decided to test a handful of the interpreters against the Red-Black tree insertion script I last posted about.


As a forward, this is completely out of idle curiosity, and with no deep intentions. The results had distinct winners and losers, and for more rich benchmark, this would need to include multiple iterations and some other aspects as well. All that to say, this is not an all encompassing benchmark.
The results are the length (in seconds) of time for the execution.


So using the rbtree.rb, that I last blogged about, using rvm, I have just run it through the following ruby interpreters:

  • ruby 1.9.1p430 (2010-08-16 revision 28998) [x86_64-linux]
  • rubinius 1.3.0dev (1.8.7 5bd938b5 xxxx-xx-xx JI) [x86_64-unknown-linux-gnu]
  • rubinius 1.2.0 (1.8.7 release 2010-12-21 JI) [x86_64-unknown-linux-gnu]
  • jruby 1.5.3 (ruby 1.8.7 patchlevel 249) (2010-09-28 7ca06d7) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_23) [amd64-java]
  • ruby 1.8.7 (2010-12-23 patchlevel 330) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2011.01
  • ruby 1.9.3dev (2011-02-14 trunk 30873) [x86_64-linux]



From an early look at it, the Rubinius interpreter is on to something good. For this example at least, their current release branch (1.2) is smoking, compared to anything else.
To look at the values, that make up the graph above, here is the spreadsheet


Take care,
vb

Kick off of playing with trees

As I have stated before, I enjoy the speed at which I can prototype an idea in Ruby. Even if ultimately that idea runs better on another language. For the sake of my academics, it provides a super learning playground.

Earlier, this past week, I began reading up on papers and documentation of trees. All in an effort to differentiate the various implementations of trees, including their benefits and drawbacks. Finding AVL trees, B trees, 2-3 node, 2-3-4 node and RB trees. I decided to kick off the events with a Red-Black tree. Some of the papers I found are the generally available by Google searching.

After a lot of learning, and a little massage, I have the first of a an “academic” implementation of an RB tree (http://hashbangbash.com/~vbatts/rbtree.rb). I gravitated towards this tree first because of the type searches I am needing. While the population of this slows as it grows, the searches are ridiculously fast. I’ll get around to generating some instrumentation to graph it. For now I have the run_loop() to return @results of the populating trees. The sets are [ , , , ]

The insertion graphed was an O(N) climb. For doubling the number of nodes to be inserted (of random unique numbers), the time increase was just a tad more than double. The @results for run_loop(1,000,000) was
=> [[1000000, 43.4912700653076, 24, 16], [500000, 20.5859882831573, 20, 15], [250000, 9.5372850894928, 20, 14], [125000, 4.50037479400635, 21, 13], [62500, 2.01573848724365, 16, 12], [31250, 0.941669225692749, 14, 11], [15625, 0.435366868972778, 14, 11], [7812, 0.252951860427856, 13, 10], [3906, 0.0940215587615967, 11, 9], [1953, 0.043536901473999, 13, 8], [976, 0.0200107097625732, 9, 8], [488, 0.00921034812927246, 9, 6], [244, 0.00417494773864746, 10, 5], [122, 0.00190210342407227, 6, 5], [61, 0.000821590423583984, 5, 4], [30, 0.000351667404174805, 4, 3], [15, 0.000170707702636719, 4, 2], [7, 6.03199005126953e-05, 2, 1], [3, 2.57492065429688e-05, 1, 1]]

which resulted in a time climb of




Another interesting observation was the logarithmic oscillations of the depth of the left most leg.


At any rate, enjoy and feel free to send your thoughts.

Take care,

vb