Direkt zum Hauptbereich

Cast some magic, prevent the blame

What it this about?

I know, I know strange title for a blog post.
When dealing with YOCTO you mostly spend time at a single recipe while developing (at least I do), so everything is right at one place - not a big deal.
But an image consists of multiple of those packages and they should all play well together.

Sometimes there are "global" component (like web server) which are used by many components at the same time.

And here comes the issue. I had the pleasure of putting a nginx based web server together which was configured by a bunch of configuration files, which were produced by a lot of different packages.
In theory this works quite well, as you simply just adjust where it's needed, everything else should be done by bitbake - well in theory...

The issue was that in a single configuration fragment a typo existed, which didn't get noticed until the whole construction was started on the target device - I personally think this is way too late.

There has to be a change of checking such issues as part of the build - the example I've chosen might not be that bad, as "it's just the web server", but think of any other service which is automatically started (maybe based on strange conditions) which is permanently crashing due to a single typo, maybe an authentication server, which isn't starting, ultimately leading to everybody can authenticate (bad design) or nobody can login (bad user expierence).

Finding a loophole

These issues only get visible when the whole image is being put together, so there is no real chance to catch them at a package level - Also most of those configuration system rely on more or less hard coded paths, which makes even more tricky to catch those little fellars
So I had a look where to insert such a check after the root filesystem (task: do_rootfs) has been created - what I found was pseudo, a tool which tries to mimic a chroot by LD_PRELOAD  to adjust all paths of the processes forked from that pseudo process to make it look like being on the "real" filesystem, instead of just a sub directory somewhere on the system.
This is a integral step within the rootfs creation of OpenEmbedded.

What it don't addresses is that all binaries within that directory are compiled for a different architecture (at least mostly) - so the result of a command like

pseudo /path/to/my/rootfs/bin/bash

is "cannot execute binary file: Exec format error", so this isn't going to work.

Breaking the cross-compile barrier

Years before I was reading something about something called qemu user mode - in this mode not a whole system is emulated by qemu but only a single process. This mechanism is also used as part of standard YOCTO for example to run PGO (Profile guided optimization) as part of the python build.

So with this in mind, why not combine those two ideas and run a qemu-user-mode-binary in a chroot??

Well, what shall I say, if it would have been so easy, it clearly don't need a dedicated blog post, agree?

Things that don't work here as expected
  • normally the standard YOCTO qemu is linked dynamically, this means the required libraries can't be  found when being "trapped" in a chroot (as they are located somewhere outside of the chroot).
    What you could do is to copy all the required files into the rootfs - but you need to identify them first and keep your fingers crossed that not a file with the same path/name exists in your rootfs
  • The LD_PRELOAD trick doesn't work well for the processes being forked from the qemu itself and that's basically what I was trying to achieve.
    Often you get warnings of libpseudo.so can't be loaded as this is located outside of the chroot)
  • usage of pseudo is based on a lot of env-variables which aren't that well documented
At this point I just took a break from that challenge, as I had no idea how to proceed here an further...

Then suddenly...

I was reading an article from ArchLinux wiki about something completely different, but in the end a link to a tool called proot popped up.
Proot is implemented differently and has builtin support for calling a qemu-user-mode-emulator.
After launching a devshell on my image recipe and typing

proot -r /path/to/my/rootfs/ -q qemu-x86_64 /bin/bash

I got a console from the cross compiled "bash" of my image - A truly amazing moment after all that work I did wasted on this here.

As the devshell approach was working I was "hacking" into a recipe task.

Blame your environment

I ran the freshly coded task and was ready to get a coffee, but the task did return only an error - what happened??
Hours of debugging through the code didn't unravel any obscure mistake in my code - the devshell approach was working all the time like a charm.

Finally (as the last possible option) I had a look into the "run.my_task"-file in the log folder of my image recipe. And there it was

unset SHELL
unset PATH

now it was clear to me, what the difference between my devshell approach and the "fully automated" approach was - missing environment variables.
We had no standard SHELL and non standard PATH set. Proot and qmeu-user-mode are initially forking a standard shell to run the code.

I quickly hacked

export SHELL=/bin/sh 
export PATH=$PATH:/bin:/usr/bin

into my recipe and voila it was working!

More pitfalls to come

Normally I do build a x86-64 build on a x86-64 machine, everything was fine.
To get everything right I though it might be a good idea to test the stuff that I coded for a different arch too - I picked i586.

I ran the code and guess what (you expected this, right?) it didn't work.
The invocation of qemu-i386 (which does the emulation for i386, i586) crashed with a strange message like 

/lib/i386-gnu-linux/libc.so: file not found

So I took a look at the issue tracker of proot and found an active issue that emulating of x32 on x64 through this tool doesn't work at the moment. I though: damn!

At this point I remembered an article which I noticed years ago, which basically said the following steps are needed to run a cross compiled tool in a chroot
  • static compile qemu
  • copy static compiled qemu to rootfs folder
  • chroot into rootfs
  • run static qemu from there
so as kind of last resort I tried that
  • static compile qemu with a forked recipe of the original YOCTO recipe
  • copy static compiled qemu to rootfs folder (now called qemu-<arch>-static)
  • chroot into rootfs (with the help of proot)
  • run static qemu from there
Now I was getting somewhere.

Polishing

As I needed to copy files into the rootfs I would have changed the outcome which would have been shipped to the costumer, so I decided to just create a local copy of the rootfs folder and run everything else there.

After ~1.5h I had everything together. If I now run 

export SHELL=/bin/sh 
export PATH=$PATH:/bin:/usr/bin 
proot -r /path/to/my/rootfs_copy/ -q qemu-i386 /bin/sh nginx -t

the return value of the last step will tell me at build time if the resulting nginx config can be applied on the real target or not.

Finally I did achieve my goal: I cast some magic (proot+qemu) to prevent the blame (of a not working image).

Further usage

I took the advantage of this use case and let tools like lynis run on my rootfs, so can get insights about overall security of my resulting image without much effort, which is a real help when one has to decide if an image is worth shipping or not.
Check out my other use cases of proot+qemu over at my static code analysis layer 

If you liked this post, have anything to add or simply want to have an argument, feel free to drop a comment!

Kommentare

Beliebte Posts aus diesem Blog

Sharing is caring... about task hashes

The YOCTO-project can do amazing things, but requires a very decent build machine, as by nature when you build everything from scratch it does require a lot of compilation. So the ultimate goal has to be to perform only the necessary steps in each run. Understanding task hashing The thing is that bitbake uses a task hashing to determine, which tasks (such as compilation, packaging, a.s.o.) are actually required to be performed. As tasks depend on each other, this information is also embedded into a hash, so the last task for a recipe is ultimately depending on the variable that are used for this specific task and every task before. You could visualize this by using a utility called bitbake-dumpsig , which produces output like this basewhitelist: {'SOURCE_DATE_EPOCH', 'FILESEXTRAPATHS', 'PRSERV_HOST', 'THISDIR', 'TMPDIR', 'WORKDIR', 'EXTERNAL_TOOLCHAIN', 'FILE', 'BB_TASKHASH', 'USER', 'BBSERVER&

Making go not a no-go

Anyone that dealt with container engines came across go - a wonderful language, that was built to provide a right way of what C++ intended to do. The language itself is pretty straight forward and upstream poky support is given since ages... In the go world one would just run 1 2 go get github.com/foo/bar go build github.com/foo/bar and magically the go ecosystem would pull all the needed sources and build them into an executable. This is where the issues start... In the Openembedded world, one would have  one provider (aka recipe) for each dependency each recipe comes with a (remote) artifact (e.g. tarball, git repo, a.s.o.) which can be archived (so one can build the same software at a later point in time without any online connectivity) dedicated license information all this information is pretty useful when working is an environment (aka company) that has restrictions, such as reproducible builds license compliance security compliance (for instance no unpatched CVE) but when us

Speedup python on embedded systems

Have you ever considered to use python as a scripting language in an embedded system? I've been using this on recent projects although it wasn't my first choice. If I had to choose a scripting language to be used in embedded I always had a strong preference for shell/bash or lua, because they are either builtin or designed to have a significant lower footprint compared to others. Nevertheless the choice was python3 (was out of my hands to decide). When putting together the first builds using YOCTO I realized that there are two sides to python. the starting phase, where the app is initializing the execution phase, where the app just processes new data In the 2nd phase python3 has good tradeoffs between maintainability of code vs. execution speed, so there is nothing to moan about. Startup is the worst But the 1st phase where the python3-interpreter is starting is really bad. So I did some research where is might be coming from. Just to give a comparison of