Direkt zum Hauptbereich

Securing firmware images with ansible

When using YOCTO for building your software there always comes the point where everything is woven together to something bigger - usually all your recipes are put together to an image, which can be used directly for booting an embedded device.
As this is the final step, extra care should be taken when it comes to checking on security.

On an recipe level you can make your component as fortified as you could imagine, if another component doesn't care about security at all, your whole concept will fall into pieces.
  • So how do I check on security when an image is created? 
  • What do I need check for?
  • Are there any helpful tools around, which can support me?
To answer the last question first - YES there are some very useful tools around, for instance ansible.
Ansible is a very flexible, extendable batch processor, which can automate most of the administrative tasks quite convenient.

The result of a YOCTO build is mostly an more or less fully featured Linux system, you can stick to the plenty of admin guides around security on the web. E.g. #1 or #2.

Now for my answer to the first question: By running ansible as part of static code analysis.

Quick example

Objective: ensure that all files lying under /etc are NOT writable by average user

Step 1: Create a playbook

Ansible can be used to run a "playbook" (which is some sort of more intelligent shell script, if you want to break it down to the core) - more on playbooks can be found for instance here.
A playbook (can) consist of multiple tasks, which run commands a.s.o. on the target.
A sample could look like this

- name: checks
  hosts: 127.0.0.1
  connection: local
  tasks:
    - name: "/etc exists as a dir"
      file:
        path: "/etc"
        state: directory
  gather_facts: false
If you now would run this playbook by
ansible-playbook <filename>
it would check if on local system the folder "/etc" is existing.
So we need to tell ansible that we want to check the code inside the YOCTO build tree.
You could change the path of "/etc" to your image-rootfs-path and it will work but that neither seems to be very useful nor generic.
This is where the so called inventory comes into play

Step 2: Create an inventory 

In very simple words the inventory is a storage for global information. So we could create a global variable which is point to the actual path inside the rootfs.
This could look like this
- all:
  - vars:
    sysconfdir: $IMAGE_ROOTFS/$sysconfdir
now the placeholder of "$IMAGE_ROOTFS" and "$sysconfdir" could be replaced by bitbake with the real value with the help of some shell or python magic inside of bitbake. You can have a look at my bbclass  how this could be achieved.

Now we going to need to tell the playbook to use this variable
- name: checks
  gather_facts: false
  hosts: 127.0.0.1
  connection: local
  tasks:
    - name: "/etc exists as a dir"
      file:
        path: "{{ sysconfdir }}"
        state: directory
When you now run the playbook with
ansible-playbook -i <inventory-file> <filename>
you are performing the same check on the correct paths.

Step 3: Making the output be machine readable 

As you may have noticed while trying these examples the standard output of ansible is more intended to be read by humans. As I always want a solution that could run on a headless system, a more machine readable output should be used.
More recent versions of ansible can output the things done as json - so for me this comes in handy.
Just try
ANSIBLE_STDOUT_CALLBACK=json ansible-playbook -i <inventory-file> <filename>
and you will see a lot of information being presented as JSON.

Step 4: Only inform when somethings are not the way they should be

In general ansible does not only check, but also change the things projected in the playbook.
With this in mind I wrote a little parser which just checks on the JSON-tree for elements with the attribute "changed": true. These are extracted and formed into some reasonable message, which can be used for signaling that an issue has been found.
But this all (maybe) does change something in the image, without curing the source of it, so IMHO it's better NOT to change the image but inform the user that an issue has been found.
This can be achieved by running ansible in "check-mode" by issuing ANSIBLE_STDOUT_CALLBACK=json ansible-playbook --check -i <inventory-file> <filename>
Now all changes are reported but not actually performed.

Step 5: The original issue

With what we have learned so far it's fairly easy to setup a playbook which does check with all files under "/etc" are NOT writable by an average user.
The playbook looks like this
- name: /etc-checks
  gather_facts: false
  hosts: 127.0.0.1
  connection: local
  tasks:
    - name: "Files in /etc are not writable by user"
      file:
        path: "{{ sysconfdir }}/{{ item.path }}"
        mode: "o-wx"
      with_filetree:
        - "{{ sysconfdir }}"
      when:
        - item.state == "file" 

Integration into SCA 

I've prepared a fully featured implementation to run ansible as part of the image creation in YOCTO - just have look at my layer. As a result every "violation" found during ansible run will be reported as an error to static code analysis.

Further development

I'll add further playbooks in future to cover more aspects of security. If you like you can join by creating pull requests or issues over on GitHub.

Kommentare

Beliebte Posts aus diesem Blog

Sharing is caring... about task hashes

The YOCTO-project can do amazing things, but requires a very decent build machine, as by nature when you build everything from scratch it does require a lot of compilation. So the ultimate goal has to be to perform only the necessary steps in each run. Understanding task hashing The thing is that bitbake uses a task hashing to determine, which tasks (such as compilation, packaging, a.s.o.) are actually required to be performed. As tasks depend on each other, this information is also embedded into a hash, so the last task for a recipe is ultimately depending on the variable that are used for this specific task and every task before. You could visualize this by using a utility called bitbake-dumpsig , which produces output like this basewhitelist: {'SOURCE_DATE_EPOCH', 'FILESEXTRAPATHS', 'PRSERV_HOST', 'THISDIR', 'TMPDIR', 'WORKDIR', 'EXTERNAL_TOOLCHAIN', 'FILE', 'BB_TASKHASH', 'USER', 'BBSERVER&

Making go not a no-go

Anyone that dealt with container engines came across go - a wonderful language, that was built to provide a right way of what C++ intended to do. The language itself is pretty straight forward and upstream poky support is given since ages... In the go world one would just run 1 2 go get github.com/foo/bar go build github.com/foo/bar and magically the go ecosystem would pull all the needed sources and build them into an executable. This is where the issues start... In the Openembedded world, one would have  one provider (aka recipe) for each dependency each recipe comes with a (remote) artifact (e.g. tarball, git repo, a.s.o.) which can be archived (so one can build the same software at a later point in time without any online connectivity) dedicated license information all this information is pretty useful when working is an environment (aka company) that has restrictions, such as reproducible builds license compliance security compliance (for instance no unpatched CVE) but when us

Speedup python on embedded systems

Have you ever considered to use python as a scripting language in an embedded system? I've been using this on recent projects although it wasn't my first choice. If I had to choose a scripting language to be used in embedded I always had a strong preference for shell/bash or lua, because they are either builtin or designed to have a significant lower footprint compared to others. Nevertheless the choice was python3 (was out of my hands to decide). When putting together the first builds using YOCTO I realized that there are two sides to python. the starting phase, where the app is initializing the execution phase, where the app just processes new data In the 2nd phase python3 has good tradeoffs between maintainability of code vs. execution speed, so there is nothing to moan about. Startup is the worst But the 1st phase where the python3-interpreter is starting is really bad. So I did some research where is might be coming from. Just to give a comparison of