Anyone that dealt with container engines came across go - a wonderful language, that was built to provide a right way of what C++ intended to do.
The language itself is pretty straight forward and upstream poky support is given since ages...
In the go world one would just run
1 2 | go get github.com/foo/bar go build github.com/foo/bar |
and magically the go ecosystem would pull all the needed sources and build them into an executable.
This is where the issues start...
In the Openembedded world, one would have
- one provider (aka recipe) for each dependency
- each recipe comes with a (remote) artifact (e.g. tarball, git repo, a.s.o.) which can be archived (so one can build the same software at a later point in time without any online connectivity)
- dedicated license information
all this information is pretty useful when working is an environment (aka company) that has restrictions, such as
- reproducible builds
- license compliance
- security compliance (for instance no unpatched CVE)
but when using go, all that is just present for the very top level repository.
But what so different about go?
Internally go use a file called go.mod - a typical example looks like this
1 2 3 4 5 6 7 8 9 10 11 12 13 | module cloud.google.com/go/firestore go 1.11 require ( cloud.google.com/go v0.81.0 github.com/golang/protobuf v1.5.2 github.com/google/go-cmp v0.5.5 github.com/googleapis/gax-go/v2 v2.0.5 google.golang.org/api v0.44.0 google.golang.org/genproto v0.0.0-20210415145412-64678f1ae2d5 google.golang.org/grpc v1.37.0 ) |
first of all a module name, next the minimum required go version and then a larger block of dependencies...
1 | cloud.google.com/go v0.81.0
|
this basically says we need the code of the repository cloud.google.com/go tagged version 0.81.0.
So the go compiler suite will pull these sources and unpack them into the build workspace without any further action required from you.
Which also implies that this operation can pull whatever sources, from whatever host in whatever version :-(
Which also implies that this operation can pull whatever sources, from whatever host in whatever version :-(
go and unknown license information
As you may already can imagine, dependencies have dependencies themselves, so it will be a lot of code being pulled just to make your top level repository compile.
All the metadata of these dependencies and the dependencies of the dependencies is not known to bitbake at all.
Just imagine the following (and yes that is a slightly adapted real world example)
- you are not allowed to use GPL-v3.0 in your product
- you are using a go module which is licensed MIT
so everything looks good in the first place - but if you look into the dependency chain
github.com/foo/bar [MIT], pulls
github.com/bar/baz [Apache-2.0], which pulls
github.com/some/other [GPL-v3.0]
as the there is not real way to determine if actually code of github.com/some/other lands in the executable of github.com/foo/bar, one has to consider the license information of both github.com/foo/bar and github.com/bar/baz to be wrong - both are to be considered GPL-v3.0, which is what you can't use...
Without in-depth analysis that would remain completely hidden from you.
go and not-reproducible builds
as the dependencies are stored in just the go.mod file of the repository you're trying to build, there is no way to have the complete set of needed sources before starting the actual compile run.
One could manually create recipes for all dependencies and inject them with the help of DEPENDS, **but** this is not how the go community likes to play this game...
Another real world example
- golang.org/x/tools requires golang.org/x/net
- golang.org/x/net requires golang.org/x/text
- golang.org/x/text requires golang.org/x/tools
So creating a recipe per dependencies is off the table...
go and the "missing" security
As we all know it usually is best to use the latest greatest of open source when it comes to bugfixes and patched security issues - again something that the go community decided to hide behind some magic super daemon, that does all the magic for you
If we have a look at https://golang.org/ref/mod#minimal-version-selection it is in the end somehow nondeterministic which version of a module is actually pulled for compilation, as
- a module might have withdrawn a version in the meantime
- a module replaces an interface with a forked version
Which leaves us with the only possible conclusion, go ecosystem isn't very well suited to be used with openembedded.
But what if I told you it could...
First of all we need to have archivable sources - this is where I came across https://proxy.golang.org, which does provide you with an API to query versions, artifacts and much more.
For instance we can get the available versions
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | # wget https://proxy.golang.org/cloud.google.com/go/@v/list
v0.26.0
v0.36.0
v0.15.0
v0.69.1
v0.45.0
v0.55.0
v0.46.3
v0.68.0
v0.50.0
v0.7.0
v0.37.4
v0.40.0
v0.37.2
v0.3.0
v0.37.0
v0.33.1
v0.52.0
v0.8.0
v0.43.0
v0.6.0
v0.33.0
v0.10.0
... |
1 2 3 | # wget https://proxy.golang.org/cloud.google.com/go/@latest {"Version":"v0.81.0","Time":"2021-04-02T19:10:02Z"} |
and we can even download a zip file containing the corresponding sources
1 | # wget https://proxy.golang.org/cloud.google.com/go/@v/v0.81.0.zip
|
so we can put a check on "archivable sources" and even "using the latest greatest" (as we have all the needed information covered by the API)
That leaves us with avoiding any circular dependencies - that was the most tricky part, but I managed to code a script, which
- analyses any go.mod file
- double checks the needed dependencies - by running
1
go list -f '{{ join .Imports " " }}' ./...
- extracts the found license information (special thanks to the wonderful scancode-toolkit)
- dumps all this information into a bitbake recipe (including the needed dependencies)
But wait, didn't you wrote earlier that this causes circular depedencies...
you're absolutely right - I had to use a little trick to make it work.
Actually any go module will consist of a bb file, for instance
1 2 3 4 5 6 7 8 9 10 11 12 13 | SUMMARY = "go.mod: golang.org/x/tools" HOMEPAGE = "https://pkg.go.dev/golang.org/x/tools" # License is determined by the modules included and will be therefore computed LICENSE = "${@' & '.join(sorted(set(x for x in (d.getVar('GOSRC_LICENSE') or '').split(' ') if x)))}" # inject the needed sources require golang.org-x-tools-sources.inc GO_IMPORT = "golang.org/x/tools" inherit gosrc |
which does just some very generic metadata, but no actual artifact reference - this will be provided by a corresponding include file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | SRC_URI += "https://proxy.golang.org/golang.org/x/tools/@v/v0.1.0.zip;srcoutput=golang.org/x/tools;srcinput=golang.org/x/tools@v0.1.0;downloadfilename=golang-org-x-tools-0.1.0.zip;name=golang-org-x-tools" SRC_URI[golang-org-x-tools.sha256sum] = "bb7d50a844ccfbe67a8d51ce04404bddc8cdc46eaf3fe82d84806d61fffc22dd" GOSRC_LICENSE += "\ BSD-3-Clause \ " LIC_FILES_CHKSUM += "\ file://src/golang.org/x/tools/LICENSE;md5=5d4950ecb7b26d2c5e4e7b4e0dd74707 \ file://src/golang.org/x/tools/cmd/getgo/LICENSE;md5=4ac66f7dea41d8d116cb7fb28aeff2ab \ " GOSRC_INCLUDEGUARD += "golang.org-x-tools-sources.inc" require ${@bb.utils.contains('GOSRC_INCLUDEGUARD', 'github.com-yuin-goldmark-sources.inc', '', 'github.com-yuin-goldmark-sources.inc', d)} require ${@bb.utils.contains('GOSRC_INCLUDEGUARD', 'golang.org-x-mod-sources.inc', '', 'golang.org-x-mod-sources.inc', d)} require ${@bb.utils.contains('GOSRC_INCLUDEGUARD', 'golang.org-x-net-sources.inc', '', 'golang.org-x-net-sources.inc', d)} require ${@bb.utils.contains('GOSRC_INCLUDEGUARD', 'golang.org-x-sync-sources.inc', '', 'golang.org-x-sync-sources.inc', d)} require ${@bb.utils.contains('GOSRC_INCLUDEGUARD', 'golang.org-x-sys-sources.inc', '', 'golang.org-x-sys-sources.inc', d)} require ${@bb.utils.contains('GOSRC_INCLUDEGUARD', 'golang.org-x-xerrors-sources.inc', '', 'golang.org-x-xerrors-sources.inc', d)} |
that one provides you with the actual sources being pulled and the correct license information.
Also the include file sets the needed dependencies required to build (which are include files of their own).
To avoid pulling the same source more than once a workspace, I used something very common in the C-world: header guard macros - so each source is only included once (which breaks the vicious circular dependency circle)
Assembling
What now happens if any of the bb recipes are build...
- the top-level bb file, let's call it foo_1.0.bb, includes foo.inc
- foo.inc sets the main sources fetched as a zip file from proxy.golang.org
- furthermore it includes bar.inc and baz.inc
- bar.inc and baz.inc are not versioned so they will pull the latest available sources defined via an recipe in the workspace
- after all source zip files have been fetched the internal bbclass (gosrc) puts every source zip file extract into a the right place
- the go compiler find all of the required files and builds an executable, which can be used without any of the dependencies (more or less a static linked executable, ready to be shipped)
In the end you have an executable which is compiled from reproducible sources, the latest greatest and additionally we have all the needed compliance information... mission accomplished!
The actual script can be found here - feel free to use and adjust it to your needs
Kommentare
Kommentar veröffentlichen