-
Notifications
You must be signed in to change notification settings - Fork 254
Reproducible builds of statusgo #1185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Before implementing Docker-based solution, I decided to evaluate other options. Essentially we just have to find a way to spoof/fake current directory path, and using Docker for that seems to be a bit too heavy. Faking current dir pathI naturally started by asking the question "How we can fake the directory path?". I.e. go build process will be referring to One obvious solution is to use chrooted jail – it comes with every POSIX system, cross-platform, straightforward to use and familiar. One downside – it needs There also some lightweight containers/cgroups options, but they all seem to be requiring root access as well, and also limited to Linux only. LD_PRELOAD and DYLD_INSERT_LIBRARIESIt's possible to write small C library that will reimplement It doesn't require root access. The downsides are following: probably will require users to compile C code once, bringing the dependency on C compiler toolkit. (which is probably installed, but anyway). Plus, it's really hacky, probably has a lot of corner cases (especially on MacOS X) and may break Go build process logic. So I decided to explore other options. Rewriting BuildID in a binaryNext approach could be rewriting the BuildID inside the binary itself if we can guarantee, that binaries are essentially equal. Let me explain that. Go is using the concept of build identifiers ( In a bit simplified form, buildID value consists of two hashes:
where
More information here: https://github.com/golang/go/blob/master/src/cmd/go/internal/work/buildid.go#L24 As I mentioned in a previous comment, the binaries built in the different directories differ only by the value of this stamp We can check buildID value with
where
Sample
Now, the interesting part. Following build instructions in a previous comment, we always get the same binaries, and the But we don't really care about it. The rest of binary is the same on a byte level. We can simply overwrite this part manually, saying "we don't care about inputs, as long as outputs are correct". I made a proof of concept solution for ELF files. Here are the steps: First, we build release version in the controlled environment (like CI), extract buildID from the binary and store it somewhere (maybe in git itself, under Then, in Makefile, when user runs
And if
After this step, we can compare SHA1 hashes of binaries and they should be the same for the same OS/ARCH no matter where this process is executed. NotesThis obviously looks like a hack, because it is a hack. But it exploits properties of Go build system design, which was designed with future reproducibility in mind. One thing that I don't like is that buildID hashes are actually truncated versions of original hash value (first 96 bytes of original hash encoded in base64, in fact) – see hashToString code. This increases the probability of collisions, but that works fine for the task "detect if binary should be rebuilt", and decreases the size of buildID from 259 bytes to 67, which is more readable. But that may be not enough to take a decision whether the produced binary is what we expect. Worth more analysis I guess. I would love to hear thoughts and questions on this. I think I can make this approach work for both MacOS X and Linux transparently (using otool).
|
Brad Fitzpatrick mentioned approach used in Google:
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed. Please re-open if this issue is important to you. |
Problem
We want to ensure reproducible (deterministic) builds for statusd and status-go libraries. This is an important security aspect of the modern open-source software. It is also required by F-Droid and non-reproducible builds seem to be a blocker. See explanation here: status-im/status-mobile#5587
Any used must be able to build status-go by herself and get the identical binary to the one distributed over release channels.
Implementation
Essentially a reproducible build is a way to guarantee that for the same git commit anyone can build absolutely the same – byte-by-byte – binary.
For Go code, there are few inputs that may render build non-determinisitic:
Most of these values can be seen as requirements for build (i.e. Go version), but some require special flags or tricks to overcome. Note, that Go 1.12 will probably have build flag like
go build -release
that enables those flags. Or, perhaps, Go programs will be almost 100% reproducible by default. See ongoing issue here.GOOS and GOARCH seems not to have an effect, so the good news is that Go binaries are cross-platform reproducible. (I.e. if all other requiremens are met, cross-compiling and native compiling will yield identical binary)
I experimented with getting the same binary just by running
go build
and the best result I got was using this command:I wasn't able to get past
buildid
identifier, that is generated from various inputs (including absolute path) and written into binary. So the builds in the different dirs/GOPATHs results in the slightly different binaries. That will be hopefully fixed in Go 1.12.So the current solution is to use Docker container for a reproducible build, where we can guarantee the same GOPATH, the same absolute path and other variables. That will also remove a need of stripping out debug information, which might be useful (for more verbose stacktraces on panic or profiling info).
If Docker approach will be sufficient for the moment, that should be fairly straightforward implementation.
The text was updated successfully, but these errors were encountered: