Skip to content

cmd/go: support reproducible builds regardless of build path? #16860

@infinity0

Description

@infinity0

It would be good if go was able to generate bit-for-bit identical binaries, even if the build environment changes in unimportant ways. (For example, so users can reproduce binaries without requiring root.)

Up until recently, we have been able to reproduce go binaries whilst modifying many parts of the build environment, except that we kept the build-path constant. However recently, we started to also vary the build-path. (Why "recently", is irrelevant to this report, but I can go into it if you ask.)

Anyway, now we can see that .gopclntab embeds the build path into the resulting binary. This patch will get rid of this environment-specific information, but I know it's not perfect:

--- src/cmd/link/internal/ld/pcln.go
+++ src/cmd/link/internal/ld/pcln.go
@@ -41,6 +41,8 @@

 // iteration over encoded pcdata tables.

+var cwd, _ = os.Getwd()
+
 func getvarint(pp *[]byte) uint32 {
    v := uint32(0)
    p := *pp
@@ -152,7 +154,8 @@
            f.Value = int64(ctxt.Nhistfile)
            f.Type = obj.SFILEPATH
            f.Next = ctxt.Filesyms
-           f.Name = expandGoroot(f.Name)
+           //f.Name = expandGoroot(f.Name)
+           f.Name, _ = filepath.Rel(cwd, f.Name)
            ctxt.Filesyms = f
        }
    }

In particular, I'm not sure how this will interfere with readers of this information. I know that src/debug/pclntab.go carries an API for this, but I'm not sure what sorts of contracts you have published for that, that people expect.

Also, the part where I comment out expandGoroot is not strictly necessary for reproducibility, but would allow, e.g. a user to try to reproduce a binary from his own go toolchain - which is some extra assurance that the copies behave the same way.

The call to Getwd during static initialisation is also a bit dirty; I could delay this for later but that may or may not require locks, or pass it in from an parent/ancestor caller but that would require adding extra arguments to some functions.

However if you give me some guidelines on how to make this patch acceptable, I'll be happy to do this work and submit a PR.

Activity

ianlancetaylor

ianlancetaylor commented on Aug 24, 2016

@ianlancetaylor
Contributor

When you say that you are varying the build path, what do you mean precisely?

This seems like a restatement of #9206.

bradfitz

bradfitz commented on Aug 24, 2016

@bradfitz
Contributor

Yeah, let's merge conversation in #9206.

@infinity0, instructions for sending a change are at https://golang.org/doc/contribute.html#Code_review which looks like a wall of text, but it's not many actual steps. Be sure to check errors, not have commented-out code, and include tests if possible.

bradfitz

bradfitz commented on Aug 24, 2016

@bradfitz
Contributor

Reopening, since this is slightly different from #9206.

changed the title [-]Bit-for-bit deterministic / reproducible builds[/-] [+]cmd/go: support reproducible builds regardless of build path?[/+] on Aug 24, 2016
ianlancetaylor

ianlancetaylor commented on Aug 24, 2016

@ianlancetaylor
Contributor

Thanks for the explanation (over on #9206). It's not clear to me that this should be a goal. I agree that reproducible builds are essential. However, it's not clear to me that reproducible builds when the sources are in different directories are essential.

For example, exactly the same thing happens when using GCC with the -g option. The source directory is included in the debug info, in the DW_AT_comp_dir attribute.

Not including the source directory will make debugging more difficult.

infinity0

infinity0 commented on Aug 24, 2016

@infinity0
Author

For GCC, we are setting -fdebug-prefix-map=$SRC_ROOT=. dynamically for most builds, which removes it from DW_AT_comp_dir. Recently we landed a patch in GCC to make it not embed debug-prefix-map into DW_AT_producer, so this now works to achieve buildpath-independent reproducibility.

Yes, we're being quite strict with ourselves in trying to make things independent of the build path, but we think it's a good goal that would allow more people in practice to perform build verification.

Ideally, we'd want as many people to rebuild the same hash as possible, so that the rest of the world (who we assume don't want to do these rebuilds) can see that 20 people built the same hash, rather than 5 people built 4 different hashes. It is indeed unclear at the moment what the "best tradeoff" is - we're just scratching the surface of this topic ourselves - but this particular issue seemed fairly easy to me to fix.

(edit: more accurate to say SRC_ROOT instead of PWD; we set it once at the start of the build when they happen to be equal)

infinity0

infinity0 commented on Aug 24, 2016

@infinity0
Author

Yes, "making debugging more difficult" was also my concern. FWIW, our experiments with GCC did not make things harder to debug (this was by chance, we had to try it out to see it). I can appreciate that Go is different, but if you can explain the details I could also try to think of solutions.

added this to the Unplanned milestone on Aug 24, 2016
jmikedupont2

jmikedupont2 commented on Oct 23, 2016

@jmikedupont2

I think this is important. Debugging is also very important. From what I understand this is the issue of two users with different goroots set. Renaming of a module/forking it to a new repo name would not be supported. So, is the question how to share debug information? How to relate debug information created in one root to another root? If you are in the same root as the debug info was created it would be nice if we could strip out all the stuff outside the root.

whyrusleeping

whyrusleeping commented on Mar 31, 2017

@whyrusleeping

Couldnt the paths stored in the binary be somehow prefixed with $GOPATH so that when i'm debugging a binary it uses code in my gopath correctly?

87 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeGoCommandcmd/goNeedsFixThe path to resolution is known, but the work has not been done.early-in-cycleA change that should be done early in the 3 month dev cycle.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @bradfitz@zx2c4@neild@stapelberg@josharian

        Issue actions

          cmd/go: support reproducible builds regardless of build path? · Issue #16860 · golang/go