-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rash build result varies from parallelism #52
Comments
longer diff: http://rb.zq1.de/compare.factory-20190405/rash-compare.out (showing first 200 lines of each file's diff) |
Thanks! This looks like a problem more generally with Racket's build system, so I've added an issue to the Racket repository as well: racket/racket#2601 |
Or perhaps I'm wrong! I'll have to look through the implementation and see if/where I'm using some non-deterministic hash-table traversal or gensym... |
I've removed some code that was likely the culprit (gensyms in macros that I was using for some reason). So if this were re-done using the master branch it would perhaps be reproducible now. I can't think of any other macros that could be doing anything non-reproducible. |
@willghatch usually you can test this by compiling to zo on a few different machines (or with |
@samth Thanks. After trying to think of a more convenient way and failing, I've been doing exactly that. Which leads me to the bad news that it's still not reproducible. I think I've isolated it to a single file (git-info.rkt). But I haven't figured out what causes it yet. The only macro used there should be deterministic. It uses gensym at runtime to get a singleton value, but that shouldn't affect reproducible compilation (and removing it doesn't fix it). Anyway, I'll look at this some more after I eat some lunch. |
So my last statement of isolation was way off, that was just an artifact of the way I first started re-kicking builds. At this point it seems that: Rash's |
@samth Thanks for your help on this, by the way. Aside from gensym and hash-table traversals, are there any other features that come to mind? As far as I know, |
Both gensym and hash traversal order are affected by what order things happen in globally (for example, memory addresses or how many gensyms have been created). So lots of things can affect them. |
Looking at the expansion of a simple rash module, here are a few things that might be an issue.
I also noticed that a trivial |
When having non-deterministic hashes in other languages, we usually sort when iterating over them. |
1. It looks like there's an options hash that gets put into the module -- is the order that those keys appear deterministic?
That's actually a run-time hash. But I've removed it anyway because it was stupid.
2. It looks like there are some gensyms in something about `current-paramter-environment`
That's it. Today I traced it down to my use of splicing-syntax-parameterize. This program does not build reproducibly:
```
#lang racket/base
(require racket/stxparam racket/splicing (for-syntax racket/base))
(define-syntax-parameter param #'something)
(splicing-syntax-parameterize ([param (make-rename-transformer #'some-identifier)])
(begin-for-syntax (void)))
```
Looking into the implementation of `splicing-syntax-parameterize`, it uses a `do-local` function that uses `gensym`. I'll look into that more tomorrow or early next week.
I also noticed that a trivial `#lang rash` program expands to 360+ lines of code. Could some of that be moved into functions where you expand to a reference to that? That makes it less likely that the expansion could change non-deterministically while also making zo sizes smaller.
I did a little bit of that today while poking around it. I should certainly do some more.
|
@samth I really appreciate your help on this. Thanks! |
Hi, were there patches done that I can test? |
This is still a problem with +++ new//usr/share/racket/pkgs/rash/rash/compiled/experimental_rkt.dep 2022-12-27 00:00:00.000000000 +0000
@@ -1 +1 @@
-("8.11.1" ta6le ("c766a8edc6824eb84469005591ebecfd6a74207e" . "010b0d6788990d0e951970461b12bdfa90d34464") (collects #"racket" #"base.rkt") (collects #"racket" #"runtime-config.rkt") (collects #"rash" #"private" #"escapable-template.rkt") (collects #"rash" #"private" #"lang-funcs.rkt"))
+("8.11.1" ta6le ("c766a8edc6824eb84469005591ebecfd6a74207e" . "0d7d0978458e495d931d91cc1a50fb926f7dd956") (collects #"racket" #"base.rkt") (collects #"racket" #"runtime-config.rkt") (collects #"rash" #"private" #"escapable-template.rkt") (collects #"rash" #"private" #"lang-funcs.rkt")) |
While working on reproducible builds for openSUSE, I found that
our
rash
package varies when comparing builds from 1-core-VM and a 4-core-VM.Two builds from 1-core-VMs are identical though, so there is some race going on during the build.
Here is an extract from the lengthy total diff:
Unfortunately I know too little about racket or rash to debug this without some help.
The text was updated successfully, but these errors were encountered: