-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Holmake to rebuild based on theory hashes rather than timestamps #381
Comments
In my vision for this, it would also obviate a rebuild in the following situation. |
This seems unlikely. Presumably foo's hash has just changed. In that situation, bar has to rebuild. However, its hash won't change (I suppose). If baz only depends on bar, it won't need to be rebuilt. On the other hand, suppose the change to foo has redefined an important constant that bar uses, but the theorems that bar states have the same statements so that when bar is rebuilt it doesn't change its external nature. (Say the tactics proving bar's theorems are robust in the face of whatever change was made.) Now theory baz is only explicitly a descendent of bar (so the graph records a link form bar to baz, but not from foo to baz), but it does have a theorem in it that depends on foo. Maybe that theorem is now false... So, we actually have to rebuild everything in baz too (though there'd be lots of similar seeming situations when we'd prefer not to). |
If this is implemented, it would be better to move the old theory and then compare the new and the old file byte for byte, file instead of overwriting it with the new theory file and then comparing hashes. Although probably theory hashes will never collide in practice, this is not provable, and thus if one were to model the build process, including this optimization, one could not prove that it is correct, not even probabilistically (in cryptography it is common to assume that the hash is a function chosen at random; this approach is used under the name of “pseudorandom function model” but note that this assumption does not match reality). |
I started writing this issue again 🤦♀️ ... This issue is a proposal to replace the timestamps (based on modification time of the script file that built the theory) currently used to identify theories (these are recorded in the The point of doing this is to enable reuse of theories even when their script files are modified as long as the exported logical content is the same. This includes even changing a proof, for example, since One of my motivations for wanting this is that I believe it will make module management (as explored in https://gist.github.com/xrchz/11d664a58448b680e3ea0e7364c72814) easier, and may even be a prerequisite for some of the ideas there. Luckily, it can be carved out as as an independent upgrade to HOL4. The issue that @mn200 mentioned above prevents the thing I asked for above - not having to rebuild a theory when one of its ancestors adds irrelevant logical content - but now I'm not asking for that. With recursive content hashes I think soundness should be straightforward, right? Even if more rebuilding is required than I initially thought, it is still less than status quo. |
I think we might store the |
What lives in |
Currently, users can put SML code of arbitrary flavours into By scanning the output of Finally, there is also the ability to generate code requiring that a certain SML module be loaded when a Luckily, I think it's trivial to make these changes get reflected in |
Yeah, the main impetus for adjoin_to_theory was the theory of pairs
because, for example, new_definition had to be re-educated from the bare
curried form to handle definitions taking tupled arguments. If that can be
moved to pairLib, I am happy to see adjoin_to_theory go away. With that,
will theories be "essentially declarative"?
…On Mon, Jan 6, 2025 at 5:47 PM Michael Norrish ***@***.***> wrote:
Currently, users can put SML code of arbitrary flavours into fooTheory.sml
(and make entry points visible as well by also changing fooTheory.sig).
The entry-points are adjoin_to_... There aren't that many existing uses,
and they could probably all be removed. (Perhaps just make that a
precondition to this.)
By scanning the output of grep -r I think the only interesting uses of
this facility are in src/coretypes/pairScript.sml, where all sorts of
changes are made to make pairs usable in places like new_definition. I
think this code should move to pairLib, and just assume that that will
always be loaded (as bossLib already does).
Finally, there is also the ability to generate code requiring that a
certain SML module be loaded when a Theory is loaded (
Theory.add_ML_dependency). To date, this facility has mostly been used
for ensuring that custom pretty-printing code is loaded when a theory is
loaded, but it might get used in important ways.
Luckily, I think it's trivial to make these changes get reflected in
fooTheory.dat: simply record the data in the .dat file even if that file
does nothing with the information. (Could do this with the general-purpose
adjoin_to_ entry points too, but they're so gross (and work-around able)
that I'd much prefer not to.)
—
Reply to this email directly, view it on GitHub
<#381 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIOAD7G5KBJQ4B5DBCV4PD2JMIZ5AVCNFSM6AAAAABUVDJWLKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNZUGEZDMMJRGY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
This would for example avoid cascading rebuilds if you just add a comment to a script file (and it produces exactly the same theory).
The text was updated successfully, but these errors were encountered: