Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle meta-programmed Sail code in the golden model. #133

Open
martinberger opened this issue Nov 27, 2021 · 39 comments
Open

How to handle meta-programmed Sail code in the golden model. #133

martinberger opened this issue Nov 27, 2021 · 39 comments

Comments

@martinberger
Copy link
Collaborator

Some Sail code isn't hand-written, but generated by other programs. We's started an interesting debate about how to deal with this in the comments on the P Extension PR #132

I think it should be a separate thread. Please continue the discussion here.

@martinberger
Copy link
Collaborator Author

martinberger commented Nov 27, 2021

I think we should minimise the programming languages used in the golden model. All code should be considered as if it was written by an especially diligent human. It's fine to auto-generate Sail. But I don't think we should have the generating programs in the model. Submitted PRs are free to include the generators for inspection.

Opinions?

@jrtc27
Copy link
Collaborator

jrtc27 commented Nov 27, 2021

To the best of my knowledge, all of the current code was hand-written (or possibly with tiny throwaway scripts or editor macros; the line is blurry), with the exception of prelude_mapping.sail which is working around a language limitation and has the Python script used to generate it the monomorphised versions in a comment.

@scottj97
Copy link
Contributor

For years I have been disappointed with the violations of the DRY principle throughout this code (e.g. CSR addresses). It makes maintenance unnecessarily difficult and error-prone.

If the Sail language does not and will not enable improvement in this area, then I support generation of the Sail code from a DRYer specification. The riscv-config changes (#128) are a step in this direction.

But writing the initial Sail code is not the hard part. Maintaining it over (we hope) decades of industrial use is the far greater concern. If you commit the generated code to the Git repo and discard the generator, you gain the benefits only for the (easier) initial writing of the code, and lose all the benefits for the (more difficult and more expensive) maintenance of that code.

(From #132:)

I think the two main reasons why we use small scripts to generate Sail is not so much to save keystrokes, b/c we don't have that much code anyway. I think it's:

to avoid errors.
to feed other targets of our ISA ideas, like compilers, test suites, documentation etc that we build in parallel while developing a RISCV extension.

I agree -- but then what happens when those specs change? If the code generation is not part of this repo, you lose all those benefits after the initial coding. The code in this repo, which will be hand-edited from thereon, is no longer DRY, and therefore is unnecessarily error-prone. The primary purpose of DRY is to reduce the risk and cost of ongoing maintenance.

Even if the generator is maintained elsewhere, it tends to diverge. See for example this Spike PR where I had to do quite a bit of work to fix a (generated, then committed) file that had diverged from the original source/generator.

Incidentally, LLVM uses the TableGen DSL to help structure almost all information about processors.

Yes, and the source files are committed to Git and the generator runs as part of the LLVM build. They do not commit the generated files and then throw out the TableGen source code!

@scottj97
Copy link
Contributor

I understand that Arm uses Sail extensively. Surely they have many different configurations of processors they must support. What have they done to solve this problem?

@jrtc27
Copy link
Collaborator

jrtc27 commented Nov 28, 2021

For years I have been disappointed with the violations of the DRY principle throughout this code (e.g. CSR addresses). It makes maintenance unnecessarily difficult and error-prone.

I don't see any issues filed by you suggesting improvements. I also don't see what the problem with CSRs is, you add one line to give its name, one line to implement its read behaviour, one line to implement its write behaviour and one line to say it exists (which you could get rid of at the expense of more code elsewhere, so whilst it's not the cleanest conceptually it's probably the one that has the least code). If you wanted to you could introduce an intermediate representation for them, but overall it'd be at least as much code, everything currently there is necessary, and additional indirection to make it look nicer would add extra code to map the encoding to the more abstract representation.

If the Sail language does not and will not enable improvement in this area, then I support generation of the Sail code from a DRYer specification. The riscv-config changes (#128) are a step in this direction.

Improvement for what? Are there concrete language features you can think of?

But writing the initial Sail code is not the hard part. Maintaining it over (we hope) decades of industrial use is the far greater concern. If you commit the generated code to the Git repo and discard the generator, you gain the benefits only for the (easier) initial writing of the code, and lose all the benefits for the (more difficult and more expensive) maintenance of that code.

It depends how exactly you're using a generator.

(From #132:)

I think the two main reasons why we use small scripts to generate Sail is not so much to save keystrokes, b/c we don't have that much code anyway. I think it's:
to avoid errors.
to feed other targets of our ISA ideas, like compilers, test suites, documentation etc that we build in parallel while developing a RISCV extension.

I agree -- but then what happens when those specs change? If the code generation is not part of this repo, you lose all those benefits after the initial coding. The code in this repo, which will be hand-edited from thereon, is no longer DRY, and therefore is unnecessarily error-prone. The primary purpose of DRY is to reduce the risk and cost of ongoing maintenance.

Even if the generator is maintained elsewhere, it tends to diverge. See for example this Spike PR where I had to do quite a bit of work to fix a (generated, then committed) file that had diverged from the original source/generator.

I agree that, if generated code where the generator and input continue to exist is present in the model, the generator and input must be considered the preferred source form and the Sail not be hand-edited, if committed at all (depends on how clunky the generator is to run during a build, and what its dependencies are). I am not convinced that is the right direction for the model though.

Incidentally, LLVM uses the TableGen DSL to help structure almost all information about processors.

Yes, and the source files are committed to Git and the generator runs as part of the LLVM build. They do not commit the generated files and then throw out the TableGen source code!

Indeed. It's a related but different problem, though, that it solves, since it isn't trying to specify the exact architectural semantics of every instruction, and ultimately is all about generating tables, not an executable formal model written in an imperative language. (Yes, those tables can emit bytecode that gets interpreted for various purposes, but that's still a long way away.)

@martinberger
Copy link
Collaborator Author

martinberger commented Nov 28, 2021

makes maintenance unnecessarily difficult and error-prone.

I agree, but there is a cost of abstraction and code-generating programs have their own learning and maintainance cost ... So, as usual, it's about balancing trade-offs.

Yes, and the source files are committed to Git and the generator runs as part of the LLVM build. They do not commit the generated files and then throw out the TableGen source code!

A core difference is that tablegen is an official LLVM metaprogramming tool. There is currently no equivalent in the Sail ecosystem. (Another difference is that RISCV is much smaller than LLVM.) I think we should be conservative here. That said, I do generated some TableGen code using a Python script ... I doubt LLVM wants that script as part of their code base.

@martinberger
Copy link
Collaborator Author

not an executable formal model written in an imperative language.

I'd say the semantics of instructions is an alternative, and executable, (part of a) model of the ISA. This model might not be complete, since one does not need all of a processor's instructions to implement a viable compiler.

@allenjbaum
Copy link
Collaborator

allenjbaum commented Nov 28, 2021 via email

@PeterSewell
Copy link
Collaborator

PeterSewell commented Nov 28, 2021 via email

@jrtc27
Copy link
Collaborator

jrtc27 commented Nov 28, 2021

Specifically for CSR behavior: there is no precise architectural definition of what most CSRs do when written. Almost all have WARL fields where the definition of what is legal is implementation specific, and the mapping of illegal->legal is also implementation specific.

Not really. A lot of the implementation specificity comes from "which features do you implement, so it's not "I choose to only make these CSR values legal", it's "I implement these extensions with these parameters as defined by the ISA and thus the legal values for this CSR are known". The latter should be written in Sail as part of the architectural model, keyed on the higher-level configurability about what features you have, with only the former being arbitrarily defined. Otherwise you end up with the model not expressing how the architecture actually is defined by the prose "spec", and provide far more flexibility than is legal.

@martinberger
Copy link
Collaborator Author

martinberger commented Nov 28, 2021

@PeterSewell I agree that there are many problems with allowing to add generators to the model, and I dread having to look at 17 different generator languages .... I think we can agree that the conservative position, which we should stick to by default, is this:

  1. The model directory must be free from generators, and contains only Sail. The Sail in model should contain the full RISCV semantics. All PRs for model are reviewed w.r.t how they affect model only. It must be possible to understand the full RISCV spec from just reading the code in model, and be able to compile from model to RTL and get a compliant processor.

We could simply stick with (1) and refuse to accept generators. I think a compromise might be possible, which is to keep (1), and to add (for now experimentally) a generated_model directory (similar to prover_snapshots) which contains generators (I'm skimming over a lot of details). For prover_snapshots we impose the additional invariant:

  1. When we run the generators in prover_snapshots, the generated Sail code must be bit-for-bit identical with the Sail in (the relevant git commit in) model. The responsibility for keeping the generators up-to-date is not with the golden-model maintainers.

A PR for prover_snapshots is only accepted if it maintains those invariants. PRs for model are accepted independent from what's going on in prover_snapshots.

If we insist on (1, 2) together, formal verification can continue to work as it does today, with the code in model. I imagine that the core problem with the above is that the maintainers of the generators will run out of energy in keeping up with the code in model.

@PeterSewell
Copy link
Collaborator

PeterSewell commented Nov 28, 2021 via email

@martinberger
Copy link
Collaborator Author

@PeterSewell @allenjbaum CSR and WARL are an interesting edge case.

CSRs and WARL are implementation defined, so it's not 100% clear this should be in the ISA model. One could arguye that implementation-defined stuff has no place in the golden model. Instead, I think it should be derived (mechanically) from the ISA model. We could simply say that it is reasonable to assume that the test writers keep up with the ISA golden model, and how they do it is their problem.

However, currently, this makes it difficult for them, because the widely used scripting languages like Python, don't have canonical importers for Sail, so it's not easy to ingest the golden model and transform it to a required shape. In contrast, it's much easier simply to output Sail from some small parameter set.

An additional difficulty is that the current ISA spec requires WARL to be deterministic [1] which is not a property of single runs, although it can be defined logically by quantification over multiple runs. (I've raised a question about this in the Sail repo [2])

  1. RISC-V ISA Specification, Volume 2, Privileged. Section 2.3 CSR Field Specifications. https://riscv.org/technical/specifications

  2. Sail best practises for specifying hyperproperties. rems-project/sail#161

@PeterSewell
Copy link
Collaborator

PeterSewell commented Nov 29, 2021 via email

@martinberger
Copy link
Collaborator Author

the ISA model should be paramaterised on those impl-defined choices

That should be doable, for example by linking in the choices in the C model, or by synthesising Sail functions that encode the choices and then compiling to C.

@PeterSewell
Copy link
Collaborator

PeterSewell commented Nov 29, 2021 via email

@rmn30
Copy link
Collaborator

rmn30 commented Nov 29, 2021

So you had to rebuild the C or OCaml emulator with that to run it, but you
could generate a single parameterised prover definition that you could
instantiate to a specific set of implementation choices, but you didn't
have to - you could work with the generic definition.

Just to clarify the 'generic definition' was actually just a default set of choices that lived in the repo. Although sail does have some support for 'undef' I didn't attempt to use it to describe implementation defined behaviour.

@PeterSewell
Copy link
Collaborator

PeterSewell commented Nov 29, 2021 via email

@allenjbaum
Copy link
Collaborator

allenjbaum commented Nov 29, 2021 via email

@scottj97
Copy link
Contributor

I also don't see what the problem with CSRs is, you add one line to give its name, one line to implement its read behaviour, one line to implement its write behaviour and one line to say it exists (which you could get rid of at the expense of more code elsewhere, so whilst it's not the cleanest conceptually it's probably the one that has the least code).

Which requires each CSR's address to appear 4 times in three separate files. That's not DRY!

Look at pmpaddr:

(0x3B0, _) => { pmpaddr0 = pmpWriteAddr(pmpLocked(pmp0cfg), pmpTORLocked(pmp1cfg), pmpaddr0, value); Some(pmpaddr0) },
(0x3B1, _) => { pmpaddr1 = pmpWriteAddr(pmpLocked(pmp1cfg), pmpTORLocked(pmp2cfg), pmpaddr1, value); Some(pmpaddr1) },
(0x3B2, _) => { pmpaddr2 = pmpWriteAddr(pmpLocked(pmp2cfg), pmpTORLocked(pmp3cfg), pmpaddr2, value); Some(pmpaddr2) },
(0x3B3, _) => { pmpaddr3 = pmpWriteAddr(pmpLocked(pmp3cfg), pmpTORLocked(pmp4cfg), pmpaddr3, value); Some(pmpaddr3) },
(0x3B4, _) => { pmpaddr4 = pmpWriteAddr(pmpLocked(pmp4cfg), pmpTORLocked(pmp5cfg), pmpaddr4, value); Some(pmpaddr4) },
(0x3B5, _) => { pmpaddr5 = pmpWriteAddr(pmpLocked(pmp5cfg), pmpTORLocked(pmp6cfg), pmpaddr5, value); Some(pmpaddr5) },
(0x3B6, _) => { pmpaddr6 = pmpWriteAddr(pmpLocked(pmp6cfg), pmpTORLocked(pmp7cfg), pmpaddr6, value); Some(pmpaddr6) },
(0x3B7, _) => { pmpaddr7 = pmpWriteAddr(pmpLocked(pmp7cfg), pmpTORLocked(pmp8cfg), pmpaddr7, value); Some(pmpaddr7) },
(0x3B8, _) => { pmpaddr8 = pmpWriteAddr(pmpLocked(pmp8cfg), pmpTORLocked(pmp9cfg), pmpaddr8, value); Some(pmpaddr8) },
(0x3B9, _) => { pmpaddr9 = pmpWriteAddr(pmpLocked(pmp9cfg), pmpTORLocked(pmp10cfg), pmpaddr9, value); Some(pmpaddr9) },
(0x3BA, _) => { pmpaddr10 = pmpWriteAddr(pmpLocked(pmp10cfg), pmpTORLocked(pmp11cfg), pmpaddr10, value); Some(pmpaddr10) },
(0x3BB, _) => { pmpaddr11 = pmpWriteAddr(pmpLocked(pmp11cfg), pmpTORLocked(pmp12cfg), pmpaddr11, value); Some(pmpaddr11) },
(0x3BC, _) => { pmpaddr12 = pmpWriteAddr(pmpLocked(pmp12cfg), pmpTORLocked(pmp13cfg), pmpaddr12, value); Some(pmpaddr12) },
(0x3BD, _) => { pmpaddr13 = pmpWriteAddr(pmpLocked(pmp13cfg), pmpTORLocked(pmp14cfg), pmpaddr13, value); Some(pmpaddr13) },
(0x3BE, _) => { pmpaddr14 = pmpWriteAddr(pmpLocked(pmp14cfg), pmpTORLocked(pmp15cfg), pmpaddr14, value); Some(pmpaddr14) },
(0x3BF, _) => { pmpaddr15 = pmpWriteAddr(pmpLocked(pmp15cfg), false, pmpaddr15, value); Some(pmpaddr15) },

Could this repetition be reduced with an array and a loop somehow?

Today if an implementation wants to change how many PMP registers are implemented from 16 to 8, there are 4 places in this file and a total of 20 lines that need to be edited. That should be one integer parameter somewhere.

Improvement for what? Are there concrete language features you can think of?

Reducing repetition and necessary copypasta. I'm no language designer and I don't know Sail well enough to make intelligent recommendations, but the above examples illustrate the issues.

I agree that, if generated code where the generator and input continue to exist is present in the model, the generator and input must be considered the preferred source form and the Sail not be hand-edited, if committed at all (depends on how clunky the generator is to run during a build, and what its dependencies are). I am not convinced that is the right direction for the model though.

I'm not convinced either.

@jrtc27
Copy link
Collaborator

jrtc27 commented Nov 29, 2021

I also don't see what the problem with CSRs is, you add one line to give its name, one line to implement its read behaviour, one line to implement its write behaviour and one line to say it exists (which you could get rid of at the expense of more code elsewhere, so whilst it's not the cleanest conceptually it's probably the one that has the least code).

Which requires each CSR's address to appear 4 times in three separate files. That's not DRY!

As opposed to what? Defining a named constant that gets repeated 4 times? Not really any different. DRY is not the be all and end all, sometimes a bit of repetition is better than taking it to the extreme. Whether or not that applies here is a different matter.

Look at pmpaddr:

(0x3B0, _) => { pmpaddr0 = pmpWriteAddr(pmpLocked(pmp0cfg), pmpTORLocked(pmp1cfg), pmpaddr0, value); Some(pmpaddr0) },
(0x3B1, _) => { pmpaddr1 = pmpWriteAddr(pmpLocked(pmp1cfg), pmpTORLocked(pmp2cfg), pmpaddr1, value); Some(pmpaddr1) },
(0x3B2, _) => { pmpaddr2 = pmpWriteAddr(pmpLocked(pmp2cfg), pmpTORLocked(pmp3cfg), pmpaddr2, value); Some(pmpaddr2) },
(0x3B3, _) => { pmpaddr3 = pmpWriteAddr(pmpLocked(pmp3cfg), pmpTORLocked(pmp4cfg), pmpaddr3, value); Some(pmpaddr3) },
(0x3B4, _) => { pmpaddr4 = pmpWriteAddr(pmpLocked(pmp4cfg), pmpTORLocked(pmp5cfg), pmpaddr4, value); Some(pmpaddr4) },
(0x3B5, _) => { pmpaddr5 = pmpWriteAddr(pmpLocked(pmp5cfg), pmpTORLocked(pmp6cfg), pmpaddr5, value); Some(pmpaddr5) },
(0x3B6, _) => { pmpaddr6 = pmpWriteAddr(pmpLocked(pmp6cfg), pmpTORLocked(pmp7cfg), pmpaddr6, value); Some(pmpaddr6) },
(0x3B7, _) => { pmpaddr7 = pmpWriteAddr(pmpLocked(pmp7cfg), pmpTORLocked(pmp8cfg), pmpaddr7, value); Some(pmpaddr7) },
(0x3B8, _) => { pmpaddr8 = pmpWriteAddr(pmpLocked(pmp8cfg), pmpTORLocked(pmp9cfg), pmpaddr8, value); Some(pmpaddr8) },
(0x3B9, _) => { pmpaddr9 = pmpWriteAddr(pmpLocked(pmp9cfg), pmpTORLocked(pmp10cfg), pmpaddr9, value); Some(pmpaddr9) },
(0x3BA, _) => { pmpaddr10 = pmpWriteAddr(pmpLocked(pmp10cfg), pmpTORLocked(pmp11cfg), pmpaddr10, value); Some(pmpaddr10) },
(0x3BB, _) => { pmpaddr11 = pmpWriteAddr(pmpLocked(pmp11cfg), pmpTORLocked(pmp12cfg), pmpaddr11, value); Some(pmpaddr11) },
(0x3BC, _) => { pmpaddr12 = pmpWriteAddr(pmpLocked(pmp12cfg), pmpTORLocked(pmp13cfg), pmpaddr12, value); Some(pmpaddr12) },
(0x3BD, _) => { pmpaddr13 = pmpWriteAddr(pmpLocked(pmp13cfg), pmpTORLocked(pmp14cfg), pmpaddr13, value); Some(pmpaddr13) },
(0x3BE, _) => { pmpaddr14 = pmpWriteAddr(pmpLocked(pmp14cfg), pmpTORLocked(pmp15cfg), pmpaddr14, value); Some(pmpaddr14) },
(0x3BF, _) => { pmpaddr15 = pmpWriteAddr(pmpLocked(pmp15cfg), false, pmpaddr15, value); Some(pmpaddr15) },

Could this repetition be reduced with an array and a loop somehow?

That's just an example of bad code, not a reason to tear up the entire model's CSR handling. But yes, the PMP implementation (don't look at riscv_pmp_control, there's more like that) should be improved, but you don't need a generator for that.

Today if an implementation wants to change how many PMP registers are implemented from 16 to 8, there are 4 places in this file and a total of 20 lines that need to be edited. That should be one integer parameter somewhere.

You can leave the read/write/name implementations there, you just need to change ext_is_CSR_defined.

Improvement for what? Are there concrete language features you can think of?

Reducing repetition and necessary copypasta. I'm no language designer and I don't know Sail well enough to make intelligent recommendations, but the above examples illustrate the issues.

I guess arrays of registers could help; currently you'd have to emulate that by packing the entire array into a single register, which gets awkward for formal reasoning.

I agree that, if generated code where the generator and input continue to exist is present in the model, the generator and input must be considered the preferred source form and the Sail not be hand-edited, if committed at all (depends on how clunky the generator is to run during a build, and what its dependencies are). I am not convinced that is the right direction for the model though.

I'm not convinced either.

@scottj97
Copy link
Contributor

As opposed to what? Defining a named constant that gets repeated 4 times? Not really any different.

As opposed to defining each CSR's complete behavior in one place instead of 4, as #128 tries to do.

That's just an example of bad code, not a reason to tear up the entire model's CSR handling. But yes, the PMP implementation (don't look at riscv_pmp_control, there's more like that) should be improved, but you don't need a generator for that.

That's great to hear! See, I thought this ugly code was due to limitations of the language, and that's why a generator would be helpful.

You can leave the read/write/name implementations there, you just need to change ext_is_CSR_defined.

I don't think that works, because the PMP spec says if the registers are not implemented then they become read-only 0. At the very least wouldn't you have to change the write? And pmpcfg also changes with this number. But this is a nitpicky distraction. The question is: can we implement this entirely in Sail having a single integer parameter to specify the number of PMP registers? If so, great! No benefit to having this code generated.

@jrtc27
Copy link
Collaborator

jrtc27 commented Nov 29, 2021

As opposed to what? Defining a named constant that gets repeated 4 times? Not really any different.

As opposed to defining each CSR's complete behavior in one place instead of 4, as #128 tries to do.

That's just an example of bad code, not a reason to tear up the entire model's CSR handling. But yes, the PMP implementation (don't look at riscv_pmp_control, there's more like that) should be improved, but you don't need a generator for that.

That's great to hear! See, I thought this ugly code was due to limitations of the language, and that's why a generator would be helpful.

FWIW the other instance of ugly code I'm aware of is the page table walker, which has copy/pasted implementations for Sv39 and Sv48 (and Sv32, though that's at least more different than just "number of levels"). The language does limit how you can write that somewhat (in C++ you'd just shove templates left, right and centre) but there are still ways to unify them (scattered unions and functions should provide a path to making it work, but failing that there's always the preprocessor including a Sail template multiple times with different $defines).

You can leave the read/write/name implementations there, you just need to change ext_is_CSR_defined.

I don't think that works, because the PMP spec says if the registers are not implemented then they become read-only 0. At the very least wouldn't you have to change the write? And pmpcfg also changes with this number.

If you change is_CSR_defined then read and write will give an illegal instruction exception before attempting to do anything else. But, true, for PMP registers it gets more complicated, so that particular approach doesn't quite work, because it's not that the CSR doesn't exist, it's that it exists with a hard-wired value (architecturally; whether or not a 0-bit register really exists or not is a separate philosophical question).

But this is a nitpicky distraction. The question is: can we implement this entirely in Sail having a single integer parameter to specify the number of PMP registers? If so, great! No benefit to having this code generated.

Almost. You need to manually define the 16 register variables, and read and write functions, just like we do for X registers (which is really just emulating arrays of registers, hence why I could see that being a useful language feature), and because I'm a lazy programmer I'd probably write a single shell for loop to spit out that code, that or use a vim macro. But everything else can be suitably abstracted such that you set N somewhere between 0 and 16 and it just works (with some of the defined registers just not being used if N < 16); that could be static or dynamic, both would work.

@allenjbaum
Copy link
Collaborator

allenjbaum commented Nov 29, 2021 via email

@PeterSewell
Copy link
Collaborator

PeterSewell commented Nov 30, 2021 via email

@martinberger
Copy link
Collaborator Author

"if option57=read-only then ....".

Which can be auto-generated, too.

We have a lot of new extensions coming to be ratified. And we want to improve the testing process. What's the best way forward? Maybe we can try and bullet-point the advantages and disadvantages of the proposed approaches, so they are in one easily accessible place?

@PeterSewell
Copy link
Collaborator

PeterSewell commented Nov 30, 2021 via email

@martinberger
Copy link
Collaborator Author

martinberger commented Nov 30, 2021

Sure, one can - but something has to be the actual source; the question is what, and what affordances it offers.

I think the actual source is not that important, as long as it generates the correct output. So I think we should be clear what the requirements are. The uncontroversial requirements are:

  • Generates full Sail that can remain the (human readable) source (for the forseeable future, some/most Sail will remain handwritten anyway)
  • Full automation, including scriptability from GH actions

Things become more tricky with the additional requirements. Like what other backends are we targetting (tests, Spike/ QEMU glue etc)

@PeterSewell
Copy link
Collaborator

PeterSewell commented Nov 30, 2021 via email

@martinberger
Copy link
Collaborator Author

generating a single parameterised definition.

I mean a single parameterised definition. What I'm not 100% clear about is, what shape this could have.

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 8, 2021 via email

@martinberger
Copy link
Collaborator Author

A meta question: how much to we want do delay deployment of Sail code to meet the coding guidlelines, vs. fixing them later?

This is an great question. From a technical POV, we can phase in changes gradually, requiring that new code is (at least in principle) equivalent to old code. I would be interested to hear from industry about their preferences; what's more important:

  • getting extensions as Sail as early as possible at the cost of that Sail changing, or, alternatively,
  • stability of the Sail code base?

@allenjbaum
Copy link
Collaborator

There are a couple of discussion point that I wanted to comment on that I lost track of. A couple rare:

The shape of the options could be relevant, e.g. if some options have to
be chosen from a space so complex that one want to describe each possible
value using a more-or-less general-purpose algorithmic language. If that
is the case, then, well, we have such a language to hand...

The problem isn't that Sail isn't capable; it must be if we have a generator that produce Sail code from YAML.
The problem is that every implementation would need to write the Sail code for their particular WARL definition for every CSR WARL field.
We can't control that. We can't even easily determine if it is only handling a CSR. So we don't allow that.
We can control the syntax of what we will test for in YAML though, and construct a generator to generate whatever it takes to compile that into a specialized Sail function.
The YAML descriptions are reasonably terse and understandable.
A generic function that can parse that would be a rather difficult exercise in Sail, and not understandable (by mortals) because it is generic. The generic parameterized function would be huge, while the actions it produces are much smaller.
The specialized generated code for those could be understandable if it's written by someone that knows how to write Sail properly.
As we've seen, that's a rather select few that I don't want to overload, and I'd like to reduce that load by removing the burden and having a generator do it

You seem to be focussing exclusively on the scenario of using the model to
test a specific implementation, but that is far from the only purpose of
the architecture ..

Well yes; but I'm nominally in charge for the ACTs, and this is what we need for that.
I don't know how we prove something about the architecture if relevant bits of it are user-definable.
But if we can prove something with per-CSR-field handwritten custom Sail, we can prove the same thing if its' per-CSR field generated custom Sail. You would still need to re-prove some things for every implementation if the proof depended on that CSR R/W behavior. We do recompile the Sail model for every implementation

Specifically for PMP registers: a loop would require a bitmap of writable registers, among other feature bits (grain size, physical address size, 16 vs. 64 registers, a bitmap (possibly per register) as to which bits are read/write vs readonly0 vs readonly1, or the allowed address range, or arbitrary combinations of both, for both pmpaddr and pmpcfg. In practice, those will (usually - not always) be identical for each entry - but that isn't required by the spec. WARL is pretty ugly to deal with.

@PeterSewell
Copy link
Collaborator

PeterSewell commented Dec 22, 2021 via email

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 22, 2021 via email

@PeterSewell
Copy link
Collaborator

PeterSewell commented Dec 22, 2021 via email

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 22, 2021 via email

@martinberger
Copy link
Collaborator Author

can be completely arbitrary,

@allenjbaum By "completely arbitrary", you mean Turing-complete (which means having recursion or some equivalent mechanism)? The example you link to

  • [dependency-vals] -> field-name[index-hi:index-lo] in [legal-values]
  • [dependency-vals] -> field-name[index-hi:index-lo] bitmask [mask, fixedval]

look very specific to me.

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 22, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants