Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New dataflow summary format #152

Merged
merged 2 commits into from
Mar 31, 2025
Merged

Conversation

victornicolet
Copy link
Contributor

Implements the new format for dataflow summaries described in #131 .
The new format is easier to read and more high-level than the previous format. This PR also adds a lot of validation and error reporting when using the dataflow summaries, making it easier to get them right.

  • Added some standard library packages to support go1.24


// IContract is a contract for an interface: it applies to each implmentation of the
// interface's methods
type IContract struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the name IContract makes this seem like an interface type (at least from a dotNET naming convention perspective). I think spelling out InterfaceContract or IfaceContract isn't too long and will make the name more clear

"golang.org/x/tools/go/ssa"
)

// RawContract is the type of contracts for serialization/deserialization that gets compiled
// into either an interface or a function contract
type RawContract struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can this be private somehow so the implementation details of parsing a contract don't get leaked to the caller?

}

// FContract is a function contract, it applies only to that set of function implementations
type FContract struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think spelling out FunctionContract isn't that long and will help with readability. Alternatively Fn or FuncContract could work

data, err := loadRawIndexedFormat(fileName)
if err == nil {
// Loaded successfully using the legacy (with indices) format; return
return data, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we print a warning here that the user should upgrade to the new string-based format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we'll need to add something in the documentation before doing that. Let's wait a little until we have a stable new format.

return nil, fmt.Errorf("receiver %s is set, but method is empty", s.Receiver)
}
// Can't have a function and a method
if s.Function != "" && s.Method != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to distinguish between functions and methods in the summaries file? I think we should just stick with methods because we treat functions as methods when parsing a config.CodeIdentifier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spirit of pushing for more constrained formats for specifications like you started doing, I think differentiating is better.
In this case, it allows checking that receiver are only used when methods are used. The function string that is internally used is also generated differently depending on whether it's a method or a function.

if err == nil {
t.Error("Expected error")
}
expectedSubstrs := []string{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: testing error strings is error-prone and will be painful to maintain if we change the error messages in the future. Instead, you can return a public error type like ErrEmptyPackage from ParseSummariesFile and then use errors.Is in the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I'm testing the user interface, I want to avoid someone to inadvertently change the error messages that should be surfaced to the user. I can add a comment in the tests. I don't want to add the error types just for testing; maybe if we need more error handling in the non-testing code then yes.

// See the License for the specific language governing permissions and
// limitations under the License.

package summaries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: having a separate test package summaries_test is nicer because it forces you to only test the package's public API

@victornicolet victornicolet force-pushed the new-dataflow-summary-format branch from 5d83ca8 to 9bb1810 Compare March 31, 2025 15:48
@victornicolet victornicolet merged commit 636aa19 into mainline Mar 31, 2025
2 checks passed
@samarth-aws samarth-aws deleted the new-dataflow-summary-format branch April 3, 2025 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants