-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove AST-node dependency from FunctionType
and ClassType
#14087
Conversation
ScopeId::node
to Scope
to avoid depending on AST nodeType
817ee80
to
7908514
Compare
|
079e46a
to
1fc3dc4
Compare
CodSpeed Performance ReportMerging #14087 will improve performances by 12.36%Comparing Summary
Benchmarks breakdown
|
Lol what |
1fc3dc4
to
a1a709f
Compare
Type
FunctionType
and ClassType
a1a709f
to
6d5f492
Compare
/// would depend on the function's AST and rerun for every change in that file. | ||
pub fn return_ty(self, db: &'db dyn Db) -> Type<'db> { | ||
let scope = self.body_scope(db); | ||
let function_stmt_node = scope.node(db).expect_function(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue without this being a salsa query is that the calling-query suddenly depends on the function's ast (and, therefore, the file).
In the case of our benchmark: _parser.py
calls match_to_datetime
which is defined in _re.py
. Now, it shouldn't be necessary to recheck _parser.py
because no type in _re.py
changed. However, Salsa had to recheck every callsite because the return_type
query accessed the definition.kind(db)
which is the AST node of the match_to_datetime
and its AST has changed (not really, but we use no_eq
...)
I verified that making return_ty
is the reason for the perf improvement by removing the #[salsa::tracked]
.
0b80226
to
6d5f492
Compare
f93cea2
to
4bb6209
Compare
4bb6209
to
a4f8f95
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
crates/red_knot_python_semantic/src/semantic_index/definition.rs
Outdated
Show resolved
Hide resolved
crates/red_knot_python_semantic/src/semantic_index/expression.rs
Outdated
Show resolved
Hide resolved
a4f8f95
to
51a9983
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than my remaining docs nits and one further optional comment:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes here look great, thanks for sorting this out!
My main concern is that I think the guidance here is going to prove quite difficult to remember and follow manually and solely via code review; it's just very easy for these things to leak. It seems like it should be possible to write tests that are similar to what happen in our benchmark, just verifying that if you make a no-op change to one file, it doesn't cause re-execution of queries in another file that shouldn't need to re-execute. I think a few tests like that could go a really long way in helping us avoid regression here.
crates/red_knot_python_semantic/src/semantic_index/definition.rs
Outdated
Show resolved
Hide resolved
# Conflicts: # crates/red_knot_python_semantic/src/types.rs
Co-authored-by: Alex Waygood <[email protected]>
@carljm I agree that a test would be nice. I added one for call inference although I'm not sure how valuable it is because it is rather fragile (a slight change to inference will break the test itself). I don't know how to write the ideal test that:
The problem here is that salsa doesn't provide the necessary reflection API to answer the second question. We could log all events but that test is likely going to break with every inference test AND it's kind of impossible to tell from a I don't mind adding more tests if you have suggestions on how to test this, but I'll leave it at what I have for now. I do think that our benchmarks do a decent job at this anyway. I would hope that we would investigate a 12% perf regression ;) I don't think there's a way to test deserialization in a meaningful before salsa adds persistent caching support. |
f3bec2d
to
26cd7ed
Compare
26cd7ed
to
969461f
Compare
Co-authored-by: Alex Waygood <[email protected]>
Summary
This PR improves Red Knot's incremental computation by:
FunctionType::return_type
andClassType::explicit_bases
salsa queries. I suspect this is where the main performance improvement comes from. Making these two functions salsa queries prevents that the class or function's node (accessed throughdefinition.node
) is marked as a dependency at the callee side. Instead, the callee only depends on whatever the return type or bases are.FunctionType::definition
andClassType::definition
and instead usesemantic_index.definition(..)
to lookup the type's definition lazily. This ensures that deserializing a type doesn't require deserializing the AST as well.ScopeId::node
and move it toScope::node
. Same as forFunctionType::definition
. Moving theNode
from theScopeId
to the definition allows deserializing theScopeId
without deserializing the AST node. This is important because bothFunctionType
andClassType
have aScopeId
field.Fixes #14054
Test Plan
cargo test