-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Needed changes for census profiling #2074
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #2074 +/- ##
==========================================
- Coverage 77.68% 67.72% -9.97%
==========================================
Files 137 51 -86
Lines 10680 2528 -8152
Branches 214 215 +1
==========================================
- Hits 8297 1712 -6585
+ Misses 2297 718 -1579
- Partials 86 98 +12
Flags with carried forward coverage won't be shown. Click here to find out more.
|
profiler/data.py
Outdated
key2 = data.timestamp | ||
|
||
filename = f"{self.path}/{key}/{key2}.json" | ||
if os.path.exists(filename): | ||
os.remove(filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the following open()
will overwrite the file, so is it necessary to remove it first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. Unfortunately the API that mount-s3 provide is not fully compatible with POSIX API and only allows writing new files! Unless additional flag is provided both at installation time and in the code. To avoid all that and keeping the code general, I decided to do it this way.
profiler/__main__.py
Outdated
@@ -1,4 +1,4 @@ | |||
from .profiler import main | |||
import profiler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did the relative imports cause trouble or is that you wanted to specify the module prefixes for the imported keywords?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the relative imports caused trouble actually. I could not figure the exact issue but managed to fix it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try running the profiler as a module: python -m profiler
. If that works, I would reinstate the relative imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done and all working great now with relative imports. Thanks.
profile_report.py
Outdated
exec_time[1] = last_two[1].user_time_sec | ||
|
||
if threshold * float(exec_time[1]) < float(exec_time[0]) or float(exec_time[1]) > threshold * float(exec_time[0]): | ||
raise SystemExit(f"Potential performance degradation detected {exec_time[0]} vs {exec_time[1]}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we allow for 10% degradation every time, we'll soon have unacceptable performance. Might we instead alert when a 10% degradation is detected compared to the first benchmark? If that makes sense, perhaps a larger degradation threshold is allowable.
Also, are the timings always for the same input data? If the data changes, we can't meaningfully compare the performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point about the 10% and also believe 10% compared to the first performance point makes more sense.
That's very true. I am not really sure how to handle the changes of data. I think that needs some brainstorming. One option is to consider new data as a new benchmark or reset the benchmark data when data changes significantly. However, none of these can be automated. What do you think?
profile_report.py
Outdated
|
||
# Processes the set of previously written logs | ||
|
||
threshold = 1.10 # Percent difference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
threshold = 1.10 # Percent difference | |
# The threshold (ratio) of allowable performance degradation between profiling runs | |
threshold = 1.10 |
profile_report.py
Outdated
exec_time[0] = last_two[0].user_time_sec | ||
exec_time[1] = last_two[1].user_time_sec | ||
|
||
if threshold * float(exec_time[1]) < float(exec_time[0]) or float(exec_time[1]) > threshold * float(exec_time[0]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would alert on these two conditions separately, with specific messages. The first condition is not a "degradation", of course.
profile_report.py
Outdated
import argparse | ||
|
||
parser = argparse.ArgumentParser() | ||
parser.add_argument("benchmark", type=str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would rename benchmark
to command
, if only to keep the terminology consistent with the rest of the Profiler code. Consider the signature: FileBaseProfileDB.find(command: str)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
profile_report.py
Outdated
|
||
parser = argparse.ArgumentParser() | ||
parser.add_argument("benchmark", type=str) | ||
parser.add_argument("path", type=str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe db-path
?
profile_report.py
Outdated
@@ -0,0 +1,31 @@ | |||
from profiler import data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment explaining the purpose and usage of the script.
profile_report.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this into profiler
dir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, there is already a report.py
module. Since this script is actually about detecting performance regressions, perhaps name like perf_checker.py
Also, would this be worth moving to the tools/perf_checker
dir in the cellxgene-census
repo? The decision over what constitutes a performance problem is a policy that matters to the Census (in particular the 10% threshold). This repo doesn't care about that. Potentially, you could wrap this in to a Python method in profiler.py
like compare_runs(previous_run_index: int = -2, current_run_index: int = -1)
. And then just call that from Census "profile checker" code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you suggested this script is moved to cellxgen-census repo
profile_report.py
Outdated
exec_time[1] = last_two[1].user_time_sec | ||
|
||
if threshold * float(exec_time[1]) < float(exec_time[0]) or float(exec_time[1]) > threshold * float(exec_time[0]): | ||
raise SystemExit(f"Potential performance degradation detected {exec_time[0]} vs {exec_time[1]}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a performance problem is detected, is there more information that could be output which is useful to the reader? Maybe just print out both of the ProfileData
objects, since that's the probably the very next thing the reader would want to see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great suggestion
248ca3a
to
64c87aa
Compare
4634583
to
8364fde
Compare
profiler/src/profiler/profiler.py
Outdated
@@ -183,12 +191,13 @@ def main(): | |||
print( | |||
f"Third profiler {args.prof2} missing output flamegraph file location" | |||
) | |||
|
|||
args.command.split(" ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line looks like a no-op.
profiler/src/profiler/profiler.py
Outdated
p_stdout, p_stderr = p.communicate() | ||
if p1 is not None: | ||
p1.wait() | ||
if p2 is not None: | ||
p2.wait() | ||
print("Done profiler run", file=stderr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print("Done profiler run", file=stderr) | |
print("Profiler run done", file=stderr) |
8364fde
to
8f57157
Compare
1a1e320
to
ba8db07
Compare
Including reporting script for detecting performance violations
* Needed changes for census profiling Including reporting script for detecting performance violations * An S3 based Database implementation
* Needed changes for census profiling Including reporting script for detecting performance violations * An S3 based Database implementation Co-authored-by: beroy <[email protected]>
Including reporting script for detecting performance violations
Issue and/or context:
Currently census code does not go through performance checks