generated from NOAA-OWP/owp-open-source-project-template
-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Formulation_Manager.hpp to use boost regex #843
Open
JoshCu
wants to merge
2
commits into
NOAA-OWP:master
Choose a base branch
from
JoshCu:boost_regex
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something like this?
I just copied the code that stats the file after the regex match.
I tested it just using std and it matched the performance of adding {{id}} to the path variable, 27s serial 3.2s parallel 56x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correction* 3.7s in 56x parallel. So not as fast as changing
path
to accept {{id}}, but probably only because I didn't stat the file when I did that, I just accepted the path without verifying if it existed. the speed difference serially is ~27.2 vs ~27.8 but my testing it just manually running it 5 times so I haven't been including the decimal place in the longer timingsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed an apparent bug in that code. We probably shouldn't be checking
S_ISREG
, since that would (I think) exclude symbolic links, which should be legitimate, as long as they point at an existent file.All of this code kinda suffers from the syndrome of "asking permission, rather than forgiveness" - we should ideally just be trying to directly open the files in question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I misunderstood the original code a bit too, it's looping over all the files and checking matches, then returning at the first match it finds right? I mistook it to return all matching files. If it's only returning the first match I can have a go at reworking it to
The line between performance optimization and hack is blurry but I was wondering if it would be faster to get a string of all filenames, separated by some illegal filename character like hash or pipe, then run regex once against that string to pull out the matching files. Rather than running regex once per file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hopefully this is more clear than my fumbling with suggested changes master...JoshCu:ngen:open_files_first_ask_questions_later
How about this? I modified it to just try and open the file immediately like you suggested, then if that fails fall back to the regex matching, which also just tries to open the file rather than using stat. The bit at the top of the diff is just inverting the
if directory != nullptr
logic to unindent the code one layer.In terms of performance, it's fractionally slower (~3% serial, ~10% Parallelx55 with a lot of deviation) than modifying path to accept {{id}} and bypassing the entire directory opening and regex section #843 (comment)
I'm not a fan of littering file and directory closes around before every possible exit path so if there's some nicer way to consolidate the cleanup I'd love to hear it.