Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nfa: improve construction times for small automatons #126

Merged
merged 3 commits into from
Aug 29, 2023

Conversation

BurntSushi
Copy link
Owner

It turns out that #121 introduced a relatively sizeable performance regression when building very small automatons. Namely, several of the steps in the construction process took worst case O(n^2) time, where n corresponds to the alphabet size (255 in this case). This ends up not being too awful when the automaton is big (a lot of patterns), but it adds fairly sizeable overhead in the case of small automatons.

We fix this by making these methods take linear time instead. This makes things a little more complicated, and perhaps there is a better abstraction to make this simpler.

This was found by Ruff's benchmark suite:
astral-sh/ruff#6964

It turns out that #121 introduced a relatively sizeable performance
regression when building very small automatons. Namely, several of the
steps in the construction process took worst case `O(n^2)` time, where
`n` corresponds to the alphabet size (255 in this case). This ends up
not being too awful when the automaton is big (a lot of patterns), but
it adds fairly sizeable overhead in the case of small automatons.

We fix this by making these methods take linear time instead. This makes
things a little more complicated, and perhaps there is a better
abstraction to make this simpler.

This was found by Ruff's benchmark suite:
astral-sh/ruff#6964
This mirrors the same change made for the regex crate.[1]

[1]: rust-lang/regex@c788378
I added this while hacking around in the non-contiguous NFA for the
1.0.4 release and forgot to remove it. So we do a little clean-up.
@BurntSushi BurntSushi merged commit 3ae537c into master Aug 29, 2023
12 checks passed
@BurntSushi
Copy link
Owner Author

This PR is on crates.io in aho-corasick 1.0.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant