Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of excluded files filter #5157

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

SimplyDanny
Copy link
Collaborator

The current algorithm is like "collect all included files and subtract all excluded files". Collecting all included and all excluded files relies on the file system. This can become slow when the patterns used to exclude files resolve to a large number of files.

The new approach only collects all lintable files and checks them against the exclude patterns. This can be done by in-memory string-regex-match and does therefore not require file system accesses.

The most critical part is the conversion of glob patterns to regular expressions. I might have missed cases.

Fixes #5018.

@SimplyDanny SimplyDanny requested a review from keith August 5, 2023 20:04
@SwiftLintBot
Copy link

SwiftLintBot commented Aug 5, 2023

17 Messages
📖 Linting Aerial with this PR took 1.12s vs 1.13s on main (0% faster)
📖 Linting Alamofire with this PR took 1.61s vs 1.64s on main (1% faster)
📖 Linting Brave with this PR took 9.35s vs 9.37s on main (0% faster)
📖 Linting DuckDuckGo with this PR took 4.74s vs 4.99s on main (5% faster)
📖 Linting Firefox with this PR took 11.72s vs 11.82s on main (0% faster)
📖 Linting Kickstarter with this PR took 11.37s vs 11.35s on main (0% slower)
📖 Linting Moya with this PR took 0.63s vs 0.64s on main (1% faster)
📖 Linting NetNewsWire with this PR took 3.34s vs 3.31s on main (0% slower)
📖 Linting Nimble with this PR took 0.92s vs 0.92s on main (0% slower)
📖 Linting PocketCasts with this PR took 9.17s vs 9.17s on main (0% slower)
📖 Linting Quick with this PR took 0.42s vs 0.42s on main (0% slower)
📖 Linting Realm with this PR took 5.84s vs 5.73s on main (1% slower)
📖 Linting Sourcery with this PR took 2.84s vs 2.86s on main (0% faster)
📖 Linting Swift with this PR took 5.61s vs 5.54s on main (1% slower)
📖 Linting VLC with this PR took 1.56s vs 1.54s on main (1% slower)
📖 Linting Wire with this PR took 20.87s vs 20.84s on main (0% slower)
📖 Linting WordPress with this PR took 13.73s vs 13.65s on main (0% slower)

Generated by 🚫 Danger

@ileitch
Copy link

ileitch commented Nov 2, 2023

Periphery had a similar performance issue not long ago, and I noticed there wasn't a solid glob to regex implementation, so I ported Python's fnmatch to Swift: https://github.com/ileitch/swift-filename-matcher. It might be useful here too.

@jpsim
Copy link
Collaborator

jpsim commented Nov 2, 2023

Last time we tried to speed this up, it caused some slight differences in the result of what was matched vs not, so please be super careful here.

@SimplyDanny
Copy link
Collaborator Author

At the moment, I'm rather concerned here that normal runs without any excludes and includes seem to become much slower sometimes.

Periphery had a similar performance issue not long ago, and I noticed there wasn't a solid glob to regex implementation, so I ported Python's fnmatch to Swift: https://github.com/ileitch/swift-filename-matcher. It might be useful here too.

This is a very helpful tip. I don't want to invent a half-backed version myself. Thanks!

@JaviSoto
Copy link
Contributor

Any chance this can land? 🙏

@SimplyDanny
Copy link
Collaborator Author

Any chance this can land? 🙏

This is a critical change that needs thorough testing. Unfortunately, I'm lacking own projects with nifty included and excluded specifications.

@JaviSoto: In case this change took effect in your projects, I'd appreciate your feedback.

The current algorithm is like "collect all included files and subtract all excluded files".
Collecting all included and all excluded files relies on the file system. This can become slow
when the patterns used to exclude files resolve to a large number of files.

The new approach only collects all lintable files and checks them against the exclude patterns.
This can be done by in-memory string-regex-match and does therefore not require file system accesses.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Excluded files impact the performance of swiftlint
5 participants