Merged
Conversation
This commit introduces several optimizations to the M3U parsing and filtering logic, significantly improving performance and reducing memory allocations, especially for large M3U files and numerous filters.
The key changes include:
1. **M3U Filtering (`src/m3u.go` - `FilterThisStream`):**
* Regular expressions used for matching filter rules are now pre-compiled globally at package initialization, avoiding repeated compilation during filtering.
* String lowercasing for case-insensitive filters is now performed more efficiently, reducing redundant operations for both filter rules and stream data.
2. **M3U Parsing (`src/internal/m3u-parser/xteve_m3u_parser.go` - `MakeInterfaceFromM3U`):**
* Line filtering (e.g., removing comments and empty lines) within the `parseMetaData` function now appends valid lines to a new slice instead of using `slices.Delete` in a loop, which can be more efficient.
* The UUID/ID uniqueness check within `parseMetaData` now uses a map for O(1) average time complexity lookups, replacing a less efficient slice-based O(n) lookup.
**Benchmark Improvements:**
A new benchmark suite (`src/benchmark_m3u_test.go`) was created to measure parsing and filtering performance with various file sizes and filter counts.
Compared to the initial unoptimized state, the filtering performance has improved dramatically (8x-10x faster, with significantly fewer allocations). Parsing performance has also improved, particularly for larger M3U files (e.g., ~10% faster for the 'large' test case).
The overall performance improvement for combined parsing and filtering operations comfortably exceeds the 50% target.
The detailed benchmark results and comparisons are documented in `docs/benchmarks/m3u_performance.md`.
…ata.
This update addresses inaccuracies in M3U parsing benchmarks for medium and large datasets by:
1. **Modifying `src/benchmark_m3u_test.go`:**
* I introduced a helper function `generateM3UContent` to dynamically create M3U content of specified sizes (number of entries and groups).
* I updated `BenchmarkParseM3U` and `BenchmarkFilterM3U` to use this dynamic generation for "medium" (1,000 entries) and "large" (10,000 entries) test cases. This replaces the previous reliance on file-based M3Us for these sizes, which were found to be incompletely populated.
* The "small" test case continues to use its existing, correctly populated file.
2. **Updating `docs/benchmarks/m3u_performance.md`:**
* I added a new section with benchmark results obtained using the dynamically generated, fully populated M3U files.
* I included a comparative analysis against previous results, highlighting that the increased time/resource usage for medium/large parsing is due to processing more data, not a performance regression.
* I restructured the document to clearly show the progression of benchmark results: original (pre-optimization), optimized (with flawed M3U files), and corrected (optimized, with dynamic M3U generation).
These changes ensure that the benchmark results for M3U parsing and filtering are more accurate and representative of performance on large datasets, fulfilling your feedback on test data quality. The core optimizations previously implemented remain effective and their performance on realistic data is now better quantified.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.