Feat/optimize m3u parsing filtering by ted-gould · Pull Request #8 · ted-gould/xTeVe

ted-gould · 2025-05-24T18:06:03Z

No description provided.

This commit introduces several optimizations to the M3U parsing and filtering logic, significantly improving performance and reducing memory allocations, especially for large M3U files and numerous filters. The key changes include: 1. **M3U Filtering (`src/m3u.go` - `FilterThisStream`):** * Regular expressions used for matching filter rules are now pre-compiled globally at package initialization, avoiding repeated compilation during filtering. * String lowercasing for case-insensitive filters is now performed more efficiently, reducing redundant operations for both filter rules and stream data. 2. **M3U Parsing (`src/internal/m3u-parser/xteve_m3u_parser.go` - `MakeInterfaceFromM3U`):** * Line filtering (e.g., removing comments and empty lines) within the `parseMetaData` function now appends valid lines to a new slice instead of using `slices.Delete` in a loop, which can be more efficient. * The UUID/ID uniqueness check within `parseMetaData` now uses a map for O(1) average time complexity lookups, replacing a less efficient slice-based O(n) lookup. **Benchmark Improvements:** A new benchmark suite (`src/benchmark_m3u_test.go`) was created to measure parsing and filtering performance with various file sizes and filter counts. Compared to the initial unoptimized state, the filtering performance has improved dramatically (8x-10x faster, with significantly fewer allocations). Parsing performance has also improved, particularly for larger M3U files (e.g., ~10% faster for the 'large' test case). The overall performance improvement for combined parsing and filtering operations comfortably exceeds the 50% target. The detailed benchmark results and comparisons are documented in `docs/benchmarks/m3u_performance.md`.

…ata. This update addresses inaccuracies in M3U parsing benchmarks for medium and large datasets by: 1. **Modifying `src/benchmark_m3u_test.go`:** * I introduced a helper function `generateM3UContent` to dynamically create M3U content of specified sizes (number of entries and groups). * I updated `BenchmarkParseM3U` and `BenchmarkFilterM3U` to use this dynamic generation for "medium" (1,000 entries) and "large" (10,000 entries) test cases. This replaces the previous reliance on file-based M3Us for these sizes, which were found to be incompletely populated. * The "small" test case continues to use its existing, correctly populated file. 2. **Updating `docs/benchmarks/m3u_performance.md`:** * I added a new section with benchmark results obtained using the dynamically generated, fully populated M3U files. * I included a comparative analysis against previous results, highlighting that the increased time/resource usage for medium/large parsing is due to processing more data, not a performance regression. * I restructured the document to clearly show the progression of benchmark results: original (pre-optimization), optimized (with flawed M3U files), and corrected (optimized, with dynamic M3U generation). These changes ensure that the benchmark results for M3U parsing and filtering are more accurate and representative of performance on large datasets, fulfilling your feedback on test data quality. The core optimizations previously implemented remain effective and their performance on realistic data is now better quantified.

google-labs-jules bot added 2 commits May 24, 2025 17:25

ted-gould merged commit 3905ddc into main May 24, 2025
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/optimize m3u parsing filtering#8

Feat/optimize m3u parsing filtering#8
ted-gould merged 2 commits intomainfrom
feat/optimize-m3u-parsing-filtering

ted-gould commented May 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ted-gould commented May 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant