Add option to exclude huge files #177

pyhedgehog · 2024-12-16T10:34:18Z

Background

Sometimes repositories contains source files too big to be parsed by pygount (I've stuck with 40Mb text file that seems to be source for ML-model). I propose simplest solution — option exclude huge files from processing.

Goals

When option --larger-to-skip is passed, every file that is larger than specified size is skipped from parsing.
Support parsing size specifiers like 10k, 20m and so on. Maybe borrow code from test.support._parse_memlimit()...

The text was updated successfully, but these errors were encountered:

roskakori · 2024-12-16T10:44:31Z

Yes, this makes sense.

Not a high priority, but I will probably work on pygount during the coming holidays, so we will see.

Actually, in analysis.py you can already find this in SourceState:

    generated = 6
    # TODO: 'huge' = auto()  # source code exceeds size limit
    #: pygments does not offer any lexer to analyze the source
    unknown = 7

roskakori added this to Open source projects Dec 16, 2024

github-project-automation bot moved this to 🆕 New in Open source projects Dec 16, 2024

roskakori added the enhancement label Dec 16, 2024

roskakori moved this from 🆕 New to 📋 Backlog in Open source projects Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to exclude huge files #177

Add option to exclude huge files #177

pyhedgehog commented Dec 16, 2024

roskakori commented Dec 16, 2024

Add option to exclude huge files #177

Add option to exclude huge files #177

Comments

pyhedgehog commented Dec 16, 2024

Background

Goals

roskakori commented Dec 16, 2024