Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New fnmatch.py #28

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft

New fnmatch.py #28

wants to merge 4 commits into from

Conversation

rakus
Copy link

@rakus rakus commented May 6, 2019

This pull request is opened as a draft, as it is not mergeable yet. It contains a switch to switch between different implementations for number ranges.
After a decision is made on the issue editorconfig/editorconfig#371, the code needs cleanup to only support one implementation. (If it is of interest at all.)


This pull request proposes a new implementation for the translation of editorconfig glob expressions to python regular expressions.

IMO the following points are important:

This was initially implemented in VimScript for my Vim plugin and was than ported to Python.

Numerical Ranges

This implementation translates numerical ranges into regular expressions.

E.g.

  • {3..10} becomes (?:\+?(?:[3-9]|10))
  • {10..3} also becomes (?:\+?(?:[3-9]|10)), so the order of numbers is irrelevant
  • {-3..+3} becomes (?:-(?:[0-3])|\+?(?:[0-3]))

The special thing about the implementation of numeric ranges is that it is
switchable between different implementations. See the top-level variable NUMBER_MODE in fnmatch.py.

Mode AS_IS

This implementation should work like the current implementation.

Mode ZEROS

This implementation allows any number of leading zeros, as proposed by @cxw42 in Py core: numeric ranges don't handle zero correctly.

So: {3..10} becomes (?:\+?0*(?:[3-9]|10)) and would match

  • 3
  • +3
  • `0000003'
  • `+0003'

Mode JUSTIFIED

This implementation handles numerical ranges as done by bash. I proposed this in a comment to @cxw42 issue here.

Now

  • {3..10} becomes (?:[3-9]|10), so leading + is not matched anymore.
  • {03..10} becomes (?:0[3-9]|10), so all numbers are formatted to equal width. IN this case single-digit numbers need one leading zero.
  • {03..120} becomes (?:00[3-9]|0[1-9][0-9]|1[0-1][0-9]|120). Again the numbers are formatted to equal width, here three digits. So single-digit numbers need two leading zeros, double-digit numbers one.
  • For negative numbers, the leading minus sign is part of the width calculation. So {-3..03} matches -3 and 03, but not -03.

Status

I didn't change anything outside fnmatch.py. So the tests failing with the master branch still fail with this branch. The only on that is fixed is brackets_slash_inside4.

The implementation passes all current tests related to globbing in mode AS_IS and JUSTIFIED. For ZEROS one test fails, that test require leading zeros not to be matched.

Locally I added some tests for mode 'JUSTIFIED', that I could provide also.

There is one function (unescapeBrackets) where I'm unsure if this is really correct.

rakus added 4 commits May 22, 2019 20:02
- improved handling of escaped characters (e.g. a\[.abc)
- improved finding matching brackets and braces
- creates regular expressions to match numerical ranges
- unit-tests to test glob to regex translation

The regex for numerical ranges can be switched between three modes:

- AS_IS: Equivalent to the previous implementation
- ZEROS: Allow any number of leading zeros
- JUSTIFIED: Implement ranges similar to bash

The final mode depends on the outcome of
editorconfig/editorconfig#371

Currently active: JUSTIFIED (the decision I would prefer :-))

On braces and brackets:

'{alpha,[a,]beta}' should become '%(alpha\|\[a\|\]beta\)' as described
in the bash man page.
So if we scan brackets, that are inside braces, the comma should be
handled as a separator for the braces. To prevent this escape the
comma with a backslash:
'{alpha,[a\,]beta}' becomes '%(alpha\|[a,]beta\)'.
'?' now translated to '[^/]'
@cxw42
Copy link
Member

cxw42 commented Jan 15, 2023

If you're still working on this, would you be willing to check if the new fnmatch handles editorconfig/editorconfig-vim#205 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants