-
-
Notifications
You must be signed in to change notification settings - Fork 189
Support minLength
and maxLength
for stringMatching
#5562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So far the only option I have for it is to play with |
Hmm, interesting problem. Regular expressions are closed under intersection. So one idea might be to "just" compute the intersection of One way to do it might be:
Any route involving DFAs seems to come with the danger of exponential blow up. For the regular expressions from formal language theory, this is maybe not too complicated but I can imagine you need a ton of case distinctions for the the full JavaScript regular expressions. So not sure if this is the best approach. |
This certainly sounds quite tricky! Thanks for all the research here :) By the way @dubzzz, in case it changes priority, in addition to needing this for We were trying to automatically generate example values for OpenAPI spec schemas like the following (this is just an example schema): type: string
pattern: [a-z]+
minLength: 3
maxLength: 10 I ended doing It's still very quick, but I only needed to generate one value from |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@TomerAberbach It's been a few months so I'm not sure if you still bother 😅 but you nerd sniped me with this problem. I really wanted to see if this regex intersection approach could work. Turns out there are almost no implementations for this (in any programming language). I thought this is standard computer science blabla? So I started hacking on a library myself. The API looks like this: import { intersection } from '@gruhn/regex-utils'
const combinedRegex = intersection(/^[a-z]+$/, /^.{3,10}$/) Preliminary results are not too bad. I created this benchmark that generates random emails addresses with 3-10 characters. Once by going the generate+filter route: fc.sample(
fc.stringMatching(emailRegex).filter(
str => 3 <= str.length && str.length <= 10
),
sampleCount
) And once by computing the intersection with the regex fc.sample(
fc.stringMatching(intersection(/^.{3,10}$/, emailRegex)),
sampleCount
) Predictably, filtering is faster for smaller samples sizes because computing the intersection has quite some overhead. But for larger sample sizes it's clearly much faster:
Even bigger samples size only with intersection:
Although this is exciting, I would enjoy this with caution. It's still very easy to find inputs where this |
🚀 Feature Request
stringMatching
generates based on the regular expression, but doesn't provide a way to set a min or max length on the generated strings.Motivation
Sometimes you want to test specific lengths of a string matching a regex, but modifying a regular expression to have a specific min and/or max length can be pretty hard and confusing. It would be a lot easier to just be able to pass
minLength
andmaxLength
like thestring
arbitrary.Example
The text was updated successfully, but these errors were encountered: