Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add python/cython benchmark for the lark parser #1

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Erotemic
Copy link
Collaborator

@erezsh I appreciate you pointing me to this repo earlier.

I was able to demonstrate about a 2x speedup for my specific DSL, but I was curious what that speedup was more generally, and I as also looking for a nice way to quantify if any chances to the Cython code would provide further speedups.

I took a peek at the Cython code and I think it is ripe for optimization. Python types are being used everywhere, and if we were able to refactor those with pure C types, then I think there could be a large speed gain. Unfortunately, I'm not the greatest Cython coder, so I wasn't able to make any simple changes that did anything.

However, I did write a reproducible benchmark to quantify the speedup between CPython-lark, and Cython-Lark over a range of input sizes.

The timing module I use is timerit (which I wrote). It works similarly to timeit, but can work inline in existing code. I also use pandas, matplotlib, and seaborn to generate a nice figure showing the speedup over different input sizes.

image

The stdout of the script is:

Timed best=1.144 ms, mean=1.704 ± 0.4 ms for method=parse_python,size=16
Timed best=5.974 ms, mean=9.279 ± 1.5 ms for method=parse_python,size=87
Timed best=10.813 ms, mean=17.676 ± 2.1 ms for method=parse_python,size=158
Timed best=16.715 ms, mean=25.435 ± 3.1 ms for method=parse_python,size=229
Timed best=20.364 ms, mean=31.777 ± 4.4 ms for method=parse_python,size=299
Timed best=19.721 ms, mean=37.681 ± 8.5 ms for method=parse_python,size=370
Timed best=23.847 ms, mean=47.875 ± 7.2 ms for method=parse_python,size=441
Timed best=29.769 ms, mean=47.659 ± 11.1 ms for method=parse_python,size=512
Timed best=0.751 ms, mean=1.226 ± 0.2 ms for method=parse_cython,size=16
Timed best=4.176 ms, mean=6.391 ± 0.7 ms for method=parse_cython,size=87
Timed best=7.068 ms, mean=11.063 ± 1.5 ms for method=parse_cython,size=158
Timed best=8.398 ms, mean=13.830 ± 3.0 ms for method=parse_cython,size=229
Timed best=11.037 ms, mean=19.378 ± 4.7 ms for method=parse_cython,size=299
Timed best=13.332 ms, mean=25.704 ± 4.5 ms for method=parse_cython,size=370
Timed best=14.691 ms, mean=25.771 ± 6.9 ms for method=parse_cython,size=441
Timed best=20.890 ms, mean=31.757 ± 6.4 ms for method=parse_cython,size=512

Statistics:
                                   min style_key size_key              hue_key        method  size      mean
key                                                                                                         
method=parse_cython,size=16   0.000751        {}       {}  method=parse_cython  parse_cython    16  0.001226
method=parse_python,size=16   0.001144        {}       {}  method=parse_python  parse_python    16  0.001704
method=parse_cython,size=87   0.004176        {}       {}  method=parse_cython  parse_cython    87  0.006391
method=parse_python,size=87   0.005974        {}       {}  method=parse_python  parse_python    87  0.009279
method=parse_cython,size=158  0.007068        {}       {}  method=parse_cython  parse_cython   158  0.011063
method=parse_cython,size=229  0.008398        {}       {}  method=parse_cython  parse_cython   229  0.013830
method=parse_python,size=158  0.010813        {}       {}  method=parse_python  parse_python   158  0.017676
method=parse_cython,size=299  0.011037        {}       {}  method=parse_cython  parse_cython   299  0.019378
method=parse_cython,size=370  0.013332        {}       {}  method=parse_cython  parse_cython   370  0.025704
method=parse_cython,size=441  0.014691        {}       {}  method=parse_cython  parse_cython   441  0.025771
method=parse_python,size=229  0.016715        {}       {}  method=parse_python  parse_python   229  0.025435
method=parse_python,size=370  0.019721        {}       {}  method=parse_python  parse_python   370  0.037681
method=parse_python,size=299  0.020364        {}       {}  method=parse_python  parse_python   299  0.031777
method=parse_cython,size=512  0.020890        {}       {}  method=parse_cython  parse_cython   512  0.031757
method=parse_python,size=441  0.023847        {}       {}  method=parse_python  parse_python   441  0.047875
method=parse_python,size=512  0.029769        {}       {}  method=parse_python  parse_python   512  0.047659
Speedup:
           min style_key size_key              hue_key        method      mean  speedup_mean  speedup_min
size                                                                                                     
16    0.000751        {}       {}  method=parse_cython  parse_cython  0.001226      1.390526     1.523632
87    0.004176        {}       {}  method=parse_cython  parse_cython  0.006391      1.451871     1.430450
158   0.007068        {}       {}  method=parse_cython  parse_cython  0.011063      1.597770     1.529840
229   0.008398        {}       {}  method=parse_cython  parse_cython  0.013830      1.839145     1.990309
299   0.011037        {}       {}  method=parse_cython  parse_cython  0.019378      1.639875     1.845177
370   0.013332        {}       {}  method=parse_cython  parse_cython  0.025704      1.465973     1.479169
441   0.014691        {}       {}  method=parse_cython  parse_cython  0.025771      1.857683     1.623239
512   0.020890        {}       {}  method=parse_cython  parse_cython  0.031757      1.500740     1.425010
Average speedup
                  mean       std       min       25%       50%       75%       max
speedup_mean  1.592948  0.176645  1.390526  1.462448  1.549255  1.689692  1.857683
speedup_min   1.605853  0.206135  1.425010  1.466989  1.526736  1.678723  1.990309

This is also benchmarked against the lark.lark grammar that ships with lark. I have a helper function to generate a "random" (not really random, but it is simple) lark file to pass to the parser. One question I had was: Is there a way to use lark to generate a string from a grammar? If so that would make bench-marking more complex grammars much easier.

Anyways, I hope this is helpful. Thanks again for this library!

@erezsh
Copy link
Member

erezsh commented Mar 19, 2022

Nice graph! It could be a real improvement to what https://github.com/goodmami/python-parsing-benchmarks has right now.

I think it is ripe for optimization

I think the same. If I had time to work on it, I would probably focus on one of two areas:

  • Use a C lexer instead of the python re module.
  • In the parser, rewrite the state-machine loop.

Is there a way to use lark to generate a string from a grammar?

There is this solution: https://hypothesis.readthedocs.io/en/latest/extras.html?highlight=lark#hypothesis-lark

I haven't tried it myself yet.

@erezsh
Copy link
Member

erezsh commented Mar 19, 2022

P.S. I think the "gold standard" might be to benchmark parsing Python. It's pretty complicated. The Lark grammar for it already exists. And, there's plenty of "test data" for it.

@Erotemic
Copy link
Collaborator Author

I attempted to use hypothesis to generate random strings for the grammar.

I got it to "somewhat" work, but it was unwieldy and unclear how to generate random strings with different specified sizes (or how to request that a string should be larger or shorter --- I haven't generated from a CFG before, so I'm not familiar with what the exact process is).

I saved the patch that adds, this but did point it here as I think it makes the PR worse overall:
Erotemic#1

It's at least a proof of concept.

The graphs it generates are much uglier though, and I can't figure out how to get hypothesis to generate long strings. I'm stuck with the set of 23 examples it generates for me.

image

@erezsh
Copy link
Member

erezsh commented Mar 21, 2022

Generating strings from a grammar is not very difficult conceptually. You can generate every variation by recursing into rules and generating every variation of the terminals. (it does require generating text for regexps, but I think there are libraries that do that already). It's a long process, but if you do it with BFS, and maybe add a little randomness, maybe you can generate reasonable samples.

Re the benchmarks, it's a little suspicious that short inputs take much longer than longer ones, no?

@Erotemic
Copy link
Collaborator Author

The x axis is slightly different in this plot. Previously it was the number of iterations I expanded the input for. In this one it is just the number of characters hypothesis generated.

The inputs it generated were:

valid_candidates = [
    '',
    '\n\x85\t  \t',
    '\n\r\n\r\n\u2029\u2006',
    '\n\n\r\n\n\u2008\u2008\u2000\x85',
    '\n\r\n\r\n\n\u2005\x1f\xa0',
    '\r\n\n\n\n\u202f\u2004\x1c\x85\u2007',
    '%ignore\t->?p \t',
    '\r\n \x85 \t\u2009\x0b//\x97 \x1e·',
    '%ignore\t->?p  \t ',
    '\r\n\n\r\n\r\n\n\t\u2002\u2029\u3000\u2009\u2003\u2006//\U000c3046ÍYÜ',
    '\r\n\t \t \t\t\n\r\n\n  \t\t_l1lhzbn:',
    '%import._B6547//4\r\n\r\n\n\x1e\r\n \x1c\x1d\t',
    '%import.T2ZGVX65PWP6CA.JD2VJ7BG',
    '%import.?b2yfuw65ovo6b_.JD2VJ7BG',
    '\n\r\n\r\n\n\r\n\u2000\u2005\u2004\u2008\n\u2003\u2005\x85\u2008\u2009\u2003\x1e\u2007\u2007\x1f\u2029\u2005\x1d\u2001\u2009\r\u2003//\U0007eccdÞfÆ',
    '\r\n\n\u1680\u202f\x0b\u202f\u2002\u205f\u2005\u2008\x1c\u2028\u2007\u3000\xa0\n\u2000\u2000\u2028\t\u2008\u2007\u200a\u2007\t\n\u3000\u200a\x1e//´%\\Z²×\U000c2c8aÿE\x97𭺐',
    '\r\n\n\r\n\n\n\n\r\n\n\n\n\n\n\r\n\r\n\r\n\n\n\n\n\r\n\n\r\n\r\n\r\n\n\r\n\r\n\r\n\xa0\u2000',
    '\r\n \x85\x85  \x85\x85 \x85!_b11c8{!_c}\t\t\t    :->  !?b17b8j9',
    '%importGJO7->ed6u3fvea\n\n\r\n\r\n\n\n\n\r\n\n\u2009\n\xa0\x1c\u2004\u2008\u2001\u2003\u2002\u2029\u2003\u205f',
    'SQ79{_K\t\t ,_L57C,_BTZSUN556M9\t \t\t\t}:\u2006\u2008//->!_wo8i ',
    '%import.PDMXSB85YFFPQ_94_CR2M1H4VQGZI\u1680\u2006\x85\u2005\u2006//\x95À\x12ÍZ¢',
    '?ewt_81v2gppir7n{r4qam2dc\t,c8k}:   \t \t\t\t\t ->!x\t\t\t\t ',
    '%import._I5AJ3X8HC\x1e\x1c\x1d\x0c//龝\x8eĀ\x08ÃSf\x8c\U0010abab9r]¡\U000d5620\r//è\n\r\n\n\u1680\u2000\r\n\n\u2005\t\t\u202f\u2029\x1e\xa0\t \t\t   \t\t\t \t\t ',
    '\r\n\r\n\n\n\r\n\r\n\n\n\r\n\n\u2007\n\n\n\u200a\u2000\x0b%import.d1qe_t0_e \u2008 \u1680\x0b\u2007\n//zþOw\x01!ù7㌜ë\x92->\x0b\u2009\u2009\u2009\r\u2007\x85\r\u2000\u2005\r\u2008//\x83\U000f22e1\U000c10b1!mcjyky1k\t\t',
    '%override!?p59mkiv_ahzr{?y01urq22b\t} \t  \t \t.-661:\u2003\u2002\u2000\x1d\u2028//\x928\x1a->?u_0h53deo \t \t\t    \t\t\u2006\x0b \u1680\x1e\x0b\u202f \u205f//LÞ?',
    '%declare\t\t\t\t \t \t \t \t \t  \t\t  \t\t \t\t\t Z_8VMNK7DN2PE2E\t\t\t\t\t\t  \t  \t\t   \t\t\t//´&O\x13ibõÏâx\x85~𮛵å³S:mݦ-×:\r\n\n\r\n \u2005\x1f\x0c\x1c\t',
    '%overridegic5{!bcvqs    }.-6109646: \t\t\t \t \t\t \t\t\t\u2000\u2028\u2000\u3000//p\x06ø\r\n\u2000\x1e\x0c\u2028\xa0\u2007\x1c\u3000//¼]\U0009b005É\U000ada1aLQ2Z3{_UV,_WEXWY5JCF_NBCK0F6CU9O\t  \t\t\t    \t }:->x',
    '\n\n\r\n\r\n\n\n\n\r\n\n\r\n\n\n\n\r\n\r\n\r\n\r\n\n\r\n\x1f\u2001\u2008\u205f\u2008\u2006\n\n\r\n\r\n\n\n\n\r\n\n\r\n\u2004\u2005\x85\n\u2007\x1c\u2001\u3000\u2002\u1680\xa0\x1c//Î\x06æ𱇍\x10\U000fc60eó\U0004b07b%import.i.\t  \t\t \t\t\t \t HK\r\u2008\x0c\u200a\u2006\u2003\r\r\x1c\x85\x0c\x1d//|ñ\x01èq\x1fÌW\x0c\x1d\x0b//E$6Þ',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ"i.."𠈂134\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b411\\\\\\\\\\\\"i.."𠈂134\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½1\\\\\\\\\\\\"i.."𠈂134\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1b\\\\\\\\\\\\"i.."𠈂134\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\"i.."𐄂34\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e︁1Ȃ\\\\\\\\\\\\\\\\\\\\"i.."4\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x081\\\\\\\\\\\\"i.."𠈂134\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½01Ȃ\\\\\\\\\\\\\\\\\\\\"i.."4\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b1Ȃ\\\\\\\\\\\\\\\\\\\\"i.."4\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b︁1Ȃ\\\\\\\\\\\\\\\\\\\\"i.."4\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\"i.."𠈂134\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\".."111134\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\\\\\"i.."1134\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"i.."𐄂34\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\\\\\"i.."\\\\\\\\\\\\"i//\x83~¹ò\U000eb616\xad\x00~02020001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\\\\\"..\x85\x85\x85\x85\x1f\x1e //Ă𐄂34\x83~¹ò\U000eb616\xad\x00""~20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\\\\\"..\x85\x85\x85\x85\x1f\x1e //"1134\x83~¹ò\U000eb616\xad\x00"~-20001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\\\\\"i.."\\\\\\\\\\\\\\\\"i//\x83~¹ò\U000eb616\xad\x00~02020001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\\\\\"..\x85\x85\x85\x85\x1f\x1e //"Ȃ\\\\"i//\x83~¹ò\U000eb616\xad\x00~02020001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\\\\\".. //3ĄȂ"\\\\\\\\\\\\\\\\"i//\x83~¹ò\U000eb616\xad\x00~02020001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\\\\\"..\x85\x85\x85\x85\x85//Ȃ"\\\\\\\\\\\\\\\\"i//\x83~¹ò\U000eb616\xad\x00~02020001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\\\\\"..\x85\x85\x85\x85\x1f//Ȃ"\\\\\\\\\\\\\\\\"i//\x83~¹ò\U000eb616\xad\x00~02020001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\\\\\"..\x85\x85\x85 \x85//Ȃ"\\\\\\\\\\\\\\\\"i//\x83~¹ò\U000eb616\xad\x00~02020001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
    '\r\n\n\r\n\u2028\x85\u1680\r\u2006\u200a\u2028//Ö\x19\U0010318b_B1AC9DD{_ZK5Y1}\u3000\u2002\x1d\x1f\u1680\u202f\u2005//JLÏ\x93éfä. \u2002\n\u2004\u2006\r\n\u2006\x1c//\x14\x95566898709757  :/\\\\\\//x\u205f\x1e\xa0\t\u2008\u2029\n//Å\x18\x80~51\u1680\u2007\x0c\x0b\xa0\x1d\u2006\x1d//"\U000c7b41\x1e½\x9b\x08\x1bþ\\\\\\\\\\\\\\\\\\\\\\\\"..\x85\x85\x85\x85\x1f\x1e //"\\\\\\\\\\\\\\\\"i//\x83~¹ò\U000eb616\xad\x00~02020001//𐌈\U00070502034oZ\U000aead9xX  \t /\\/\\\\/i\u2005\u200a//¿N*/\\\\\\\\\\\\\\\\\\\\\\/\\//llm?//\n\n\n\r\n\r\n\x1e_A1101111{_B011}.1:\t \t   \x1c\u2002//\u2000\x1d\u2000\u2008\u2001\u3000\x1e\u3000\x1c\u2006\u2003\u202f//\x01Ú\x1b\u2009\u2009\u2028\x1c\u2008\u205f\x1e\x0b\u205f\u2000 \u2005\u2005\x1f\r \u2008\xa0\u2004\x1e//\r\n\x85\x85 \x85\x85\x85\u2002\u2006\u1680\x1f\x85\x1e\u2004\u3000\x85\u2028//\xadë\x1e_A{_B173F_MC2Y0LE_767WDGI6M26}\x1e//\U000d501b\U000a4046\x86:->\xa0\u3000\u200a\n\u1680\u2004\x85\u2007\u2007\u200a//ȁ!a\u202f\u2005\n\xa0\u2007\u2008\u2003\x0c\u2005\u2007\u2006\r\u202f\u2028\u2005\u1680\u2006\u3000//\n       \x85|()~+110200120314152//01\x1c\u2008\u2008\u205f\u2004\x0b\x0b\u2005\u2004\u2029\t//\'GP¦q\x97꣥²´òs\x15\x88C\x8f\U000b94bc\U000a9289\x90\U0008e406\U000cc0c9\x16\x8c%GÖj\x0b𛊠öd\x9b\U000c2f4a\U0007091a7鎸Q\U000de8ceöD\x9e\x9ba÷\t \t\t\t\t',
]

so I imagine overhead time and differences between what type of tokens are being generated is dominating for the smallest inputs, so differences don't become apparent until you get to the larger inputs (which are still reasonably small).

Increasing the number of timerit iterations and plotting on a log scale yields this:

image

So overall, it's not that suspicious given the input data its working with. And the most structured example in this PR does cleanly show the expected increasing behavior.

I might take a shot at writing a better generator for a grammar, as it would be useful in my own testing. But it's going to be lower on my priority list.

@erezsh
Copy link
Member

erezsh commented Mar 21, 2022

If you don't mind me bringing it up again, why did you choose not to benchmark parsing Python?

@Erotemic
Copy link
Collaborator Author

Benchmarking Python was what I originally tried, but I encountered an error. Taking a quick look again, I see it is because I specified the wrong rule as the start. At the time I just moved to something simpler to get the initial test working.

My original goal was to write a benchmark where the grammar can be swapped in/out interchangeably. The benchmark file derives from a benchmark template I've been working on, so most of the code in the PR is actually boilerplate with just a little bit specifying how to generate inputs, what the parameter grid is, and how to run the code to benchmark.

Also, I did try to pass the parser for the Python grammar into hypothesis but got:

InvalidArgument: Undefined terminal '_INDENT'. Generation does not currently support use of %declare unless you pass `explicit`, a dict of names-to-strategies, such as `{'_INDENT': st.just("")}`

So work might need to be done to get that up to speed. I'm not a big fan of the hypothesis API so far, so I wonder how much of that is my unfamiliarity with the topic versus the API design itself. Like you said I would have though generating from a grammar would be more straightforward, but either there is a complexity I'm unaware of, or there could be a nicer API to handle it (hypothesis seems like it is built to handle much more than just generating form grammars).

@erezsh
Copy link
Member

erezsh commented Mar 21, 2022

My sense is that it would be very easy to find a big bunch of Python snippets (probably there are existing collections), and then just sort them by size. I don't think you need to generate anything yourself.

Anyway, I admit I haven't had a chance to use Hypothesis yet. And as for the Lark plugin, even its author admits it is not as good as it could be.

@erezsh
Copy link
Member

erezsh commented Apr 7, 2022

Did you make any progress on this? Just curious

@Erotemic
Copy link
Collaborator Author

Erotemic commented Apr 7, 2022

Probably won't have time to do any more work on this anytime soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants