@@ -258,8 +258,8 @@ There are two ways you can use this project:
2582581 . as a scanner generator for C++, similar to Flex;
2592592 . as a flexible regex library API for C++.
260260
261- For the first option, simply build the ** reflex** tool and run it on the
262- command line on a lexer specification:
261+ For the first use case, use the ** reflex** tool on the command line on a lexer
262+ specification:
263263
264264 $ reflex --flex --bison --graphs-file lexspec.l
265265
@@ -275,13 +275,15 @@ visualized with the [Graphviz dot][dot-url] tool:
275275Several examples are included to get you started. See the [ manual] [ manual-url ]
276276for more details.
277277
278- For the second option, simply use the RE/flex matcher API classes to start
279- pattern search, matching, splitting and scanning on strings, wide strings,
280- files, and streams.
278+ For the second use case, use the RE/flex matcher API classes to start pattern
279+ search, matching, splitting and scanning on strings, wide strings, files, and
280+ streams.
281281
282282You can select matchers that are based on different regex engines:
283283
284284- RE/flex regex: ` #include <reflex/matcher.h> ` and use ` reflex::Matcher ` ;
285+ - RE/flex fuzzy regex for approximate matching:
286+ ` #include <reflex/fuzzymatcher.h> ` and use ` reflex::FuzzyMatcher `
285287- PCRE2: ` #include <reflex/pcre2matcher.h> ` and use ` reflex::PCRE2Matcher ` or
286288 ` reflex::PCRE2UTFMatcher ` .
287289- Boost.Regex: ` #include <reflex/boostmatcher.h> ` and use
@@ -292,13 +294,19 @@ You can select matchers that are based on different regex engines:
292294Each matcher may differ in regex syntax features (see the full documentation),
293295but they all share the same methods and iterators, such as:
294296
295- - ` matches() ` returns nonzero if the input matches the specified pattern;
296- - ` find() ` search input and returns nonzero if a match was found;
297- - ` scan() ` scan input and returns nonzero if input at current position matches;
298- - ` split() ` returns nonzero for a split of the input at the next match;
299- - ` find.begin() ` ...` find.end() ` filter iterator;
300- - ` scan.begin() ` ...` scan.end() ` tokenizer iterator;
301- - ` split.begin() ` ...` split.end() ` splitter iterator.
297+ - ` matches() ` returns nonzero if the whole input from start to end matches the specified pattern;
298+ - ` find() ` search input and returns nonzero if a match was found, can be repeated;
299+ - ` scan() ` scan input and returns nonzero if input at current position matches, can be repeated;
300+ - ` split() ` returns nonzero for a split of the input at the next match, can be repeated;
301+ - ` find.begin() ` ...` find.end() ` a filter iterator, iterates with ` find() ` ;
302+ - ` scan.begin() ` ...` scan.end() ` a tokenizer iterator, iterates with ` scan() ` ;
303+ - ` split.begin() ` ...` split.end() ` a splitter iterator, iterates with ` split() ` .
304+
305+ The input matched and searched may be a string, a wide string, a file, or a
306+ stream. Searching is incremental, meaning that the input is not buffered as a
307+ whole in memory, but rather buffered in parts in a sliding window of a few KB.
308+ The window size may grow to fit a pattern match. UTF-16/32 file input with a
309+ UTF BOM is automatically normalized and matched as UTF-8.
302310
303311For example, using Boost.Regex (alternatively use PCRE2 ` reflex::PCRE2Matcher ` ):
304312
0 commit comments