Skip to content

Commit 872a646

Browse files
committed
Enhance performance of synthesize_collect expressions.
- Implemented `synthesize_collect_expression` to optimize the combination of synthesis and collection operations, improving performance for complex data structures. - Updated `README.md` to clarify the functionality of bracket expressions. - Refined `synthesize_expression` error messages for better clarity. - Removed redundant `build_collection` method in favor of the new `build_container` function, streamlining the codebase. - Adjusted `CHANGELOG.md` to reflect these significant changes and improvements.
1 parent 697f75d commit 872a646

File tree

3 files changed

+54
-30
lines changed

3 files changed

+54
-30
lines changed

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,10 @@
22

33
## Release v0.6.0 (Under Development)
44

5+
* Implemented optimized `synthesize_collect_expression` that combines synthesize and collect operations into a single semantic action, eliminating redundant attribute stack operations and significantly improving performance for complex data structure construction.
56
* Replaced the basic regular expression (BRE) implementation with bracket expressions (`_bx`). This change improves performance and reduces code complexity as the previous implementation was only partially complete and primarily used for character sets or ranges. The new bracket expressions provide the same functionality with better optimization opportunities in the expression tree, resulting in smaller code size, faster parsing and faster compilation times.
67
* Completely redesigned repeat mechanism with a highly optimized implementation that dramatically reduces bytecode size and significantly improves parsing performance for repetitive patterns. Introduced more expressive control directives: `repeat<min,max>[e]` for bounded repetition, `at_least<min>[e]` for minimum repetition, `at_most<max>[e]` for maximum repetition, and `exactly<count>[e]` for fixed repetition.
78
* Added specialized repeat opcodes that eliminate redundant stack operations and reduce instruction count for common repetition patterns.
8-
* Implemented tail-call optimization for repeat operations to prevent stack overflow on deeply nested repetitions.
99
* Implemented an optimized whitespace skipping mechanism, replacing the previous implementation for better performance.
1010
* Fixed critical issues with `eol` and `eoi` combinators that were incorrectly interacting with whitespace skipping logic.
1111
* Added specialized fast paths for common whitespace patterns to improve parsing speed in typical scenarios.

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -202,7 +202,7 @@ Quick Reference
202202
| `chr(c)` | Matches the UTF-8, UTF-16, or UTF-32 character *c*. |
203203
| `chr(c1, c2)` | Matches characters in the UTF-8, UTF-16, or UTF-32 interval \[*c1*-*c2*\]. |
204204
| `str(s)` | Matches the sequence of characters in the string *s*. |
205-
| `bkt(s)` | Bracket expression matching a set of characters or character ranges, optionally negated with a `^` prefix. Can be referred to informally as a character bucket expression. |
205+
| `bkt(s)` | Bracket expression matching a set of characters and/or character intervals in string *s*, optionally negated with a *^* prefix. |
206206
| `any` | Matches any single character. |
207207
| `any(flags)` | Matches a character exhibiting any of the character properties. |
208208
| `all(flags)` | Matches a character with all of the character properties. |
@@ -236,7 +236,7 @@ Quick Reference
236236
| `_bx` | Bracket Expression | Bracket expression containing multiple caracters and character ranges. |
237237
| `_icx` | Case Insensitive Character Expression | Same as `_cx` but case insensitive |
238238
| `_isx` | Case Insensitive String Expression | Same as `_sx` but case insensitive |
239-
| `_ibx` | Case Insensitive Regular Expression | Same as `_bx` but case insensitive |
239+
| `_ibx` | Case Insensitive Bracket Expression | Same as `_bx` but case insensitive |
240240
| `_scx` | Case Sensitive Character Expression | Same as `_cx` but case sensitive |
241241
| `_ssx` | Case Sensitive String Expression | Same as `_sx` but case sensitive |
242242
| `_sbx` | Case Sensitive Bracket Expression | Same as `_bx` but case sensitive |

include/lug/lug.hpp

Lines changed: 51 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1846,6 +1846,32 @@ template <class X1> symbol_block_expression(X1&&) -> symbol_block_expression<std
18461846
template <class X1> local_block_expression(X1&&) -> local_block_expression<std::decay_t<X1>>;
18471847
template <class X1> local_to_block_expression(X1&&, std::string_view) -> local_to_block_expression<std::decay_t<X1>>;
18481848

1849+
template <class Container, class... As, std::size_t... Is>
1850+
[[nodiscard]] Container build_container(environment& envr, std::index_sequence<Is...> const& seq)
1851+
{
1852+
Container container;
1853+
if (auto attributes = envr.finish_attribute_collection(seq.size()); !attributes.empty()) {
1854+
if constexpr (detail::container_has_reserve_v<Container>)
1855+
container.reserve(attributes.size() / seq.size());
1856+
if constexpr (detail::container_has_emplace_back_v<Container, As...>) {
1857+
for ( ; !attributes.empty(); attributes.consume_front(seq.size()))
1858+
(void)container.emplace_back(attributes.template read_front<As, Is>()...);
1859+
}
1860+
else if constexpr (detail::container_has_emplace_after_v<Container, As...>) {
1861+
for (auto last = container.cbegin(); !attributes.empty(); attributes.consume_front(seq.size()))
1862+
last = container.emplace_after(last, attributes.template read_front<As, Is>()...);
1863+
}
1864+
else if constexpr (detail::container_has_emplace_v<Container, As...>) {
1865+
for ( ; !attributes.empty(); attributes.consume_back(seq.size()))
1866+
(void)container.emplace(attributes.template read_back<As, Is, sizeof...(Is)>()...);
1867+
}
1868+
else {
1869+
static_assert(detail::always_false_v<Container>, "container type does not support attribute collection");
1870+
}
1871+
}
1872+
return container;
1873+
}
1874+
18491875
template <class E1, class Container, class... ElementArgs>
18501876
struct collect_expression : unary_encoder_expression_interface<collect_expression<E1, Container, ElementArgs...>, E1>
18511877
{
@@ -1859,32 +1885,9 @@ struct collect_expression : unary_encoder_expression_interface<collect_expressio
18591885
{
18601886
d.encode(opcode::action, semantic_action{[](environment& envr) { envr.start_attribute_collection(); }});
18611887
auto m2 = this->e1.evaluate(d, m);
1862-
d.encode(opcode::action, semantic_action{[](environment& envr) { collect_expression::build_collection<ElementArgs...>(envr, std::index_sequence_for<ElementArgs...>{}); }});
1888+
d.encode(opcode::action, semantic_action{[](environment& envr) { envr.push_attribute(lug::build_container<Container, ElementArgs...>(envr, std::index_sequence_for<ElementArgs...>{})); }});
18631889
return m2;
18641890
}
1865-
1866-
template <class... As, std::size_t... Is>
1867-
static void build_collection(environment& envr, std::index_sequence<Is...> const& seq)
1868-
{
1869-
Container container;
1870-
if (auto attributes = envr.finish_attribute_collection(seq.size()); !attributes.empty()) {
1871-
if constexpr (detail::container_has_reserve_v<Container>)
1872-
container.reserve(attributes.size() / seq.size());
1873-
if constexpr (detail::container_has_emplace_back_v<Container, As...>) {
1874-
for ( ; !attributes.empty(); attributes.consume_front(seq.size()))
1875-
(void)container.emplace_back(attributes.template read_front<As, Is>()...);
1876-
} else if constexpr (detail::container_has_emplace_after_v<Container, As...>) {
1877-
for ( auto last = container.cbegin(); !attributes.empty(); attributes.consume_front(seq.size()))
1878-
last = container.emplace_after(last, attributes.template read_front<As, Is>()...);
1879-
} else if constexpr (detail::container_has_emplace_v<Container, As...>) {
1880-
for ( ; !attributes.empty(); attributes.consume_back(seq.size()))
1881-
(void)container.emplace(attributes.template read_back<As, Is, sizeof...(Is)>()...);
1882-
} else {
1883-
static_assert(detail::always_false_v<Container>, "container type does not support attribute collection");
1884-
}
1885-
}
1886-
envr.push_attribute(std::move(container));
1887-
}
18881891
};
18891892

18901893
template <class X1, class C, class... As> collect_expression(X1&&, std::in_place_type_t<C>, std::in_place_type_t<As>...) -> collect_expression<std::decay_t<X1>, C, As...>;
@@ -1906,7 +1909,7 @@ template <class E1, class Factory, class T, class... Args>
19061909
struct synthesize_expression : unary_encoder_expression_interface<synthesize_expression<E1, Factory, T, Args...>, E1>
19071910
{
19081911
static_assert(sizeof...(Args) > 0, "no arguments types provided to synthesize expression" );
1909-
static_assert(std::is_constructible_v<T, std::decay_t<Args>...>, "synthesized type does not support the provided constructor arguments" );
1912+
static_assert(std::is_constructible_v<T, std::decay_t<Args>...>, "synthesized type T does not support the provided constructor arguments" );
19101913
using base_type = unary_encoder_expression_interface<synthesize_expression<E1, Factory, T, Args...>, E1>;
19111914
template <class X1, class F, class U, class... As> constexpr synthesize_expression(X1&& x1, std::in_place_type_t<F> /*f*/, std::in_place_type_t<U> /*u*/, std::in_place_type_t<As>... /*a*/) noexcept : base_type{std::forward<X1>(x1)} {}
19121915

@@ -1943,16 +1946,37 @@ struct synthesize_combinator
19431946
}
19441947
};
19451948

1949+
template <class E1, class Factory, class T, class Container, class... ElementArgs>
1950+
struct synthesize_collect_expression : unary_encoder_expression_interface<synthesize_collect_expression<E1, Factory, T, Container, ElementArgs...>, E1>
1951+
{
1952+
static_assert(sizeof...(ElementArgs) > 0, "no element types provided to collect expression");
1953+
static_assert(std::is_constructible_v<typename Container::value_type, std::decay_t<ElementArgs>...>, "synthesized element type does not support the provided constructor argument types");
1954+
static_assert(std::is_constructible_v<T, Container>, "synthesized type T not constructible from Container type argument");
1955+
using base_type = unary_encoder_expression_interface<synthesize_collect_expression<E1, Factory, T, Container, ElementArgs...>, E1>;
1956+
template <class X1, class F, class V, class C, class... As> constexpr synthesize_collect_expression(X1&& x1, std::in_place_type_t<F> /*f*/, std::in_place_type_t<V> /*v*/, std::in_place_type_t<C> /*c*/, std::in_place_type_t<As>... /*a*/) noexcept : base_type{std::forward<X1>(x1)} {}
1957+
1958+
template <class M>
1959+
[[nodiscard]] constexpr decltype(auto) evaluate(encoder& d, M const& m) const
1960+
{
1961+
d.encode(opcode::action, semantic_action{[](environment& envr) { envr.start_attribute_collection(); }});
1962+
auto m2 = this->e1.evaluate(d, m);
1963+
d.encode(opcode::action, semantic_action{[](environment& envr) { envr.push_attribute(Factory{}(std::in_place_type<T>, lug::build_container<Container, ElementArgs...>(envr, std::index_sequence_for<ElementArgs...>{}))); }});
1964+
return m2;
1965+
}
1966+
};
1967+
1968+
template <class X1, class F, class V, class C, class... As> synthesize_collect_expression(X1&&, std::in_place_type_t<F>, std::in_place_type_t<V>, std::in_place_type_t<C>, std::in_place_type_t<As>...) -> synthesize_collect_expression<std::decay_t<X1>, F, V, C, As...>;
1969+
19461970
template <class Factory, class T, class Container, class... ElementArgs>
19471971
struct synthesize_collect_combinator
19481972
{
19491973
template <class E, class = std::enable_if_t<is_expression_v<E>>>
19501974
[[nodiscard]] constexpr auto operator[](E const& e) const noexcept
19511975
{
19521976
if constexpr (sizeof...(ElementArgs) == 0)
1953-
return synthesize_expression{collect_expression{make_expression(e), std::in_place_type<Container>, std::in_place_type<typename Container::value_type>}, std::in_place_type<Factory>, std::in_place_type<T>, std::in_place_type<Container>};
1977+
return synthesize_collect_expression{make_expression(e), std::in_place_type<Factory>, std::in_place_type<T>, std::in_place_type<Container>, std::in_place_type<typename Container::value_type>};
19541978
else
1955-
return synthesize_expression{collect_expression{make_expression(e), std::in_place_type<Container>, std::in_place_type<ElementArgs>...}, std::in_place_type<Factory>, std::in_place_type<T>, std::in_place_type<Container>};
1979+
return synthesize_collect_expression{make_expression(e), std::in_place_type<Factory>, std::in_place_type<T>, std::in_place_type<Container>, std::in_place_type<ElementArgs>...};
19561980
}
19571981
};
19581982

0 commit comments

Comments
 (0)