Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbounded mutual recursion in Parser impl #347

Open
tari opened this issue May 10, 2022 · 3 comments
Open

Unbounded mutual recursion in Parser impl #347

tari opened this issue May 10, 2022 · 3 comments

Comments

@tari
Copy link

tari commented May 10, 2022

I've implemented a recursive descent parser via the precedence climbing method using Combine, but it seems prone to unbounded mutual recursion somewhere in undocumented methods of the Parser impl.

The parser itself looks something like this:

struct ExpressionParser(u8);

impl<Input> Parser<Input> for ExpressionParser<Input>
where Input: Stream<Token = u8>
{
    type Output = EvalResult;
    type PartialState = ();

    fn parse_stream(&mut self, input: &mut Input) -> ParseResult<Self::Output, <Input as StreamOnce>::Error> {
        // calls parse_terminal() and may also recurse explicitly through parse_stream()
    }
}

impl<Input> ExpressionParser<Input>
where Input: Stream<Token = u8>
{
    fn parse_terminal(&self, input: &mut Input) -> ParseResult<EvalResult, <Input as StreamOnce>::Error> {
        // sometimes recurses through the Parser impl on Self
    }
}

I find that in some formulations this gets stuck calling parse_first, parse_partial and parse_lazy on the Parser without end (until the stack overflows) as illustrated by this stack trace (most recent call first) taken from where parse is called before things go wrong:

#74777 0x000055555555df80 in combine::parser::Parser::parse_first<ExpressionParser<&[u8]>, &[u8]> (
    self=0x7fffffffc459, input=0x7fffffffd558, state=0x7fffffffbb18)
    at /home/tari/.cargo/registry/src/github.com-1ecc6299db9ec823/combine-4.6.3/src/parser/mod.rs:245                                       
#74778 0x000055555555df21 in combine::parser::Parser::parse_lazy<ExpressionParser<&[u8]>, &[u8]> (    
    self=0x7fffffffc459, input=0x7fffffffd558)
    at /home/tari/.cargo/registry/src/github.com-1ecc6299db9ec823/combine-4.6.3/src/parser/mod.rs:196                                       
#74779 0x000055555555dfb0 in combine::parser::Parser::parse_partial<ExpressionParser<&[u8]>, &[u8]> (         
    self=0x7fffffffc459, input=0x7fffffffd558, state=0x7fffffffc361)
    at /home/tari/.cargo/registry/src/github.com-1ecc6299db9ec823/combine-4.6.3/src/parser/mod.rs:261                                       
#74780 0x000055555555df80 in combine::parser::Parser::parse_first<ExpressionParser<&[u8]>, &[u8]> (                
    self=0x7fffffffc459, input=0x7fffffffd558, state=0x7fffffffc361)
    at /home/tari/.cargo/registry/src/github.com-1ecc6299db9ec823/combine-4.6.3/src/parser/mod.rs:245
#74781 0x000055555555e063 in combine::parser::Parser::parse_mode_impl<ExpressionParser<&[u8]>, &[u8], combine::parser::FirstMode> (
    self=0x7fffffffc459, mode=..., input=0x7fffffffd558, state=0x7fffffffc361)
    at /home/tari/.cargo/registry/src/github.com-1ecc6299db9ec823/combine-4.6.3/src/parser/mod.rs:294                                      
#74782 0x000055555556274f in combine::parser::{impl#3}::parse<ExpressionParser<&[u8]>, &[u8]> (
    self=..., parser=0x7fffffffc459, input=0x7fffffffd558, state=0x7fffffffc361)                                                                      
    at /home/tari/.cargo/registry/src/github.com-1ecc6299db9ec823/combine-4.6.3/src/parser/mod.rs:1170

According to the documentation it seems like implementing parse_stream alone is fine, and parse_partial/parse_first are not documented so I don't understand what they're meant to be doing or why they end up mutually recursing.


When I previously had this problem in parse_terminal I was able to work around it (for reasons I don't understand) by wrapping the recursive call in a wrapper function. Whereas this sort of thing exhibited the same problem described here:

Self::parse(instance, input)

wrapping it in parser and explicitly calling parse_stream did not:

combine::parser(|i| Self::parse_stream(instance, i))
@Marwes
Copy link
Owner

Marwes commented May 10, 2022

I'd generally avoid implementing Parser manually as the addition of partial parsing complicates manual implementations quite a bit. Instead use combine::parser(|i| ...) if you need a manual function implementation.

If you still need a manual implementation I would copy paste from an implementation in the combine repository as it seems that just implementing parse_stream does not work, but parse_lazy does work.

impl<Input, P> Parser<Input> for Satisfy<Input, P>
where
    Input: Stream,
    P: FnMut(Input::Token) -> bool,
{
    type Output = Input::Token;
    type PartialState = ();

    #[inline]
    fn parse_lazy(&mut self, input: &mut Input) -> ParseResult<Self::Output, Input::Error> {
        satisfy_impl(input, |c| {
            if (self.predicate)(c.clone()) {
                Some(c)
            } else {
                None
            }
        })
    }
}

@tari
Copy link
Author

tari commented May 11, 2022

I can confirm that simply changing my impl to implement parse_lazy instead of parse_stream fixes the issue I was seeing.

That suggests to me that the Parser documentation should be changed to suggest that implementers implement parse_lazy only, rather than either of that or parse_stream. Would that be reasonable, or is there something else that means sometimes users could want to implement parse_stream?

@shaobo-he-aws
Copy link

shaobo-he-aws commented Jul 31, 2023

I encountered a similar issue. We should at least update the document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants