Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in logic for dropping parentheses #674

Closed
magbak opened this issue Aug 11, 2023 · 18 comments · Fixed by #675
Closed

Bug in logic for dropping parentheses #674

magbak opened this issue Aug 11, 2023 · 18 comments · Fixed by #675

Comments

@magbak
Copy link
Contributor

magbak commented Aug 11, 2023

Description

From https://github.com/SeaQL/sea-query/blob/master/src/expr.rs:

pub(crate) fn need_parentheses(&self) -> bool {
        match self {
            Self::Binary(left, oper, _) => !matches!(
                (left.as_ref(), oper),
                (Self::Binary(_, BinOper::And, _), BinOper::And)
                    | (Self::Binary(_, BinOper::Or, _), BinOper::Or)
            ),
            _ => false,
        }
    }

This logic appears flawed, since the expression
(("a" OR "b") OR "c") is determined to not need a parenthesis.
But it clearly does some times:
((("a" OR "b") OR "c") AND "d")

In binary_expr() (a method of QueryBuilder), need_parenthesis()=false is sufficient to drop them, but SimpleExpr really does not have enough information to be able to decide that.

Steps to Reproduce

mod tests {
    use crate::backend::CommonSqlQueryBuilder;
    use crate::{Alias, BinOper, ColumnRef, DynIden, IntoIden, QueryBuilder, SimpleExpr};

    #[test]
    fn test_parentheses() {
        let qb = CommonSqlQueryBuilder {};
        let mut sql_string = "".to_string();
        let a = SimpleExpr::Column(ColumnRef::Column(Alias::new("a").into_iden()));
        let b = SimpleExpr::Column(ColumnRef::Column(Alias::new("b").into_iden()));
        let c = SimpleExpr::Column(ColumnRef::Column(Alias::new("c").into_iden()));
        let d = SimpleExpr::Column(ColumnRef::Column(Alias::new("d").into_iden()));

        let x_op_y = |x,op,y| SimpleExpr::Binary(Box::new(x), op, Box::new(y));
        let a_or_b = x_op_y(a,BinOper::Or, b);
        let a_or_b_or_c = x_op_y(a_or_b,BinOper::Or, c);
        let a_or_b_or_c_and_d = x_op_y(a_or_b_or_c.clone(),BinOper::And, d);
        println!("Needs parentheses: {}", a_or_b_or_c.need_parentheses());
        qb.prepare_simple_expr(&a_or_b_or_c_and_d, &mut sql_string);

        println!("Sql: {}", sql_string);
    }
}

Outputs:
Needs parentheses: false
Sql: "a" OR "b" OR "c" AND "d"
But should be ("a" OR "b" OR "c") AND "d" or equivalent.

Expected Behavior

Expect parentheses to preserve semantics of expressions.

Actual Behavior

Parentheses are eliminated in a way that changes the semantics of the expression.

Reproduces How Often

Deterministic.

Versions

v0.30.0
DB / OS not relevant.

Additional Information

See test case above.

@magbak magbak changed the title Logic for dropping parentheses is flawed Bug in logic for dropping parentheses Aug 11, 2023
@tyt2y3
Copy link
Member

tyt2y3 commented Aug 11, 2023

This was the reason the Condition API was introduced, and is the recommended way to build complex / mixed conditional expressions.

The reason () was (attempted to be) omitted is we wanted to make it more readable, i.e. to not grow into ((((((a or b) or c) ...

However as you pointed out, it may be non-trivial to do it properly.

@magbak
Copy link
Contributor Author

magbak commented Aug 12, 2023

Thanks for getting back to me quickly!
Yes, I think it might be a bit tricky to solve generally.
The nested or / and are however pretty much fixed by the conditions for the left hand side parenthesis in binary_expr():

&& left.is_binary()
&& op != left.get_bin_oper().unwrap()

Which appears sound - most builtin binary operations are parsed in a left associative way in PostgreSQL.
https://www.postgresql.org/docs/15/sql-syntax-lexical.html
But custom operators, which appear to be supported by SeaQL might be right associative. (Edit: this was a bit of confusion on my part, we are looking here at the case where the placing left-associative parens vs. not doing so, and postgresql docs states that it will parse custom operators left associative aswell. So not a matter of mathematical associativity.)

Also, the comparison operators are nonassociative, so you have to keep the parentheses. See the additional bug below.
I think the simple straight and narrow approach is best here, initially requiring parentheses and only introducing rules that are known to be sound based on the left- and right associativity of the operator in question.

#[test]
    fn test_parentheses2() {
        let qb = CommonSqlQueryBuilder {};
        let mut sql_string = "".to_string();
        let a = SimpleExpr::Column(ColumnRef::Column(Alias::new("a").into_iden()));
        let b = SimpleExpr::Column(ColumnRef::Column(Alias::new("b").into_iden()));
        let c = SimpleExpr::Column(ColumnRef::Column(Alias::new("c").into_iden()));

        let x_op_y = |x,op,y| SimpleExpr::Binary(Box::new(x), op, Box::new(y));
        let a_smaller_b = x_op_y(a,BinOper::SmallerThan, b);
        let a_smaller_b_smaller_c= x_op_y(a_smaller_b, BinOper::SmallerThan, c);
        qb.prepare_simple_expr(&a_smaller_b_smaller_c, &mut sql_string);

        println!("Sql: {}", sql_string);
    }

Produces:
Sql: "a" < "b" < "c"
But this is something like
("a" < "b") and ("b" < "c")

@magbak
Copy link
Contributor Author

magbak commented Aug 12, 2023

I could make a pull request if you like? Think I have an easy solution.

@magbak
Copy link
Contributor Author

magbak commented Aug 12, 2023

The handling of unary expressions also has a serious bug:

#[test]
fn test_parentheses3() {
    let qb = CommonSqlQueryBuilder {};
    let mut sql_string = "".to_string();
    let a = SimpleExpr::Column(ColumnRef::Column(Alias::new("a").into_iden()));
    let b = SimpleExpr::Column(ColumnRef::Column(Alias::new("b").into_iden()));
    let c = SimpleExpr::Column(ColumnRef::Column(Alias::new("c").into_iden()));

    let op_x = |op,x| SimpleExpr::Unary(op, Box::new(x));
    let x_op_y = |x,op,y| SimpleExpr::Binary(Box::new(x), op, Box::new(y));
    let a_and_b = x_op_y(a, BinOper::And, b);
    let not_a_and_b = op_x(UnOper::Not, a_and_b);
    qb.prepare_simple_expr(&not_a_and_b, &mut sql_string);

    println!("Sql 3: {}", sql_string);
}

#[test]
fn test_parentheses4() {
    let qb = CommonSqlQueryBuilder {};
    let mut sql_string = "".to_string();
    let a = SimpleExpr::Column(ColumnRef::Column(Alias::new("a").into_iden()));
    let b = SimpleExpr::Column(ColumnRef::Column(Alias::new("b").into_iden()));
    let c = SimpleExpr::Column(ColumnRef::Column(Alias::new("c").into_iden()));

    let op_x = |op,x| SimpleExpr::Unary(op, Box::new(x));
    let x_op_y = |x,op,y| SimpleExpr::Binary(Box::new(x), op, Box::new(y));
    let not_a = op_x(UnOper::Not, a);
    let not_a_and_b = x_op_y(not_a, BinOper::And, b);
    qb.prepare_simple_expr(&not_a_and_b, &mut sql_string);

    println!("Sql 4: {}", sql_string);
}
Both of these yield the same result:
Sql 3: NOT "a" AND "b"
Sql 4: NOT "a" AND "b"

@magbak
Copy link
Contributor Author

magbak commented Aug 13, 2023

Well, there is no easy comprehensive solution to these problems, which must be fixed by considering how the associativity and precedence rules for parsing expressions in the respective backends.

The idea was to in binary expressions:
Drop left parenthesis if either:

  • Left binary and left.op = op and left_associative(op)
  • precedence(left) > precedence(op)

Drop right parenthesis if:

  • precedence(right) > precedence(op)

And similarly for Unary expressions:

  • Drop expr parenthesis if expr does not contain an op or precedence(expr.op) > precedence(op)

However, the respective backends supported have differing operator precedence rules:

What can be done is to order some of those pairs of operators where all backends agree, and have precedence be unknown otherwise. Adding the simple things.. like "*/%+-" greater than the comparison operators and so on.

Otherwise SQL construction should know about the backend, which it does not appear to do now, and integer precedence values consistent with the precedence table of the backend can be assigned.

@tyt2y3
Copy link
Member

tyt2y3 commented Aug 13, 2023

Thank you for your investigation and suggestion. It seemed very doable! I was wondering, can we further simplify the logic to "we add a paren if the operators are different", without regarding precedence? It might be more verbose, but it would not be wrong. I hope it'd not break too many of our existing test cases though.

Btw, the SQL building stage is database specific, so we can potentially have different logic for each backend. Though it is preferable if we can align them.

@magbak
Copy link
Contributor Author

magbak commented Aug 13, 2023

I think I changed my mind about needing DB specific information, since for the human SQL reader, you would essentially need to know the particularities of precedence rules for your DB to decide what the expression means. Better to go safe and readable and use some parentheses.

So I did as you suggested above and added paren if operators are different by default, but I also added some cases where parens can be dropped to cover exceptional situations and some broad precedence cases - otherwise one third of the tests broke, and the SQL was quite ugly.

Had a go at a pull request, let me know what you think.
Possibly the tests are a bit misplaced, or need to be duplicated across DBs?

#675

magbak added a commit to magbak/sea-query that referenced this issue Aug 16, 2023
magbak added a commit to magbak/sea-query that referenced this issue Aug 16, 2023
magbak added a commit to magbak/sea-query that referenced this issue Aug 16, 2023
@prakol16
Copy link

prakol16 commented Aug 17, 2023

This code causes a panic I believe:

fn test_sea_query() {
        let e = SimpleExpr::from(true).and(SimpleExpr::from(true).and(true.into()).and(true.into()));
        println!("Expr: {}",  Query::select().and_where(e).to_string(SqliteQueryBuilder));  // panics
}

Please fix this soon; this is a serious bug. We are building queries dynamically and cannot simply "switch" to using Condition::all/Condition::any. I would much prefer sea query use unnecessary parentheses and preserve soundness rather than panic on only moderately complicated nested binary operations.

Edit: Sorry, forgot to post the location of the panic. This is the stack trace:

panicked at 'called `Option::unwrap()` on a `None` value', ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:1405:63
stack backtrace:
   0: rust_begin_unwind
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:578:5
   1: core::panicking::panic_fmt
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/panicking.rs:67:14
   2: core::panicking::panic
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/panicking.rs:117:5
   3: core::option::Option<T>::unwrap
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/option.rs:950:21
   4: sea_query::backend::query_builder::QueryBuilder::binary_expr
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:1405:43
   5: sea_query::backend::query_builder::QueryBuilder::prepare_simple_expr_common
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:312:22
   6: sea_query::backend::query_builder::QueryBuilder::prepare_simple_expr
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:285:9
   7: sea_query::backend::query_builder::QueryBuilder::prepare_condition_where::{{closure}}
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:1304:21
   8: core::iter::traits::iterator::Iterator::fold
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/iter/traits/iterator.rs:2482:21
   9: sea_query::backend::query_builder::QueryBuilder::prepare_condition_where
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:1283:9
  10: sea_query::backend::query_builder::QueryBuilder::prepare_condition
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:1260:17
  11: sea_query::backend::query_builder::QueryBuilder::prepare_select_statement
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:127:9
  12: <sea_query::query::select::SelectStatement as sea_query::query::traits::QueryStatementBuilder>::build_collect_any_into
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/query/

And the relevant lines are:

        let right_paren = (right.need_parentheses()
            || right.is_binary() && op != left.get_bin_oper().unwrap())
        ---------------------------------------------------   ^ unwrap called on None
            && !no_right_paren
            && !no_paren;

@tyt2y3
Copy link
Member

tyt2y3 commented Aug 17, 2023

@prakol16 can you post the location and reason for the panic?

@prakol16
Copy link

@prakol16 can you post the location and reason for the panic?

I edited my original comment.

@magbak
Copy link
Contributor Author

magbak commented Aug 17, 2023

This code causes a panic I believe:

fn test_sea_query() {
        let e = SimpleExpr::from(true).and(SimpleExpr::from(true).and(true.into()).and(true.into()));
        println!("Expr: {}",  Query::select().and_where(e).to_string(SqliteQueryBuilder));  // panics
}

Please fix this soon; this is a serious bug. We are building queries dynamically and cannot simply "switch" to using Condition::all/Condition::any. I would much prefer sea query use unnecessary parentheses and preserve soundness rather than panic on only moderately complicated nested binary operations.

Edit: Sorry, forgot to post the location of the panic. This is the stack trace:

panicked at 'called `Option::unwrap()` on a `None` value', ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:1405:63
stack backtrace:
   0: rust_begin_unwind
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/std/src/panicking.rs:578:5
   1: core::panicking::panic_fmt
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/panicking.rs:67:14
   2: core::panicking::panic
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/panicking.rs:117:5
   3: core::option::Option<T>::unwrap
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/option.rs:950:21
   4: sea_query::backend::query_builder::QueryBuilder::binary_expr
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:1405:43
   5: sea_query::backend::query_builder::QueryBuilder::prepare_simple_expr_common
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:312:22
   6: sea_query::backend::query_builder::QueryBuilder::prepare_simple_expr
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:285:9
   7: sea_query::backend::query_builder::QueryBuilder::prepare_condition_where::{{closure}}
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:1304:21
   8: core::iter::traits::iterator::Iterator::fold
             at /rustc/90c541806f23a127002de5b4038be731ba1458ca/library/core/src/iter/traits/iterator.rs:2482:21
   9: sea_query::backend::query_builder::QueryBuilder::prepare_condition_where
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:1283:9
  10: sea_query::backend::query_builder::QueryBuilder::prepare_condition
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:1260:17
  11: sea_query::backend::query_builder::QueryBuilder::prepare_select_statement
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/backend/query_builder.rs:127:9
  12: <sea_query::query::select::SelectStatement as sea_query::query::traits::QueryStatementBuilder>::build_collect_any_into
             at ~/.cargo/registry/src/index.crates.io-6f17d22bba15001f/sea-query-0.30.0/src/query/

And the relevant lines are:

        let right_paren = (right.need_parentheses()
            || right.is_binary() && op != left.get_bin_oper().unwrap())
        ---------------------------------------------------   ^ unwrap called on None
            && !no_right_paren
            && !no_paren;

Edit:I think this is fixed as a consequence of by bugfix in #675 (edited with correct PR)
Are you able to provide a test case?

@prakol16
Copy link

Issue #657 seems to be about derive_attr which seems unrelated to this issue. Did you mean #675 ?

I agree #675 likely solves this issue. I just wanted to point out that this isn't just a soundness issue -- it can cause crashes/panics as well. I think this puts a little pressure to get the PR merged and released sooner.

Test case:

#[test]
fn test_issue_674_nested_logical_panic() {
    let e = SimpleExpr::from(true).and(SimpleExpr::from(true).and(true.into()).and(true.into()));

    assert_eq!(
        Query::select()
            .columns([Char::Character])
            .from(Char::Table)
            .and_where(e)
            .to_string(PostgresQueryBuilder),
        r#"SELECT "character" FROM "character" WHERE TRUE AND (TRUE AND TRUE AND TRUE)"#
    );
}

I believe this is correct. Another correct option (with more parentheses) includes r#"SELECT "character" FROM "character" WHERE TRUE AND ((TRUE AND TRUE) AND TRUE)"#. I might have done the "manual query building" here wrong though -- the important thing is that it does not panic.

magbak added a commit to magbak/sea-query that referenced this issue Aug 17, 2023
@magbak
Copy link
Contributor Author

magbak commented Aug 17, 2023

Issue #657 seems to be about derive_attr which seems unrelated to this issue. Did you mean #675 ?

I agree #675 likely solves this issue. I just wanted to point out that this isn't just a soundness issue -- it can cause crashes/panics as well. I think this puts a little pressure to get the PR merged and released sooner.

Test case:

#[test]
fn test_issue_674_nested_logical_panic() {
    let e = SimpleExpr::from(true).and(SimpleExpr::from(true).and(true.into()).and(true.into()));

    assert_eq!(
        Query::select()
            .columns([Char::Character])
            .from(Char::Table)
            .and_where(e)
            .to_string(PostgresQueryBuilder),
        r#"SELECT "character" FROM "character" WHERE TRUE AND (TRUE AND TRUE AND TRUE)"#
    );
}

I believe this is correct. Another correct option (with more parentheses) includes r#"SELECT "character" FROM "character" WHERE TRUE AND ((TRUE AND TRUE) AND TRUE)"#. I might have done the "manual query building" here wrong though -- the important thing is that it does not panic.

Thanks for the test case! Put it into the PR. Strictly we can drop the parentheses around the right hand expression here.
This requires rules on the mathematical associativity of AND. Let's leave that for another time :-)

magbak added a commit to magbak/sea-query that referenced this issue Aug 18, 2023
tyt2y3 pushed a commit that referenced this issue Aug 23, 2023
* Issue #674: Fix issues with semantics of parenthesis removal

* Fix comment

* #674 Revamp how conditions are written to SQL - now uniformly handled as SimpleExpr:Binary

* #674 Fix clippy

* Issue #674 Move methods for deciding associativity and precedence to trait

* Added extra testcase from Issue #674

* Issue #674: Follow up on comments in code review
@Sytten
Copy link
Contributor

Sytten commented Aug 24, 2023

Can we get a release for this @tyt2y3 I think we hit a bug in our codebase due to that too...

@tyt2y3
Copy link
Member

tyt2y3 commented Aug 24, 2023

Sure. I am planning to get on it tmr.

@tyt2y3
Copy link
Member

tyt2y3 commented Aug 25, 2023

Backported the fixes to 0.30.1

@tyt2y3 tyt2y3 closed this as completed Aug 25, 2023
@Sytten
Copy link
Contributor

Sytten commented Dec 5, 2023

@tyt2y3 Could we have a "give me all the parentheses" mode (either feature flag or runtime).
Not that that I don't trust the implicit priority of operations but I don't hahaha (is it even standard between databases?), so I would sleep better if I could force it to add parentheses.

@tyt2y3
Copy link
Member

tyt2y3 commented Dec 8, 2023

Yeah, that's fair. I think it is better to be a runtime option instead of a feature flag.
I remember serde_json having a arbitrary_precision feature flag which gives different runtime behaviour. Which from my experience was pretty difficult to manage.
I am not sure whether using lazy_static for global options would be a better idea though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants