Skip to content

finalizeDt on both sides of the join makes redundant enumeration #536

@pashandor789

Description

@pashandor789

Example

auto logicalPlan =
      lp::PlanBuilder(context)
          .tableScan("nation")
          .limit(0, 10)
          .orderBy({"n_nationkey"})
          .as("n1")
          .join(
              lp::PlanBuilder(context)
                  .tableScan("region")
                  .limit(0, 5)
                  .orderBy({"r_regionkey"})
                  .as("r1"),
              "n1.n_regionkey = r1.r_regionkey",
              lp::JoinType::kInner)
          .build();

This queries cause a fail below.

E20251021 17:04:08.954496 902249 Exceptions.h:53] Line: /home/ivanovp/projects/verax/axiom/optimizer/Optimization.cpp:1297, Function:crossJoin, Expression:  No cross joins, Source: RUNTIME, ErrorCode: NOT_IMPLEMENTED
unknown file: Failure
C++ exception with description "Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: NOT_IMPLEMENTED
Reason: No cross joins

Why is it happening?

translateJoin calls makeQueryGraph(*joinLeft) / makeQueryGraph(*joinRight)

consider we've dt1 when we are entering the translateJoin.
on the left makeQueryGraph we have limit + sort as in example, there's a code

case lp::NodeKind::kSort:
      // Multiple orderBys are allowed before a limit. Last one wins. Previous
      // are dropped. If arrives after limit, then starts a new DT.

      makeQueryGraph(*node.onlyInput(), allowedInDt);

      if (currentDt_->hasLimit()) {
        finalizeDt(*node.onlyInput());
      }

      return addOrderBy(*node.asUnchecked<lp::SortNode>());

so we makeQueryGraph for the input and now currentDt (let it be dt1) has a state:

dt1.tables = [table1]

then we call finalizeDt (code is below), which creates a new one and then put the previous dt into tables of the current
so now the state will be:

dt1.tables = [table1]
dt2.tables = [dt1]
void ToGraph::finalizeDt(
    const lp::LogicalPlanNode& node,
    DerivedTableP outerDt) {
  DerivedTableP dt = currentDt_;
  setDtUsedOutput(dt, node);

  currentDt_ = outerDt != nullptr ? outerDt : newDt();
  currentDt_->addTable(dt);

  dt->makeInitialPlan();
}

so after leaving the left makeQueryGraph, currentDt = dt2
and here we go to the right part which do the same
we call makeQueryGraph and then finalizeDt for sort case.

makeQueryGraph will come across table2 and we will put it into our dt2 so the state is

dt1.tables = [table1]
dt2.tables = [dt1, table2]

and then we call finalizeDt which starts enumeration in the makeInitialPlan
dt2.tables = [dt1, table2] will trigger cross join to be enumerated .

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions