Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-6274] Two Elasticsearch index join return empty result #3696

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zoov-w
Copy link

@zoov-w zoov-w commented Feb 20, 2024

CALCITE-6274
Two index of Elasticsearch join return empty result even if the data from both indexes can match。

create index test_01:
PUT /test_01/_doc/1 {   "doc_id" : 1,   "doc_desc" : "doc01" }

create index test_02:
PUT /test_02/_doc/1 {   "doc_id" : 1,   "doc_score" : 90 }

execute sql:
select * from es.test_01 t1 join es.test_02 t2 on cast(t1._MAP['doc_id'] as bigint) = cast(t2._MAP['doc_id'] as bigint)

the code generate by ElasticsearchToEnumerableConverter like this:
subquery of index test_01:
{ return ((org.apache.calcite.adapter.elasticsearch.ElasticsearchTable.ElasticsearchQueryable) org.apache.calcite.schema.Schemas.queryable(root, root.getRootSchema().getSubSchema("es"), java.lang.Object[].class, "test_01")).find(java.util.Collections.EMPTY_LIST, java.util.Arrays.asList(new org.apache.calcite.util.Pair[] { new org.apache.calcite.util.Pair( "_MAP", java.util.Map.class), new org.apache.calcite.util.Pair( "_1", java.lang.Long.class)}), java.util.Arrays.asList(new org.apache.calcite.util.Pair[] { new org.apache.calcite.util.Pair( "doc_id", org.apache.calcite.rel.RelFieldCollation.Direction.ASCENDING)}), java.util.Collections.EMPTY_LIST, java.util.Arrays.asList(new org.apache.calcite.util.Pair[] {}), com.google.common.collect.ImmutableMap.of("$f1", "doc_id"), null, null); }
project field names: "_MAP", "_1"

subquery of index test_02:
{ return ((org.apache.calcite.adapter.elasticsearch.ElasticsearchTable.ElasticsearchQueryable) org.apache.calcite.schema.Schemas.queryable(root, root.getRootSchema().getSubSchema("es"), java.lang.Object[].class, "test_02")).find(java.util.Collections.EMPTY_LIST, java.util.Arrays.asList(new org.apache.calcite.util.Pair[] { new org.apache.calcite.util.Pair( "_MAP", java.util.Map.class), new org.apache.calcite.util.Pair( "_1", java.lang.Long.class)}), java.util.Arrays.asList(new org.apache.calcite.util.Pair[] { new org.apache.calcite.util.Pair( "doc_id", org.apache.calcite.rel.RelFieldCollation.Direction.ASCENDING)}), java.util.Collections.EMPTY_LIST, java.util.Arrays.asList(new org.apache.calcite.util.Pair[] {}), com.google.common.collect.ImmutableMap.of("$f1", "doc_id"), null, null); }
project field names: "_MAP", "_1"

This org.apache.calcite.adapter.elasticsearch.ElasticsearchTable.ElasticsearchQueryable#find function actually execute request, subq-query result projected according to second paramter fields. Field "_1" can not find from subq-query result. "_1" not in mappings {"$f1":"doc_id"}, cause two sub-query join condition value is null, so the result of sql is empty.

This PR fix the problem, the sub-query project field names is: "_MAP", "$f1". join condition field $f1 can be find in mappings. The result of sql match expectations.

@mihaibudiu
Copy link
Contributor

The tools don't like the way you formatted your code; you can run ./gradlew build to find out what's wrong.

@zoov-w zoov-w force-pushed the elasticsearch-adapter-join-empty-result branch from 48b67a1 to ae5ee86 Compare February 21, 2024 02:27
@zoov-w zoov-w force-pushed the elasticsearch-adapter-join-empty-result branch from ae5ee86 to 544b4b0 Compare February 22, 2024 12:08
Copy link

sonarcloud bot commented Feb 22, 2024

@@ -113,8 +113,7 @@ static List<String> elasticsearchFieldNames(final RelDataType rowType) {
return SqlValidatorUtil.uniquify(
new AbstractList<String>() {
@Override public String get(int index) {
final String name = rowType.getFieldList().get(index).getName();
return name.startsWith("$") ? "_" + name.substring(2) : name;
return rowType.getFieldList().get(index).getName();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before tearing down this chesterton fence... do we know what was the logic behind this piece of code that you propose to remove?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants