You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While querying by joining two tables with multiple join conditions(more than 2 in my case), I observed that one of the conditions was being ignored hence giving incorrect results
To Reproduce
Setup an AWS aggregator connection
For example When joining the tables aws_dax_parameter_group and aws_dax_cluster on parameter_group_name, account_id and region. The p.account_id = c.account_id condition is being ignored. Thus returning results from multiple accounts
>selectp.parameter_group_nameas id,
p.titleas title,
p.account_idas pg_account_id,
c.account_idas dax_account_id,
p.regionas pg_region,
c.regionas dax_region
from
aws_dax_parameter_group as p
join aws_dax_cluster as c
onc.parameter_group->>'ParameterGroupName'=p.parameter_group_nameandp.account_id=c.account_idandp.region=c.regionwherec.arn='arn:aws:dax:us-east-1:111122223333:cache/graph-test-dax'+----------------+----------------+---------------+----------------+-----------+------------+
| id | title | pg_account_id | dax_account_id | pg_region | dax_region |
+----------------+----------------+---------------+----------------+-----------+------------+
| default.dax1.0 | default.dax1.0 | 123456789012 | 111122223333 | us-east-1 | us-east-1 |
| default.dax1.0 | default.dax1.0 | 111122223333 | 111122223333 | us-east-1 | us-east-1 |
+----------------+----------------+---------------+----------------+-----------+------------++------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+------------------------------------------------------------------------------------------------------------------------------------------+
| Nested Loop (cost=0.00..40040001000000.00 rows=125000 width=192) (actual time=12.747..12.866 rows=2 loops=1) |
| -> Foreign Scan on aws_dax_cluster c (cost=0.00..40000000000000.00 rows=1000000 width=400) (actual time=7.364..7.380 rows=1 loops=1) |
| Filter: (arn ='arn:aws:dax:us-east-1:111122223333:cache/graph-test-dax'::text) |
| -> Foreign Scan on aws_dax_parameter_group p (cost=0.00..40000.00 rows=100 width=400) (actual time=4.604..4.707 rows=2 loops=1) |
| Filter: ((region =c.region) AND ((c.parameter_group->>'ParameterGroupName'::text) = parameter_group_name)) |
| Rows Removed by Filter: 3 |
| Planning Time: 8.682 ms |
| Execution Time: 14.922 ms |
+------------------------------------------------------------------------------------------------------------------------------------------+
The query plan above shows that the filter for p.account_id = c.account_id is ignored.
On removing one of the join conditions from the same query, for eg. p.region = c.region below, the result displayed are correct
>selectp.parameter_group_nameas id,
p.titleas title,
p.account_idas pg_account_id,
c.account_idas dax_account_id,
p.regionas pg_region,
c.regionas dax_region
from
aws_dax_parameter_group as p
join aws_dax_cluster as c
onc.parameter_group->>'ParameterGroupName'=p.parameter_group_nameandp.account_id=c.account_idwherec.arn='arn:aws:dax:us-east-1:111122223333:cache/graph-test-dax'+----------------+----------------+---------------+----------------+----------------+------------+
| id | title | pg_account_id | dax_account_id | pg_region | dax_region |
+----------------+----------------+---------------+----------------+----------------+------------+
| default.dax1.0 | default.dax1.0 | 111122223333 | 111122223333 | ap-south-1 | us-east-1 |
| default.dax1.0 | default.dax1.0 | 111122223333 | 111122223333 | us-west-2 | us-east-1 |
| default.dax1.0 | default.dax1.0 | 111122223333 | 111122223333 | us-east-1 | us-east-1 |
+----------------+----------------+---------------+----------------+----------------+------------++------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+------------------------------------------------------------------------------------------------------------------------------------------+
| Nested Loop (cost=0.00..40040001000000.00 rows=25000000 width=192) (actual time=12.474..12.939 rows=13 loops=1) |
| -> Foreign Scan on aws_dax_cluster c (cost=0.00..40000000000000.00 rows=1000000 width=400) (actual time=4.827..4.840 rows=1 loops=1) |
| Filter: (arn ='arn:aws:dax:us-east-1:111122223333:cache/graph-test-dax'::text) |
| -> Foreign Scan on aws_dax_parameter_group p (cost=0.00..40000.00 rows=100 width=400) (actual time=7.004..7.454 rows=13 loops=1) |
| Filter: ((account_id =c.account_id) AND ((c.parameter_group->>'ParameterGroupName'::text) = parameter_group_name)) |
| Rows Removed by Filter: 16 |
| Planning Time: 10.968 ms |
| Execution Time: 16.336 ms |
+------------------------------------------------------------------------------------------------------------------------------------------+
So does the plan show that all the join conditions were considered.
Now on removing a different condition from the same query, for eg. c.parameter_group ->> 'ParameterGroupName' = p.parameter_group_name below, the result displayed are correct again
>selectp.parameter_group_nameas id,
p.titleas title,
p.account_idas pg_account_id,
c.account_idas dax_account_id,
p.regionas pg_region,
c.regionas dax_region
from
aws_dax_parameter_group as p
join aws_dax_cluster as c
onp.account_id=c.account_idandp.region=c.regionwherec.arn='arn:aws:dax:us-east-1:111122223333:cache/graph-test-dax'+----------------+----------------+---------------+----------------+-----------+------------+
| id | title | pg_account_id | dax_account_id | pg_region | dax_region |
+----------------+----------------+---------------+----------------+-----------+------------+
| default.dax1.0 | default.dax1.0 | 111122223333 | 111122223333 | us-east-1 | us-east-1 |
+----------------+----------------+---------------+----------------+-----------+------------++---------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+---------------------------------------------------------------------------------------------------------------------------------------------------+
| Nested Loop (cost=0.00..70057540002500.00 rows=25000000 width=192) (actual time=17.646..17.691 rows=1 loops=1) |
| Join Filter: ((p.account_id=c.account_id) AND (p.region=c.region)) |
| Rows Removed by Join Filter: 28 |
| -> Foreign Scan on aws_dax_parameter_group p (cost=0.00..40000000000000.00 rows=1000000 width=400) (actual time=3.382..4.065 rows=29 loops=1) |
| -> Materialize (cost=0.00..30000000045040.00 rows=1000000 width=300) (actual time=0.468..0.469 rows=1 loops=29) |
| -> Foreign Scan on aws_dax_cluster c (cost=0.00..30000000000000.00 rows=1000000 width=300) (actual time=13.549..13.587 rows=1 loops=1) |
| Filter: (arn ='arn:aws:dax:us-east-1:111122223333:cache/graph-test-dax'::text) |
| Planning Time: 11.774 ms |
| Execution Time: 21.489 ms |
+---------------------------------------------------------------------------------------------------------------------------------------------------+
So does the plan show that all the join conditions were considered.
Same query with cross-join and the join conditions in the where clause, generates a similar plan
> explain analyse selectp.parameter_group_nameas id,
p.titleas title,
p.account_idas pg_account_id,
c.account_idas dax_account_id,
p.regionas pg_region,
c.regionas dax_region
from
aws_dax_parameter_group as p,
aws_dax_cluster as c
wherec.parameter_group->>'ParameterGroupName'=p.parameter_group_nameandp.account_id=c.account_idandp.region=c.regionandc.arn='arn:aws:dax:us-east-1:111122223333:cache/graph-test-dax'+------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+------------------------------------------------------------------------------------------------------------------------------------------+
| Nested Loop (cost=0.00..40040001000000.00 rows=125000 width=192) (actual time=11.617..11.679 rows=2 loops=1) |
| -> Foreign Scan on aws_dax_cluster c (cost=0.00..40000000000000.00 rows=1000000 width=400) (actual time=6.915..6.929 rows=1 loops=1) |
| Filter: (arn ='arn:aws:dax:us-east-1:111122223333:cache/graph-test-dax'::text) |
| -> Foreign Scan on aws_dax_parameter_group p (cost=0.00..40000.00 rows=100 width=400) (actual time=4.235..4.283 rows=2 loops=1) |
| Filter: ((region =c.region) AND ((c.parameter_group->>'ParameterGroupName'::text) = parameter_group_name)) |
| Rows Removed by Filter: 3 |
| Planning Time: 13.727 ms |
| Execution Time: 13.760 ms |
+------------------------------------------------------------------------------------------------------------------------------------------+
Similar example
> explain analyse selectb.arnas eventbridge_bus_arn
from
aws_eventbridge_rule r
join aws_eventbridge_bus b onr.event_bus_name=b.nameandr.region=b.regionandr.account_id=b.account_idwherer.arn='arn:aws:events:ap-south-1:111122223333:rule/test'+------------------------------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
+------------------------------------------------------------------------------------------------------------------------------------------------+
| Nested Loop (cost=0.00..40040001000000.00 rows=125000 width=32) (actual time=15.128..25.383 rows=2 loops=1) |
| -> Foreign Scan on aws_eventbridge_rule r (cost=0.00..40000000000000.00 rows=1000000 width=400) (actual time=7.076..17.280 rows=1 loops=1) |
| Filter: (arn ='arn:aws:events:ap-south-1:111122223333:rule/test'::text) |
| Rows Removed by Filter: 437 |
| -> Foreign Scan on aws_eventbridge_bus b (cost=0.00..40000.00 rows=100 width=400) (actual time=7.371..7.419 rows=2 loops=1) |
| Filter: ((r.region= region) AND (r.event_bus_name= name)) |
| Planning Time: 11.155 ms |
| Execution Time: 28.488 ms |
+------------------------------------------------------------------------------------------------------------------------------------------------+
Steampipe version
Tested with both 0.18.0 and 0.17.4
The text was updated successfully, but these errors were encountered:
While querying by joining two tables with multiple join conditions(more than 2 in my case), I observed that one of the conditions was being ignored hence giving incorrect results
To Reproduce
aws_dax_parameter_group
andaws_dax_cluster
onparameter_group_name
,account_id
andregion
. Thep.account_id = c.account_id
condition is being ignored. Thus returning results from multiple accountsp.account_id = c.account_id
is ignored.p.region = c.region
below, the result displayed are correctc.parameter_group ->> 'ParameterGroupName' = p.parameter_group_name
below, the result displayed are correct againwhere
clause, generates a similar planSteampipe version
Tested with both
0.18.0
and0.17.4
The text was updated successfully, but these errors were encountered: