Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect count result when querying a View over a Distributed table #142

Open
dvgorbunkov opened this issue Feb 15, 2023 · 11 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@dvgorbunkov
Copy link

HI!
I'm using metabase version 0.44.1 with version clickhouse driver 0.8.3.
Found a problem that a simple query

select count(distinct(orderId)) from AttributedPartnerOrders 
where partnerId = '...'
and createdDate >= '2023-01-01'
and createdDate <= '2023-01-31'
and clickedMinutesAgo < 1440

produces a result 6046, but if you repeat it, it gives 5971.

The numbers are constantly changing.

Through the standard clickhouse-client, the data is given correctly and the same.
In the metabase logs we see only regular lines for this request

DEBUG middleware.log :: POST /api/dataset 202 [ASYNC: completed] 244.3 ms (6 DB calls) App DB connections: 1/15 Jetty threads: 2/50 (5 idle, 0 queued) (100 total active threads) Queries in flight: 1 (0 queued); clickhouse DB 2 connections: 0/1 (0 threads blocked)
@slvrtrn slvrtrn added the clarification Waiting for clarification label Feb 15, 2023
@slvrtrn
Copy link
Collaborator

slvrtrn commented Feb 15, 2023

Hi, thanks for the report.

Having the sample cell towers dataset, I tried to do something similar using Metabase 0.45.2 and plugin version 1.0.3:

SELECT count(distinct(area))
FROM `default`.`cell_towers`
WHERE radio = 'CDMA' 
AND updated >= '2011-07-01'
AND updated <= '2011-07-31';

It produces the same result in the Metabase interface and the standard clickhouse-client.

Can you please try upgrading the driver (to at least 0.9.0 as the underlying JDBC driver has been changed there) and then check if the issue persists?

Cheers.

@dvgorbunkov
Copy link
Author

@slvrtrn
In the previous changelog in the driver, I didn't find a mention of my case, that's why I wrote here about the problem.
Thanks for the advice. I'll definitely update and come back here with the result.

@dvgorbunkov
Copy link
Author

Unfortunately the update didn't help. I installed the latest metabase and the latest clickhouse driver (1.0.3). The problem is reproduced.

{
  "browser-info": {
    "language": "ru-RU",
    "platform": "Linux x86_64",
    "userAgent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36",
    "vendor": "Google Inc."
  },
  "system-info": {
    "file.encoding": "UTF-8",
    "java.runtime.name": "OpenJDK Runtime Environment",
    "java.runtime.version": "11.0.18+10",
    "java.vendor": "Eclipse Adoptium",
    "java.vendor.url": "https://adoptium.net/",
    "java.version": "11.0.18",
    "java.vm.name": "OpenJDK 64-Bit Server VM",
    "java.vm.version": "11.0.18+10",
    "os.name": "Linux",
    "os.version": "5.10.0-13-amd64",
    "user.language": "en",
    "user.timezone": "GMT"
  },
  "metabase-info": {
    "databases": [
      "h2",
      "clickhouse"
    ],
    "hosting-env": "unknown",
    "application-database": "h2",
    "application-database-details": {
      "database": {
        "name": "H2",
        "version": "1.4.197 (2018-03-18)"
      },
      "jdbc-driver": {
        "name": "H2 JDBC Driver",
        "version": "1.4.197 (2018-03-18)"
      }
    },
    "run-mode": "prod",
    "version": {
      "date": "2023-01-27",
      "tag": "v0.45.2.1",
      "branch": "release-x.45.2.x",
      "hash": "1a59de7"
    },
    "settings": {
      "report-timezone": null
    }
  }
}

P.S. This is very strange, but if you make a request like

select 
  count(*) 
from 
  (
    select 
      distinct(orderId) 
    from 
      AttributedPartnerOrders 
    where 
      partnerId = '...' 
      and createdDate >= '2023-01-01' 
      and createdDate <= '2023-01-31' 
      and clickedMinutesAgo < 1440
  )

then the result is correct and does not change if I repeat it again.

@slvrtrn
Copy link
Collaborator

slvrtrn commented Feb 15, 2023

Thanks for trying it out.

That is odd indeed!

Could you provide the table structure, maybe with a limited number of fields (the ones used in the query at least), but with exact ClickHouse types?

I will generate some dummy data and try to reproduce it on my end.

@slvrtrn slvrtrn added bug Something isn't working and removed clarification Waiting for clarification labels Feb 15, 2023
@dvgorbunkov
Copy link
Author

dvgorbunkov commented Feb 15, 2023

Thank you for taking such good care of my question.

CREATE VIEW AttributedPartnerOrders
(
    `partnerId` FixedString(24),
    `visitorId` Nullable(FixedString(24)),
    `createdAt` DateTime,
    `createdDate` Date,
    `itemId` Nullable(Int64),
    `orderId` Nullable(String),
    `qnt` Nullable(Float64),
    `amount` Nullable(Float64),
    `clickedAt` DateTime,
    `clickedMinutesAgo` Float64,
    `blockId` FixedString(32),
    `method` Nullable(String),
    `email` Nullable(String),
    `updatedAt` DateTime
) AS
SELECT *
FROM AttributedPartnerOrdersMaterialized

@slvrtrn
Copy link
Collaborator

slvrtrn commented Feb 15, 2023

Thanks. Will have a look.

@slvrtrn slvrtrn self-assigned this Feb 16, 2023
@slvrtrn
Copy link
Collaborator

slvrtrn commented Feb 17, 2023

I still cannot reproduce this.

Here's what I did:

CREATE TABLE AttributedPartnerOrdersMaterialized
(
    `partnerId`         FixedString(24),
    `createdDate`       Date,
    `orderId`           Nullable(String),
    `clickedMinutesAgo` Float64
) ENGINE MergeTree ORDER BY partnerId;

CREATE VIEW AttributedPartnerOrders
(
    `partnerId`         FixedString(24),
    `createdDate`       Date,
    `orderId`           Nullable(String),
    `clickedMinutesAgo` Float64
)
AS
SELECT *
FROM AttributedPartnerOrdersMaterialized;

so I'd have some minimal schema setup for this test.

I generated 1 million random rows, such as

image

10 unique partner ids, 100k unique orderIds there

running several queries such as

select count(distinct(orderId)) from AttributedPartnerOrders 
where partnerId = 'c359c87522234e459890fd88'
and createdDate >= '2023-01-01'
and createdDate <= '2023-01-31'
and clickedMinutesAgo < 1440

yielded the same result in both Metabase and ClickHouse.

I think we are missing something else here...

To help with the investigation, could you provide more details:

  • what is your ClickHouse server version
  • what is the table engine you are using?
  • is it a single node or a cluster setup?

to speed up the process, maybe you could ping me in the ClickHouse Community Slack

@dvgorbunkov
Copy link
Author

I have 4 servers with clickhouse version 22.8.4.7.
Two sharded servers, each with one additional server with a replica.
I have the following table schema

CREATE TABLE AttributedPartnerOrdersMaterialized_local ON CLUSTER my_cluster (
    `partnerId` FixedString(24),
    `createdDate` Date,
    `itemId` Nullable(Int64),
    `clickedMinutesAgo` Float64,
...
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/AttributedPartnerOrdersMaterialized_local', '{replica}')
PARTITION BY toYYYYMMDD(createdDate) ORDER BY (partnerId, blockId) SETTINGS index_granularity = 8192;

CREATE TABLE AttributedPartnerOrdersMaterialized ON CLUSTER my_cluster (
    `partnerId` FixedString(24),
    `createdDate` Date,
    `itemId` Nullable(Int64),
    `clickedMinutesAgo` Float64,
...
) ENGINE = Distributed('my_cluster', 'mydb', 'AttributedPartnerOrdersMaterialized_local', rand());


CREATE VIEW AttributedPartnerOrders ON CLUSTER my_cluster (
    `partnerId` FixedString(24),
    `createdDate` Date,
    `itemId` Nullable(Int64),
    `clickedMinutesAgo` Float64,
...
) AS SELECT * FROM AttributedPartnerOrdersMaterialized;

As a result, if we make a query on the Distributed table (AttributedPartnerOrdersMaterialized), we get the correct data that does not change if we repeat the query.
However, if we make query view table (AttributedPartnerOrders), then we get a random number.
This is still strange, but we will be satisfied with the option of transferring queries to another table.

Thanks for pointing out the data model. We have an obvious error.

@slvrtrn
Copy link
Collaborator

slvrtrn commented Feb 20, 2023

However, if we make query view table (AttributedPartnerOrders), then we get a random number.

is it the case for clickhouse-client as well or only Metabase?

@dvgorbunkov
Copy link
Author

is it the case for clickhouse-client as well or only Metabase?

only Metabase

@slvrtrn slvrtrn changed the title Same query returns different result Incorrect count result when querying a View over a Distributed table Feb 20, 2023
@slvrtrn
Copy link
Collaborator

slvrtrn commented Feb 20, 2023

Glad to hear that it works for you now.
I will leave this issue open as we still need to investigate why it behaves awkwardly when using a View over a Distributed table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants