Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monolith OSS - Limit query consumes high CPU usage for look backs 12+hours #25573

Open
MaduMitha-Ravi opened this issue Nov 21, 2024 · 2 comments

Comments

@MaduMitha-Ravi
Copy link

MaduMitha-Ravi commented Nov 21, 2024

Problem 1

The SQL limit query consumes very high CPU utilization even for a single series. On a m6dg.large machine which is of 2 cores and 8 GiB RAM, we observe that the query uses more than 75% and in many cases nearly 95%.

Problem 2

As the LIMIT increases, drastic increase in CPU usage is observed. LIMIT 1 -> LIMIT 5.

Note: Happens only in OSS. This has an effect on the latency as well and these experiments are without concurrency.

Configuration

The problem was found with the below basic setup,

  • 2 cores, 8 GiB RAM
  • Light Ingest, about 17 rows/second
  • Executed as a stand-alone query (tried as part of a suite as well, same concern)
  • Query look back used was 12+ hours (approx. 734400 rows scanned)

Queries

SELECT * FROM senec_data WHERE device_id = '10' and time > (now() - interval'{{.QueryRange}} minute')::timestamp ORDER BY time DESC LIMIT 1;

SELECT * FROM senec_data where device_id = '10' and time > (now() - interval'{{.QueryRange}} minute')::timestamp order by time desc limit 5;

Metrics/Evidence

Q5 query - LIMIT 5 reaching 93.5%
image

F1 query - LIMIT 1 reaching 74.4%
image

Latency over 100 runs
image

@MaduMitha-Ravi MaduMitha-Ravi changed the title Monolith - Limit query consumes high CPU usage for look backs 12+hours Monolith OSS - Limit query consumes high CPU usage for look backs 12+hours Nov 21, 2024
@pauldix
Copy link
Member

pauldix commented Nov 21, 2024

Can you get some EXPLAIN plans from some of these queries? I'm curious if deduplication is running, which should be something that we generally don't need except for stuff in the WriteBuffer Chunks.

@MaduMitha-Ravi
Copy link
Author

@pauldix Here are the explains.

Explains for Q5 query (LIMIT 5 reaching CPU usage of 93%) ``` plan: Sort: senec_data.time DESC NULLS FIRST, fetch=5 Filter: senec_data.device_id = Dictionary(Int32, Utf8("136")) AND senec_data.time > TimestampNanosecond(1732094449936786175, None) TableScan: senec_data projection=[device_id, field1, field2, field3, field4, field5, field6, field7, field8, field9, time], partial_filters=[senec_data.device_id = Dictionary(Int32, Utf8("136")), senec_data.time > TimestampNanosecond(1732094449936786175, None)] plan_type: logical_plan plan: SortPreservingMergeExec: [time@10 DESC], fetch=5 SortExec: TopK(fetch=5), expr=[time@10 DESC], preserve_partitioning=[true] CoalesceBatchesExec: target_batch_size=8192 FilterExec: device_id@0 = 136 AND time@10 > 1732094449936786175 UnionExec CoalesceBatchesExec: target_batch_size=8192 FilterExec: device_id@0 = 136 AND time@10 > 1732094449936786175 ParquetExec: file_groups={2 groups: [[mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/21-00/0000084566.parquet:0..740319, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/20-40/0000083390.parquet:0..739549, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/20-20/0000082213.parquet:0..739327, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/20-00/0000081037.parquet:0..739762, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/19-40/0000079860.parquet:0..739837, ...], [mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-19/21-20/0000000900.parquet:481478..738991, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/20-50/0000083978.parquet:0..740778, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/20-30/0000082801.parquet:0..740188, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/20-10/0000081625.parquet:0..739828, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/19-50/0000080449.parquet:0..739798, ...]]}, projection=[device_id, field1, field2, field3, field4, field5, field6, field7, field8, field9, time], predicate=device_id@0 = 136 AND time@10 > 1732094449936786175, pruning_predicate=CASE WHEN device_id_null_count@2 = device_id_row_count@3 THEN false ELSE device_id_min@0 <= 136 AND 136 <= device_id_max@1 END AND CASE WHEN time_null_count@5 = time_row_count@6 THEN false ELSE time_max@4 > 1732094449936786175 END, required_guarantees=[device_id in (136)] ProjectionExec: expr=[device_id@0 as device_id, field1@1 as field1, field2@2 as field2, field3@3 as field3, field4@4 as field4, field5@5 as field5, field6@6 as field6, field7@7 as field7, field8@8 as field8, field9@9 as field9, time@10 as time] DeduplicateExec: [device_id@0 ASC,time@10 ASC] SortPreservingMergeExec: [time@10 ASC,__chunk_order@11 ASC] SortExec: expr=[time@10 ASC,__chunk_order@11 ASC], preserve_partitioning=[true] CoalesceBatchesExec: target_batch_size=8192 FilterExec: device_id@0 = 136 AND time@10 > 1732094449936786175 RepartitionExec: partitioning=RoundRobinBatch(2), input_partitions=1 RecordBatchesExec: chunks=1, projection=[device_id, field1, field2, field3, field4, field5, field6, field7, field8, field9, time, __chunk_order] ProjectionExec: expr=[device_id@0 as device_id, field1@1 as field1, field2@2 as field2, field3@3 as field3, field4@4 as field4, field5@5 as field5, field6@6 as field6, field7@7 as field7, field8@8 as field8, field9@9 as field9, time@10 as time] DeduplicateExec: [device_id@0 ASC,time@10 ASC] SortExec: expr=[time@10 ASC,__chunk_order@11 ASC], preserve_partitioning=[false] CoalesceBatchesExec: target_batch_size=8192 FilterExec: device_id@0 = 136 AND time@10 > 1732094449936786175 RecordBatchesExec: chunks=1, projection=[device_id, field1, field2, field3, field4, field5, field6, field7, field8, field9, time, __chunk_order] plan_type: physical_plan ```
Explains for F1 query (LIMIT 1 reaching CPU usage of 74%)
plan: Sort: senec_data.time DESC NULLS FIRST, fetch=1
  Filter: senec_data.device_id = Dictionary(Int32, Utf8("767")) AND senec_data.time > TimestampNanosecond(1732095057911717855, None)
    TableScan: senec_data projection=[device_id, field1, field2, field3, field4, field5, field6, field7, field8, field9, time], partial_filters=[senec_data.device_id = Dictionary(Int32, Utf8("767")), senec_data.time > TimestampNanosecond(1732095057911717855, None)] plan_type: logical_plan         
plan: SortPreservingMergeExec: [time@10 DESC], fetch=1
  SortExec: TopK(fetch=1), expr=[time@10 DESC], preserve_partitioning=[true]
    CoalesceBatchesExec: target_batch_size=8192
      FilterExec: device_id@0 = 767 AND time@10 > 1732095057911717855
        UnionExec
          CoalesceBatchesExec: target_batch_size=8192
            FilterExec: device_id@0 = 767 AND time@10 > 1732095057911717855
              ParquetExec: file_groups={2 groups: [[mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/21-10/0000085154.parquet:0..739065, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/20-50/0000083978.parquet:0..740778, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/20-30/0000082801.parquet:0..740188, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/20-10/0000081625.parquet:0..739828, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/19-50/0000080449.parquet:0..739798, ...], [mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-19/21-10/0000000900.parquet:112427..224446, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/21-00/0000084566.parquet:0..740319, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/20-40/0000083390.parquet:0..739549, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/20-20/0000082213.parquet:0..739327, mravi-m-oss-1k-monolithoss-worker-00/dbs/thebucket-0/senec_data-0/2024-11-20/20-00/0000081037.parquet:0..739762, ...]]}, projection=[device_id, field1, field2, field3, field4, field5, field6, field7, field8, field9, time], predicate=device_id@0 = 767 AND time@10 > 1732095057911717855, pruning_predicate=CASE WHEN device_id_null_count@2 = device_id_row_count@3 THEN false ELSE device_id_min@0 <= 767 AND 767 <= device_id_max@1 END AND CASE WHEN time_null_count@5 = time_row_count@6 THEN false ELSE time_max@4 > 1732095057911717855 END, required_guarantees=[device_id in (767)]
          ProjectionExec: expr=[device_id@0 as device_id, field1@1 as field1, field2@2 as field2, field3@3 as field3, field4@4 as field4, field5@5 as field5, field6@6 as field6, field7@7 as field7, field8@8 as field8, field9@9 as field9, time@10 as time]
            DeduplicateExec: [device_id@0 ASC,time@10 ASC]
              SortPreservingMergeExec: [time@10 ASC,__chunk_order@11 ASC]
                SortExec: expr=[time@10 ASC,__chunk_order@11 ASC], preserve_partitioning=[true]
                  CoalesceBatchesExec: target_batch_size=8192
                    FilterExec: device_id@0 = 767 AND time@10 > 1732095057911717855
                      RepartitionExec: partitioning=RoundRobinBatch(2), input_partitions=1
                        RecordBatchesExec: chunks=1, projection=[device_id, field1, field2, field3, field4, field5, field6, field7, field8, field9, time, __chunk_order]
          ProjectionExec: expr=[device_id@0 as device_id, field1@1 as field1, field2@2 as field2, field3@3 as field3, field4@4 as field4, field5@5 as field5, field6@6 as field6, field7@7 as field7, field8@8 as field8, field9@9 as field9, time@10 as time]
            DeduplicateExec: [device_id@0 ASC,time@10 ASC]
              SortExec: expr=[time@10 ASC,__chunk_order@11 ASC], preserve_partitioning=[false]
                CoalesceBatchesExec: target_batch_size=8192
                  FilterExec: device_id@0 = 767 AND time@10 > 1732095057911717855
                    RecordBatchesExec: chunks=1, projection=[device_id, field1, field2, field3, field4, field5, field6, field7, field8, field9, time, __chunk_order]
 plan_type: physical_plan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants