Concurrency issues investigation #25615

hiltontj · 2024-12-04T14:50:35Z

Problem statement

The performance team has found that the influxdb3 process does not handle concurrency well. In #25604 we made a change very similar to what was done in IOx to move the query planning off of the main IO thread pool, but this did not improve things and instead seems to have lead to a regression (see #25562 (comment)).

This issue is for investigating why the system is breaking down under concurrent loads.

Ideas

Not configuring the main `tokio` runtime

Here we initialize the tokio runtime using the defaults:

influxdb/influxdb3/src/main.rs

Line 116 in 0daa3f2

let tokio_runtime = get_runtime(None)?;

This means that the tokio runtime is initializing with as many threads as there are CPU cores, so it could be competing with the DataFusion runtime as a result.

Hard-coding the query concurrency semaphore to `10`

Here we hard-code the query concurrency semaphore for the:

influxdb/influxdb3/src/commands/serve.rs

Line 518 in 0daa3f2

concurrent_query_limit: 10,

This sets the semaphore limit here:

influxdb/influxdb3_server/src/query_executor.rs

Lines 95 to 96 in 0daa3f2

    
           let query_execution_semaphore = 
        
               Arc::new(semaphore_metrics.new_semaphore(concurrent_query_limit));

Which is supposedly used by the flight service here:

influxdb/influxdb3_server/src/query_executor.rs

Lines 353 to 358 in 0daa3f2

    
           async fn acquire_semaphore(&self, span: Option<Span>) -> InstrumentedAsyncOwnedSemaphorePermit { 
        
               Arc::clone(&self.query_execution_semaphore) 
        
                   .acquire_owned(span) 
        
                   .await 
        
                   .expect("Semaphore should not be closed by anyone") 
        
           }

The text was updated successfully, but these errors were encountered:

pauldix · 2024-12-04T14:59:50Z

Should we do away with the query concurrency semaphore? I worry about if it unnecessarily slows things down if a query is waiting on IO (i.e. another query could be executing, but won't because of the semaphore).

There's definitely a point where we'll want to park incoming requests because of maxed out resource utilization, I just doubt that a semaphore is the best primitive for this. I'd rather just remove it for now and address query throttling later.

hiltontj · 2024-12-04T15:55:44Z

Should we do away with the query concurrency semaphore?

I think so. It looks like IOx does this in the querier by setting the limit to u16::MAX

mgattozzi · 2024-12-04T15:56:55Z

Might be relevant as I think we synced this change over from Pro, but we did enable IO on the dedicated executor because without it we had crashes and the default to not have it enabled is from IOx. This was part of their split of having separate executors. I wonder if it makes sense to have just one executor to avoid contention in Monolith at least.

https://github.com/influxdata/influxdb_pro/pull/174

pauldix · 2024-12-04T16:18:43Z

Let's remove the semaphore completely. One less thing to confound the investigation.

pauldix · 2024-12-04T16:20:02Z

@mgattozzi it would be an interesting experiment to put everything on a single pool. Maybe we should give that a try too?

hiltontj · 2024-12-04T16:22:15Z

Let's remove the semaphore completely. One less thing to confound the investigation.

Can try to remove completely but I'm not sure it can be because it is relied on by core traits used in the Flight service. But I think setting it to u16::MAX should have the desired effect.

praveen-influx · 2024-12-04T16:25:31Z

Can we run it through a profiler to see where it's spending the time (maybe on the box that perf team are using)?

hiltontj · 2024-12-06T20:52:05Z

It looks like IOx does this in the querier by setting the limit to u16::MAX

That was incorrect, IOx doesn't set u16::MAX as the limit, that is just the limit that they impose. IOx uses the default as 10, however, their scenario is different given that that default is set on a per-querier basis, so for us, it may make sense to remove the limit by default, but still allow it to be configurable. I will open an issue to deal with that specifically.

hiltontj · 2024-12-06T21:10:36Z

I opened #25627

MaduMitha-Ravi · 2024-12-09T14:54:16Z

@hiltontj fyi only - the concurrency issue is global across the product and just for Last value cache. The details I have observed is,

Even with concurrent query load of 2, the CPU cores are utilized to 95-99% on a large machine (2 cores and 8 GiB RAM)and increase in the latency is observed
To ensure this is not a limitation by the machine used, tried the same experiment on 2XLarge machine (8 cores and 32 GiB RAM) and still observed the CPU usage of 95-99% with increase in the latency

As the next step,
Taking the latest build of PRO, and retrying the concurrency problems with Large, 2XL, 8XL machine configurations

hiltontj · 2024-12-16T16:10:24Z

Something that I noticed was that the TokioDatafusionConfig that we utilize from influxdb3_core has IO disabled (see here).

This likely does not bode well for us. In IOx, it seems that all IO that could happen from within a query context is spawned onto the main IO runtime using this function.

Initially, in #25604, I thought the only IO in IOx was for communication with the catalog, which we don't need IO for in monolith since the catalog is in memory. However, I missed that IOx also spawns requests to the object store on the IO runtime (see here) when fetching parquet files. That makes sense, as that would be an important IO operation within the context of a query.

Therefore, the changes in #25604 are incomplete. We need to decide if we are going to stick to separate runtimes for IO/DataFusion, or switch to a single runtime. That will inform how to sort this out.

hiltontj added the v3 label Dec 4, 2024

hiltontj changed the title ~~Concurrency investigation~~ Concurrency issues investigation Dec 4, 2024

hiltontj mentioned this issue Dec 6, 2024

Last value Cache - Increase in outliers beyond 16 threads for 100 series #25562

Open

hiltontj mentioned this issue Dec 6, 2024

Lift concurrent query semaphore limit by default #25627

Open

hiltontj mentioned this issue Dec 16, 2024

Queries to last cache with large IN list predicates seem slow #25550

Open

hiltontj mentioned this issue Dec 16, 2024

Add clap blocks for configuring DataFusion and IO runtimes #25664

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrency issues investigation #25615

Concurrency issues investigation #25615

hiltontj commented Dec 4, 2024

pauldix commented Dec 4, 2024

hiltontj commented Dec 4, 2024

mgattozzi commented Dec 4, 2024

pauldix commented Dec 4, 2024

pauldix commented Dec 4, 2024

hiltontj commented Dec 4, 2024

praveen-influx commented Dec 4, 2024

hiltontj commented Dec 6, 2024

hiltontj commented Dec 6, 2024

MaduMitha-Ravi commented Dec 9, 2024

hiltontj commented Dec 16, 2024

Concurrency issues investigation #25615

Concurrency issues investigation #25615

Comments

hiltontj commented Dec 4, 2024

Problem statement

Ideas

Not configuring the main tokio runtime

Hard-coding the query concurrency semaphore to 10

pauldix commented Dec 4, 2024

hiltontj commented Dec 4, 2024

mgattozzi commented Dec 4, 2024

pauldix commented Dec 4, 2024

pauldix commented Dec 4, 2024

hiltontj commented Dec 4, 2024

praveen-influx commented Dec 4, 2024

hiltontj commented Dec 6, 2024

hiltontj commented Dec 6, 2024

MaduMitha-Ravi commented Dec 9, 2024

hiltontj commented Dec 16, 2024

Not configuring the main `tokio` runtime

Hard-coding the query concurrency semaphore to `10`