Skip to content

Commit 89736e0

Browse files
HeidiHan0000facebook-github-bot
authored andcommitted
Add spill related configs to system configs (#24726)
Summary: Adding spill configs so they can be set from configs, and then overridden by session properties. Differential Revision: D71002997
1 parent d260ff5 commit 89736e0

File tree

5 files changed

+163
-31
lines changed

5 files changed

+163
-31
lines changed

presto-docs/src/main/sphinx/presto_cpp/properties.rst

Lines changed: 73 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,8 @@ For information on catalog configuration properties, see :doc:`Connectors </conn
1212

1313
For information on Presto C++ session properties, see :doc:`properties-session`.
1414

15-
NOTE: While some of the configuration properties below with "-gb" in their names
16-
show gigabytes (gB; 1 gB equals 1000000000 B), it is actually
15+
NOTE: While some of the configuration properties below with "-gb" in their names
16+
show gigabytes (gB; 1 gB equals 1000000000 B), it is actually
1717
gibibytes (GiB; 1 GiB equals 1073741824 B).
1818

1919
.. contents::
@@ -137,8 +137,8 @@ The configuration properties of Presto C++ workers are described here, in alphab
137137
1) Memory used by the queries as specified in ``query-memory-gb``; 2) Memory used by the
138138
system, such as disk spilling and cache prefetch.
139139

140-
Set ``system-memory-gb`` to about 90% of available machine memory of the deployment.
141-
This allows some buffer room to handle unaccounted memory in order to prevent out-of-memory conditions.
140+
Set ``system-memory-gb`` to about 90% of available machine memory of the deployment.
141+
This allows some buffer room to handle unaccounted memory in order to prevent out-of-memory conditions.
142142
The default value of 57 gb is calculated based on available machine memory of 64 gb.
143143

144144

@@ -162,6 +162,51 @@ The configuration properties of Presto C++ workers are described here, in alphab
162162
storage used for spilling. If it is zero, then there is no limit and spilling
163163
might exhaust the storage or takes too long to run.
164164

165+
166+
``spill-enabled``
167+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
168+
169+
* **Type:** ``boolean``
170+
* **Default value:** ``false``
171+
172+
Try spilling memory to disk to avoid exceeding memory limits for the query.
173+
174+
Spilling works by offloading memory to disk. This process can allow a query with a large memory
175+
footprint to pass at the cost of slower execution times. Currently, spilling is supported only for
176+
aggregations and joins (inner and outer), so this property will not reduce memory usage required for
177+
window functions, sorting and other join types.
178+
179+
180+
``join-spill-enabled``
181+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
182+
183+
* **Type:** ``boolean``
184+
* **Default value:** ``true``
185+
186+
When ``spill_enabled`` is ``true``, this determines whether Presto will try spilling memory to disk for joins to
187+
avoid exceeding memory limits for the query.
188+
189+
190+
``aggregation-spill-enabled``
191+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
192+
193+
* **Type:** ``boolean``
194+
* **Default value:** ``true``
195+
196+
When ``spill_enabled`` is ``true``, this determines whether Presto will try spilling memory to disk for aggregations to
197+
avoid exceeding memory limits for the query.
198+
199+
200+
``order-by-spill-enabled``
201+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
202+
203+
* **Type:** ``boolean``
204+
* **Default value:** ``true``
205+
206+
When ``spill_enabled`` is ``true``, this determines whether Presto will try spilling memory to disk for order by to
207+
avoid exceeding memory limits for the query.
208+
209+
165210
``shared-arbitrator.reserved-capacity``
166211
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
167212

@@ -321,32 +366,32 @@ The configuration properties of AsyncDataCache and SSD cache are described here.
321366
^^^^^^^^^^^^^^^^^^^^^^^^
322367
* **Type:** ``string``
323368
* **Default value:** ``/mnt/flash/async_cache.``
324-
369+
325370
The path of the directory that is mounted onto the SSD.
326371

327372
``async-cache-max-ssd-write-ratio``
328373
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
329374
* **Type:** ``double``
330375
* **Default value:** ``0.7``
331-
332-
The maximum ratio of the number of in-memory cache entries written to the SSD cache
333-
over the total number of cache entries. Use this to control SSD cache write rate,
376+
377+
The maximum ratio of the number of in-memory cache entries written to the SSD cache
378+
over the total number of cache entries. Use this to control SSD cache write rate,
334379
once the ratio exceeds this threshold then we stop writing to the SSD cache.
335380

336381
``async-cache-ssd-savable-ratio``
337382
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
338383
* **Type:** ``double``
339384
* **Default value:** ``0.125``
340-
385+
341386
The min ratio of SSD savable (in-memory) cache space over the total cache space.
342-
Once the ratio exceeds this limit, we start writing SSD savable cache entries
387+
Once the ratio exceeds this limit, we start writing SSD savable cache entries
343388
into SSD cache.
344389

345390
``async-cache-min-ssd-savable-bytes``
346391
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
347392
* **Type:** ``integer``
348393
* **Default value:** ``16777216``
349-
394+
350395
Min SSD savable (in-memory) cache space to start writing SSD savable cache entries into SSD cache.
351396

352397
The default value ``16777216`` is 16 MB.
@@ -358,61 +403,61 @@ The configuration properties of AsyncDataCache and SSD cache are described here.
358403
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
359404
* **Type:** ``string``
360405
* **Default value:** ``0s``
361-
406+
362407
The interval for persisting in-memory cache to SSD. Set this configuration to a non-zero value to
363408
activate periodic cache persistence.
364-
365-
The following time units are supported:
366-
409+
410+
The following time units are supported:
411+
367412
ns, us, ms, s, m, h, d
368413

369414
``async-cache-ssd-disable-file-cow``
370415
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
371416
* **Type:** ``bool``
372417
* **Default value:** ``false``
373-
418+
374419
In file systems such as btrfs that support cow (copy on write), the SSD cache can use all of the SSD
375420
space and stop working. To prevent that, use this option to disable cow for cache files.
376421

377422
``ssd-cache-checksum-enabled``
378423
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
379424
* **Type:** ``bool``
380425
* **Default value:** ``false``
381-
382-
When enabled, a CRC-based checksum is calculated for each cache entry written to SSD.
426+
427+
When enabled, a CRC-based checksum is calculated for each cache entry written to SSD.
383428
The checksum is stored in the next checkpoint file.
384429

385430
``ssd-cache-read-verification-enabled``
386431
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
387432
* **Type:** ``bool``
388433
* **Default value:** ``false``
389-
390-
When enabled, the checksum is recalculated and verified against the stored value when
434+
435+
When enabled, the checksum is recalculated and verified against the stored value when
391436
cache data is loaded from the SSD.
392437

393438
``cache.velox.ttl-enabled``
394439
^^^^^^^^^^^^^^^^^^^^^^^^^^^
395440
* **Type:** ``bool``
396441
* **Default value:** ``false``
397-
442+
398443
Enable TTL for AsyncDataCache and SSD cache.
399444

400445
``cache.velox.ttl-threshold``
401446
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
402447
* **Type:** ``string``
403448
* **Default value:** ``2d``
404-
449+
405450
TTL duration for AsyncDataCache and SSD cache entries.
406-
451+
407452
The following time units are supported:
408-
453+
409454
ns, us, ms, s, m, h, d
410455

411456
``cache.velox.ttl-check-interval``
412457
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
413458
* **Type:** ``string``
414459
* **Default value:** ``1h``
415-
460+
416461
The periodic duration to apply cache TTL and evict AsyncDataCache and SSD cache entries.
417462

418463
Memory Checker Properties
@@ -439,9 +484,9 @@ server is under low memory pressure.
439484

440485
Specifies the system memory limit that triggers the memory pushback or heap dump if
441486
the server memory usage is beyond this limit. A value of zero means no limit is set.
442-
This only applies if ``system-mem-pushback-enabled`` is ``true``.
443-
Set ``system-mem-limit-gb`` to be greater than or equal to system-memory-gb but not
444-
higher than the available machine memory of the deployment.
487+
This only applies if ``system-mem-pushback-enabled`` is ``true``.
488+
Set ``system-mem-limit-gb`` to be greater than or equal to system-memory-gb but not
489+
higher than the available machine memory of the deployment.
445490
The default value of 60 gb is calculated based on available machine memory of 64 gb.
446491

447492
``system-mem-shrink-gb``

presto-native-execution/presto_cpp/main/QueryContextManager.cpp

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,15 @@ void updateFromSystemConfigs(
4545
{core::QueryConfig::kQueryMaxMemoryPerNode,
4646
std::string(SystemConfig::kQueryMaxMemoryPerNode)},
4747
{core::QueryConfig::kSpillFileCreateConfig,
48-
std::string(SystemConfig::kSpillerFileCreateConfig)}};
48+
std::string(SystemConfig::kSpillerFileCreateConfig)},
49+
{core::QueryConfig::kSpillEnabled,
50+
std::string(SystemConfig::kSpillEnabled)},
51+
{core::QueryConfig::kJoinSpillEnabled,
52+
std::string(SystemConfig::kJoinSpillEnabled)},
53+
{core::QueryConfig::kOrderBySpillEnabled,
54+
std::string(SystemConfig::kOrderBySpillEnabled)},
55+
{core::QueryConfig::kAggregationSpillEnabled,
56+
std::string(SystemConfig::kAggregationSpillEnabled)}};
4957

5058
for (const auto& configNameEntry : sessionSystemConfigMapping) {
5159
const auto& sessionName = configNameEntry.first;

presto-native-execution/presto_cpp/main/common/Configs.cpp

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -250,6 +250,10 @@ SystemConfig::SystemConfig() {
250250
BOOL_PROP(kPlanValidatorFailOnNestedLoopJoin, false),
251251
STR_PROP(kPrestoDefaultNamespacePrefix, "presto.default"),
252252
STR_PROP(kPoolType, "DEFAULT"),
253+
BOOL_PROP(kSpillEnabled, false),
254+
BOOL_PROP(kJoinSpillEnabled, true),
255+
BOOL_PROP(kAggregationSpillEnabled, true),
256+
BOOL_PROP(kOrderBySpillEnabled, true),
253257
};
254258
}
255259

@@ -313,6 +317,22 @@ std::string SystemConfig::poolType() const {
313317
return value;
314318
}
315319

320+
bool SystemConfig::spillEnabled() const {
321+
return optionalProperty<bool>(kSpillEnabled).value();
322+
}
323+
324+
bool SystemConfig::joinSpillEnabled() const {
325+
return optionalProperty<bool>(kJoinSpillEnabled).value();
326+
}
327+
328+
bool SystemConfig::aggregationSpillEnabled() const {
329+
return optionalProperty<bool>(kAggregationSpillEnabled).value();
330+
}
331+
332+
bool SystemConfig::orderBySpillEnabled() const {
333+
return optionalProperty<bool>(kOrderBySpillEnabled).value();
334+
}
335+
316336
bool SystemConfig::mutableConfig() const {
317337
return optionalProperty<bool>(kMutableConfig).value();
318338
}

presto-native-execution/presto_cpp/main/common/Configs.h

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -662,6 +662,14 @@ class SystemConfig : public ConfigBase {
662662

663663
// Specifies the type of worker pool
664664
static constexpr std::string_view kPoolType{"pool-type"};
665+
666+
// Spill related configs
667+
static constexpr std::string_view kSpillEnabled{"spill-enabled"};
668+
static constexpr std::string_view kJoinSpillEnabled{"join-spill-enabled"};
669+
static constexpr std::string_view kAggregationSpillEnabled{
670+
"aggregation-spill-enabled"};
671+
static constexpr std::string_view kOrderBySpillEnabled{
672+
"order-by-spill-enabled"};
665673

666674
SystemConfig();
667675

@@ -901,9 +909,18 @@ class SystemConfig : public ConfigBase {
901909
bool enableRuntimeMetricsCollection() const;
902910

903911
bool prestoNativeSidecar() const;
912+
904913
std::string prestoDefaultNamespacePrefix() const;
905914

906915
std::string poolType() const;
916+
917+
bool spillEnabled() const;
918+
919+
bool joinSpillEnabled() const;
920+
921+
bool aggregationSpillEnabled() const;
922+
923+
bool orderBySpillEnabled() const;
907924
};
908925

909926
/// Provides access to node properties defined in node.properties file.

presto-native-execution/presto_cpp/main/tests/QueryContextManagerTest.cpp

Lines changed: 44 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -108,34 +108,54 @@ TEST_F(QueryContextManagerTest, defaultSessionProperties) {
108108
EXPECT_EQ(queryConfig.maxSpillLevel(), defaultQC->maxSpillLevel());
109109
EXPECT_EQ(
110110
queryConfig.spillCompressionKind(), defaultQC->spillCompressionKind());
111+
EXPECT_EQ(queryConfig.spillEnabled(), defaultQC->spillEnabled());
112+
EXPECT_EQ(queryConfig.aggregationSpillEnabled(), defaultQC->aggregationSpillEnabled());
111113
EXPECT_EQ(queryConfig.joinSpillEnabled(), defaultQC->joinSpillEnabled());
114+
EXPECT_EQ(queryConfig.orderBySpillEnabled(), defaultQC->orderBySpillEnabled());
112115
EXPECT_EQ(
113116
queryConfig.validateOutputFromOperators(),
114117
defaultQC->validateOutputFromOperators());
115118
EXPECT_EQ(
116119
queryConfig.spillWriteBufferSize(), defaultQC->spillWriteBufferSize());
117120
}
118121

119-
TEST_F(QueryContextManagerTest, overrdingSessionProperties) {
122+
TEST_F(QueryContextManagerTest, overridingSessionProperties) {
120123
protocol::TaskId taskId = "scan.0.0.1.0";
121124
const auto& systemConfig = SystemConfig::instance();
122125
{
123126
protocol::SessionRepresentation session{.systemProperties = {}};
124127
auto queryCtx =
125128
taskManager_->getQueryContextManager()->findOrCreateQueryCtx(
126129
taskId, session);
130+
// When session properties are not explicitly set, they should be set to
131+
// system config values.
127132
EXPECT_EQ(
128133
queryCtx->queryConfig().queryMaxMemoryPerNode(),
129134
systemConfig->queryMaxMemoryPerNode());
130135
EXPECT_EQ(
131136
queryCtx->queryConfig().spillFileCreateConfig(),
132137
systemConfig->spillerFileCreateConfig());
138+
EXPECT_EQ(
139+
queryCtx->queryConfig().spillEnabled(),
140+
systemConfig->spillEnabled());
141+
EXPECT_EQ(
142+
queryCtx->queryConfig().aggregationSpillEnabled(),
143+
systemConfig->aggregationSpillEnabled());
144+
EXPECT_EQ(
145+
queryCtx->queryConfig().joinSpillEnabled(),
146+
systemConfig->joinSpillEnabled());
147+
EXPECT_EQ(
148+
queryCtx->queryConfig().orderBySpillEnabled(),
149+
systemConfig->orderBySpillEnabled());
133150
}
134151
{
135152
protocol::SessionRepresentation session{
136153
.systemProperties = {
137154
{"query_max_memory_per_node", "1GB"},
138-
{"spill_file_create_config", "encoding:replica_2"}}};
155+
{"spill_file_create_config", "encoding:replica_2"},
156+
{"spill_enabled", "true"},
157+
{"aggregation_spill_enabled", "false"},
158+
{"join_spill_enabled", "true"}}};
139159
auto queryCtx =
140160
taskManager_->getQueryContextManager()->findOrCreateQueryCtx(
141161
taskId, session);
@@ -144,6 +164,28 @@ TEST_F(QueryContextManagerTest, overrdingSessionProperties) {
144164
1UL * 1024 * 1024 * 1024);
145165
EXPECT_EQ(
146166
queryCtx->queryConfig().spillFileCreateConfig(), "encoding:replica_2");
167+
// Override with different value
168+
EXPECT_EQ(
169+
queryCtx->queryConfig().spillEnabled(), true);
170+
EXPECT_NE(
171+
queryCtx->queryConfig().spillEnabled(),
172+
systemConfig->spillEnabled());
173+
// Override with different value
174+
EXPECT_EQ(
175+
queryCtx->queryConfig().aggregationSpillEnabled(), false);
176+
EXPECT_NE(
177+
queryCtx->queryConfig().aggregationSpillEnabled(),
178+
systemConfig->aggregationSpillEnabled());
179+
// Override with same value
180+
EXPECT_EQ(
181+
queryCtx->queryConfig().joinSpillEnabled(), true);
182+
EXPECT_EQ(
183+
queryCtx->queryConfig().joinSpillEnabled(),
184+
systemConfig->joinSpillEnabled());
185+
// No override
186+
EXPECT_EQ(
187+
queryCtx->queryConfig().orderBySpillEnabled(),
188+
systemConfig->orderBySpillEnabled());
147189
}
148190
}
149191

0 commit comments

Comments
 (0)