Skip to content

Commit 20914e1

Browse files
committed
docs: enhance README and user task guide
Added detailed examples for agent group targeting in the README, demonstrating how to submit agents with specific group names and target them in the topology. Updated the user task guide to include inline environment scripts for the SSH plugin, providing syntax and usage examples for embedding bash scripts directly in the configuration file. * Improved clarity on resource management in heterogeneous environments. * Provided practical examples for better user understanding.
1 parent 1f6cdc1 commit 20914e1

File tree

3 files changed

+166
-9
lines changed

3 files changed

+166
-9
lines changed

dds-topology-lib/README.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -281,6 +281,73 @@ Requirements specify constraints for task placement on computing nodes. They hel
281281
282282
For `hostname` and `wnname`, the value can be a full name or regular expression.
283283

284+
### Agent Group Targeting Example
285+
286+
You can submit agents with specific group names and then target them in your topology, enabling fine-grained control over task placement across heterogeneous resources:
287+
288+
```bash
289+
# Submit GPU-capable agents with a group tag
290+
dds-submit -r slurm -n 10 --slots 8 --group-name="gpu_workers"
291+
292+
# Submit CPU-only agents with a different tag
293+
dds-submit -r slurm -n 20 --slots 16 --group-name="cpu_workers"
294+
295+
# Submit high-memory agents
296+
dds-submit -r slurm -n 5 --slots 32 --group-name="highmem_workers"
297+
```
298+
299+
Then target specific agent groups in your topology:
300+
301+
```xml
302+
<topology name="heterogeneous_workflow">
303+
<!-- Define requirements for different agent groups -->
304+
<declrequirement name="gpu_req" type="groupname" value="gpu_workers"/>
305+
<declrequirement name="cpu_req" type="groupname" value="cpu_workers"/>
306+
<declrequirement name="highmem_req" type="groupname" value="highmem_workers"/>
307+
308+
<!-- GPU-intensive task -->
309+
<decltask name="gpu_task">
310+
<exe>cuda_app --device=gpu</exe>
311+
<requirements>
312+
<name>gpu_req</name> <!-- Will run only on gpu_workers agents -->
313+
</requirements>
314+
</decltask>
315+
316+
<!-- Standard CPU task -->
317+
<decltask name="cpu_task">
318+
<exe>standard_app</exe>
319+
<requirements>
320+
<name>cpu_req</name> <!-- Will run only on cpu_workers agents -->
321+
</requirements>
322+
</decltask>
323+
324+
<!-- Memory-intensive task -->
325+
<decltask name="memory_task">
326+
<exe>bigdata_app --memory=large</exe>
327+
<requirements>
328+
<name>highmem_req</name> <!-- Will run only on highmem_workers agents -->
329+
</requirements>
330+
</decltask>
331+
332+
<main name="main">
333+
<task>gpu_task</task>
334+
<group name="cpu_workers" n="10">
335+
<task>cpu_task</task>
336+
</group>
337+
<group name="memory_workers" n="3">
338+
<task>memory_task</task>
339+
</group>
340+
</main>
341+
</topology>
342+
```
343+
344+
This pattern is particularly useful for:
345+
346+
- **Heterogeneous clusters**: Mix GPU and CPU nodes in the same workflow
347+
- **Resource optimization**: Ensure memory-intensive tasks get high-memory nodes
348+
- **Cost management**: Separate expensive GPU resources from cheaper CPU resources
349+
- **Multi-tenant environments**: Isolate different user groups or projects
350+
284351
### Using Requirements
285352

286353
Requirements can be applied to tasks or collections:

docs/user-task-guide.md

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -412,17 +412,29 @@ dds-submit --rms ssh --config ssh_config_with_inline_env.cfg
412412

413413
#### Method 2: SSH Plugin Inline Configuration
414414

415-
```ini
416-
# ssh_config.cfg
417-
[ssh_worker_1]
418-
host=worker1.example.com
419-
env=export DATA_PATH=/local/data; module load python/3.8
420-
421-
[ssh_worker_2]
422-
host=worker2.example.com
423-
env=export DATA_PATH=/shared/data; source /opt/env.sh
415+
The SSH plugin supports embedding bash scripts directly in the configuration file using special tags:
416+
417+
```properties
418+
@bash_begin@
419+
# Custom environment for SSH workers
420+
export DATA_PATH=/local/data
421+
module load python/3.8
422+
423+
# Task-specific customization based on DDS variables
424+
if [ "$DDS_TASK_NAME" = "master_task" ]; then
425+
export ROLE="master"
426+
else
427+
export ROLE="worker"
428+
fi
429+
@bash_end@
430+
431+
# SSH worker definitions
432+
wn1, [email protected], -p22, /home/user/dds_work, 4
433+
wn2, [email protected], -p22, /home/user/dds_work, 4
424434
```
425435

436+
For more details, see the [SSH Plugin Documentation](../plugins/dds-submit-ssh/README.md#inline-environment-scripts).
437+
426438
#### Method 3: Global User Environment
427439

428440
Create `~/.DDS/user_worker_env.sh` for automatic inclusion:

plugins/dds-submit-ssh/README.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,84 @@ r2, [email protected],,/home/user/dds,10
2626
125, user2@host, , /tmp/test,
2727
```
2828

29+
## Inline Environment Scripts
30+
31+
The SSH plug-in supports embedding custom bash scripts directly in the configuration file. These scripts are executed on each worker node before DDS agent startup, allowing you to set up custom environments, load modules, or perform initialization tasks.
32+
33+
### Syntax
34+
35+
Use `@bash_begin@` and `@bash_end@` tags to delimit your bash script:
36+
37+
```properties
38+
@bash_begin@
39+
# Your custom bash commands here
40+
export MY_VAR=value
41+
module load gcc/9.3.0
42+
@bash_end@
43+
44+
# Regular SSH configuration entries follow
45+
r1, [email protected], -p22, /tmp/dds, 4
46+
r2, [email protected], , /tmp/dds, 4
47+
```
48+
49+
### Inline Script Example
50+
51+
```properties
52+
@bash_begin@
53+
# Load required environment modules
54+
module purge
55+
module load gcc/9.3.0
56+
module load openmpi/4.0.3
57+
module load python/3.8.5
58+
59+
# Set custom environment variables
60+
export MY_APP_CONFIG="/shared/config"
61+
export DATA_ROOT="/shared/data"
62+
export OMP_NUM_THREADS=4
63+
64+
# Source site-specific setup
65+
if [ -f "/etc/site-setup.sh" ]; then
66+
source /etc/site-setup.sh
67+
fi
68+
69+
echo "Custom environment loaded for DDS worker"
70+
@bash_end@
71+
72+
wn1, [email protected], -p22, /home/user/dds_work, 8
73+
wn2, [email protected], -p22, /home/user/dds_work, 8
74+
wn3, [email protected], -p22, /home/user/dds_work, 8
75+
```
76+
77+
### Notes
78+
79+
* The inline script is executed once per worker node during agent initialization
80+
* The script applies to all agents spawned from the configuration file
81+
* Comments are allowed within the `@bash_begin@/@bash_end@` block
82+
* Comments outside the bash block start with `#` as usual
83+
* The closing `@bash_end@` tag must be present or a syntax error will occur
84+
* If you don't need an inline script, you can omit the tags entirely or use empty tags:
85+
86+
```properties
87+
@bash_begin@
88+
@bash_end@
89+
90+
wn, localhost, , ~/tmp/dds_wn_test, 6
91+
```
92+
93+
### Alternative: Using --env-config
94+
95+
Instead of inline scripts in the configuration file, you can also use the `--env-config` flag with `dds-submit`:
96+
97+
```shell
98+
dds-submit -r ssh -c ssh_config.cfg --env-config /path/to/env_script.sh
99+
```
100+
101+
This method is useful when you want to:
102+
103+
* Reuse the same environment script across different configurations
104+
* Keep configuration files simple and separate from environment setup
105+
* Manage environment scripts in version control separately
106+
29107
## Usage example
30108

31109
Call using a given configuration file:

0 commit comments

Comments
 (0)