feat(skills): add eks-genai skill + Day 1 workflow by jalawala · Pull Request #47 · aws-samples/sample-apex-skills

jalawala · 2026-06-02T06:56:41Z

Add an opinionated GenAI-on-EKS skill, a matching Day 1 steering workflow and command shim, and the full repo fan-out (catalogues, hub routing, sibling-map eval updates).

Skill — skills/eks-genai/:

SKILL.md (132 lines) + 12 references teaching the AWS-canonical 6-layer GenAI-on-EKS stack: compute (NVIDIA GPU vs AWS Neuron), cluster/scheduler (Karpenter, device plugins, EFA, Capacity Blocks), frameworks (JARK + vLLM + Ray Serve, Triton/Dynamo/KServe), storage (FSx Lustre, Mountpoint S3 CSI, EFS, S3 Vectors), observability (DCGM/Neuron Monitor + Prometheus/Grafana + AMP), and the LiteLLM AI gateway; plus distributed training, KV-cache tiering (LMCache), cost levers, agentic/RAG, a non-negotiable security baseline, the concrete validated stack (versions), and 5 worked use cases.
Grounded in the EKS AI/ML Best Practices guide + awslabs/ai-on-eks and validated against the GenAI-on-EKS NVIDIA workshop currency.

Steering:

steering/workflows/eks-genai.md (Day 1 - Build, advisory, 4 phases, STOP gates) - passes quick_validate clean (0/0).
/apex:eks-genai command shim.
Wired into steering/eks.md routing table + workflow index.

Evals + sibling fan-out — misc/evals/eks-genai/:

triggering.json (8 positives, 8 attributed negatives), evals.json (2 task prompts, 5 grader-checkable expectations each).
Added eks-genai to the SIBLING_MAP + a routing negative in the 4 neighbours (best-practices, design, build, platform-engineering).

Catalogues: README.md (skills, steering, slash-command tables) and skills/README.md detail block.

Gates: quick_validate PASS (0/0); make hygiene + check-evals-coverage PASS for all 10 skills.

Summary

If this PR adds or changes a skill

Ran /apex:new-skill (or walked the equivalent manual steps in CONTRIBUTING.md)
skills/<skill>/SKILL.md present and passes make validate-<skill> (run from misc/evals/)
misc/evals/<skill>/triggering.json authored (≥16 prompts; balanced positives and near-miss negatives)
misc/evals/<skill>/evals.json authored (≥2 realistic task prompts with ≥3 expectations each; every assertion tagged TODO: human review until tuned)
misc/evals/<skill>/README.md filled in — including the SIBLING_MAP block (or explicitly empty with rationale if the skill has no siblings)
For each neighbour: misc/evals/<neighbour>/SIBLING_MAP gained a bullet and its triggering.json gained the matching negatives (via update_sibling_map.py or hand-edit)
make init-evals-finalize SKILL=<skill> exits 0
make check-evals-coverage exits 0
Ran the update-docs skill and committed any resulting changes (regenerated wrappers/manifest, marker-block updates, prose edits)

See misc/evals/README.md for the capability catalogue and CONTRIBUTING.md for the full new-skill workflow.

Add an opinionated GenAI-on-EKS skill, a matching Day 1 steering workflow and command shim, and the full repo fan-out (catalogues, hub routing, sibling-map eval updates). Skill — skills/eks-genai/: - SKILL.md (132 lines) + 12 references teaching the AWS-canonical 6-layer GenAI-on-EKS stack: compute (NVIDIA GPU vs AWS Neuron), cluster/scheduler (Karpenter, device plugins, EFA, Capacity Blocks), frameworks (JARK + vLLM + Ray Serve, Triton/Dynamo/KServe), storage (FSx Lustre, Mountpoint S3 CSI, EFS, S3 Vectors), observability (DCGM/Neuron Monitor + Prometheus/Grafana + AMP), and the LiteLLM AI gateway; plus distributed training, KV-cache tiering (LMCache), cost levers, agentic/RAG, a non-negotiable security baseline, the concrete validated stack (versions), and 5 worked use cases. - Grounded in the EKS AI/ML Best Practices guide + awslabs/ai-on-eks and validated against the GenAI-on-EKS NVIDIA workshop currency. Steering: - steering/workflows/eks-genai.md (Day 1 - Build, advisory, 4 phases, STOP gates) - passes quick_validate clean (0/0). - /apex:eks-genai command shim. - Wired into steering/eks.md routing table + workflow index. Evals + sibling fan-out — misc/evals/eks-genai/: - triggering.json (8 positives, 8 attributed negatives), evals.json (2 task prompts, 5 grader-checkable expectations each). - Added eks-genai to the SIBLING_MAP + a routing negative in the 4 neighbours (best-practices, design, build, platform-engineering). Catalogues: README.md (skills, steering, slash-command tables) and skills/README.md detail block. Gates: quick_validate PASS (0/0); make hygiene + check-evals-coverage PASS for all 10 skills.

utkarpun · 2026-06-02T10:05:57Z

Linked: closes #6 (AI/ML | GenAI Reference). This PR delivers the genai skill that addresses the original ask.

devfloor9 self-requested a review June 2, 2026 07:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): add eks-genai skill + Day 1 workflow#47

feat(skills): add eks-genai skill + Day 1 workflow#47
jalawala wants to merge 1 commit into
aws-samples:mainfrom
jalawala:feat/eks-genai-skill

jalawala commented Jun 2, 2026

Uh oh!

utkarpun commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jalawala commented Jun 2, 2026

Summary

If this PR adds or changes a skill

Uh oh!

utkarpun commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants