Skip to content

Docs: non-production cost optimization + Terraform/YAML guidance#347

Open
justin808 wants to merge 1 commit into
mainfrom
jg/non-prod-cost-optimization-docs
Open

Docs: non-production cost optimization + Terraform/YAML guidance#347
justin808 wants to merge 1 commit into
mainfrom
jg/non-prod-cost-optimization-docs

Conversation

@justin808

@justin808 justin808 commented May 30, 2026

Copy link
Copy Markdown
Member

Summary

Adds user-facing documentation on two related topics that currently have no home in the guide:

  1. Reducing Control Plane cost for long-lived non-production apps (staging, demos) — the existing Minimizing Review App Costs section only covers ephemeral PRs.
  2. How Terraform complements the cpflow YAML workflow — the existing Terraform overview reads as "Terraform over YAML," so this clarifies when to use each.

Changes

docs/tips.md — new section "Right-Sizing Non-Production Workloads":

  • Enable Capacity AI on idle workloads (with the stateful-workload tradeoff)
  • Disable CPU-utilization autoscaling for idle apps (it fights Capacity AI)
  • Right-size reserved CPU/memory — the common mistake of pinning Postgres at a full core
  • Drop unused workers — run Rails 8 Solid Queue inside Puma (SOLID_QUEUE_IN_PUMA=true) instead of a full-time worker
  • Share one Postgres across non-production apps (database-per-app), with the single-port and scale-event caveats
  • Keep templates as the source of truth — console edits get overwritten by cpflow deploy; bridges to Terraform drift detection

docs/terraform/overview.md — new section "Terraform and cpflow Are Complementary":

  • cpflow deploy for ephemeral review apps; generated Terraform for long-lived staging/production
  • Notes that cpflow terraform generate keeps the YAML templates as the single source of truth, and that terraform plan surfaces drift

Notes

  • Docs-only. No source-code changes, so no CHANGELOG entry (the changelog is scoped to source code).
  • All anchor links and cross-links between the two docs were checked.
  • Will trigger the docs-site dispatch to controlplaneflow-com on merge (per CONTRIBUTING.md).

🤖 Generated with Claude Code


Note

Low Risk
Documentation-only changes with no runtime, auth, or infrastructure code impact.

Overview
Docs-only: expands the user guide with non-production cost levers and clarifies how Terraform fits alongside cpflow deploy.

docs/tips.md adds Right-Sizing Non-Production Workloads for long-lived staging/demo apps (beyond the existing review-app section): Capacity AI, turning off CPU autoscaling on idle apps, lowering reserved CPU/memory (especially Postgres), dropping unused workers (e.g. Solid Queue in Puma), sharing one Postgres across apps, and keeping .controlplane/templates/ as the source of truth—with a link to Terraform for drift on stable envs.

docs/terraform/overview.md adds Terraform and cpflow Are Complementary: review apps stay on cpflow deploy; staging/production use generated Terraform from the same templates, with terraform plan for drift—cross-linked to the new tips section.

Reviewed by Cursor Bugbot for commit 599cb21. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • Documentation
    • Enhanced Terraform documentation clarifying workflow separation: cpflow deploy for ephemeral environments, generated Terraform for stable long-lived environments with state tracking and drift detection.
    • Added cost-reduction guidance for non-production workloads, covering resource optimization, autoscaling configuration, and consolidation strategies.

@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

@coderabbitai

coderabbitai Bot commented May 30, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e1688352-2c4c-419d-8b84-64624ef7d3dc

📥 Commits

Reviewing files that changed from the base of the PR and between 037b74f and 599cb21.

📒 Files selected for processing (2)
  • docs/terraform/overview.md
  • docs/tips.md

Walkthrough

This PR adds documentation covering two related areas of deployment and cost management. It clarifies the complementary roles of cpflow deploy and generated Terraform, recommending cpflow for ephemeral review apps and Terraform for stable staging/production with state tracking. It also introduces comprehensive guidance for cost optimization in non-production environments, covering resource sizing, workload consolidation, and shared infrastructure patterns.

Changes

Documentation Additions

Layer / File(s) Summary
Terraform and cpflow complementarity
docs/terraform/overview.md
New section explaining the workflow split: cpflow with YAML templates for ephemeral review apps, and generated Terraform for long-lived environments with state management and drift detection. Templates remain the source of truth via regeneration.
Right-sizing non-production workloads
docs/tips.md
Table of contents updated and new section added covering cost reduction strategies: enabling Capacity AI, disabling CPU autoscaling on idle apps, right-sizing reserved resources with example YAML, consolidating workloads using Rails Solid Queue in Puma, sharing Postgres across non-production apps with routing guidance, and maintaining templates as the authoritative configuration source.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • shakacode/control-plane-flow#346: Both PRs modify docs/tips.md with updates to the table of contents and non-production/review-app guidance, overlapping in cost and scaling clarifications.

Suggested labels

documentation, approved for merge

Poem

🐰 The docs now bloom with wisdom bright,
Terraform paths and savings' light,
Templates guide the ephemeral way,
While steady stages terraform apply all day!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: documentation updates about non-production cost optimization and Terraform/YAML guidance, matching both file summaries.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch jg/non-prod-cost-optimization-docs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread docs/tips.md
it — acceptable for non-production.

A managed alternative is a single small RDS instance hosting many databases; see
[migrating from Heroku Postgres to RDS](https://pelle.io/posts/hetzner-rds-postgres).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link text says "Heroku Postgres" but the URL slug is hetzner-rds-postgres — this appears to be about migrating from Hetzner-hosted Postgres (self-managed VPS) to RDS, not from Heroku Postgres. The same mismatch exists in the pre-existing Useful Links entry at the bottom of the file (line 456).

Suggested change
[migrating from Heroku Postgres to RDS](https://pelle.io/posts/hetzner-rds-postgres).
[migrating from Hetzner Postgres to RDS](https://pelle.io/posts/hetzner-rds-postgres).

Worth verifying the article title before merging, and fixing the Useful Links entry at the same time for consistency.

@greptile-apps

greptile-apps Bot commented May 30, 2026

Copy link
Copy Markdown

Greptile Summary

This PR adds two docs-only sections: a "Right-Sizing Non-Production Workloads" guide in docs/tips.md and a "Terraform and cpflow Are Complementary" section in docs/terraform/overview.md. No source code is changed.

  • docs/tips.md gains six subsections covering Capacity AI, CPU autoscaling, reserved resource right-sizing, dropping unused workloads, shared Postgres, and template-as-source-of-truth; cross-links to existing anchors and the new Terraform section are correct.
  • docs/terraform/overview.md gains a short section clarifying the cpflow-vs-Terraform split (ephemeral review apps vs. long-lived environments) and links back to the new tips anchor.

Confidence Score: 4/5

Docs-only change with no source code modifications; safe to merge after addressing the link-text mismatch.

The new content is accurate and well-structured. The only issues are a link text that says "AWS RDS" while the URL points to a Hetzner article (which would send readers to the wrong tutorial) and a YAML snippet missing its workload-level fields for consistency.

docs/tips.md — review the RDS link text and the incomplete autoscaling YAML snippet.

Important Files Changed

Filename Overview
docs/tips.md Adds 115-line "Right-Sizing Non-Production Workloads" section with 6 subsections; one autoscaling YAML snippet is missing its workload-level kind/name fields (inconsistent with all other snippets), and a link text/URL mismatch describes the target as AWS RDS while the URL points to a Hetzner article.
docs/terraform/overview.md Adds 11-line "Terraform and cpflow Are Complementary" section before the Usage heading; cross-links to tips.md anchors are accurate and the prose is consistent with the rest of the document.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[New environment needed?] --> B{Environment type}
    B --> |Ephemeral / PR review| C[cpflow YAML + cpflow deploy]
    B --> |Long-lived: staging/production| D[cpflow terraform generate]
    C --> E[Dynamic per-PR app names]
    C --> F[cleanup-stale-apps / ps:stop / ps:start]
    C --> G[No Terraform state to manage]
    D --> H[terraform apply]
    H --> I[Tracked state file + drift detection]
    H --> J[terraform plan surfaces config drift]
    D --> K[YAML templates remain source of truth]
    K --> |Templates change| D
Loading

Reviews (1): Last reviewed commit: "Add non-production cost-optimization tip..." | Re-trigger Greptile

@claude

claude Bot commented May 30, 2026

Copy link
Copy Markdown
Contributor

Review: Docs — Non-Production Cost Optimization + Terraform/YAML Guidance

Overall: Approve with one factual fix needed. This is a well-written, genuinely useful addition. The content is accurate, the YAML examples are correct, and the cross-linking between the two files is clean.


What the PR does

  • docs/tips.md: Adds a new §10 "Right-Sizing Non-Production Workloads" covering Capacity AI, disabling CPU autoscaling, right-sizing reserved resources (especially Postgres), dropping unused workers (Solid Queue-in-Puma), sharing one Postgres across non-prod apps, and keeping templates as the source of truth.
  • docs/terraform/overview.md: Adds a "Terraform and cpflow Are Complementary" section that clarifies the cpflow deploy vs generated-Terraform split.

Issues

Bug — one factual error (inline comment posted):
Line 440 links [migrating from Heroku Postgres to RDS] to pelle.io/posts/hetzner-rds-postgres. The URL slug says Hetzner (a self-hosted VPS provider), not Heroku. The same wrong text exists in the pre-existing Useful Links entry at line 456 — both should be fixed together.


Minor observations (non-blocking)

  • metric: disabled — the YAML snippet in "Don't Autoscale Idle Workloads on CPU" uses metric: disabled. Worth a quick check against the Control Plane API docs to confirm disabled is a valid enum value rather than, say, none or omitting the key.

  • Shared Postgres — data isolation note — the "Share One Postgres" section helpfully lists networking caveats but doesn't mention that each app should connect as a distinct database user scoped to its own database. A brief note would make the advice production-safe even for non-production setups (prevents a misconfigured DATABASE_URL from touching another app's data).

  • terraform/overview.md heading tension — the section just above the new one is titled "Benefits of Using Terraform Over YAML Configs". The new "Complementary" section immediately softens that framing, which is the right message, but a reader skimming headings gets a slightly mixed signal. Consider renaming the existing heading to "Benefits of Terraform for Long-Lived Environments" or similar — but this is cosmetic and out of scope for this PR.

  • Duplicate link — the new inline reference to pelle.io at line 440 duplicates the existing Useful Links entry at line 456. Once the Heroku→Hetzner text is fixed in both places, you may want to remove the Useful Links entry to avoid future divergence (or keep it for discoverability — either is fine).


Verdict

The documentation is accurate, well-structured, and fills a real gap. One factual fix (Heroku → Hetzner in the link text) is required; the rest are suggestions.

Comment thread docs/tips.md
Comment on lines +439 to +441
A managed alternative is a single small RDS instance hosting many databases; see
[migrating from Heroku Postgres to RDS](https://pelle.io/posts/hetzner-rds-postgres).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Link text vs. URL mismatch for RDS reference

The link text says "migrating from Heroku Postgres to RDS" (implying AWS RDS), but the URL is https://pelle.io/posts/hetzner-rds-postgres — Hetzner and AWS RDS are different managed-database products with distinct setup steps. A reader following the "RDS" framing and clicking through would land on a Hetzner tutorial, which could cause confusion when the infrastructure details don't match.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment thread docs/tips.md
Comment on lines +379 to +387
```yaml
spec:
defaultOptions:
capacityAI: true
autoscaling:
metric: disabled
minScale: 1
maxScale: 1
```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The autoscaling snippet is missing the kind: and name: top-level fields that are present in every other YAML snippet in this section (e.g. the Capacity AI snippet directly above and the Postgres snippet below). Without those fields the snippet looks like an incomplete fragment and readers can't tell which workload it applies to.

Suggested change
```yaml
spec:
defaultOptions:
capacityAI: true
autoscaling:
metric: disabled
minScale: 1
maxScale: 1
```
```yaml
kind: workload
name: rails
spec:
defaultOptions:
capacityAI: true
autoscaling:
metric: disabled
minScale: 1
maxScale: 1

Document how to reduce Control Plane spend on long-lived non-production
environments (staging, demos) and clarify that Terraform complements the
cpflow YAML workflow rather than replacing it.

docs/tips.md: new "Right-Sizing Non-Production Workloads" section --
enable Capacity AI on idle workloads, disable CPU autoscaling for idle
apps, right-size reserved CPU/memory (e.g. avoid pinning Postgres at a
full core), drop unused workers (run Solid Queue in Puma), share one
Postgres across non-production apps, and keep templates as the source of
truth to avoid console drift.

docs/terraform/overview.md: new "Terraform and cpflow Are Complementary"
section -- cpflow deploy for ephemeral review apps, generated Terraform
for long-lived staging/production with state tracking and drift detection.

Docs-only; no source-code changes, so no CHANGELOG entry.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@justin808 justin808 force-pushed the jg/non-prod-cost-optimization-docs branch from a670c5b to 599cb21 Compare May 31, 2026 07:58
@coderabbitai coderabbitai Bot added documentation Improvements or additions to documentation approved for merge labels May 31, 2026
@claude

claude Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

Code Review: Docs — Non-Production Cost Optimization + Terraform/YAML Guidance

Overall: This is a well-written, practical addition to the guide. The content is accurate, the tradeoffs are honestly stated, and the new section slots cleanly into the existing tips structure. All cross-link anchors verified. A few minor issues worth addressing before merge:


Correctness

  • metric: disabled should be quoted. The spec files consistently use metric: "disabled" (string-quoted), while the new YAML snippet writes it unquoted. In YAML, disabled parses as a plain string either way, but quoting it matches the codebase convention. See spec/core/terraform_config/generator_spec.rb:522 and workload_spec.rb:172.

  • templates/app.yml is not a standard path. The "Share One Postgres" section tells readers to edit DATABASE_HOST "in templates/app.yml", but the actual directory is .controlplane/templates/ and the filename varies per app (e.g. rails.yml). The path should be corrected or made more generic.

Style / Consistency

  • YAML snippets are inconsistent. The first snippet (Capacity AI) includes the kind:/name: header; the second (autoscaling) drops straight to spec:. Readers copy-pasting the second snippet would produce a headless fragment. Either add the header or label it as a partial excerpt.

  • Duplicate external link. pelle.io/posts/hetzner-rds-postgres appears both inline in the new "Share One Postgres" section and in the existing "Useful Links" footer. The duplication is harmless but the inline use is more contextual.

Clarity

  • Org-vs-GVC port exposure is under-explained. "Expose the database port at exactly one level (org or GVC, never both) to avoid Control Plane routing conflicts" is the most operationally risky advice in the new section but the least explained. New users won't know what "expose at org level" means vs "GVC level." A brief clarifier or link to Control Plane docs would help.

  • Solid Queue advice is Rails 8-specific. The section frames dropping the worker workload as a general tip, then qualifies it with "On Rails 8, Solid Queue can run inside Puma...". Users on Rails 6/7 (likely using Sidekiq) may miss that this specific workaround doesn't apply to them.


The metric: "disabled" quoting and the .controlplane/templates/ path are the two items most worth fixing. Everything else is minor polish.

Comment thread docs/tips.md
defaultOptions:
capacityAI: true
autoscaling:
metric: disabled

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec files in this repo quote this value as "disabled" — e.g. generator_spec.rb:522 and workload_spec.rb:172. YAML parses an unquoted disabled as a string too, but quoting it matches the established convention and avoids any surprises with strict YAML validators.

Suggested change
metric: disabled
metric: "disabled"

Comment thread docs/tips.md
app multiplies standing cost. For non-production, several apps can share a single Postgres
server, each using its own database:

- Point each app's `DATABASE_HOST` (in `templates/app.yml`) at the shared instance's

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

templates/app.yml isn't a standard cpflow filename — the canonical location is .controlplane/templates/ and the actual file name varies per app (e.g. rails.yml, app.yml depends on what the user named it). A more accurate phrasing:

Suggested change
- Point each app's `DATABASE_HOST` (in `templates/app.yml`) at the shared instance's
- Point each app's `DATABASE_HOST` environment variable (in `.controlplane/templates/`) at the shared instance's

Comment thread docs/tips.md
Comment on lines +435 to +443
```yaml
spec:
defaultOptions:
capacityAI: true
autoscaling:
metric: disabled
minScale: 1
maxScale: 1
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This snippet starts at spec: while the Capacity AI snippet above begins with kind:/name:. Readers copy-pasting this fragment directly into a template would get a headless YAML block. Consider either adding the header to match the first snippet, or adding a note like # (partial — add to your workload's spec.defaultOptions) to signal it's an excerpt.

Comment thread docs/tips.md
Comment on lines +490 to +491
- Expose the database port at **exactly one** level (org *or* GVC, never both) to avoid
Control Plane routing conflicts.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the most operationally risky advice in the section but the least explained. Readers unfamiliar with Control Plane's networking model won't know what "expose at org level" vs "GVC level" means, or what the routing conflict looks like in practice. A one-sentence clarifier (or a link to the relevant Control Plane docs on port/service exposure) would significantly reduce the chance of misconfiguration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved for merge documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant