Skip to content

Conversation

@brian-abo
Copy link

Contribution Details

PR Description

This PR adds two significant enhancements to Flagger:

Namespace Scoping Support - Allows Flagger to watch specific namespaces instead of all namespaces #1859
New Relic NerdGraph Provider - Adds support for New Relic's GraphQL API for custom metrics #1860

What This PR Does

Namespace Scoping Feature

Problem: Previously, Flagger could only watch all namespaces in a cluster, which wasn't ideal for multi-tenant environments or when you only want to manage canaries in specific namespaces.

Multi-Tenancy Issues:

Security Isolation: Watching all namespaces meant Flagger had cluster-wide permissions and could potentially access sensitive deployment information across tenant boundaries.

Resource Conflicts: Multiple teams using Flagger in the same cluster could experience naming conflicts or unintended interactions between their canary configurations.

Compliance Requirements: Organizations may have strict compliance requirements that mandate workload isolation between different business units, environments, or customers.

Operational Challenges:

Noise and Complexity: Operators had to sift through logs and metrics from all namespaces, making troubleshooting and monitoring more difficult when they only cared about a specific namespace.

RBAC Complexity: Organizations couldn't implement least-privilege access patterns where Flagger instances should only have permissions for specific namespaces they're responsible for managing.
Environment Separation:

Development vs Production: Teams often want separate Flagger instances for different environments, but the all-namespace approach made it impossible to cleanly separate development canaries from production ones.

Staged Rollouts: Organizations implementing progressive delivery across multiple environments need better control over which Flagger instance manages which namespace.

This limitation forced many organizations to either:

  • Deploy multiple Flagger instances in separate clusters (expensive and complex)
  • Accept the security and performance trade-offs of cluster-wide access
  • The namespace scoping feature solves these problems by allowing administrators to precisely control which namespaces each Flagger instance monitors, enabling true multi-tenant deployments with proper security isolation and improved operational efficiency.

Solution: Modified --namespace flag that accepts a comma-separated list of namespaces to watch.

Key Changes:

Modified:

  • cmd/flagger/main.go to parse namespace flag and create namespace-specific informers
  • Updated controller logic in pkg/controller/ to handle both namespace-specific and all-namespace listers
  • Added fallback logic: if a resource isn't found in a specific namespace lister, it checks the "all-namespaces" lister
  • Updated installation documentation with examples for single namespace, multiple namespaces, and all namespaces (default)
# Watch specific namespace
--set namespace=production

# Watch multiple namespaces  
--set namespace=production,staging,testing

# Watch all namespaces (default behavior)
# No namespace flag needed

New Relic NerdGraph Provider

Problem: The existing New Relic provider only used the Insights API. New Relic currently recommends using the NerdGraph (GraphQL) API. This is also the API we use at Capital One.

Solution: Implemented a NerdGraph provider that wraps NRQL queries in GraphQL and supports template variables.

Key Features:

  • GraphQL API Integration: Uses New Relic's NerdGraph API for more flexible querying
  • Template Variable Support: Supports all Flagger template variables ({{ target }}, {{ namespace }}, etc.)
  • Automatic Query Wrapping: Automatically wraps NRQL queries with proper GraphQL structure and time windows
  • Robust Error Handling: Comprehensive error handling for API responses and data parsing
  • Credential Management: Secure handling of API keys and account IDs via Kubernetes secrets

Implementation Details:

Added :

  • pkg/metrics/providers/newrelic-nerdgraph.go with full provider implementation
  • Comprehensive tests in pkg/metrics/providers/newrelic-nerdgraph_test.go
  • Updated factory to register the new provider
  • Added detailed documentation with examples
apiVersion: flagger.app/v1beta1
kind: MetricTemplate
metadata:
  name: nerdgraph-success-rate
spec:
  provider:
    type: newrelic-nerdgraph
    secretRef:
      name: newrelic-secret
  query: |
    SELECT percentage(count(*), WHERE response.status != 'error') 
    FROM Transaction 
    WHERE appName = '{{ target }}'

@brian-abo
Copy link
Author

Sorry, I had to close and recreate this PR in order to fix the DCO check. The company I work for has strict controls around open source contributions and I was unable to force push to the old branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant