Skip to content

ImageBuilder should consume a configuration file for build/publishing operations #1747

@lbussell

Description

@lbussell

The problem

Configuring build, publishing, and pipeline settings for ImageBuilder is complex and needlessly complicated.

There are lots of inputs to the build/publishing process. This includes things like which pipeline image to run on as well as which registry to publish to.

In our system, pipeline inputs and build/publishing inputs are intertwined. Everything is configured through azure pipelines parameters and variables. This has a number of downsides:

  • It is impossible to separate the publishing from the pipeline environment that it runs in
  • It's difficult to track down where/how/why a particular pipeline variable was assigned
  • We spend lots and lots of pipeline steps setting up variables inside PowerShell scripts
  • Pipeline variables are inherently insecure. Any compromised step can override variables by printing to stdout

For example, look at eng/common/templates/variables/common.yml. There is no logical separation between pipeline considerations (e.g. what pool we are running on) vs. publishing considerations (which ACR should we publish to? under which repo path?). Furthermore, Azure pipelines do not give us an easy way to separate those concerns.

Here's an example of a build command that ran recently:

build \
    --manifest dotnet-dotnet-docker/manifest.json \
    --os-version alpine3.21 \
    --os-type linux \
    --architecture amd64 \
    --retry \
    --digests-out-var 'builtImages' \
    --acr-subscription '00000000-0000-0000-0000-000000000000' \
    --acr-resource-group 'DotnetContainers' \
    --image-info-output-path /artifacts/imageInfo/linuxamd64src-runtime-deps-10.0-alpine3.21-graph-image-info.json \
    --source-repo https://github.com/dotnet/dotnet-docker \
    --source-repo-prefix mirror/ \
    --registry-override somerandomacr.azurecr.io \
    --repo-prefix build-staging/9999999/ \
    --push \
    --image-info-source-path versions/build-info/docker/image-info.dotnet-dotnet-docker-nightly.json

Hopefully it's obvious that lots of those arguments are a function of which publishing environment we're targeting (main, nightly, or testing). Ideally there would be one place for the entirety of those configurations to live.

Config file

By creating a publishing configuration file for different environments/pipelines, we can separate concerns about what we're doing (the publishing config) from how we're doing it (the pipeline code).

Such a config file might look something like this:

{
  "buildAcr": {
    "name": "someStagingAcr",
    "repoPrefix": "build/$(BUILD_ID)/",
    "authentication": {
      "serviceConnection": {
        "id": "00000000-0000-0000-0000-000000000000",
        "clientId": "00000000-0000-0000-0000-000000000000",
        "tenantId": "00000000-0000-0000-0000-000000000000",
      }
    },
  },
  "publishAcr": {
    "name": "someProdAcr",
    "repoPrefix": "public/",
    "authentication": {
      // ...
    },
  },
  "publishEolAnnotations": true,
  "publishImageInfo": true,
  "versionsRepo": {
    "location": "github",
    "org": "dotnet",
    "repo": "versions",
    "branch": "main",
    "imageInfoPath": "build-info/docker"
  },
  "publishReadmes": true,
  "docsRepo": {
    "location": "github",
    "org": "microsoft",
    // ...
  },
  "enablePublishNotifications": true,
  "notificationsRepo": {
    "location": "github",
    // ...
  },
  "enableBuildTelemetry": true,
  "buildTelemetry": {
    "kustoCluster": "somekustocluster",
    "database": "Database",
    "imageTable": "Images",
    "layerTable": "Layers",
    "authentication": {
      //...
    }
  },
  // ...
}

This obviously isn't a complete example, but hopefully that makes sense. We should extract all of the immutable characteristics of various publishing configurations out of the pipelines and into declarative config files like this. Each repo/branch publishing setup would have its own config (dotnet-docker main, dotnet-docker nightly, dotnet-buildtools-prereqs, docker-tools, etc.). This is something we could do incrementally, on a command-by-command or even option-by-option basis for example.

Some things (like waiting for image ingestion, timeouts, etc.) should remain configurable via CLI args so they are more easily accessible and overridable via pipelines.

Unknowns

  • Some values in the config don't need to be public. It would be ideal to have a way of injecting some values into the config from pipeline variable groups. This could be via a script step or something like that.
  • Could we define the config directly in our YAML pipelines and convert it to json to be passed in?

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions