Skip to content

Conversation

@knqyf263
Copy link
Collaborator

@knqyf263 knqyf263 commented Oct 20, 2025

Description

This PR implements a new Artifact ID calculation for container images that includes registry and repository information, as described in #9678.

Changes Summary

  1. New Artifact ID calculation: Changed from using Image ID directly to hash(ImageID + Registry + Repository)
  2. New image.Reference type: Type-safe wrapper for container image references with JSON marshaling support
  3. Image reference matching: Added logic to match artifact names with RepoTags/RepoDigests

Implementation Details

The new Artifact ID ensures:

  • Same image with different tags → Same Artifact ID
  • Different repositories → Different Artifact IDs
  • Different registries → Different Artifact IDs

This allows proper tracking and deduplication of vulnerabilities across different deployment contexts.

Related issues

Related PRs

Checklist

  • I've read the guidelines for contributing to this repository.
  • I've followed the conventions in the PR title.
  • I've added tests that prove my fix is effective or that my feature works.
  • I've updated the documentation with the relevant information (if needed).
  • I've added usage information (if the PR introduces new options)
  • I've included a "before" and "after" example to the description (if the PR is a user interface change).

@knqyf263 knqyf263 self-assigned this Oct 21, 2025
knqyf263 and others added 11 commits October 27, 2025 12:22
- Calculate Artifact ID as hash(ImageID + Registry + Repository)
- Ensures same images in different repos/registries have different IDs
- Same image with different tags in same repo gets same Artifact ID
- Falls back to image ID for unparseable image references
- Add comprehensive tests for the new implementation

Implements specification from gist 0031d4e835388ec1d4c11e50e74da3a4
…calculation

- Parse RepoTags and RepoDigests upfront using name.NewTag/NewDigest
- Use switch statement for type-safe Digest/Tag handling
- Use lo.Find for matching reference lookup
- Use lo.FirstOrEmpty for fallback selection
- Add test case for same image with different digest
Add comprehensive documentation for ArtifactID field explaining:
- Container images: hash(ImageID + Registry + Repository)
- Repositories: hash(RepoURL + Commit) or hash(Path + Commit)
- Filesystems and other types: empty string
Extract gzip-aware file opening logic from pkg/fanal/image/docker.go
into a reusable utility package pkg/fanal/utils/gzip.

This improves code reusability and makes it easier to handle
gzip-compressed files consistently across the codebase.
… conflicts

When loading Docker images from tar archives in integration tests,
existing images with the same RepoTags can cause conflicts. This results
in the loaded image not having the expected RepoTags from the archive.

This commit:
- Adds ExtractRepoTagsFromArchive helper to extract RepoTags from
  tar archives before loading
- Removes existing images with matching RepoTags before docker load
- Updates override function to match ArtifactName and Target with
  golden file expectations

This ensures that the loaded image has the correct RepoTags from
the archive's manifest.json, which is necessary for proper Artifact ID
calculation.
Add ImageCleanLoad helper to internal/testutil/docker.go that removes
existing images with conflicting RepoTags before loading a new image
from archive. This prevents RepoTag conflicts in Docker Engine tests
that would affect Artifact ID calculation.

The helper extracts RepoTags from the archive manifest and removes any
existing images with the same tags before performing the load operation.
Update integration test golden files to reflect the new Artifact ID
calculation that includes registry and repository information in
addition to the image ID.
Update unit test expectations to reflect the new Artifact ID calculation
that includes registry and repository information. Add Reference field
to test data where needed to ensure proper Artifact ID generation.
Update spring4shell module test golden files to reflect the new
Artifact ID calculation.
Exclude ArtifactID from comparison in TestRegistry because registry tests
use random ports (e.g., localhost:54321/alpine:3.10), which causes RepoTags
and the calculated Artifact ID to vary on each test run.
@knqyf263 knqyf263 force-pushed the feat/update-artifact-id-calculation branch from e13fd05 to 2f5bf2a Compare October 27, 2025 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: include registry and repository in artifact ID calculation

1 participant