Skip to content

Conversation

@wjordan
Copy link

@wjordan wjordan commented Aug 14, 2025

While working on #977, I found that newly-pushed images were not immediately advertising their associated content, and learned that for containerd versions not supporting native content-create events, the logic to create OCIEvent from images was explicitly skipped for tag-referenced images:

// Pull by tag creates an event only for the tag. We dont get content to avoid advertising twice.
if img.Digest == "" {
return []OCIEvent{{Type: CreateEvent, Key: e.GetName()}}, nil
}

This PR fixes the issue by resolving the image digest from the tag and allows the existing logic to walk the image to create associated content events.

(note: this fix works for my case, but I'm not 100% confident it will work in the general case since I didn't fully understand the existing comment about advertising twice. In my own tests, I only saw a single /images/create event when pulling an image by tag. My tests were on containerd v1.7.20 in case version-specific behavior is related.)

Resolve the image digest and walk image for all content.
@phillebaba
Copy link
Member

phillebaba commented Aug 18, 2025

The code could be helped with some more comments and explicit tests to show the expected behavior. What happens when an image is pulled with Containerd is that two image create events are created. One event for the image tag reference and one for the digest. In Containerd 1.7 you would only see the image create events and none of the content create events.

2025-08-18 11:30:30.188671182 +0000 UTC k8s.io /content/create {"digest":"sha256:8feb4d8ca5354def3d8fce243717141ce31e2c428701f6682bd2fafe15388214","size":6688}
2025-08-18 11:30:30.385033783 +0000 UTC k8s.io /content/create {"digest":"sha256:c664f8f86ed5a386b0a340d981b8f81714e21a8b9c73f658c4bea56aa179d54a","size":424}
2025-08-18 11:30:31.104545898 +0000 UTC k8s.io /content/create {"digest":"sha256:b7bab04fd9aa0c771e5720bf0cc7cbf993fd6946645983d9096126e5af45d713","size":2297}
2025-08-18 11:30:31.108390749 +0000 UTC k8s.io /snapshot/prepare {"key":"extract-105403198-nNxn sha256:470b66ea5123c93b0d5606e4213bf9e47d3d426b640d32472e4ac213186c4bb6","snapshotter":"overlayfs"}
2025-08-18 11:30:35.12343038 +0000 UTC k8s.io /content/create {"digest":"sha256:13b7e930469f6d3575a320709035c6acf6f5485a76abcf03d1b92a64c09c2476","size":27510394}
2025-08-18 11:30:35.862241173 +0000 UTC k8s.io /snapshot/commit {"key":"extract-105403198-nNxn sha256:470b66ea5123c93b0d5606e4213bf9e47d3d426b640d32472e4ac213186c4bb6","name":"sha256:470b66ea5123c93b0d5606e4213bf9e47d3d426b640d32472e4ac213186c4bb6","snapshotter":"overlayfs"}
2025-08-18 11:30:35.863816772 +0000 UTC k8s.io /images/create {"name":"docker.io/library/ubuntu:20.04","labels":{"io.cri-containerd.image":"managed"}}
2025-08-18 11:30:35.866443706 +0000 UTC k8s.io /images/create {"name":"sha256:b7bab04fd9aa0c771e5720bf0cc7cbf993fd6946645983d9096126e5af45d713","labels":{"io.cri-containerd.image":"managed"}}
2025-08-18 11:30:35.868908736 +0000 UTC k8s.io /images/create {"name":"docker.io/library/ubuntu@sha256:8feb4d8ca5354def3d8fce243717141ce31e2c428701f6682bd2fafe15388214","labels":{"io.cri-containerd.image":"managed"}}

What we want to avoid is creating duplicate events for the same content, as advertising content has a cost associated with it. So what we do is when the image tag event comes we return immediately. For the image create event with the digest if content create events are not supported we will walk the image and create events for all the layers. If we did this for the image tag event we would create duplicate events. Now for Containerd 2.1 image create events are only needed for the tag event as all of the digests also create content create events already.

How are you observing that newly created images are not being advertised?

@wjordan
Copy link
Author

wjordan commented Aug 18, 2025

What happens when an image is pulled with Containerd is that two image create events are created. One event for the image tag reference and one for the digest.

How are you observing that newly created images are not being advertised?

I'm observing different behavior using containerd 1.7 and the ctr utility:

$ ctr --version
ctr containerd.io 1.7.27

Run ctr -n test events in one terminal, and ctr -n test content fetch docker.io/library/ubuntu:latest in another.
The only event observed is a single /images/create event for the tag reference:

2025-08-18 16:01:27.889154697 +0000 UTC test /images/create {"name":"docker.io/library/ubuntu:latest"}

Using ctr -n test images pull docker.io/library/ubuntu:latest adds /snapshot/prepare and /snapshot/commit events, but no additional /images events:

2025-08-18 16:08:33.556847584 +0000 UTC test /images/create {"name":"docker.io/library/ubuntu:latest"}
2025-08-18 16:08:33.607475503 +0000 UTC test /snapshot/prepare {"key":"extract-592093429-2vv9 sha256:cd9664b1462ea111a41bdadf65ce077582cdc77e28683a4f6996dd03afcc56f5","snapshotter":"overlayfs"}
2025-08-18 16:08:34.721013012 +0000 UTC test /snapshot/commit {"key":"extract-592093429-2vv9 sha256:cd9664b1462ea111a41bdadf65ce077582cdc77e28683a4f6996dd03afcc56f5","name":"sha256:cd9664b1462ea111a41bdadf65ce077582cdc77e28683a4f6996dd03afcc56f5","snapshotter":"overlayfs"}

Judging from the {"io.cri-containerd.image":"managed"} label in your output, it looks like the behavior you described is coming from the CRI plugin, I can indeed reproduce the two additional /images/create events by using crictl pull instead of ctr:

2025-08-18 16:20:32.264168435 +0000 UTC k8s.io /snapshot/prepare {"key":"extract-260132239-AOPW sha256:cd9664b1462ea111a41bdadf65ce077582cdc77e28683a4f6996dd03afcc56f5","snapshotter":"overlayfs"}
2025-08-18 16:20:34.766342022 +0000 UTC k8s.io /snapshot/commit {"key":"extract-260132239-AOPW sha256:cd9664b1462ea111a41bdadf65ce077582cdc77e28683a4f6996dd03afcc56f5","name":"sha256:cd9664b1462ea111a41bdadf65ce077582cdc77e28683a4f6996dd03afcc56f5","snapshotter":"overlayfs"}
2025-08-18 16:20:34.76992965 +0000 UTC k8s.io /images/create {"name":"docker.io/library/ubuntu:latest","labels":{"io.cri-containerd.image":"managed"}}
2025-08-18 16:20:34.77216531 +0000 UTC k8s.io /images/create {"name":"sha256:e0f16e6366fef4e695b9f8788819849d265cde40eb84300c0147a6e5261d2750","labels":{"io.cri-containerd.image":"managed"}}
2025-08-18 16:20:34.775000554 +0000 UTC k8s.io /images/create {"name":"docker.io/library/ubuntu@sha256:7c06e91f61fa88c08cc74f7e1b7c69ae24910d745357e0dfe1d2c0322aaf20f9","labels":{"io.cri-containerd.image":"managed"}}

Indeed, the CRI implementation of PullImage specifically creates images for each of {imageID, repoTag, repoDigest} references which explains that particular behavior.

@phillebaba
Copy link
Member

Spegel is specifically developed to work with CRI and Containerd, as that is what the kubelet uses to pull images. In what use case do you need to use ctr to pull images together with Spegel?

@wjordan
Copy link
Author

wjordan commented Aug 19, 2025

Spegel is specifically developed to work with CRI and Containerd, as that is what the kubelet uses to pull images.

Based on #829, I was under the impression that Kubernetes/CRI was not a fixed requirement, and I assumed that improving Spegel's compatibility for a use-case involving Containerd directly would be in scope of this project. If I was mistaken on this and using Spegel via Containerd's Kubernetes/CRI plugin will continue to be a requirement moving forward, feel free to close this compatibility-fix PR as being out of scope.

In what use case do you need to use ctr to pull images together with Spegel?

My use-case is a Golang program that interacts directly with Containerd on the host to pull images from remote registries, with Spegel running directly on the host (as a systemd service), configured as a registry mirror. This program doesn't invoke ctr directly (I mentioned it only for a minimal reproducible example), it uses its own containerd.Client to pull and unpack/snapshot images.

@phillebaba
Copy link
Member

#829 would still assume that interactions with Containerd would be done through CRI. I am not directly opposed to fixing the events to support both pulling images through ctr and cri. This would however require expanding unit tests to cover the following cases. The easiest method would be to take the raw json events and replay them through the event conversion function.

These cases doubled up, once when content events are supported and once when content events are not supported.

  • CRI pull by tag
  • CRI pull by digest
  • CTR pull by tag
  • CTR pull by digest

The events produced should be assumed to be the same for both methods, without any duplicate digests. The main difference being that when pulling by tag the tag is also included in the events.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants