Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for storing build images in gitlab registries #966

Closed
wants to merge 1 commit into from

Conversation

rochaporto
Copy link
Contributor

@rochaporto rochaporto commented Sep 27, 2019

Add support for gitlab token_url requests and accept gitlab registry structures, where any number of levels in the image name is possible.

Example:
binder/images/build-:tag

Where binder is the group, images the project and the rest the image name and tag in the 'images' registry. Subgroups can be added creating additional path levels. The previous logic assumed two levels only, which matches the docker.io behavior.

This change implies the image_prefix will only contain the image name, not a repo dns name. So 'rochaporto/myimage' is ok, but not 'docker.io/rochaporto/myimage'. If needed i could look into some additional logic for this, but given urllib.parse expects a scheme it won't be direct.

Fixes #965

@betatim
Copy link
Member

betatim commented Sep 28, 2019

Thanks for digging into this. Every time I end up looking at this code I think we should refactor it a bit and add some tests because I find it hard to "run it in my head" to check the various scenarios of different registries (and then which cases we already handle via what kind of workaround...). So below some code that I think reproduces the values we'd see in a setup like mybinder.org:

In [1]: image_prefix = "gcr.io/binder-prod/r2d-f18835fd-"

In [3]: '{prefix}{build_slug}:{ref}'.format(prefix=image_prefix, build_slug="some-escaped-url", ref="1234")
Out[3]: 'gcr.io/binder-prod/r2d-f18835fd-some-escaped-url:1234'

In [5]: image_name = '{prefix}{build_slug}:{ref}'.format(prefix=image_prefix, build_slug="some-escaped-url", ref="1234")

In [6]: '/'.join(image_name.split('/')[-2:]).split(':', 1)
Out[6]: ['binder-prod/r2d-f18835fd-some-escaped-url', '1234']

In [7]: image_name.split(':', 1)
Out[7]: ['gcr.io/binder-prod/r2d-f18835fd-some-escaped-url', '1234']

This means your hunch that what is passed to get_image_manifest() is indeed just the name like binder-prod/r2d-f18835fd-some-escaped-url. But the new code doesn't produce the same result. I'm trying to remember why we aren't using image_name.split('/', 1)[1].split(':', 1).

I think this would work for the gcr.io case and also for the GitLab registry case.

binderhub/registry.py Outdated Show resolved Hide resolved
Accept any number of elements in the image prefix passed to
get_image_manifest. Add support for gitlab registries which are
per project and can have things like: <group>/<project>/<image>:<tag>.

Or even deeper structures if subgroups are used.

It implies only the real image part is being set in image_prefix, not
the host (which i assume is the case). So something like
rochaporto/myimage:mytag is ok, but not docker.io/rochaporto/myimage:mytag.
@rochaporto
Copy link
Contributor Author

Thanks for digging into this. Every time I end up looking at this code I think we should refactor it a bit and add some tests because I find it hard to "run it in my head" to check the various scenarios of different registries (and then which cases we already handle via what kind of workaround...). So below some code that I think reproduces the values we'd see in a setup like mybinder.org:

In [1]: image_prefix = "gcr.io/binder-prod/r2d-f18835fd-"

In [3]: '{prefix}{build_slug}:{ref}'.format(prefix=image_prefix, build_slug="some-escaped-url", ref="1234")
Out[3]: 'gcr.io/binder-prod/r2d-f18835fd-some-escaped-url:1234'

In [5]: image_name = '{prefix}{build_slug}:{ref}'.format(prefix=image_prefix, build_slug="some-escaped-url", ref="1234")

In [6]: '/'.join(image_name.split('/')[-2:]).split(':', 1)
Out[6]: ['binder-prod/r2d-f18835fd-some-escaped-url', '1234']

In [7]: image_name.split(':', 1)
Out[7]: ['gcr.io/binder-prod/r2d-f18835fd-some-escaped-url', '1234']

This means your hunch that what is passed to get_image_manifest() is indeed just the name like binder-prod/r2d-f18835fd-some-escaped-url. But the new code doesn't produce the same result. I'm trying to remember why we aren't using image_name.split('/', 1)[1].split(':', 1).

I think this would work for the gcr.io case and also for the GitLab registry case.

Do you have a set of inputs->outputs to be used for testing? I'm happy to extract this logic to a separate function and add a few tests.

@betatim
Copy link
Member

betatim commented Oct 6, 2019

Not really :( The best I can think of is configs of existing deployments. For them we could feed in the current config, capture the output and then create a test based on this not changing.

The one posted above is what we use for the GKE cluster of mybinder.org.

For the OVH cluster we use this config https://github.com/jupyterhub/mybinder.org-deploy/blob/c7197ee387048d20838de3c2ab70135f4a83107c/config/ovh.yaml#L9-L14

This is the config of the GESIS cluster https://github.com/gesiscss/orc/blob/aa81bd432330547f76ee8ee8c5fcaf3c8735051e/gesisbinder/config.yaml#L32 (I think it uses DockerHub as storage)

The Pangeo BinderHub uses gcr.io as well https://github.com/pangeo-data/pangeo-binder/blob/7276aae9bf6077761255ef1f41243e6d391bf827/deploy/prod.yaml#L10 (so config wise a copy&paste from mybinder.org)

@bdrian

This comment has been minimized.

@betatim
Copy link
Member

betatim commented Oct 22, 2019

@bdrian do you want to make a new PR with your suggested patch and some of the test cases? This PR has been quiet for a while so I think it would be fine to start a new one with fresh momentum (and a slightly different approach).

@bdrian
Copy link
Contributor

bdrian commented Oct 23, 2019

@betatim I'll submit a similar PR soon, just have to finish the local testing. The patch above is dumb on second thought.

@rochaporto
Copy link
Contributor Author

Closing this one in favor of #986

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for gitlab as build registry
3 participants