-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: Should we support vendor-specific cloud API libraries in BinderHub? #1623
Comments
The philosophy of BinderHub so far has been to be "vendor agnostic". I think most often this leads to/is interpreted as "lowest common denominator", use the stuff that works equally well (or badly) everywhere. I'm not familiar with "ECR container repository". I quickly googled it and it suggested "container registry" to me. Setting up a container registry sounds like a one time/setup task, not an ongoing thing that BinderHub does while it is running. Could you explain a bit what you had in mind? For one time setup stuff I think we should describe it in the guide(s). The vendor specific guides are a good example of how they are valuable but also often out of date (which I think is the relevant thing for deciding about "vendor specific" code as well). Over the last year or so I've become more and more convinced of (and attracted to) the idea that having a plugin system is a great idea. In this case BinderHub would allow plugins to change/augment/extend parts of its behaviour. The advantage of having a plugin system is that anyone (including core maintainers) can extend BinderHub without needing to consider all the things/permissions/consensus of including it in the core. I think it also allows for a lot of creativity, and some kind of "combinatorial explosion" of things your software can do (think iPhone w/o app store (no plugin system) vs iPhone with app store (plugin system)). Maybe something like you have in mind would be a good use case of a plugin system? Of course creating the "host side" of a plugin system is work and the quality of plugins rises and falls with how well it is done. JupyterHub already has a kinda plugin system for spawners and authenticators, so there is precedent for this working well. I think a plugin system would imply that you need to make your own binderhub image?! |
I think you're right that a lot of the great interface-defining work @manics and others have done is getting BinderHub for a level of maturity where it defines the interfaces, and implementations of non-default providers start moving to their own packages. But once you start breaking things up like that, it also starts to make sense to be doing more versioned releases to better communicate changes and compatibility at the API level.
Yes and no - we see this in z2jh: z2jh's default image ships with a common set of plugins (then are they really plugins?), but you can always add more / select versions in a custom image. We still have to decide what's in this default set and what's not, which is a pretty difficult line to draw as everyone asks for their Authenticator to be added so they don't need a custom image. I know a lot of supply chain folks bristle at the idea of install-at-runtime as a pattern, but I honestly think for plugin purposes that |
@betatim ECR (and some other container registries) don't support pushing to I've had a go at implementing the microservice model with Oracle Cloud's registry: Example binderhub config stanza: import json
from tornado import httpclient
from traitlets import Unicode
from binderhub.registry import DockerRegistry
class ExternalRegistryHelper(DockerRegistry):
service_url = Unicode(
"http://oracle-container-repositories-svc:8080",
allow_none=False,
help="The URL of the registry helper micro-service.",
config=True,
)
auth_token = Unicode(
"secret-token",
help="The auth token to use when accessing the registry helper micro-service.",
config=True,
)
async def get_image_manifest(self, image, tag):
"""
If the container repository exists use the standard Docker Registry API
to check for the image tag.
Otherwise create the container repository.
The full registry image URL has the form:
CONTAINER_REGISTRY/OCIR_NAMESPACE/OCIR_IMAGE_NAME:TAG
but the BinderHub image is OCIR_NAMESPACE/OCIR_IMAGE_NAME
so we need to remove the OCIR_NAMESPACE component
"""
client = httpclient.AsyncHTTPClient()
image = image.split("/", 1)[1]
repo_url = f"{self.service_url}/repo/{image}"
headers = {"Authorization": f"Bearer {self.auth_token}"}
self.log.debug(f"Checking whether repository exists: {repo_url}")
try:
repo = await client.fetch(repo_url, headers=headers)
repo_exists = True
except httpclient.HTTPError as e:
if e.code == 404:
repo_exists = False
else:
raise
if repo_exists:
repo_json = json.loads(repo.body.decode("utf-8"))
self.log.debug(f"Repository exists: {repo_json}")
return await super().get_image_manifest(image, tag)
else:
self.log.debug(f"Creating repository: {repo_url}")
await client.fetch(repo_url, headers=headers, method="POST", body="")
return None
c.BinderHub.registry_class = ExternalRegistryHelper This only requires standard HTTP GET/POST calls and headers, the complex Oracle Cloud auth and API calls are hidden in the microservice. |
That looks nice. All the vendor specific stuff is in one place, and the way to extend BinderHub is also not too ugly. A downside is that creating what you created requires quite a bit of knowledge of how BinderHub works, so it is probably beyond the average user's skills. Can/should we bundle the microservice in BinderHub's repo to make it (and others like it) more discoverable? Have a repo tag that is used to create a list in the docs? |
I'm going to see if I can get ECR actually working. If it does then I think incorporating some of the work into BinderHub will be helpful to admins:
|
There's a few use-cases that benefit from using a cloud specific library to make API calls, e.g. using AWS boto3 to create an ECR container repository and to obtain a temporary read/write token #1055
Other public cloud registries may benefit from similar e.g. Oracle Cloud Infrastructure Registry (there's an autocreate option when pushing new images, but creating the repository in advance allows more control of things like auto-deletion), which requires the oci library.
There's probably others, either related to registries, or for other things like hooking into cloud notifications.
It's easy to have
extras_requires
insetup.py
, or to put the new Registry (for example) implementation in a separate Python package since it's configurable with Traitlets, but what should we include in the container image? Just the ones used by mybinder.org and encourage everyone else to re-build their BinderHub container? Or should we include all of them? Do we take a completely different route and make those vendor specific API calls via a separate container (going down the microservices route)?The text was updated successfully, but these errors were encountered: