Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Definition of the HostClaim resource #408

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
307 changes: 307 additions & 0 deletions design/hostclaim-multitenancy-and-hybrid-clusters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,307 @@
<!--
This work is licensed under a Creative Commons Attribution 3.0
Unported License.

http://creativecommons.org/licenses/by/3.0/legalcode
-->

# HostClaim: multi-tenancy and hybrid clusters

## Status

provisional

## Summary

We introduce a new Custom Resource (named HostClaim) which will facilitate
the creation of multiple clusters for different tenants. It also provides
a framework for building clusters with different kind of compute resource:
baremetal servers but also virtual machines hosted in private or public cloud.

A HostClaim decouples the client need from the actual implementation of the
compute resource: it establishes a security boundary and provides a way to
migrate nodes between different kind of compute resources.

A HostClaim expresses that one wants to start a given
OS image with an initial configuration (typically cloud-init or ignition
configuration files) on a compute resource that meets a set of requirements
(host selectors). These requirements can be interpreted either as labels
for a recyclable resource (such as a bare-metal server) or as characteristics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the scope of metal3 is bare-metal servers that's make all our infra recyclable we can have some disposable resource to simulate the bare-metal but cannot treat them from metal3 (BMO or CAPM3) as disposable all resource seen in CAPM3 is recyclable if we want hybrid clusters then this might be solved in higher abstraction layer or we manage the infra in lower layer and encapsulate the disposable under a layer that hide the disposable features from metal3 and virtualize it as recyclable (baremetal3)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is not to change how BareMetalHost work and the proposal does not modify the BMH controller. The idea is to split capm3 between a part that generate the workload to execute (with all the templating required to create a cloud-init) and the association part that binds the machine by introducing an object in the middle. After what amounts to mostly code shuffling you have a new resource HostClaim which is disposable and abstract enough. This is essentially code shuffling that introduces new nice capabilities and the other approach section states the limitations of the other approach you mention.

for a disposable resource (such as a virtual machine created for the workload).
The status and meta-data of the HostClaim provide the necessary information
for the end user to define and manage his workload on the compute resource,
but they do not grant full control over the resource (typically, BMC
credentials of servers are not exposed to the tenant).

## Motivation

So far, the primary use case of cluster-api-baremetal is the creation of a
single target cluster from a temporary management cluster. The pivot process
transfers the resources describing the target cluster from the management
cluster to the target cluster. Once the pivot process is complete, the target
cluster takes over all the servers. It can scale based on its workload but it
cannot share its servers with other clusters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we can say this is the primary use case. It is just they bootstrap process to get a "self managed" management cluster. Once you have a management cluster (no matter if it is self managed or not) it can be used to create multiple other clusters. These can be created from BareMetalHosts in the same or other namespaces.

What I think we should bring up here instead is the following.
The Metal3 infrastructure provider for Cluster API as of today, does not implement the multi-tenancy contract. While some hacks and workarounds exists (e.g. running multiple namespaced instances), they are far from ideal and do not address all use cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure the reference to the multi-tenancy contract is the right one because this contract has been designed for disposable compute resources not recyclable ones. With disposable compute resource, you have an API to create them usually protected by credentials and the problem is just to be able to isolate each set of credentials and make it usable for example in a single namespace. With recyclable resources the set of credentials is attached to a long living object.

You could trade tenant credentials to gain access to the real object. This is what was done in Kanod with the baremetalpool approach. It is powerful but also very complex and we depended on Redfish.

Here we rather hide the credentials and use the HostClaim controller to exchange control and information between the BareMetalHost in a protected namespace and a Host in the tenant namespace. It is true that we need limits on the binding performed:

  • the BareMetalHost should be able to limit the namespaces of the HostClaims that can be associated
  • at some point a mechanism to limit the number of compute resources a namespace can acquire is needed.

I think the second point can be addressed later. It can be implemented through a quota mechanism (see https://gitlab.com/Orange-OpenSource/kanod/host-quota in the scope of Kanod).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed a new version of motivations. It also solve the previous remark as I have removed explicit mention to Sylva and das Schiff

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure the reference to the multi-tenancy contract is the right one because this contract has been designed for disposable compute resources not recyclable ones. With disposable compute resource, you have an API to create them usually protected by credentials and the problem is just to be able to isolate each set of credentials and make it usable for example in a single namespace. With recyclable resources the set of credentials is attached to a long living object.

My goal would be to achieve the exact same process for both disposable and recyclable resources. The API for creating is the Kubernetes API of the cluster holding the BareMetalHosts. The credentials are the credentials for accessing that API. In this way we achieve the same way of working as the other infrastructure providers, which simplifies a lot and also gives us multi-tenancy.

The proposal for how to handle interaction between namespaces (with HostClaims in one namespace and BareMetalHosts in another) worries me. It is generally not a good idea to allow cross namespace references. If we have to do this I think it must be one way only. Then all BareMetalHosts would be in a special namespace and no namespace reference is needed from the HostClaim side. This would be similar to ClusterIssuers from cert-manager.

If I understand correctly, the need to have the HostClaim in a separate namespace is so that we could keep it together with the Metal3Machine. I don't understand why this is needed though. Would it not make more sense to keep the HostClaim with the BareMetalHost and propagate the status back to the Metal3Machine. This is much closer to other providers (CAPA/CAPO, etc). The user has no insights at all into the actual cloud resources outside of what is propagated to the InfrastructureMachine (unless they check it through a separate API). I think propagating more of the BareMetalHost status back to the Metal3Machine makes perfect sense and can hopefully simplify this a lot.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the need to have the HostClaim in a separate namespace is so that we could keep it together with the Metal3Machine. I don't understand why this is needed though. Would it not make more sense to keep the HostClaim with the BareMetalHost and propagate the status back to the Metal3Machine. This is much closer to other providers (CAPA/CAPO, etc). The user has no insights at all into the actual cloud resources outside of what is propagated to the InfrastructureMachine (unless they check it through a separate API). I think propagating more of the BareMetalHost status back to the Metal3Machine makes perfect sense and can hopefully simplify this a lot.

Except that if you do this, you have:

  • a clusterapi only solution (ie I cannot use it for a workload that is not defining a node).
  • a solution that will only work with bmh and cannot be so easily extended to other resources (even if this can be challenged: see other comment).


There is another model where a single management cluster is used to create and
manage several clusters across a set of bare-metal servers. This is the focus
<!-- cSpell:ignore Sylva Schiff -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add these in the cSpell config instead?

of the [Sylva Project](https://sylvaproject.org/) of the Linux foundation.
Another example is [Das Schiff](https://github.com/telekom/das-schiff).

One of the issue encountered today is that the compute resources
(BareMetalHost) and the cluster definition (Cluster, MachineDeployment,
Machines, Metal3Machines, etc.) must be in the same namespace. Since the goal
is to share the compute resources, this means that a single namespace is used
for all resources. Consequently, unless very complex access control
rules are defined, cluster administrators have visibility over all clusters
and full control over the servers as the credentials are stored in the same
namespace.

The solution so far is to completely proxy the access to the Kubernetes
resources that define the clusters.

Another unrelated problem is that Cluster-API has been designed
to define clusters using homogeneous compute resources: it is challenging to
define a cluster with both bare-metal servers and virtual machines in a private
or public cloud.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning the CAPI developers give for not supporting heterogeneous clusters? Are they aware of the use cases?

This [blog post](https://metal3.io/blog/2022/07/08/One_cluster_multiple_providers.html)
proposes several approaches but none is entirely satisfactory.

On the other hand, workloads can be easily defined in terms of OS images and
initial configurations and standards such as qcow or cloud-init have emerged
and are used by various infrastructure technologies.
Due to the unique challenges of managing bare-metal, the Metal3 project has
developed a set of abstractions and tools that could be used in different
settings. The main mechanism is the selection process that occurs between
the Metal3Machine and the BareMetalHost which assigns a disposable workload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the Metal3Machine and the BareMetalHost which assigns a disposable workload
the Metal3Machine and the BareMetalHost which assigns a disposable workload

Now that confuses me disposable here is higher level k8s node same as capi machine

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OS image and the cloud-init created by CAPM3 defines a workload for the underlying baremetal server which has a single purpose : being a K8S node of the target cluster. When the cluster is deleted/scaled down, this workload is removed from the server that goes back to ready state so it is disposable.

(being a Kubernetes node) to a recyclable compute resource (a
server).

This proposal introduces a new resource called HostClaim that solves
both problems by decoupling the definition of the workload performed by
the Metal3 machine controller from the actual compute resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hiding the bmh can protect the the tenant I think we are not conceding the uniqueness of the baremetal3 resource hiding the BMH resource does not protect the real server the baremetal3 uniqueness AFAIK as you own the hardware how can it be protect since it can access the infra layer eventually! Except if it is physically isolated can't be protected if the hardware network is connected to other servers

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean access to the BMC ? It must not be given to tenants. In the same way that they must not see the BMC credentials. An important note is that if you pivot a cluster with hostclaim, you are not pivoting ironic (unless everybody is in the same namespace and this is essentially for bootstrap). Because bootstrap is a special case, there is a dedicated scenario that tells essentially that for bootstrap, the old behaviour is kept because everything is moved.

Do you mean network isolation between servers: maintaining it is outside the scope of this proposal. I think DT has a proposal where each cluster ensure its own isolation through "signed BGP" advertisment. Somehow the newly booted server comes with some credentials that make that it will have access to/publish some routes. In our integration project, we will probably try to make an evolution of the networking solution we had with Kanod baremetalpools: https://orange-opensource.gitlab.io/kanod/reference/kiab/network.html#kiab-network. But I think it is safe to say that it is outside the scope of Kanod.

This resource acts as both a security boundary and a way to hide the
implementation details of the compute resource.

### Goals

* Split responsibilities between infrastructure teams, who manage servers, and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

infrastructure teams and cluster administrators roles need to be defined in typical bare-metal scenario and the user (tenant)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should probably state more explicitly that administrators of the target clusters are the tenants.

cluster administrators, who create/update/scale baremetal clusters deployed
on those servers, using traditional Kubernetes RBAC to ensure isolation.
* Provide a framework where cluster administrators can consume compute
resources that are not baremetal servers, as long as they offer similar APIs,
using the cluster-api-provider-metal3 to manage the life-cycle of those
resources.
* Define a resource where a user can request a compute resource to execute
an arbitrary workload described by an OS image and an initial configuration.
The user does not need to know exactly which resource is used and may not
have full control over this resource (typically no BMC access).

### Non-Goals

* How to implement HostClaim for specific compute resources that are not
BareMetalHost.
* Discovery of which capabilities are exposed by the cluster.
Which kind of compute resources are available and the semantics of the
selectors are not handled.
* Compute resource quotas. The HostClaim resource should make it possible to
develop a framework to limit the number/size of compute resources allocated
to a tenant, similar to how quotas work for pods. However, the specification
of such a framework will be addressed in another design document.
* Pivoting client clusters resources (managed clusters that are not the
initial cluster).

## Proposal

### User Stories

#### As a user I would like to execute a workload on an arbitrary server

The OS image is available in qcow format on a remote server at ``url_image``.
It supports cloud-init and a script can launch the workload at boot time
(e.g., a systemd service).

The cluster offers bare-metal as a service using Metal3 baremetal-operator.
However, as a regular user, I am not allowed to directly access the definitions
of the servers. All servers are labeled with an ``infra-kind`` label whose
value depends on the characteristics of the computer.

* I create a resource with the following content:

```yaml
apiVersion: metal3.io/v1alpha1
kind: HostClaim
metadata:
name: my-host
spec:
online: false
kind: baremetal

hostSelector:
matchLabels:
infra-kind: medium
```

* After a while, the system associates the claim with a real server, and
the resource's status is populated with the following information:

```yaml
status:
addresses:
- address: 192.168.133.33
type: InternalIP
- address: fe80::6be8:1f93:7f65:59cf%ens3
type: InternalIP
- address: localhost.localdomain
type: Hostname
- address: localhost.localdomain
type: InternalDNS
bootMACAddress: "52:54:00:01:00:05"
conditions:
- lastTransitionTime: "2024-03-29T14:33:19Z"
status: "True"
type: Ready
- lastTransitionTime: "2024-03-29T14:33:19Z"
status: "True"
type: AssociateBMH
lastUpdated: "2024-03-29T14:33:19Z"
nics:
- MAC: "52:54:00:01:00:05"
ip: 192.168.133.33
name: ens3
```

* I also examine the annotations and labels of the HostClaim resource. They
have been enriched with information from the BareMetalHost resource.
* I create three secrets in the same namespace ``my-user-data``,
``my-meta-data``, and ``my-network-data``. I use the information from the
status and meta data to customize the scripts they contain.
* I modify the HostClaim to point to those secrets and start the server:

```yaml
apiVersion: metal3.io/v1alpha1
kind: HostClaim
metadata:
name: my-host
spec:
online: true
image:
checksum: https://url_image.qcow2.md5
url: https://url_image.qcow2
format: qcow2
userData:
name: my-user-data
networkData:
name: my-network-data
kind: baremetal
hostSelector:
matchLabels:
infra-kind: medium
```

* The workload is launched. When the machine is fully provisioned, the boolean
field ready in the status becomes true. I can stop the server by changing
the online status. I can also perform a reboot by targeting specific
annotations in the reserved ``host.metal3.io`` domain.
* When I destroy the host, the association is broken and another user can take
over the server.

#### As an infrastructure administrator I would like to host several isolated clusters

All the servers in the data-center are registered as BareMetalHost in one or
several namespaces under the control of the infrastructure manager. Namespaces
are created for each tenant of the infrastructure. They create
standard cluster definitions in those namespaces. The only difference with
standard baremetal cluster definitions is the presence of a ``kind`` field in
the Metal3Machine templates. The value of this field is set to ``baremetal``.

When the cluster is started, a HostClaim is created for each Metal3Machine
associated to the cluster. The ``hostSelector`` and ``kind`` fields are
inherited from the Metal3Machine. They are used to define the BareMetalHost
associated with the cluster. The associated BareMetalHost is not in the same
namespace as the HostClaim. The exact definition of the BareMetalHost remains
Comment on lines +227 to +228
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something we should clarify I think.
How should the BareMetalHost be picked? Across all namespaces? Should the HostClaim specify where they are? We should be able to have the BareMetalHosts in a completely different cluster than the Metal3Machines IMO. Where should the HostClaims be then?

My idea would be to add an identityRef to the Metal3Cluster. It references a Secret with credentials that would be used for accessing the BareMetalHosts or HostClaims. The same identityRef field could be added to Metal3MachineTemplate also. These could be propagated to the HostClaims so we get something like this.

apiVersion: metal3.io/v1alpha1
kind: HostClaim
metadata:
  name: my-host
spec:
  identityRef:
    name: my-credentials

Then the HostClaim can be in the same namespace as the Metal3Machine. However, it means that the user must be in possession of credentials that can access and modify the BareMetalHosts! This is because the controller (CAPM3) would need to use the credentials of the HostClaim to reach out to the BareMetalHost and modify it.

The other option is that the HostClaim itself lives next to the BareMetalHost. We then set the identityRef on the Metal3Cluster and Metal3MachineTemplate. Then CAPM3 creates HostClaims with the help of the provided credentials. It does not require access to modify the BareMetalHosts directly. This option seems more reasonable to me, but it has the drawback the the user cannot easily inspect the HostClaim.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another approach. For your approach a HostClaim represents a right to use a specific host and is the private part in the BMH realm. It is associated to a set of credentials representing the cluster. Then you will need a mechanism to create new HostClaims. I believe that your proposal is not so far from the Kanod BareMetalPool approach except that you never give access to the real credentials of the BMH: you only expose the standard API offered by the BMH.

I think there are different objectives that can be addressed in different ways in both solutions:

  • Having the BareMetalHosts in a different cluster. It is a legitimate need as hardware is often handled by different teams. But I did not put it in the objective of this document. It can be addressed later as experimented in Kanod. We use the capability of HostClaim to target different kind of compute resources (hybrid) to implement virtual HostClaim synchronized with a remote HostClaim in another cluster (https://gitlab.com/Orange-OpenSource/kanod/host-remote-operator). There is an equivalent of the identityRef in that solution. You have HostClaims on both sides. They must be synchronized. The HostClaim in the cluster namespace has its spec copied to the remote HostClaim and it inherit the status from the remote and yes the contract is tricky on metadata (labels and annotations).
  • Ensuring multi-tenancy isolation. This is done by the fact that HostClaim are not in the same namespace as BareMetalHost and only offer an API with the credentials hidden. The user and the metal3machine controller can completely inspect the HostClaim object. When there is a remote HostClaim, the one on the BMH cluster could be in the same namespace, but it may be harder to avoid name collision between objects and if we want to use quotas, namespaces are also useful to distinguish tenants.
    Your solution will not address the first use case (use without capm3) because your version of HostClaim is invisible to the end user and you rely on capm3 controller to talk directly with the service hiding the BareMetalHost.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lentzi90 I think the main point now is to decide if having BMH in different clusters is a goal and needs a scenario or a non goal. If it is a goal, then I also need to bring the remote hostclaim in the scope of this document, or we keep it as a non goal and mention that it will be addressed later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the capability of HostClaim to target different kind of compute resources (hybrid) to implement virtual HostClaim synchronized with a remote HostClaim in another cluster

This does not make sense to me and sounds too complicated. Why not just propagate the status back to the Metal3Machine instead? Then there is no need for the remote HostClaim.

Your solution will not address the first use case (use without capm3) because your version of HostClaim is invisible to the end user and you rely on capm3 controller to talk directly with the service hiding the BareMetalHost.

Can you explain this a bit more? The way I see it, we can propagate the status of the HostClaim back to the Metal3Machine. Then the user will get the information. If the user wants to manually handle HostClaims and skip CAPM3 then they already have access to the HostClaim and can see the status there.

I think the main point now is to decide if having BMH in different clusters is a goal and needs a scenario or a non goal. If it is a goal, then I also need to bring the remote hostclaim in the scope of this document, or we keep it as a non goal and mention that it will be addressed later.

I think it must be a goal because it is ultimately a consequence of multi-tenancy. It is possible to implement it so that it only works in one cluster, but it should be trivial to support a remote cluster if the local cluster already works. Then I do not see a reason to limit it.

Copy link
Author

@pierrecregut pierrecregut Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First HostClaim as I defined it is very similar to Metal3Machine. In fact the only difference in the Spec is the providerID (only in m3m) and online (only in host) but in practice it is a little more than that as there are a few things handled through annotations that are specific to cluster-api in the metal3 machine controller. HostClaim are a way to decouple what is specific to cluster-api from what is generic to a compute resource.

Multi-tenancy for BareMetalHost is not specific to the use with cluster-api. I would like to be able to launch workloads on servers that are shared with other users of the infrastructure even if the workload is not being a node of a cluster. So I think that the implementation of multi-tenancy should be between Hostclaim and BareMetalHost, so in the Hostclaim controller.

To just implement multi-tenancy in the same cluster, I just need to make sure that I hide the credentials but provide sufficient feed back on the baremetal host. If I want to use it without capm3, this feedback cannot be propagated to the m3machine that does not exist in that scenario.

I can extend the solution to work with BareMetalHost on a remote cluster. As in your proposal, it involves having a set of credentials for a remote api that gives access to the bmh. Again I could implement it directly directly in the HostClaim controller but if I want to support hybrid clusters and if I need the same mechanism to access kubevirt VM on a remote cluster, then I prefer an implementation that let me control any kind of HostClaim on a remote cluster. That is exactly what remote hostclaim do.

Your solution trivially supports remote cluster because it uses the mechanisms to access a remote cluster even when it is not needed.

On the other hand, we could also decide that the real controllers are always implemented by a service implementing the HostClaim API. The notion of kind we have could disappear: you would simply target another endpoint. Technically this would be almost the same implementation but the HostClaim on the remote cluster would only exist "on the wire" and not as a resource. This means that the service implementation must be able to rebuild the model (internally you need it) from the compute resource only (through labels and annotations added to mark how they are associated to the source HostClaim resources).

This is probably closer to what you had in mind but the resource would still be in the cluster (tenant) namespace not in the baremetalhost namespace.

The only drawback from my point of view is the complexity of this server when it is not needed (single cluster). Especially with metadata, the flow of information between clusters is not easy and having a single implementation may ensure more coherence. We also loose the ability to implement Quota for HostClaims as this must be done on a resource existing on the target cluster, not on the tenant cluster.

hidden from the cluster user, but parts of its status and metadata are copied
back to the Host namespace. With this information,
the data template controller has enough details to compute the different
secrets (userData, metaData and networkData) associated to the Metal3Machine.
Those secrets are linked to the HostClaim and, ultimately, to the
BareMetalHost.

When the cluster is modified, new Machine and Metal3Machine resources replace
the previous ones. The HostClaims follow the life-cycle of the Metal3Machines
and are destroyed unless they are tagged for node reuse. The BareMetalHosts are
recycled and are bound to new HostClaims, potentially belonging to other
clusters.

#### As a cluster administrator I would like to build a cluster with different kind of nodes

This scenario assumes that:

* the cloud technologies CT_i use qcow images and cloud-init to
define workloads.
* Clouds C_i implementing CT_i are accessible through
credentials and endpoints described in a resource Cr_i.
* a HostClaim controller exists for each CT_i. Compute resource can
be parameterize through arguments in the HostClaim arg_ij.

One can build a cluster where each machine deployment MD_i
contains Metal3Machine templates referring to kind CT_i.
The arguments identify the credentials to use (CR_i)

```yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3MachineTemplate
metadata:
name: md-i
spec:
dataTemplate:
name: dt-i
kind: CT_i
args:
arg_i1: v1
...
arg_ik: vk
hostSelector:
matchLabels:
...
image:
checksum: https://image_url.qcow2.md5
format: qcow2
url: https://image_url.qcow2.md5
```

The Metal3Machine controllers will create HostClaims with different kinds
handled by different controllers creating the different compute resources.
Connectivity must be established between the different subnets where each
controller creates its compute resources.

The argument extension is not strictly necessary but is cleaner than using
matchLabels in specific domains to convey information to controllers.
Controllers for disposable resources such as virtual machine typically do not
use hostSelectors. Controllers for a "bare-metal as a service" service
may use selectors.

#### As a cluster administrator I would like to install a new baremetal cluster from a transient cluster

The bootstrap process can be performed as usual from an ephemeral cluster
(e.g., a KinD cluster). The constraint that all resources must be in the same
namespace (Cluster and BareMetalHost resources) must be respected. The
BareMetalHost should be marked as movable.

The only difference with the behavior without Host is the presence of an
intermediate Host resource but the chain of resources is kept during the
transfer and the pause annotation is used to stop Ironic.

Because this operation is only performed by the administrator of a cluster
manager, the fact that the cluster definition and the BareMetalHosts are in
the same namespace should not be an issue.

The tenant clusters cannot be pivoted which can be expected from a security
point of vue as it would give the bare-metal servers credentials to the
tenants. Partial pivot can be achieved with the help of HostClaim replicating
HostClaims on other clusters but the specification of the associated
controller is beyond the scope of this specification.

## Design Details

TBD.