You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to build a compute node image for a CitC instance deployed in a new project on Bristol Digital Labs prototype system's OpenStack, I found Packer was unable to complete the image build:
[citc@mgmt ~]$ sudo /usr/local/bin/run-packeropenstack.openstack: output will be in this color.==> openstack.openstack: Loading flavor: m1.small openstack.openstack: Verified flavor. ID: 1d816549-b9ad-47d4-9139-218bfc22681f==> openstack.openstack: Creating temporary keypair: packer_6619b72d-0884-3bb9-1dbc-800d6e2ddb50 ...==> openstack.openstack: Created temporary keypair: packer_6619b72d-0884-3bb9-1dbc-800d6e2ddb50 openstack.openstack: Found Image ID: b9e4cc7a-ed29-4a15-807b-dc80cdbd9983==> openstack.openstack: Launching server...==> openstack.openstack: Launching server... openstack.openstack: Server ID: c8d3dfc3-6a36-4b4f-b2ea-d93ee3c0bf88==> openstack.openstack: Waiting for server to become ready... openstack.openstack: Floating IP not required==> openstack.openstack: Using SSH communicator to connect: 10.0.1.96==> openstack.openstack: Waiting for SSH to become available...==> openstack.openstack: Timeout waiting for SSH.==> openstack.openstack: Terminating the source server: c8d3dfc3-6a36-4b4f-b2ea-d93ee3c0bf88 ...==> openstack.openstack: Deleting temporary keypair: packer_6619b72d-0884-3bb9-1dbc-800d6e2ddb50 ...Build 'openstack.openstack' errored after 5 minutes 13 seconds: Timeout waiting for SSH.
I found that the instance was created but Packer does not seem to be able to connect to it over SSH, so image build fails.
Inspecting OpenStack instance details, I found that the instance was created with "default" security group:
% openstack server show c8ecea93-12c3-4c3e-bcb5-c02f37f9da6c -c hostname -c flavor -c image -c security_groups+-----------------+--------------------------------------------------+| Field | Value |+-----------------+--------------------------------------------------+| flavor | m1.small (m1.small) || hostname | packer-one-airedale-v1712962160 || image | Rocky-8.8 (b9e4cc7a-ed29-4a15-807b-dc80cdbd9983) || security_groups | name='default' || | name='default' |+-----------------+--------------------------------------------------+
On inspection, I found that the default security group for this project did not have a rule that allowed SSH ingress from the mgmt instance and in general we cannot rely on this being the case.
I believe the reason why this problem has not arisen before on the Bristol Digital Labs prototype systems is that previous CitC deployments have been in a project where the default security group had a rule added which allowed ingress from any IP on 22/TCP.
To workaround this issue, I modified the local clone of this repository to specify that the image build instance should use the cluster-one-airedale security group created for this CitC instance.
After re-running Ansible for the mgmt instance, I was able to successfully run Packer and build a new compute node image.
I think that this change could be implemented more generally by modifying roles/packer/files/all.pkr.hcl in this repo to use security groups specified as variables in roles/packer/templates/variables.pkrvars.hcl.j2.
Possibly also of interest: I note that the Packer build in this case seems to connect over SSH on an IP on the Ceph network, rather than the cluster network. It seems that security groups specified in the Packer config are applied to all ports on the instance, so this does not prevent image build from occurring.
The text was updated successfully, but these errors were encountered:
When trying to build a compute node image for a CitC instance deployed in a new project on Bristol Digital Labs prototype system's OpenStack, I found Packer was unable to complete the image build:
I found that the instance was created but Packer does not seem to be able to connect to it over SSH, so image build fails.
Inspecting OpenStack instance details, I found that the instance was created with "default" security group:
On inspection, I found that the
default
security group for this project did not have a rule that allowed SSH ingress from the mgmt instance and in general we cannot rely on this being the case.I believe the reason why this problem has not arisen before on the Bristol Digital Labs prototype systems is that previous CitC deployments have been in a project where the
default
security group had a rule added which allowed ingress from any IP on 22/TCP.To workaround this issue, I modified the local clone of this repository to specify that the image build instance should use the
cluster-one-airedale
security group created for this CitC instance.After re-running Ansible for the mgmt instance, I was able to successfully run Packer and build a new compute node image.
I think that this change could be implemented more generally by modifying
roles/packer/files/all.pkr.hcl
in this repo to use security groups specified as variables inroles/packer/templates/variables.pkrvars.hcl.j2
.Possibly also of interest: I note that the Packer build in this case seems to connect over SSH on an IP on the Ceph network, rather than the cluster network. It seems that security groups specified in the Packer config are applied to all ports on the instance, so this does not prevent image build from occurring.
The text was updated successfully, but these errors were encountered: