Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation of Ceph cluster fails due to Unexpected playbook failure. Check ansible-runner-service directory #85

Open
aasraoui opened this issue Jan 8, 2021 · 7 comments

Comments

@aasraoui
Copy link

aasraoui commented Jan 8, 2021

Ceph Installer - Cockpit-ceph-installer.pdf

@pcuzner
Copy link
Collaborator

pcuzner commented Jan 11, 2021

could you drop a screenshot into the issue instead of a pdf please (pdf's don't render, and could be mangled to do nasty stuff)

Until then some basic checks

  • I think current ceph-ansible has a validate role which requires ansible 2.9 ... is that in place?
  • are you using master of the installer?
  • playbooks write to /usr/share/ansible-runner-service/artifacts - so if you have an error, you can pick the playbook directory out of the error message and look at the stdout in the folder (this will be the regular ansible output from the playbook run - unless things have gone really wrong!)

@aasraoui
Copy link
Author

Below is a capture of the stdout log:

Identity added: /usr/share/ansible-runner-service/artifacts/6e68835e-51b8-11eb-8c06-080027191e45/ssh_key_data (/usr/share/ansible-runner-service/artifacts/6e68835e-51b8-11eb-8c06-080027191e45/ssh_key_data)^M
[WARNING]: log file at /root/ansible/ansible.log is not writeable and we cannot create it, aborting^M
^M
^M
PLAY [Validate hosts against desired cluster state] ****************************
^M
TASK [CEPH_CHECK_ROLE] *********************************************************
Friday 08 January 2021 13:50:18 +0000 (0:00:00.274) 0:00:00.274 ********
ok: [Metrics]
ok: [Rgw]
ok: [Mds]
ok: [Osd]
[WARNING]: Unhandled error in Python interpreter discovery for host Mon:^M
Failed to connect to the host via ssh: Permission denied (publickey,gssapi-^M
keyex,gssapi-with-mic,password).
fatal: [Mon]: UNREACHABLE! => {"changed": false, "msg": "Data could not be sent to remote host "Mon". Make sure this host can be reached over ssh: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).\r\n", "unreachable": true}
^M
PLAY RECAP *********************************************************************^M
@

Mds : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 ^M
Metrics : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 ^M
Mon : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0 ^M
Osd : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 ^M
Rgw : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 ^M

Friday 08 January 2021 13:56:20 +0000 (0:06:01.955) 0:06:02.230 ********

CEPH_CHECK_ROLE ------------------------------------------------------- 361.95s

image

@aasraoui
Copy link
Author

aasraoui commented Jan 11, 2021

can sssh to Mon node, not sure why it is not reacheable!!!
[root@Cockpit-ceph-installer ceph-ansible]# ssh Mon
Last login: Sun Jan 10 20:27:27 2021 from 10.0.0.113
[root@Mon ~]#

@pcuzner
Copy link
Collaborator

pcuzner commented Jan 11, 2021

what's strange is that you added the host first. The act of adding a host confirms that the ssh key that the installer uses is in the authorized_keys file on the target. So at some point, 'mon' was accessible using the installers public key. However, right now it doesn't appear so. Checking with root login to mon is misleading, since the installer uses it's own key - unless you provided yuor keys to the installer.

Next steps.
compare your authorized_keys file on mon to one of the osd or rgw host
try connecting manually using the priv key in /usr/share/ansible-runner-service/env/ssh_key (i.e. use -i /usr/share/ansible-runner-service/env/ssh_key

@aasraoui
Copy link
Author

the authorized_keys in Osd is different from the Mon node, manual connection to Mon with priv key works:
[root@Cockpit-ceph-installer .ssh]# ssh root@Mon -i /usr/share/ansible-runner-service/env/ssh_key
root@mon's password:
Last login: Mon Jan 11 04:35:12 2021 from 10.0.0.113
[root@Mon ~]#

@aasraoui
Copy link
Author

aasraoui commented Jan 12, 2021

I have updated mon node with same authorized key as the other nodes, now it is failing for not having any osds on the cluster !!!

image

@pcuzner
Copy link
Collaborator

pcuzner commented Jan 12, 2021

And the problem is?

The installer expects you to have nodes with disks for OSDs, so the osd role can be applied to it. Looking at your screenshot, you've ticked the osd role too. So frmo my perspective this is working as expected.

For a storage cluster you need storage..?

Also just for awareness when you see errors and warnings if you click on the triangle icon, the row will be expanded to show you the error text.

If you're just kicking the tyres - you could just use 2 machines - one for ceph and the other for monitoring - just make sure you have free disks on the node you want to deploy ceph too, and use the container mode deployment (not rpm).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants