Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster deployed with already expired token #299

Open
mtangaro opened this issue Nov 5, 2018 · 3 comments
Open

Cluster deployed with already expired token #299

mtangaro opened this issue Nov 5, 2018 · 3 comments
Labels

Comments

@mtangaro
Copy link

mtangaro commented Nov 5, 2018

Dear experts,
on deploying an elastic cluster (SLURM is used as resource manager and Galaxy as workflow manager), it happens that CLUES can't contact the orchestrator for nodes deployment, because the token (injected by the orchestrator) is already expired.

Indeed, from the information in the token you can see:

"iss": "https://iam.recas.ba.infn.it/",
"exp": 1541413974,
"iat": 1541410374,

corresponding to:

iat: Monday 5 November 2018 10:32:54
exp: Monday 5 November 2018 11:32:54

The CLUES log:
[PLUGIN-INDIGO-ORCHESTRATOR];ERROR;2018-11-05 11:40:23,067;ERROR getting deployment info: {"code":401,"title":"Unauthorized","message":"Invalid token: ***TOKEN***"} [PLUGIN-INDIGO-ORCHESTRATOR];WARNING;2018-11-05 11:40:23,067;No resources obtained from orchestrator. [PLUGIN-INDIGO-ORCHESTRATOR];DEBUG;2018-11-05 11:40:30,674;The access token is valid for -4056 seconds. [PLUGIN-INDIGO-ORCHESTRATOR];ERROR;2018-11-05 11:40:30,674;Error refreshing access token: No client info provided.

This is not rare and non-INDIGO-experts can't recover the cluster.

@mtangaro mtangaro added the bug label Nov 5, 2018
@alberto-brigandi
Copy link
Member

Hi Marco,
The token is not expired at Infrastructure creation time (otherwise the creation request to IM would fail).
It probably reach expiration time because the ansible roles execute things, before using the token, that require a lot of time to be completed
Could you please give me an estimation of how much time the ansible roles take to reach the point where the injected access token is used?

@mtangaro
Copy link
Author

mtangaro commented Nov 7, 2018

Hi Alberto,
Without any non elastic node deployed, clues is installed as "first" step, taking 20 minutes to start after the the deployment submission.
I'm using this template: https://github.com/indigo-dc/tosca-types/blob/master/examples/galaxy_elastic_cluster_full_elixirIT.yaml
Actually using a non elastic node this may take much more time, even hours, since I have to wait galaxy to be installed and the nfs has to be configured between master and worker nodes.
I'm including @micafer in the loop, maybe he can help.

@micafer
Copy link

micafer commented Nov 7, 2018

Hi @mtangaro,
The problem is that the orchestrator plugin is the one that gets the refresh token using the injected access token. If clues is started with the access token expired it will fail.
We need a way to get the refresh token at the beginning of the configuration of the front-end node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants