-
Notifications
You must be signed in to change notification settings - Fork 744
Fix access to public S3 buckets form restricted Intance/Job Roles #6553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…o access public s3 Signed-off-by: jorgee <[email protected]>
Signed-off-by: jorgee <[email protected]>
✅ Deploy Preview for nextflow-docs-staging canceled.
|
| echo "$output" | ||
| return 1 | ||
| fi | ||
| else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theres' no way to avoid the fallback logic and make this predictable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is tricky, the only way to check if it is public in this condition is just try to access. We could reduce the number of fallbacks by using the try we do in the ls for the cp, so if the ls has worked with --no-sign-request, we will use it in the cp. It will be a complex code but I will just fallback once per file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be the alternative only falling back for ls
nxf_s3_download() {
local source=\$1
local target=\$2
local file_name=\$(basename \$1)
local opts=(--only-show-errors)
local ls_output
if ! ls_output=\$($cli s3 ls \$source 2>&1); then
if echo "\$ls_output" | grep -Eq "(AccessDenied|Forbidden|403)"; then
echo "Access denied, retrying unsigned request..."
if ! ls_output=\$($cli s3 ls --no-sign-request \$source 2>&1); then
echo \$ls_output
return 1
else
opts+=(--no-sign-request)
fi
else
echo \$ls_output
return 1
fi
fi
local is_dir=\$(echo \$ls_output | grep -F "PRE \${file_name}/" -c)
if [[ \$is_dir == 1 ]]; then
opts+=(--recursive)
fi
$cli s3 cp "\${opts[@]}" "\$source" "\$target"
}
close #4732
Depite public s3 buckets can be accessed without s3 credentials, the access to this buckets using the AWS SDK or CLI can fail if we get the credentials from instance or job roles that only allow to access a certain private s3 buckets. It produces errors when running pipelines combining public s3 buckets with private buckets.
The
aws.client.anonymousoption does not solve the issue by two reasons:1- It only aplies to SDK actions at head node, and some stage-in operations are happening at the AWS Batch job that uses the AWS CLI.
2- It is aplied to all the clients. If applied the head job cannot access private buckets.
This PR implements a fallback mechanism for S3 download operations in the CLI and SDK.
I think this approach can deprecate the
aws.client.anonymousflag