Skip to content

Fix: Race between layer and Lambda update (#5927) #6259

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

dsotirho-ucsc
Copy link
Contributor

@dsotirho-ucsc dsotirho-ucsc commented May 14, 2024

Connected issues: #5927

Checklist

Author

  • PR is a draft
  • Target branch is develop
  • Name of PR branch matches issues/<GitHub handle of author>/<issue#>-<slug>
  • On ZenHub, PR is connected to all issues it (partially) resolves
  • PR description links to connected issues
  • PR title matches1 that of a connected issue or comment in PR explains why they're different
  • PR title references all connected issues
  • For each connected issue, there is at least one commit whose title references that issue

1 when the issue title describes a problem, the corresponding PR
title is Fix: followed by the issue title

Author (partiality)

  • Added p tag to titles of partial commits
  • This PR is labeled partial or completely resolves all connected issues
  • This PR partially resolves each of the connected issues or does not have the partial label

Author (chains)

  • This PR is blocked by previous PR in the chain or is not chained to another PR
  • The blocking PR is labeled base or this PR is not chained to another PR
  • This PR is labeled chained or is not chained to another PR

Author (reindex, API changes)

  • Added r tag to commit title or the changes introduced by this PR will not require reindexing of any deployment
  • This PR is labeled reindex:dev or the changes introduced by it will not require reindexing of dev
  • This PR is labeled reindex:anvildev or the changes introduced by it will not require reindexing of anvildev
  • This PR is labeled reindex:anvilprod or the changes introduced by it will not require reindexing of anvilprod
  • This PR is labeled reindex:prod or the changes introduced by it will not require reindexing of prod
  • This PR is labeled reindex:partial and its description documents the specific reindexing procedure for dev, anvildev, anvilprod and prod or requires a full reindex or carries none of the labels reindex:dev, reindex:anvildev, reindex:anvilprod and reindex:prod
  • This PR and its connected issues are labeled API or this PR does not modify a REST API
  • Added a (A) tag to commit title for backwards (in)compatible changes or this PR does not modify a REST API
  • Updated REST API version number in app.py or this PR does not modify a REST API

Author (upgrading deployments)

  • Ran make image_manifests.json and committed the resulting changes or this PR does not modify azul_docker_images, or any other variables referenced in the definition of that variable
  • Documented upgrading of deployments in UPGRADING.rst or this PR does not require upgrading deployments
  • Added u tag to commit title or this PR does not require upgrading deployments
  • This PR is labeled upgrade or does not require upgrading deployments
  • This PR is labeled deploy:shared or does not modify image_manifests.json, and does not require deploying the shared component for any other reason
  • This PR is labeled deploy:gitlab or does not require deploying the gitlab component
  • This PR is labeled deploy:runner or does not require deploying the runner image

Author (hotfixes)

  • Added F tag to main commit title or this PR does not include permanent fix for a temporary hotfix
  • Reverted the temporary hotfixes for any connected issues or the none of the stable branches (anvilprod and prod) have temporary hotfixes for any of the issues connected to this PR

Author (before every review)

  • Rebased PR branch on develop, squashed old fixups
  • Ran make requirements_update or this PR does not modify requirements*.txt, common.mk, Makefile and Dockerfile
  • Added R tag to commit title or this PR does not modify requirements*.txt
  • This PR is labeled reqs or does not modify requirements*.txt
  • make integration_test passes in personal deployment or this PR does not modify functionality that could affect the IT outcome

Peer reviewer (after approval)

  • PR is not a draft
  • Ticket is in Review requested column
  • PR is awaiting requested review from system administrator
  • PR is assigned to only the system administrator

System administrator (after approval)

  • Actually approved the PR
  • Labeled connected issues as demo or no demo
  • Commented on connected issues about demo expectations or all connected issues are labeled no demo
  • Decided if PR can be labeled no sandbox
  • A comment to this PR details the completed security design review
  • PR title is appropriate as title of merge commit
  • N reviews label is accurate
  • Moved ticket to Approved column
  • PR is assigned to only the operator

Operator (before pushing merge the commit)

  • Checked reindex:… labels and r commit title tag
  • Checked that demo expectations are clear or all connected issues are labeled no demo
  • Squashed PR branch and rebased onto develop
  • Sanity-checked history
  • Pushed PR branch to GitHub
  • Ran _select dev.shared && CI_COMMIT_REF_NAME=develop make -C terraform/shared apply_keep_unused or this PR is not labeled deploy:shared
  • Ran _select dev.gitlab && CI_COMMIT_REF_NAME=develop make -C terraform/gitlab apply or this PR is not labeled deploy:gitlab
  • Ran _select anvildev.shared && CI_COMMIT_REF_NAME=develop make -C terraform/shared apply_keep_unused or this PR is not labeled deploy:shared
  • Ran _select anvildev.gitlab && CI_COMMIT_REF_NAME=develop make -C terraform/gitlab apply or this PR is not labeled deploy:gitlab
  • Checked the items in the next section or this PR is labeled deploy:gitlab
  • PR is assigned to only the system administrator or this PR is not labeled deploy:gitlab

System administrator

  • Background migrations for dev.gitlab are complete or this PR is not labeled deploy:gitlab
  • Background migrations for anvildev.gitlab are complete or this PR is not labeled deploy:gitlab
  • PR is assigned to only the operator

Operator (before pushing merge the commit)

  • Ran _select dev.gitlab && make -C terraform/gitlab/runner or this PR is not labeled deploy:runner
  • Ran _select anvildev.gitlab && make -C terraform/gitlab/runner or this PR is not labeled deploy:runner
  • Added sandbox label or PR is labeled no sandbox
  • Pushed PR branch to GitLab dev or PR is labeled no sandbox
  • Pushed PR branch to GitLab anvildev or PR is labeled no sandbox
  • Build passes in sandbox deployment or PR is labeled no sandbox
  • Build passes in anvilbox deployment or PR is labeled no sandbox
  • Reviewed build logs for anomalies in sandbox deployment or PR is labeled no sandbox
  • Reviewed build logs for anomalies in anvilbox deployment or PR is labeled no sandbox
  • Deleted unreferenced indices in sandbox or this PR does not remove catalogs or otherwise causes unreferenced indices in dev
  • Deleted unreferenced indices in anvilbox or this PR does not remove catalogs or otherwise causes unreferenced indices in anvildev
  • Started reindex in sandbox or this PR is not labeled reindex:dev
  • Started reindex in anvilbox or this PR is not labeled reindex:anvildev
  • Checked for failures in sandbox or this PR is not labeled reindex:dev
  • Checked for failures in anvilbox or this PR is not labeled reindex:anvildev
  • The title of the merge commit starts with the title of this PR
  • Added PR # reference to merge commit title
  • Collected commit title tags in merge commit title but only included p if the PR is also labeled partial
  • Moved connected issues to Merged lower column in ZenHub
  • Moved blocked issues to Triage or no issues are blocked on the connected issues
  • Pushed merge commit to GitHub

Operator (chain shortening)

  • Changed the target branch of the blocked PR to develop or this PR is not labeled base
  • Removed the chained label from the blocked PR or this PR is not labeled base
  • Removed the blocking relationship from the blocked PR or this PR is not labeled base
  • Removed the base label from this PR or this PR is not labeled base

Operator (after pushing the merge commit)

  • Pushed merge commit to GitLab dev
  • Pushed merge commit to GitLab anvildev
  • Build passes on GitLab dev
  • Reviewed build logs for anomalies on GitLab dev
  • Build passes on GitLab anvildev
  • Reviewed build logs for anomalies on GitLab anvildev
  • Ran _select dev.shared && make -C terraform/shared apply or this PR is not labeled deploy:shared
  • Ran _select anvildev.shared && make -C terraform/shared apply or this PR is not labeled deploy:shared
  • Deleted PR branch from GitHub
  • Deleted PR branch from GitLab dev
  • Deleted PR branch from GitLab anvildev

Operator (reindex)

  • Deindexed all unreferenced catalogs in dev or this PR is neither labeled reindex:partial nor reindex:dev
  • Deindexed all unreferenced catalogs in anvildev or this PR is neither labeled reindex:partial nor reindex:anvildev
  • Deindexed specific sources in dev or this PR is neither labeled reindex:partial nor reindex:dev
  • Deindexed specific sources in anvildev or this PR is neither labeled reindex:partial nor reindex:anvildev
  • Indexed specific sources in dev or this PR is neither labeled reindex:partial nor reindex:dev
  • Indexed specific sources in anvildev or this PR is neither labeled reindex:partial nor reindex:anvildev
  • Started reindex in dev or this PR does not require reindexing dev
  • Started reindex in anvildev or this PR does not require reindexing anvildev
  • Checked for, triaged and possibly requeued messages in both fail queues in dev or this PR does not require reindexing dev
  • Checked for, triaged and possibly requeued messages in both fail queues in anvildev or this PR does not require reindexing anvildev
  • Emptied fail queues in dev or this PR does not require reindexing dev
  • Emptied fail queues in anvildev or this PR does not require reindexing anvildev

Operator

  • Propagated the deploy:shared, deploy:gitlab, deploy:runner, reindex:partial, reindex:anvilprod and reindex:prod labels to the next promotion PRs or this PR carries none of these labels
  • Propagated any specific instructions related to the deploy:shared, deploy:gitlab, deploy:runner, reindex:partial, reindex:anvilprod and reindex:prod labels, from the description of this PR to that of the next promotion PRs or this PR carries none of these labels
  • PR is assigned to no one

Shorthand for review comments

  • L line is too long
  • W line wrapping is wrong
  • Q bad quotes
  • F other formatting problem

@github-actions github-actions bot added the orange [process] Done by the Azul team label May 14, 2024
Copy link

codecov bot commented May 14, 2024

Codecov Report

Attention: Patch coverage is 18.75000% with 13 lines in your changes missing coverage. Please review.

Project coverage is 85.35%. Comparing base (6e66a5b) to head (f612650).
Report is 449 commits behind head on develop.

Files with missing lines Patch % Lines
src/azul/lambdas.py 20.00% 12 Missing ⚠️
src/azul/terraform.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6259      +/-   ##
===========================================
- Coverage    85.38%   85.35%   -0.03%     
===========================================
  Files          155      155              
  Lines        20735    20743       +8     
===========================================
+ Hits         17704    17706       +2     
- Misses        3031     3037       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@coveralls
Copy link

coveralls commented May 14, 2024

Coverage Status

coverage: 85.376% (-0.02%) from 85.399%
when pulling f612650 on issues/dsotirho-ucsc/5927-layer-and-lambda-race
into 6e66a5b on develop.

@dsotirho-ucsc dsotirho-ucsc force-pushed the issues/dsotirho-ucsc/5927-layer-and-lambda-race branch 2 times, most recently from 1de4271 to 6275fec Compare May 15, 2024 22:44
@dsotirho-ucsc
Copy link
Contributor Author

6259_IT_2024-05-16.txt

Apply complete! Resources: 7 added, 10 changed, 7 destroyed.
python /Users/daniel/repo/azul2/scripts/delete_lambda_versions.py
2024-05-15 22:27:05,758    INFO MainThread botocore.credentials: Found credentials in shared credentials file: ~/.aws/credentials
2024-05-15 22:27:05,832    INFO MainThread azul.deployment: Allocated new Boto3 client for 'lambda' with ID 4406124240
2024-05-15 22:27:07,575    INFO MainThread __main__: Deleting published version 8 of azul-indexer-daniel-aggregate_retry
2024-05-15 22:27:08,081    INFO MainThread __main__: Deleting published version 8 of azul-service-daniel
2024-05-15 22:27:09,137    INFO MainThread __main__: Deleting published version 8 of azul-indexer-daniel-aggregate
2024-05-15 22:27:10,433    INFO MainThread __main__: Deleting published version 8 of azul-indexer-daniel-indexercachehealth
2024-05-15 22:27:14,468    INFO MainThread __main__: Deleting published version 8 of azul-service-daniel-servicecachehealth
2024-05-15 22:27:15,399    INFO MainThread __main__: Deleting published version 8 of azul-service-daniel-manifest
2024-05-15 22:27:16,776    INFO MainThread __main__: Deleting published version 8 of azul-indexer-daniel
2024-05-15 22:27:19,294    INFO MainThread __main__: Deleting published version 8 of azul-indexer-daniel-contribute_retry
2024-05-15 22:27:20,534    INFO MainThread __main__: Deleting published version 8 of azul-indexer-daniel-contribute
python /Users/daniel/repo/azul2/scripts/post_deploy_tdr.py
…

@dsotirho-ucsc
Copy link
Contributor Author

It seems AWS keeps an internal counter of the published version of each Lambda, so even if the published version(s) are deleted before the deploy, the next published version will be +1 larger than the last. Since this is the case, there seems no benefit in running the delete prior to the deploy as opposed to after, so I opted for the latter.

Comment on lines 712 to 713
# race conditions when an update to the function's configuration and
# code rely on the update of each other in order to work correctly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# race conditions when an update to the function's configuration and
# code rely on the update of each other in order to work correctly.
# race conditions when there's a cyclic dependency between an update
# to the function's configuration and an update to its code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my way is easier to read, assuming I've correctly parsed the intent here

Comment on lines 28 to 35
if version['Version'] == '$LATEST':
pass
else:
version_number = version['Version']
log.info('Deleting published version %s of %s', version_number, lambda_.name)
aws.lambda_.delete_function(
FunctionName=lambda_.name,
Qualifier=version_number
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if version['Version'] == '$LATEST':
pass
else:
version_number = version['Version']
log.info('Deleting published version %s of %s', version_number, lambda_.name)
aws.lambda_.delete_function(
FunctionName=lambda_.name,
Qualifier=version_number
)
version = version['Version']
if version == '$LATEST':
pass
else:
log.info('Deleting published version %s of %s', version, lambda_.name)
aws.lambda_.delete_function(
FunctionName=lambda_.name,
Qualifier=version
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to reuse the version variable instead of introducing a new one (especially since I'm confused about the difference in their names). Also, we can assign to it earlier to avoid repeating the dictionary lookup.

Comment on lines 24 to 26
response = aws.lambda_.list_versions_by_function(
FunctionName=lambda_.name
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fits on one line

Suggested change
response = aws.lambda_.list_versions_by_function(
FunctionName=lambda_.name
)
response = aws.lambda_.list_versions_by_function(FunctionName=lambda_.name)

@nadove-ucsc nadove-ucsc removed their assignment May 16, 2024
@dsotirho-ucsc dsotirho-ucsc force-pushed the issues/dsotirho-ucsc/5927-layer-and-lambda-race branch from 6275fec to e5d5938 Compare May 16, 2024 23:24
@dsotirho-ucsc
Copy link
Contributor Author

6259_IT_2024-05-16.txt

Apply complete! Resources: 7 added, 10 changed, 7 destroyed.
python /Users/daniel/repo/azul2/scripts/delete_lambda_versions.py
2024-05-16 16:35:52,794    INFO MainThread botocore.credentials: Found credentials in shared credentials file: ~/.aws/credentials
2024-05-16 16:35:52,870    INFO MainThread azul.deployment: Allocated new Boto3 client for 'lambda' with ID 4483204624
2024-05-16 16:35:55,007    INFO MainThread __main__: Fetching the published versions of azul-indexer-daniel-aggregate_retry
2024-05-16 16:35:55,148    INFO MainThread __main__: Deleting published version 11 of azul-indexer-daniel-aggregate_retry
2024-05-16 16:35:55,406    INFO MainThread __main__: Fetching the published versions of azul-service-daniel
2024-05-16 16:35:55,542    INFO MainThread __main__: Deleting published version 10 of azul-service-daniel
2024-05-16 16:35:55,810    INFO MainThread __main__: Fetching the published versions of azul-indexer-daniel-aggregate
2024-05-16 16:35:55,959    INFO MainThread __main__: Deleting published version 11 of azul-indexer-daniel-aggregate
2024-05-16 16:35:56,210    INFO MainThread __main__: Fetching the published versions of azul-indexer-daniel-indexercachehealth
2024-05-16 16:35:56,343    INFO MainThread __main__: Deleting published version 11 of azul-indexer-daniel-indexercachehealth
2024-05-16 16:35:56,598    INFO MainThread __main__: Fetching the published versions of azul-service-daniel-servicecachehealth
2024-05-16 16:35:56,733    INFO MainThread __main__: Deleting published version 10 of azul-service-daniel-servicecachehealth
2024-05-16 16:35:57,024    INFO MainThread __main__: Fetching the published versions of azul-service-daniel-manifest
2024-05-16 16:35:57,172    INFO MainThread __main__: Deleting published version 10 of azul-service-daniel-manifest
2024-05-16 16:35:57,456    INFO MainThread __main__: Fetching the published versions of azul-indexer-daniel
2024-05-16 16:35:57,611    INFO MainThread __main__: Deleting published version 11 of azul-indexer-daniel
2024-05-16 16:35:57,892    INFO MainThread __main__: Fetching the published versions of azul-indexer-daniel-contribute_retry
2024-05-16 16:35:58,020    INFO MainThread __main__: Deleting published version 11 of azul-indexer-daniel-contribute_retry
2024-05-16 16:35:58,279    INFO MainThread __main__: Fetching the published versions of azul-indexer-daniel-contribute
2024-05-16 16:35:58,431    INFO MainThread __main__: Deleting published version 11 of azul-indexer-daniel-contribute
python /Users/daniel/repo/azul2/scripts/post_deploy_tdr.py
…

@nadove-ucsc nadove-ucsc removed their assignment May 17, 2024
@dsotirho-ucsc dsotirho-ucsc force-pushed the issues/dsotirho-ucsc/5927-layer-and-lambda-race branch from e5d5938 to d126563 Compare May 17, 2024 17:20
@dsotirho-ucsc
Copy link
Contributor Author

6259_IT_2024-05-17.txt

return [
Lambda.from_response(function)
for response in self._lambda.get_paginator('list_functions').paginate()
for function in response['Functions']
if deployment_only is False or deployment(function) == config.deployment_stage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if deployment_only is False or deployment(function) == config.deployment_stage
if not deployment_only or deployment(function) == config.deployment_stage

@nadove-ucsc nadove-ucsc removed their assignment May 17, 2024
@dsotirho-ucsc dsotirho-ucsc force-pushed the issues/dsotirho-ucsc/5927-layer-and-lambda-race branch from d126563 to 184a0e0 Compare May 17, 2024 21:39
@dsotirho-ucsc
Copy link
Contributor Author

6259_IT_2024-05-17.txt

nadove-ucsc
nadove-ucsc previously approved these changes May 17, 2024
@nadove-ucsc nadove-ucsc removed their assignment May 17, 2024
@nadove-ucsc nadove-ucsc marked this pull request as ready for review May 17, 2024 22:38
@nadove-ucsc nadove-ucsc requested a review from hannes-ucsc as a code owner May 17, 2024 22:38
Copy link
Member

@hannes-ucsc hannes-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not ask for any code changes in my previous review, only for additional evidence, yet the code was changed. Was this intentional, and if so, why?

@hannes-ucsc hannes-ucsc removed their assignment Aug 22, 2024
@dsotirho-ucsc
Copy link
Contributor Author

Yes, a fixup commit was added to re-combine the two list_lambdas() methods into one.

Background: When I refactored the code out of manage_lambdas() that listed the Lambda functions. I did this as a separate new method list_deployment_lambdas() instead of combining with the existing list_lambdas(). I didn't combine these at the time due to the conflicting behavior (from all deployments or not, & for all Lambda versions or not) being required by different call sites.

When testing manage_lambdas.py I discovered a bug introduced by this PR where now that Lambda functions have multiple versions (latest & a numbered version), manage_lambdas.py was unnecessarily fetching all versions of my Lambda functions and process each function twice as a result.

❯ python scripts/manage_lambdas.py --disable
2024-08-22 17:29:46,024    INFO MainThread botocore.credentials: Found credentials in shared credentials file: ~/.aws/credentials
2024-08-22 17:29:46,070    INFO MainThread azul.deployment: Allocated new Boto3 client for 'lambda' with ID 4371164368
2024-08-22 17:29:47,774    INFO MainThread azul.lambdas: Setting concurrency limit for azul-indexer-daniel-aggregate_retry to zero.
2024-08-22 17:29:48,270    INFO MainThread azul.lambdas: Setting concurrency limit for azul-service-daniel to zero.
2024-08-22 17:29:48,771    INFO MainThread azul.lambdas: Setting concurrency limit for azul-indexer-daniel-aggregate to zero.
2024-08-22 17:29:49,272    INFO MainThread azul.lambdas: Setting concurrency limit for azul-indexer-daniel-indexercachehealth to zero.
2024-08-22 17:29:49,826    INFO MainThread azul.lambdas: Setting concurrency limit for azul-service-daniel-servicecachehealth to zero.
2024-08-22 17:29:50,341    INFO MainThread azul.lambdas: Setting concurrency limit for azul-service-daniel-manifest to zero.
2024-08-22 17:29:50,848    INFO MainThread azul.lambdas: Setting concurrency limit for azul-indexer-daniel to zero.
2024-08-22 17:29:51,362    INFO MainThread azul.lambdas: Setting concurrency limit for azul-indexer-daniel-contribute_retry to zero.
2024-08-22 17:29:51,852    INFO MainThread azul.lambdas: Setting concurrency limit for azul-indexer-daniel-contribute to zero.
2024-08-22 17:29:52,352 WARNING MainThread azul.lambdas: azul-indexer-daniel is already disabled.
2024-08-22 17:29:52,597 WARNING MainThread azul.lambdas: azul-indexer-daniel-contribute is already disabled.
2024-08-22 17:29:52,833 WARNING MainThread azul.lambdas: azul-indexer-daniel-aggregate_retry is already disabled.
2024-08-22 17:29:53,091 WARNING MainThread azul.lambdas: azul-indexer-daniel-contribute_retry is already disabled.
2024-08-22 17:29:53,333 WARNING MainThread azul.lambdas: azul-indexer-daniel-aggregate is already disabled.
2024-08-22 17:29:53,573 WARNING MainThread azul.lambdas: azul-indexer-daniel-indexercachehealth is already disabled.
2024-08-22 17:29:53,818 WARNING MainThread azul.lambdas: azul-service-daniel-servicecachehealth is already disabled.
2024-08-22 17:29:54,067 WARNING MainThread azul.lambdas: azul-service-daniel-manifest is already disabled.
2024-08-22 17:29:54,307 WARNING MainThread azul.lambdas: azul-service-daniel is already disabled.

This prompted me to take another look at the two list_lambdas() methods and to combine them

all_versions: bool = False
) -> list[Lambda]:
"""
Return a list of AWS Lambda functions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Return a list of AWS Lambda functions
Return a list of all AWS Lambda functions (or function versions) in the current account, or the given deployment.

for lambda_name in [metadata['FunctionName'] for metadata in lambda_page['Functions']]:
if any(lambda_name.startswith(prefix) for prefix in lambda_prefixes):
self.manage_lambda(lambda_name, enabled)
for lambda_ in self.list_lambdas(deployment=config.deployment_stage):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for lambda_ in self.list_lambdas(deployment=config.deployment_stage):
for function in self.list_lambdas(deployment=config.deployment_stage):

As noted before, we are phasing this term out. You don't have to remove all usages but just don't add new ones.


def delete_stale_function_versions(self):
"""
Delete all but the latest published version of every AWS Lambda function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Delete all but the latest published version of every AWS Lambda function
Delete all but the latest published version of every AWS Lambda function

Can there be more than one published version? If not, "the latest" qualifier seems redundant.

Copy link
Contributor Author

@dsotirho-ucsc dsotirho-ucsc Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can there be more than one published version?

Yes, however I was using the term "published" incorrectly. In AWS terminology a published version is a copy of the unpublished ($LATEST) version. There can be many published versions (each with a unique version number), but only one unpublished version.

https://docs.aws.amazon.com/lambda/latest/dg/configuration-versions.html

In my latest fixup commit I've changed the name and docstring of this method accordingly.

Copy link
Member

@hannes-ucsc hannes-ucsc Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#6259 (comment)

Then I believe either the deletion logic is wrong or the publish TF functionality doesn't work. One of the published versions must be live so the logic below deletes it, or the latest version is live which raises the question as to why we published an immutable version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might of misunderstood your original question. There can be multiple published versions of a function in the sense that AWS allows it, however with this PR there will only ever be at most one published version, along with the unpublished version.

Both the published and unpublished versions are live, however only the unpublished version is used since our code does not qualify the function ARN with a version. https://docs.aws.amazon.com/lambda/latest/dg/configuration-versions.html#versioning-versions-using

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still don't understand it. Please raise in PL.

paginator = self._lambda.get_paginator('list_functions')
lambda_prefixes = [
config.qualified_resource_name(lambda_infix, stage=deployment)
for lambda_infix in config.lambda_names()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for lambda_infix in config.lambda_names()
for lambda_name in config.lambda_names()

lambda_prefixes = [
config.qualified_resource_name(lambda_infix, stage=deployment)
for lambda_infix in config.lambda_names()
if deployment is not None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In larger constructs I see the point of a constant comprehension guard, but here no.

for function in response['Functions']
if deployment is None or any(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if deployment is None or any(
if lambda_prefixes is None or any(

@hannes-ucsc hannes-ucsc removed their assignment Aug 24, 2024
@dsotirho-ucsc dsotirho-ucsc force-pushed the issues/dsotirho-ucsc/5927-layer-and-lambda-race branch from fd98952 to d21db43 Compare August 27, 2024 20:23
@dsotirho-ucsc
Copy link
Contributor Author

6259_IT_2024-08-27.txt


def delete_stale_function_versions(self):
"""
Delete all but the latest published version of every AWS Lambda function
Copy link
Member

@hannes-ucsc hannes-ucsc Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#6259 (comment)

Then I believe either the deletion logic is wrong or the publish TF functionality doesn't work. One of the published versions must be live so the logic below deletes it, or the latest version is live which raises the question as to why we published an immutable version.

@hannes-ucsc hannes-ucsc removed their assignment Aug 28, 2024
@dsotirho-ucsc dsotirho-ucsc force-pushed the issues/dsotirho-ucsc/5927-layer-and-lambda-race branch from d21db43 to 2567ecb Compare September 10, 2024 23:07
@dsotirho-ucsc dsotirho-ucsc force-pushed the issues/dsotirho-ucsc/5927-layer-and-lambda-race branch from 2567ecb to 0ba2a94 Compare September 11, 2024 16:25
@dsotirho-ucsc dsotirho-ucsc force-pushed the issues/dsotirho-ucsc/5927-layer-and-lambda-race branch from 0ba2a94 to f612650 Compare September 11, 2024 16:29
@dsotirho-ucsc
Copy link
Contributor Author

dsotirho-ucsc commented Sep 11, 2024

Copy link
Member

@hannes-ucsc hannes-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4+ reviews [process] Lead requested changes four times or more orange [process] Done by the Azul team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants