Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: AWS Serverless Support #5591

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft

RFC: AWS Serverless Support #5591

wants to merge 5 commits into from

Conversation

txase
Copy link

@txase txase commented Feb 14, 2025

This draft PR contains a POC of support for deploying an instance of Vaultwarden into an AWS account using entirely "serverless" services (likely falling within the free-tier usage limits as well). I'm looking for feedback and agreement by Vaultwarden maintainers on whether these contributions could be merged into vaultwarden (with further refinement).

Architecture

CloudFront CDN
├─ API Lambda Function
│  ├─ Data S3 Bucket
│  ├─ Aurora DSQL Database
│  └─ Amazon Simple Email Service (SES)
└─ Web-vault static assets S3 Bucket

All of this is implemented in the PR behind feature flags: dsql, s3, and ses. All three can be enabled together via the aws feature flag.

Unimplemented Functionality

I believe all functionality, except as listed below, is functional. But I'm new to vaultwarden and may have missed something along the way. I've not found any significant issues with my own usage due to this missing functionality, however.

  • Background jobs (e.g. to delete expired data)
  • Websockets for events

Open Questions / Concerns

  • Aurora DSQL is in preview. It's mostly compatible with Postgres, but it has a few incompatibilities that could be an issue if they remain at GA:
    • Foreign keys (FKs) are not supported. However, after auditing the code I did not find any Postgres functionality that requires FKs (unlike for sqlite and Mysql). It would be nice to have it to feel safer for data consistency, but doesn't appear to be necessary. The seed migration file in this PR is a dump from a local Postgres vaultwarden database that I then stripped FK constraints from. This is a known documented issue with the DSQL preview, and it is unclear if it will be resolved before GA.
    • DDL statements cannot be executed in a transaction, so migrations are a bit more fraught. I've yet to have a migration fail, though. This is a known documented issue with the DSQL preview, and it is unclear if it will be resolved before GA.
    • Tables can be altered to add rows, but new rows cannot have constraints. This is an undocumented issue I found, and gives me more concern that further migration difficulties will arise over time. Hopefully this will be resolved before GA.
  • API Lambda Function settings:
    • MemorySize is set to 3008 MB. This is the largest possible Function size for new accounts (which is an undocumented AWS limit). Cost is linearly related to this value, but lower values mean longer cold start times, especially when creating TLS connections to external services. For a production environment with a lot of traffic this would be tuned to a minimal amount that achieves desired performance characteristics. Here, I set it to the max because traffic will generally be minimal for private usage, and cost will likely fit under the free tier.
    • Timeout is set to 5 minutes. Cold start times seem to average about 3-4 seconds. Warmed functions (or cold functions after the init work) generally take less than a second to complete (if no external services need to be invoked, warmed functions generally complete in under 10 ms). Importing my personal vault (~600 items) took just under 30 seconds. A 5 minute limit seems large, but I don't want to cause issues for anyone importing or exporting a sizable vault.
  • Attachment size limit: 6 MB. With client changes it would be possible to upload directly from clients to S3 via signed URLs. This is how the official Bitwarden service works, but they use the Azure Blob Storage service instead of AWS S3. Azure uses POST instead of S3's PUT to upload, and although S3 also supports POST with signed URLs the data must be form-encoded. This means existing Bitwarden clients cannot upload directly to S3, and must instead upload through the API Lambda Function. Unfortunately, the AWS Lambda service has a 6 MB size limit for request payloads. Note: This only applies to uploads, downloads are streamed from S3 directly via signed GET URLs which do not have a size limit.

Deployment Instructions

See the aws/README.md file in this PR.

Proposed Plan of Attack

This PR is too large to attempt to review and merge with sanity. With agreement from Vaultwarden maintainers in comments below, I propose developing and merging the following as separate PRs in sequence:

  • Add the persistent_fs module, migrating existing filesystem functionality into a "local" backend implementation. See src/persistent_fs/mod.rs and src/persistent_fs/local.rs. This would not change any functionality; it would simply rearchitect file access in preparation for the addition of an S3 backend.
  • Add the S3 persistent FS backend implementation
  • Add SES support for sending emails
  • Add DSQL support, with some means for denoting it is beta or unstable functionality (e.g. label the feature flag dsql-beta)
  • Add a GitHub Actions workflow to build a Lambda function package on release
    • In an ideal case, the vaultwarden project would also own an AWS account with regional S3 Buckets to upload releases to. This would simplify deployment, allowing users to deploy to their own AWS accounts without having to download and upload function code packages themselves. I can help set this up if needed.
  • Add the CloudFormation template and deployment instructions

@txase txase marked this pull request as ready for review February 14, 2025 16:07
@txase txase marked this pull request as draft February 14, 2025 16:08
@BlackDex
Copy link
Collaborator

Thanks for the PR.

I'm not sure if this is something we want to have embedded by default though, @dani-garcia what is your opinion here.
My main reason is testing this. I'm not sure if there is a way to test this without an AWS account. I mean, S3 can be replaced with minio for example, but other items, I'm not sure of.

Further, i do like the abstraction of the filesystem part. But it's still a lot of custom work. It might be nicer to try and use Opendal maybe? That supports a lot of different file and object storage's.

@txase
Copy link
Author

txase commented Feb 17, 2025

I hadn’t seen Opendal before, but it looks like it should be a perfect fit for this use case. I’d be happy to refactor the fs work to use it. Hopefully that would address testing concerns of the fs layer as well, since it should be abstracting away most of the functionality?

The SES code is quite limited in scope. It may be hard to test without an aws account, but it sort of “just works” because it takes the whole raw email from lettre and sends it.

The DSQL part is most challenging. That said, most of the changes involve implementing ManageConnection in order to generate and cache pre-signed connection URLs. AWS and GCP are moving in this direction for greater security via temporary credentials. I hope we can find a solution here we all like, but I would be fine with deferring DSQL work for now given it’s still in Preview state with known issues.

@williamdes
Copy link
Contributor

Thanks for the PR.

I'm not sure if this is something we want to have embedded by default though, @dani-garcia what is your opinion here. My main reason is testing this. I'm not sure if there is a way to test this without an AWS account. I mean, S3 can be replaced with minio for example, but other items, I'm not sure of.

Same

MinIO or Garage (in Rust) are good replacements.

Amazon Simple Email Service (SES)

At work I used the SMTP interface for quite a while, so no code changes are "hard" required. Having API support is nice indeed.

Aurora DSQL Database

How is this different from MySQL Aurora ?

Further, i do like the abstraction of the filesystem part. But it's still a lot of custom work. It might be nicer to try and use Opendal maybe? That supports a lot of different file and object storage's.

I also would like to see a first set of changes merged (all the non AWS required stuff)

@dani-garcia
Copy link
Owner

Agreed with some of the points mentioned. Ideally I'd like to have as much as this code integrated into crates that we reference instead of including the code in the project which would ensure that other developers get to use these services as well, while also removing some maintenance load from us. For example:

  • Using opendal like @BlackDex mentioned, or some other fs abstraction so we don't have to care about where the files are, or adding support for other platforms in the future.
  • Seeing how relatively simple the SES mail code is, you could implement a lettre Transport and either see if it can be upstreamed to lettre or you can publish a lettre-ses crate.
  • Somewhat similar with the DSQL connection, if that can be upstreamed or extracted to a dsql-r2d2 or dsql-diesel crate.

If we can get those points solved, we'd be in a situation where there would be barely any aws specific code in vaultwarden, and I'd feel more confident including support for it.

The missing features from DSQL are a bit annoying, and it would be unfortunate if we need to maintain a separate migration path without foreign keys and dsql specific changes, so I agree that delaying that until it's out of preview seems like a good idea.

The unimplemented functionality shouldn't be a problem, we could refactor background jobs to store time since last run on the db and run them on startup. About Websockets, that might be harder to do but they shouldn't be necessary for functionality, the clients will run syncs periodically regardless of if they receive updates via websockets or not. There are works in progress to move some clients to use WebPush so it would be less important in the future as well.

@txase
Copy link
Author

txase commented Feb 18, 2025

Thanks for all the comments! What I'm hearing is:

  • The project would welcome contributions that make it easy to deploy an AWS serverless architecture (admittedly a low bar, but some projects / maintainers can have odd stances here)
  • The project would like integrations to be via upstream abstractions like:
    • Opendal for files
    • Lettre transports for email
    • "Something" for a Diesel DSQL connection
  • Known issues (e.g. no background jobs, no web sockets, DSQL quirks) are things that look like they can be managed or overcome, and it is ok to bring in the above contributions first before nailing down exactly how we solve for them

I propose developing and merging the following PRs in sequence, deferring DSQL integration for now:

  • Abstract files to Opendal (or some other abstraction library if Opendal doesn't work for some reason)
  • Add support for S3 as an Opendal file backend
  • Add SES support, ideally via a Lettre Transport that lives somewhere upstream

Please follow up to confirm agreement (or point out where I'm missing something) before I get too deep into the weeds of Opendal :).

@txase
Copy link
Author

txase commented Feb 18, 2025

I also wanted to address one point @williamdes brought up in #5591 (comment) about SES having an SMTP interface that would mean no code changes to Vaultwarden would be required. Unfortunately, the SES SMTP proxy requires long-term credentials:

Important
Don't use temporary AWS credentials to derive SMTP credentials. The SES SMTP interface doesn't support SMTP credentials that have been generated from temporary security credentials.

When it's so easy to run a service that uses temporary AWS credentials, which is a security best-practice, I have a hard time proposing a solution that requires long-term credentials :). Speaking for myself, I don't want to run a Vaultwarden instance that uses any long-term credentials.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants