Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CopyFile: support assets, archives, and recursive copying #423

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

thomas11
Copy link
Contributor

@thomas11 thomas11 commented Apr 30, 2024

This PR allows the CopyFile resource to recursively copy directories as well, similar to scp -r.

This PR enhances the CopyFile resource in a few ways.

  • It can now copy directories as well, recursively with their contents.
  • It now takes the standard Pulumi asserts and archives as input, allowing for seamless interop with other resources.
  • It now checks whether the specified file or directory have changed (in content, not timestamp) and copies only if they did.

In light of these changes, I've renamed CopyFile to just Copy which is a breaking change.

Resolves #23
Resolves #33
Resolves #42

TODO

  • The current implementation fails as soon as a file or directory already exists on the remote, like the previous CopyFile.
  • Before GA, should we rename CopyFile to Copy or RemoteCopy or insert your proposal here?
  • The integration test copies a bunch of EC2 setup code with TestEc2RemoteTs, which we should share instead. What would be a good way to do that, given the code is in the TS tests, not in the Go driver?

Design considerations

The behavior of the copy operation was modeled after cp and scp.

source | dest - exists as dir | dest - does not exist | dest - exists as file
-------|----------------------|-----------------------|-----------------------
dir    | dest/dir             | dest/dir              | error
dir/   | dest/x for x in dir  | dest/dir              | error
file   | dest/file            | dest                  | dest (overwritten)

Specifically:

  • When copying a directory, we overwrite existing files. (The current CopyFile resource for single files does that, too.)
  • When copying a directory, we don't clear the remote directory first so that no left-over files from previous copies will be around. We can always add a flag for that if needed.

Implementation notes

  • I've looked around for open source Go packages implementing scp of folders, but to my surprise, I couldn't find one with an acceptable license that 1) wasn't ancient and 2) was cross-platform.
  • The current implementation copies files sequentially and is therefore probably slow for large trees. If we replaced the implementation, I don't think the shape of the resource would change, so this shouldn't be a GA blocker.
  • The size.ts file in both steps of the integration test is useless because it always has the same value. However, I found that without a TS file present in the additional step, it would not be run. Haven't debugged further.

provider/pkg/provider/remote/copy.go Outdated Show resolved Hide resolved
provider/pkg/provider/remote/copy.go Outdated Show resolved Hide resolved
provider/pkg/provider/remote/copy.go Outdated Show resolved Hide resolved
provider/pkg/provider/remote/copy.go Outdated Show resolved Hide resolved
provider/pkg/provider/remote/copyController.go Outdated Show resolved Hide resolved
provider/pkg/provider/remote/copyController.go Outdated Show resolved Hide resolved
examples/ec2_dir_copy/index.ts Outdated Show resolved Hide resolved
provider/pkg/provider/remote/copyController.go Outdated Show resolved Hide resolved
provider/pkg/provider/remote/copyController.go Outdated Show resolved Hide resolved
provider/pkg/provider/remote/copyController.go Outdated Show resolved Hide resolved
}
defer local.Close()

remote, err := sftp.Create(dst)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the parents of dst do not exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When copyFile is called from our copyDir, that shouldn't be possible.

When it's called directly because the user wants to copy a file, one could argue that it should fail because the user asked to copy to a non-existant location. It's one of the questions in my (internal) ideation doc.

@thomas11 thomas11 force-pushed the tkappler/archive branch 2 times, most recently from df31734 to 3fcd01d Compare May 17, 2024 08:32
@thomas11 thomas11 requested review from danielrbradley and a team May 22, 2024 17:41
Copy link
Member

@iwahbe iwahbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good to me.

provider/pkg/provider/remote/copyController.go Outdated Show resolved Hide resolved
Comment on lines +35 to +36
Asset *asset.Asset `pulumi:"asset,optional"`
Archive *archive.Archive `pulumi:"archive,optional"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these defined as two properties rather than just having the Asset field?

AFAIK, when a property is of type Asset, that automatically allows the user to pass in an Archive. You can see this in the SDK when your Asset fields actually get generated with the type AssetOrArchive. This whole heirarchy of types is somewhat confusing for users, but I think it makes it superfluous to include both fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh interesting! I just looked at the definitions of Asset and Archive and saw that they're separate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the design of Assets overall is pretty confusing on both the provider and consumer sides 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works a bit differently because it's using raw property values rather than the new framework. Perhaps this is a missing feature in the go-provider to detect which kind of asset we've got at runtime?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the question is also: what's the difference between the asset/archive types in sdk/v3/go/common/resource and sdk/v3/go/pulumi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This discussion led to filing pulumi/pulumi-go-provider#237

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants