-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle lost sparseness in non http data sources #3213
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
db6ec2c
to
3dc043f
Compare
3dc043f
to
5b926ed
Compare
5b926ed
to
66d6ac4
Compare
c2a3372
to
e9c286a
Compare
/test all |
2fddf72
to
c4388c6
Compare
/retest |
4c77744
to
86ef1bf
Compare
86ef1bf
to
85903b9
Compare
/test pull-containerized-data-importer-fossa |
85903b9
to
ad75665
Compare
ad75665
to
8aaab52
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice, just a couple of comments
return info.VirtualSize >= imageContentSize, nil | ||
fmt.Fprintf(ginkgo.GinkgoWriter, "INFO: VerifySparse comparison: OriginalVirtual: %d vs SizeOnDisk: %d\n", originalVirtualSize, info.ActualSize) | ||
// The size on disk of a sparse image is significantly lower than the image's virtual size | ||
return originalVirtualSize-info.ActualSize >= units.MB, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably intentional but why use units.MB
instead of the previously used MiB
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah intentional but it's been a while so I don't remember the exact reasoning, have to take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the 1MB fudge factor all about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its probably to ensure we align the size on 1Mib. qemu gets unhappy if the image is not aligned on the actual storage block size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this check is a little too strict, wondering why current virtual size is not appropriate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I looked up the 1MiB->1MB switch.
It's because qemu-img convert
only manages to shave off one megabyte (not mebi) for our test image
INFO: VerifySparse comparison: OriginalVirtual: 18874368 vs SizeOnDisk: 17829888
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this check is specific to a particular image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, you pass the virtual size of the used image as a parameter
pkg/importer/upload-datasource.go
Outdated
@@ -51,8 +54,7 @@ func (ud *UploadDataSource) Info() (ProcessingPhase, error) { | |||
if ud.contentType == cdiv1.DataVolumeArchive { | |||
return ProcessingPhaseTransferDataDir, nil | |||
} | |||
if !ud.readers.Convert { | |||
// Uploading a raw file, we can write that directly to the target. | |||
if ud.directWriteException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe pass the source content type and do the sourceContentType == common.BlockdeviceClone
here? It makes this easier to debug and understand than having an arbitrary directWriteException
that's not too informative. Just a nit though, feel free to ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it could be a little awkward having it know about source content type
/hold
|
@@ -284,24 +283,15 @@ func (f *Framework) VerifyBlankDisk(namespace *k8sv1.Namespace, pvc *k8sv1.Persi | |||
} | |||
|
|||
// VerifySparse checks a disk image being sparse after creation/resize. | |||
func (f *Framework) VerifySparse(namespace *k8sv1.Namespace, pvc *k8sv1.PersistentVolumeClaim, imagePath string) (bool, error) { | |||
func (f *Framework) VerifySparse(namespace *k8sv1.Namespace, pvc *k8sv1.PersistentVolumeClaim, imagePath string, originalVirtualSize int64) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not get rid of this function and check !VerifyImagePreallocated()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly because of the third point in the PR description
The third issue is that we almost always resize which results in significantly larger virtual size.
What we really care about is comparing against the original virtual size of the image.
We could get rid of one of them by negating the other but those additions have to stay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we really care about is comparing against the original virtual size of the image.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since current virtual size is post-resize you always end up comparing 1Gi or more (requested disk image size) versus a modest couple of megabytes, which is always going to pass as sort of a false negative
8aaab52
to
0827b34
Compare
The first issue is that we look at content size since 2168, which is not beneficial in the sparseness check as opposed to disk usage. The second issue is that the check we have in tests for sparsiness is not honest because of the "equal to" part. The virtual size has to be strictly greater than the content (as the content is displayed by tools that understand sparseness). The third issue is that we almost always resize which results in significantly larger virtual size. What we really care about is comparing against the original virtual size of the image. Signed-off-by: Alex Kalenyuk <[email protected]>
Basically a follow up to 3219 for upload sources, which suffer from the same issue of losing sparseness. Signed-off-by: Alex Kalenyuk <[email protected]>
- ImageIO raw images are not passed through any sparsification mechanism - VDDK ones do, but the fake plugin used for testing lies that they're not Signed-off-by: Alex Kalenyuk <[email protected]>
0827b34
to
483318c
Compare
/retest |
@akalenyu: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with /lifecycle rotten |
/remove-lifecycle rotten |
What this PR does / why we need it:
Handle some non-http sources where we lose sparseness similarly to #3219
For this to be complete, we must fix the broken check for sparseness in e2e, to help us test against regression.
The first issue is that we look at content size since #2168,
which is not beneficial in the sparseness check as opposed to disk usage.
The second issue is that the check we have in tests for sparseness is not honest because of the "equal to" part.
The virtual size has to be strictly greater than the size on disk.
The third issue is that we almost always resize which results in significantly larger virtual size.
What we really care about is comparing against the original virtual size of the image.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Release note: