Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(legacy): improve deterministic build of the polyfill bundle by sorting the polyfills discovered by babel #16566

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shankerwangmiao
Copy link

Description

The polyfill bundle generated by the legacy plugin is non deterministic, because the order of calling renderChunk() affects the input of the finally generated polyfill bundle.

This PR solves this issue by sorting the polyfills in need discovered by babel. The polyfills added directly from the plugin configuration is kept ordered the first, same as before.

Previously, #13672 (comment) mentioned this issue.

Copy link

stackblitz bot commented Apr 30, 2024

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

Copy link
Member

@bluwy bluwy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we sort (and convert to array for) the polyfills before these two spots instead?

isDebug &&
console.log(
`[@vitejs/plugin-legacy] modern polyfills:`,
modernPolyfills,
)

isDebug &&
console.log(
`[@vitejs/plugin-legacy] legacy polyfills:`,
legacyPolyfills,
)

We can then pass the sorted array to buildPolyfillChunk() below them and it should be deterministic?

@bluwy bluwy added plugin: legacy p2-nice-to-have Not breaking anything but nice to have (priority) labels May 1, 2024
@shankerwangmiao
Copy link
Author

Could we sort (and convert to array for) the polyfills before these two spots instead?

isDebug &&
console.log(
`[@vitejs/plugin-legacy] modern polyfills:`,
modernPolyfills,
)

isDebug &&
console.log(
`[@vitejs/plugin-legacy] legacy polyfills:`,
legacyPolyfills,
)

We can then pass the sorted array to buildPolyfillChunk() below them and it should be deterministic?

Hi, thanks for your review. It could be simpler to directly sort the array here. However, I think the ordering of polyfill groups should be kept as before. Originally, the corejs polyfills specified in options.{modern,legacy}Polyfills comes the first, followed by those specified in additional{Modern,Legacy}Polyfills, and those discovered by babel. If we directly sort the combined polyfills list, this ordering will be broken, and some projects may rely on this to work properly. This proposed change keeps this ordering, and only sort the polyfills generated by babel.

@shankerwangmiao
Copy link
Author

Also some issues with the CI. I found that the CI sometimes fails on tests not related with legacy plugin.

See:

They should be running on the same changes, but I have no idea why some of the tests failed

@bluwy
Copy link
Member

bluwy commented May 2, 2024

Yeah I'd ignore the CI fails that aren't related to plugin-legacy. It's flaky lately.

If we directly sort the combined polyfills list, this ordering will be broken, and some projects may rely on this to work properly.

I did some digging and found babel/babel-polyfills#192. So it seems like some order matters and presumably babel-preset-env will respect that when it returns. But in your PR, you're sorting the list too, which is incorrect?

In which case, the current behaviour is safer wrt ordering. Otherwise it's better to sort the entire array based based on core-js-compat/modules.json instead?

@shankerwangmiao
Copy link
Author

Ah, thanks for pointing it out. I'll see how to make the polyfill list stable and following the correct ordering requirement.

@shankerwangmiao shankerwangmiao force-pushed the fix-legacy-polyfill-deterministic branch from 2b44e8a to 3f53346 Compare May 2, 2024 15:56
@shankerwangmiao
Copy link
Author

I did some digging and found babel/babel-polyfills#192. So it seems like some order matters and presumably babel-preset-env will respect that when it returns. But in your PR, you're sorting the list too, which is incorrect?

In which case, the current behaviour is safer wrt ordering. Otherwise it's better to sort the entire array based based on core-js-compat/modules.json instead?

Hi, I believe sorting according to core-js-compat/modules.json is definitely correct. However, by doing that, we are skipping the abstract layer provided by babel/preset-env, so I don't think it will be a clean solution.

My idea is to record all the generated polyfills and their ordering in each chunk and generate a combined list of polyfills by doing a topological sort, and generating a lexicographically minimum one to make the result stable.

However, by implementing it, as you can see from the CI result, I discovered that the order of polyfills given by babel is not strictly following the order in core-js-compat/modules.json. For example, in playground/legacy, the polyfills in the legacy chunk assets/chunk-main-legacy.!~{003}~.js is

core-js/modules/es.symbol.js
core-js/modules/es.symbol.description.js
core-js/modules/es.symbol.async-iterator.js
core-js/modules/es.symbol.iterator.js
core-js/modules/es.symbol.to-string-tag.js
core-js/modules/es.error.cause.js
core-js/modules/es.array.concat.js
core-js/modules/es.array.from.js
core-js/modules/es.array.iterator.js
core-js/modules/es.array.join.js
core-js/modules/es.array.map.js
core-js/modules/es.array.push.js
core-js/modules/es.function.name.js
core-js/modules/es.json.stringify.js
core-js/modules/es.json.to-string-tag.js
core-js/modules/es.map.js
core-js/modules/es.math.to-string-tag.js
core-js/modules/es.object.freeze.js
core-js/modules/es.object.get-prototype-of.js
core-js/modules/es.object.keys.js
core-js/modules/es.array.slice.js
core-js/modules/es.object.to-string.js
core-js/modules/es.promise.js
core-js/modules/es.set.js
core-js/modules/es.string.ends-with.js
core-js/modules/es.regexp.exec.js
core-js/modules/es.regexp.test.js
core-js/modules/es.regexp.to-string.js
core-js/modules/es.string.iterator.js
core-js/modules/es.string.raw.js
core-js/modules/esnext.iterator.constructor.js
core-js/modules/esnext.iterator.for-each.js
core-js/modules/esnext.iterator.map.js
core-js/modules/esnext.set.difference.v2.js
core-js/modules/esnext.set.intersection.v2.js
core-js/modules/esnext.set.is-disjoint-from.v2.js
core-js/modules/esnext.set.is-subset-of.v2.js
core-js/modules/esnext.set.is-superset-of.v2.js
core-js/modules/esnext.set.symmetric-difference.v2.js
core-js/modules/esnext.set.union.v2.js
core-js/modules/web.dom-collections.for-each.js
core-js/modules/web.dom-collections.iterator.js
core-js/modules/web.dom-exception.constructor.js
core-js/modules/web.dom-exception.stack.js
core-js/modules/web.dom-exception.to-string-tag.js
core-js/modules/web.structured-clone.js
core-js/modules/web.url.js
core-js/modules/web.url.to-json.js
core-js/modules/web.url-search-params.js
core-js/modules/web.url-search-params.delete.js
core-js/modules/web.url-search-params.has.js
core-js/modules/web.url-search-params.size.js

We can see core-js/modules/es.array.slice.js is ordered after core-js/modules/es.object.keys.js which is opposite to the ordering in core-js-compat/modules.json.

I wonder whether this is a bug in babel or intended.

@shankerwangmiao shankerwangmiao force-pushed the fix-legacy-polyfill-deterministic branch from 3f53346 to 9a98cf4 Compare May 2, 2024 18:45
@shankerwangmiao
Copy link
Author

This issue is addressed by adding a circle breaking mechanism.

@yoyo837
Copy link
Contributor

yoyo837 commented May 7, 2024

Thanks for the great job. Really looking forward to this fix.

@bluwy
Copy link
Member

bluwy commented May 8, 2024

Thanks for investigating this @shankerwangmiao. Seems like babel-preset-env might have a bug with the ordering but I can't find out why.

From the current implementation, I don't understand why we need a complex sorting algorithm to sort it though. Why not do a single final sort based on core-js-compat/modules.json? The sort function should be something like this.

@shankerwangmiao
Copy link
Author

From the current implementation, I don't understand why we need a complex sorting algorithm to sort it though. Why not do a single final sort based on core-js-compat/modules.json? The sort function should be something like this.

Because there is no insurance that someday babel won't add other polyfill dependency from other packages, I believe not directly relying on the core-js-compat/modules.json, but relying on solely the output of babel would be a better approach.

@bluwy
Copy link
Member

bluwy commented May 8, 2024

If babel adds another polyfill, wouldn't that still be fine? It shouldn't be having a specific order if it's not core-js, but if it needs a specific order, we can add it later too.

Perhaps what I don't understand too is that, if babel-preset-env outputs imports that aren't in the correct order, how does a sorting algorithm fix into a correct order, and especially would always respect core-js-compat/modules.json without reading it?

I can see that it's in the correct order now locally, but will it always be in the correct order if, let's say, there's only one chunk which babel outputs in an incorrect order?

@shankerwangmiao
Copy link
Author

If babel adds another polyfill, wouldn't that still be fine? It shouldn't be having a specific order if it's not core-js, but if it needs a specific order, we can add it later too.

Perhaps what I don't understand too is that, if babel-preset-env outputs imports that aren't in the correct order, how does a sorting algorithm fix into a correct order, and especially would always respect core-js-compat/modules.json without reading it?

I can see that it's in the correct order now locally, but will it always be in the correct order if, let's say, there's only one chunk which babel outputs in an incorrect order?

The basic assumption is that the ordering given by babel is (or should be) always right, no matter whether the polyfills are from core-js or other sources. We utilize this information rather than the information from core-js-compat/modules.json to keep all the polyfills in the correct order. The "correct order" means that if a polyfill A comes before B in a chunk, A will come before B in the final output; if a polyfill A and a polyfill B never appear in the same chunk, there will be no ordering constrain between them.

@bluwy
Copy link
Member

bluwy commented May 9, 2024

Even if babel is correctly outputting in the right order today, there's still the problem where additionalModernPolyfills or additionalLegacyPolyfills get merged and there are not in the right order. Having in the incorrect order is very unlikely to be intended, so we should also sort and fix it up.

I get that fixing the ordering issue wasn't the initial goal of this PR, it was to only make it deterministic. But knowing now that the bug exist, and the way to solve this is to sort based on core-js-compat/modules.json, we can do that directly to kill two birds with one stone? Otherwise we have to followup with another PR that does that.

@shankerwangmiao
Copy link
Author

You are right about this. My original point is to only keep the output deterministic, not to solve the ordering issue.

If we sort according to core-js-compat/modules.json, what if the user choose to include polyfills from other than those provided by core-js in additional*Polyfills? Also, currently, additional*Polyfills comes before the polyfills imported by babel, what if the user relies on the ordering?

@bluwy
Copy link
Member

bluwy commented May 9, 2024

What I had in mind is that, if there's any polyfills that aren't within core-js-compat/modules.json, then we keep the ordering as-is. However, it is more complex I think 🤔 For example:

// es.array.slice must be before es.object.keys

// Case 1:
const additional =  ['core-js/modules/es.array.slice.js', 'something-else.js']
const babel = ['core-js/modules/es.object.keys.js']
const result = ['core-js/modules/es.array.slice.js', 'something-else.js', 'core-js/modules/es.object.keys.js']

// Case 2:
const additional =  ['core-js/modules/es.object.keys.js', 'something-else.js']
const babel = ['core-js/modules/es.array.slice.js']
const result = ['core-js/modules/es.array.slice.js', 'core-js/modules/es.object.keys.js', 'something-else.js']
// Move es.array.slice before es.object.keys

I'm not sure if a .sort algorithm would be enough to do the trick, but I think keeping the basic idea of moving things to the front should make sure non-corejs modules are not simply moved entirely to last. In other words, non-corejs modules if it has another polyfill before it, there shouldn't be something injected between them. I might need to think about this more.

@shankerwangmiao
Copy link
Author

I guess a simple sort is not enough. Maybe we need topology sorting. However, I have no idea on how to deal with contradictory ordering constrains.

@Huodoo
Copy link

Huodoo commented May 14, 2024

I guess a simple sort is not enough. Maybe we need topology sorting. However, I have no idea on how to deal with contradictory ordering constrains.

Maybe a sorting algorithm is enough. This will just sort the polyfill import statement

@bluwy
Copy link
Member

bluwy commented May 14, 2024

I had a few ideas recently to solve the ordering issue, but all are quite complex. @shankerwangmiao I'm happy to solve the deterministic issue for now, but I think we could simplify the implementation.

The forth argument of renderChunk has meta.chunks. This object is properly ordered. On the first renderChunk call, we can create a map like this:

chunkFileNameToPolyfills = new Map<string, { modern: Set<string>, legacy: Set<string> }>();
for (const fileName in meta.chunks) {
  chunkFileNameToPolyfills.set(fileName,  { modern: new Set(), legacy: new Set() }
}

And then later on the code in renderChunk will always only populate the sets in chunkFileNameToPolyfills. And finally in generateBundle, we can iterate chunkFileNameToPolyfills and combine it all back into the main modernPolyfills and legacyPolyfills. What do you think?

@shankerwangmiao shankerwangmiao force-pushed the fix-legacy-polyfill-deterministic branch from 9a98cf4 to e1ca296 Compare May 14, 2024 10:05
@shankerwangmiao
Copy link
Author

shankerwangmiao commented May 14, 2024

I followed your suggestion, but with a simpler solution. In renderChunk(), chunkName and polyfills are recorded in an array. In generateBundle(), we simply sort it according the chunkName, and combine the polyfills together.

The change made to the original process is simple: only the order of polyfill groups generated by chunks and pushed to {legacy,modern}Polyfills is changed.

@shankerwangmiao shankerwangmiao force-pushed the fix-legacy-polyfill-deterministic branch from e1ca296 to 9573279 Compare May 14, 2024 10:12
@Huodoo
Copy link

Huodoo commented May 14, 2024

I had a few ideas recently to solve the ordering issue, but all are quite complex. @shankerwangmiao I'm happy to solve the deterministic issue for now, but I think we could simplify the implementation.

The forth argument of renderChunk has meta.chunks. This object is properly ordered. On the first renderChunk call, we can create a map like this:

chunkFileNameToPolyfills = new Map<string, { modern: Set<string>, legacy: Set<string> }>();
for (const fileName in meta.chunks) {
  chunkFileNameToPolyfills.set(fileName,  { modern: new Set(), legacy: new Set() }
}

And then later on the code in renderChunk will always only populate the sets in chunkFileNameToPolyfills. And finally in generateBundle, we can iterate chunkFileNameToPolyfills and combine it all back into the main modernPolyfills and legacyPolyfills. What do you think?

modernPolyfills and legacyPolyfills is just used for generate a file like

import "core-js/modules/es.regexp.exec.js";import "core-js/modules/es.regexp.test.js";import "core-js/modules/es.regexp.to-string.js";import "core-js/modules/es.string.match.js";import "core-js/modules/es.string.replace.js";

#16668

@bluwy
Copy link
Member

bluwy commented May 14, 2024

@Huodoo please read the above discussion, particularly #16566 (comment). The import order matters.

Comment on lines 265 to 267
.sort((a, b) =>
a.chunkName < b.chunkName ? -1 : a.chunkName > b.chunkName ? 1 : 0,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't sort it this way because the chunk order matters and it would be good to respect that, hence I suggested using meta.chunks because it gives the right order or chunks.

This way we also prevent a large polyfill re-order change after this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, the PR is updated as requested. Now the ordering of the chunks is taken from the forth parameter of renderChunk

@shankerwangmiao shankerwangmiao force-pushed the fix-legacy-polyfill-deterministic branch from 9573279 to 4eee61f Compare May 14, 2024 15:45
@shankerwangmiao shankerwangmiao force-pushed the fix-legacy-polyfill-deterministic branch from 4eee61f to 37c344f Compare May 14, 2024 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p2-nice-to-have Not breaking anything but nice to have (priority) plugin: legacy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants