Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is ESM supposed to only work with file:// URLs #1423

Open
mcollina opened this issue Aug 2, 2023 · 13 comments
Open

Is ESM supposed to only work with file:// URLs #1423

mcollina opened this issue Aug 2, 2023 · 13 comments

Comments

@mcollina
Copy link
Member

mcollina commented Aug 2, 2023

As part of nodejs/node#48740 and our recent survey, quite a few people asked for having easy access to __filename and __dirname inside ESM.

Right now to get those values, users have to do a bit of additional work:

import { fileURLToPath } from 'node:url';
const __filename = fileURLToPath(import.meta.url);
const __dirname = fileURLToPath(new URL('.', import.meta.url));

Note that import.meta.url is a string starting with file://, and file:// URLs are not standardized in any way.

After much discussion in the meeting, the distilled version of the question comes into forms:

  1. should most Node.js API and the ecosystem accept that in JavaScript, files are represented by file://{path} strings
  2. should most Node.js API and ecosystem works with files using Operating-System paths (represented as strings)

I don't think both are exclusive. They are just two separate use cases and only catering for one is problematic.
I hope we can reach a consensus on this, so we can use it to drive further decisions.

@aduh95
Copy link
Contributor

aduh95 commented Aug 2, 2023

I feel obligated to make a clarification regarding the title of the issue: ESM is supposed to work with any URL – you can host modules from custom protocol if you so wish. When loading an ESM file, Node.js assigns it a file: URL based on its path. The question is not "do we want to restrict the other protocols?", it's "can we have paths as well as URLs for modules loaded from the FS?".

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Aug 2, 2023

I was asking @mcollina about this on Slack, trying to clarify the question(s) being asked, and in discussion I think we boiled it down to this:

  • It’s a philosophical question: do we yield and make it easier to work with paths in ESM, and the ecosystem doesn’t need to change; or do we stand firm and thereby add pressure on ecosystem tools to accept URL strings or URL instances as input?

@GeoffreyBooth
Copy link
Member

For added context, import.meta.resolve is ready to be unflagged as soon as someone puts up the PR. Once that’s available without a flag, there will be a lot more URL strings (like import.meta.url) in use in ESM.

@benjamingr
Copy link
Member

benjamingr commented Aug 2, 2023

I don't think having file paths and URLs are exclusive, I can see cases where people would want paths and cases people would want URLs depending on what they're building with Node.js

If we add import.meta.filename for example, I think it addresses a specific use case and is helpful to people (at least according to them via the survey) and we don't make life harder for people using URLs.

So for me:

It’s a philosophical question: do we yield and make it easier to work with paths in ESM, and the ecosystem doesn’t need to change; or do we stand firm and thereby add pressure on ecosystem tools to accept URL strings or URL instances as input?

Generally I prefer Node.js not dictate stuff down to the ecosystem rather than listen to what people in the ecosystem are asking for and making them more productive - so from that point of view I'm on the "yield" side (though it's more "listen" than "yield").

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Aug 2, 2023

To answer my own question, I think I’d like to make it easier to work with both path strings and URL strings in ESM. Yes we should add import.meta.filename and dirname, once we get a blessed API from WinterCG; but we should also make URL strings (and therefore, the output of import.meta.resolve) as usable across as many of our APIs as possible. I opened nodejs/node#48994 regarding that.

I see @aduh95’s point that URL strings (or URL instances) are in general better than path strings, and we should be trying to encourage migration to them; just like how I feel that in general ESM is better than CommonJS and we should encourage migration to it too. But I think we should take a more carrots than sticks approach to that, making the “better” path as good as we can make it to try to entice people that way, rather than making the “bad” path harder than it needs to be.

@ljharb
Copy link
Member

ljharb commented Aug 2, 2023

Ecosystems can’t be forced to change; anything that requires it for ESM makes adoption harder. Migration is achieved by preserving existing patterns and also providing a preferred (by the authors) pattern that’s considered better by the ecosystem (not by the authors).

@JakobJingleheimer
Copy link

JakobJingleheimer commented Aug 2, 2023

I know issue is in the TSC, but as possibly one of the biggest ESM cheerleaders (and one who has dropped a block on related issues/PRs), I'm very much in favour of easing adoption (in appropriate ways). This is something I myself found wanting in days of yore. That Worker() doesn't work with a relative path is absolutely infuriating and is what literally no-one wants (but technical limitations 🥲).

IF it's truly needed nowadays and import.meta is the right place for it, huzzah (it's easily achieved). I'm not sure if additional APIs are needed, but open to being convinced.

Apologies for the intrusion.

/exit

@aduh95
Copy link
Contributor

aduh95 commented Aug 2, 2023

I see @aduh95’s point that URL strings (or URL instances) are in general better than path strings, and we should be trying to encourage migration to them

That's not my point nor my opinion, I don't think one is better than the other. A CJS module has a 1-to-1 relationship with a path (and it's easy to expose it via __filename), an ES module has a 1-to-1 relationship with a URL (and it's easy to expose it via import.meta.url); if we try to expose something defined as "the path of the current ES module", that can only work in a subset of cases, and that feels wrong.
I'm personally rather happy with the current status quo (with fileURLToPath and support for URL instances in most built-in APIs), but I hear that's not an experience shared by everyone, I guess there's a tradeoff to find between consistency and UX.

@MoLow
Copy link
Member

MoLow commented Aug 3, 2023

I think the determination that ESM works with URLs and not paths is only correct from the point of view of the implementation/the spec.
from the viewpoint of an ESM user - the most common case is import * from './some/path' which is only later resolved into a URL.
so as far as users are concerned (which is what we should care about in my opinion) - in the most common case, they are working with paths, and the fact that the path is resolved into a URL is an implementation detail that doesn't concern most people in most cases.

an ES module has a 1-to-1 relationship with a URL (and it's easy to expose it via import.meta.url); if we try to expose something defined as "the path of the current ES module", that can only work in a subset of cases, and that feels wrong.

I think an ES module has a 1-to-1 relationship with a specifier, which is not necessarily a URL, right? why is it incorrect for node to expose dirname and filename only in cases where the specifier is a path?

@aduh95
Copy link
Contributor

aduh95 commented Aug 3, 2023

@MoLow in the case of import * from './some/path', they are using a relative URL, not a path. They might think it's a path, but really it's not, and as soon as you introduce a % char to that use-case, it will become quite glaring that it's indeed not the same thing. By default (assuming we're talking about modules on the FS, which is indeed the most common use-case atm), that specifier (which is a relative URL) will be resolved to an absolute URL, which then will be later internally converted to a path to actually load the module from the FS.

// let's assume import.meta.url === 'file:///root/entry.js'
import "./dir%20name/index.js"; // loads '/root/dir name/index.js'
import "./dir%2520name/index.js"; // loads '/root/dir%20name/index.js'
import "./dir%name/index.js"; // throws

I think an ES module has a 1-to-1 relationship with a specifier, which is not necessarily a URL, right?

That's not correct, a ES module has a unique URL, and a URL can reference at most one1 ES module. Several specifiers can be resolved to the same URL, or the same specifier can be resolved to multiple URLs, there's no 1-to-1 relationship with specifiers.

// import.meta.url === 'file:///root/a.js'
import "./b.js"; // references file:///root/b.js
// import.meta.url === 'file:///root/subdir/a.js'
import '../b.js'; // also references file:///root/b.js
import './b.js'; // references file:///root/subdir/b.js

In the above example, the same specifier ./b.js is resolved to two different URLs (and therefor two different modules), and two different specifiers (./b.js and ../b.js) is resolved to the same URL (therefor the same module).

why is it incorrect for node to expose dirname and filename only in cases where the specifier is a path?

Specifiers are never paths. Using a POSIX path as a specifier may or may not load the correct file, or can throw.

Footnotes

  1. assuming no import attributes. When adding import attributes to the equation, the spec says there can be several modules using the same URL, but Node.js will likely keep the current limitation until there's a use case to justify breaking that assumption.

@GeoffreyBooth
Copy link
Member

why is it incorrect for node to expose dirname and filename only in cases where the specifier is a path?

Don't forget Node also supports modules loaded from data: URLs and (with a flag) https: URLs. Neither of these can get a dirname or filename value.

@benjamingr
Copy link
Member

@JakobJingleheimer meant to comment earlier but forgot - I don't speak for the TSC and I'm a very new member - but something being in the TSC repo or anywhere else does not mean it's only for the TSC to discuss and feedback/opinions/ideas are always welcome and appreciated.

@jsumners
Copy link

jsumners commented Aug 3, 2023

@JakobJingleheimer meant to comment earlier but forgot - I don't speak for the TSC and I'm a very new member - but something being in the TSC repo or anywhere else does not mean it's only for the TSC to discuss and feedback/opinions/ideas are always welcome and appreciated.

If that is the case, then...

I'm all for following specs. I'm really quite a pedant about it 99.9999% of the time. However, we are talking about a runtime/framework designed for running JavaScript code as a system application. As such, I expect to be able to work with the filesystem within my scripts without having to jump through translation layers, i.e. URLs. If it is determined that ESM scripts should only use URLs then consider this user as one that will completely write off ESM as not useful in his work and will stick with CJS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants