-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds support for native URL parsing if node API is unavailable #149
base: master
Are you sure you want to change the base?
Conversation
@@ -1,7 +1,19 @@ | |||
|
|||
var URL = require('url'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the intent would be clearer to rather test if the global URL
is set and require it otherwise?
Or we could also rely directly on it since it is available since node@10
.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense - you'd probably know better than me if there are folks depending on this using ancient node versions, but as long as this came in a major bump IMO it'd be simplest to just use the global URL since it is available in all maintenance+ releases of node
lib/clean-host.js
Outdated
@@ -97,7 +109,7 @@ module.exports = function extractHostname(value) { | |||
url = '//' + url; | |||
} | |||
|
|||
var parts = URL.parse(url, null, true); | |||
var parts = parseURL(url); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻
Also discovers & resolves a few URL related issues: - WHATWG parses any potentially valid integer string as an IP - add explicit IP detection to avoid this case and preserve existing parsing functionality - WHATWG does not fuzzy parse ipv6 addresses, so add simple regex-based ipv6 extraction to handle this case and preserve existing functionality - WHATWG does not parse scheme-less URLs so change prefixing behavior to include a scheme when falling back to URL-based parsing
d956699
to
2fb0a6c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WHATWG URL parsing turns out to be quite subtly different. I did a pass on the unit tests to get it all passing but there are potentially some concessions here that are undesirable (mainly using some regexes to try to align with the legacy parsing behavior).
var needsTrimming = checkTrimmingNeeded(url); | ||
if (needsTrimming) { | ||
url = url.trim(); | ||
} | ||
|
||
var needsLowerCase = checkLowerCaseNeeded(url); | ||
if (needsLowerCase) { | ||
url = url.toLowerCase(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This diff is a little funky, but I just moved the lowercase check to before the ipv6 check since the later IP checks expect a lowercased string.
url = '//' + url; | ||
url = 'https://' + url; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WHATWG URL does not support scheme-less URLs so I had to add a scheme to get it to parse.
if (ipLikeRE.test(parts.hostname) && !ipLikeRE.test(value)) { | ||
return value; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WHATWG URL will parse the string http://42
as the URL http://0.0.0.42
, which is not how the legacy URL parser worked (it would fail to parse). This just aligns them. A slight difference in this is that when value
is not a string, we will no longer coerce it to one when returning it. I believe this is closer to the documented API but is technically a breaking change. LMK if you'd like me to restore the test and the string-coercion.
expect(tld.extractHostname(42)).to.equal('42'); | ||
expect(tld.extractHostname(42)).to.equal(42); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned above - "return the initial value" feels like it should return an integer, not a string. But this is a change in behavior.
Fixes #148