You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Multiple slashes together are ok with standards and have a different meaning than just one slash. That is: "/foo/bar" should be translated to path segments "foo" and "bar", while "/foo//bar" is "foo", "" and "bar".
Some people uses significant empty segments in their paths (see this). However, the most common case is that multiple slashes are not significant and are produced as an unintended consequence of bad serialization.
Trailing slash
It's generally accepted that a trailing slash can be added to an URL path if there is no "file extension". (e.g. /foo -> /foo/ but not foo.html -> /foo.htnl/). However, that changes semantics according to RFC 3986 and might break well-formed URLs in lots of cases.
Further considerations
Both of these normalizations can break standard-compliant URLs. So they should be optional and the user should be warned. Also, when to perform this normalization (during parsing or after parsing) is important, since it can change the result of /../.
Proper processing of these cases (as Google seems to be doing) is normalizing according to the result of fetching the URL and processing redirects and <link rel="canonical">.
Because of all of this, I still doubt that providing these normalizations in Galimatias is a sane choice.
The text was updated successfully, but these errors were encountered:
Multiple slashes
Multiple slashes together are ok with standards and have a different meaning than just one slash. That is: "/foo/bar" should be translated to path segments "foo" and "bar", while "/foo//bar" is "foo", "" and "bar".
Some people uses significant empty segments in their paths (see this). However, the most common case is that multiple slashes are not significant and are produced as an unintended consequence of bad serialization.
Trailing slash
It's generally accepted that a trailing slash can be added to an URL path if there is no "file extension". (e.g.
/foo -> /foo/
but notfoo.html -> /foo.htnl/
). However, that changes semantics according to RFC 3986 and might break well-formed URLs in lots of cases.Further considerations
Both of these normalizations can break standard-compliant URLs. So they should be optional and the user should be warned. Also, when to perform this normalization (during parsing or after parsing) is important, since it can change the result of
/../
.Proper processing of these cases (as Google seems to be doing) is normalizing according to the result of fetching the URL and processing redirects and
<link rel="canonical">
.Because of all of this, I still doubt that providing these normalizations in Galimatias is a sane choice.
The text was updated successfully, but these errors were encountered: