Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't build/serve website from NGINX #3636

Open
MattIPv4 opened this issue Feb 20, 2024 · 4 comments
Open

Don't build/serve website from NGINX #3636

MattIPv4 opened this issue Feb 20, 2024 · 4 comments

Comments

@MattIPv4
Copy link
Member

MattIPv4 commented Feb 20, 2024

The website is now served from Vercel, so it is redundant to serve it from NGINX as well. It has also been observed that the version from NGINX that is available at origin.nodejs.org is being indexed in Google, so removing it should resolve this.

It should be called out that the dist, download, docs, api, etc. should still be served from NGINX.

The site build script can be removed, as well as various parts of the NGINX config relating to the serving of Next.js:

# Sets a custom 404 error page for the whole Server region
# we use the Next.js generated 404 page for all the 404's of our Website
# including for binaries, assets and et cetera.
error_page 404 @localized_404;

# This Location directive is the primary Location directive for any request not handled by the
# mutual exclusivity Location directives (the ones started by ^~) and pretty much handles the requests for the Website pages
# as in general all other requests should either not fall here.
location / {
# We rewrite all Website pages ending with a trailing slash, removing the trailing slash
# This is done because our Next.js deployment doesn't use trailingSlash, in other words
# /en/blog actually translates into /en/blog.html within Next.js built (exported) files
# Removing the trailing slash from the request allows us to do an external permanent redirect
# that will that fallback to this same Location block.
# For all Website pages we won't have any single **/index.html, meaning that we don't need to
# test for $uri/
rewrite ^/(.*)/$ /$1 permanent;
# Tries the $uri first and if there's no $uri for that e.g. /en/blog
# it attempts with /en/blog.html, which for the Website it will exist.
# This is basically a rewrite to remove the ".html" extension from our Website pages
# NOTE: By disabling trailingSlash config option on Next.js, less folders need to be created.
# If a file doesn't exist, it attempts to invoke the @english_fallback, as in most of cases
# for the Website, it means that, for example, /es/blog will not exist, but /en/blog exists
# so it attempts to open that page on its English version. Note that @english_fallback
# will only redirect two-letter-code pages to english ones, everything else goes right to 404.
try_files $uri $uri.html @english_fallback;
location ~ \.json$ {
add_header access-control-allow-origin *;
}
}
# This Location is used for handling static Next.js files. As we don't want to log access
# to static directories and also we don't want to log not found requests here
# As this is a static directory that in theory should not change over time, we disable access
# logs and also cache 404's errors as this folder contents change completely on every
# We don't use ^~ as there are other Rewrite directives below that should also be taken into consideration
# before failing the request with a 404 if it doesn't exist
location /static {
access_log off;
log_not_found off;
open_file_cache_errors on;
}
# This Location directy is used to handle Next.js internal _next directory
# As this is an internal directory requested by Next.js itself, we disable access
# logs and also cache 404's errors as this folder contents change completely on every build
# We use ^~ to tell NGINX to not process any other Location directive or Rewrite after this match
location ^~ /_next {
access_log off;
log_not_found off;
open_file_cache_errors on;
}

# When a website 404 occurs, attempt to load the English version of the page
# if the request was for a localised page.
# Also, store the original language of the request if it was localised
# We'll use this language for the 404 in the try_files in @localized_404
location @english_fallback {
# @TODO: Handle Localization Fallback through Next.js SSR as this is a hacky approach and requires
# continuous maintenance of the supported languages
if ($uri ~* ^/(ar|be|ca|de|es|fa|fr|gl|id|it|ja|ka|ko|nl|pt-br|ro|ru|tr|uk|zh-cn|zh-tw)/) {
set $lang $1;
}
rewrite ^/(ar|be|ca|de|es|fa|fr|gl|id|it|ja|ka|ko|nl|pt-br|ro|ru|tr|uk|zh-cn|zh-tw)/(.*)$ /en/$2;
}
# This location directive handles all 404 responses for the server
# If the request was a localised website page, use the requested language
# as set by the @english_fallback location block
# Otherwise, this will fallback to $lang being "en" as defined numerous lines above
location @localized_404 {
# We disable caching of 404 pages as we always want Cloudflare to check if the file now exists
# Some 404s may be caused by the server reaching maximum concurrent file system open() requests
# Disabling cache allows Cloudflare to re-evaluate the same $uri once our server recovers and then properly cache it
add_header Cache-Control "private, no-store, max-age=0" always;
# If this was a rewritten i18n request from @english_fallback, use the localized 404
# If there is no 404 page for that locale, fallback to the English 404
# As a last resort, fallback to NGINX's default 404. This should never happen, and will emit a [crit]
try_files /$lang/404.html /en/404.html =404;
}

- "/home/nodejs/build-site.sh nodejs"

- "*/5 * * * * nodejs /home/nodejs/check-build-site.sh nodejs"

if [ "X$site" != "Xiojs" ] && [ "X$site" != "Xnodejs" ]; then
echo "Usage: check-build-site.sh < iojs | nodejs >"

if [ "X$site" != "Xiojs" ] && [ "X$site" != "Xnodejs" ]; then
echo "Usage: build-site.sh < iojs | nodejs >"

if [ "$site" == "nodejs" ]; then
build_cmd="npm run deploy"
rsync_from="build/"
else

I'm probably missing some other bits.

@targos
Copy link
Member

targos commented Feb 21, 2024

The first thing I can do is delete the website contents from the server. It's outdated now anyway. @nodejs/build-infra wdyt?

@MattIPv4
Copy link
Member Author

👍 Not build infra, but removing the site content seems like a good first step and would solve for the Google indexing issue (and act as a confirmation we don't need it before removing all the code/config for building/serving it).

@targos
Copy link
Member

targos commented Feb 23, 2024

I moved all website files to a folder at /home/www/nodejs_old in case we need to recover something, and updated the robots.txt to disallow everything.

Remains:

$ ls nodejs
robots.txt  traffic-manager
$ cat nodejs/robots.txt
User-Agent: *
Disallow: /

targos added a commit that referenced this issue Feb 24, 2024
Website is now built and deployed on Vercel infra

Refs: #3636
@targos
Copy link
Member

targos commented Feb 24, 2024

Opened #3641 to remove the build scripts and webhook.

targos added a commit that referenced this issue Feb 28, 2024
Website is now built and deployed on Vercel infra

Refs: #3636
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants