Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] HOST url scheme without https #93

Open
flame0 opened this issue Oct 15, 2018 · 3 comments
Open

[BUG] HOST url scheme without https #93

flame0 opened this issue Oct 15, 2018 · 3 comments

Comments

@flame0
Copy link

flame0 commented Oct 15, 2018

Hello!
I am using django + nginx + https
if set ROBOTS_USE_SCHEME_IN_HOST = True

I get this result in the robots.txt: Host: http://site.com
But expected: Host: https://site.com

Maybe it happens because of using nginx.
nginx proxies traffic to gunicorn via http

location / {
        proxy_pass http://web:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }

How i can fix it?

@flame0 flame0 changed the title [BUG] HOST scheme without https [BUG] HOST url scheme without https Oct 15, 2018
@troyshu
Copy link

troyshu commented Nov 27, 2018

I think you're right that your issue stems from how you use nginx. According to the docs, it looks like ROBOTS_USE_SCHEME_IN_HOST uses the protocol of the current request. I imagine what's happening with your current setup is that since the requests that gunicorn gets are http, ROBOTS_USE_SCHEME_IN_HOST adds http instead of https.

I'm using gunicorn without nginx in front of it, and redirect everything in my DNS to https, so ROBOTS_USE_SCHEME_IN_HOST = True properly sets the Host to use https for me.

@ntravis
Copy link

ntravis commented Feb 12, 2019

The way this is handled in Django is leveraging a secure header (e.g. one you have scrubbed and maintained as safe, typically X-Forwarded-Proto) which seems like it could be pulled into for this project as well. See this link for their implementation details. I would guess that this part of code could be adjusted to use a setting (defaulting to X-Forwarded-Proto if enabled?, and allow other headers to be specified if needed)

def get_domain(self):
scheme = self.request.is_secure() and 'https' or 'http'
if not self.current_site.domain.startswith(('http', 'https')):
return "%s://%s" % (scheme, self.current_site.domain)
return self.current_site.domain

@some1ataplace
Copy link

To fix this issue, you can try updating your nginx configuration to add the X-Forwarded-Proto header. This will enable django-robots to correctly detect the protocol used and generate the correct Host value in the robots.txt file.

Here's an example of how to update your nginx configuration:

location / {
proxy_pass http://web:8080/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme; # Add this line
}

This will set the X-Forwarded-Proto header to the value of $scheme, which should be https when using HTTPS. Django will then use this header to correctly detect the protocol and generate the correct Host value in the robots.txt file.


It looks like the issue you're encountering is due to the fact that your Django application is not aware of the fact that it's being served over HTTPS by Nginx. When the header X-Forwarded-Proto is not properly set, Django assumes the request is coming over HTTP, hence the resulting Host using the incorrect scheme in the robots.txt.

To fix this, you have a couple of options:

  1. Set the X-Forwarded-Proto header in your Nginx configuration. You can do this by adding the following line to your Nginx location block:

    proxy_set_header X-Forwarded-Proto $scheme;

    This tells Nginx to forward the scheme (HTTP or HTTPS) that the user connected with to your Django application. Then, in your Django settings file (settings.py), add the following line to make Django aware of the X-Forwarded-Proto header:

    SECURE_PROXY_SSL_HEADER = ('HTTP_X_FORWARDED_PROTO', 'https')

    This tells Django to trust the X-Forwarded-Proto header and treat the connection as HTTPS if the header is set to 'https'.

  2. Alternatively, if you want your site to strictly use HTTPS, you can directly modify the django-robots package to always use https as the scheme in the Host line. To do this, locate the robots/views.py file in the django-robots package and find the following line:

    host = full_host(request)

    Change this line to:

    host = 'https://' + request.get_host()

    This will directly set the scheme to https for the Host line in your robots.txt. Keep in mind that modifying the package directly is not recommended, as it could cause issues when updating the package or deploying your project.

After making the necessary changes, restart your Nginx and Django services to ensure your changes take effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants