Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(docs): reorder, clarify PyPI installation #28221

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
56 changes: 34 additions & 22 deletions docs/docs/installation/pypi.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,7 @@ This page describes how to install Superset using the `apache-superset` package

### OS Dependencies

Superset stores database connection information in its metadata database. For that purpose, we use
the cryptography Python library to encrypt connection passwords. Unfortunately, this library has OS
level dependencies.
Superset uses the `cryptography` Python library to encrypt database connection passwords. This library requires the installation of OS-level dependencies.

**Debian and Ubuntu**

Expand Down Expand Up @@ -100,24 +98,18 @@ We highly recommend installing Superset inside of a virtual environment. Python
pip install virtualenv
```

You can create and activate a virtual environment using:

First create a directory for Superset:
```
# virtualenv is shipped in Python 3.6+ as venv instead of pyvenv.
# See https://docs.python.org/3.6/library/venv.html
python3 -m venv venv
. venv/bin/activate
mkdir superset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. I'm a tad confused. Shouldn't the virtualenv be called venv—assuming it's inside the root superset folder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the docs don't say to make a folder called superset, that's something I'm adding here. I believe what they say currently makes a virtualenv called venv in the root directory, which is not descriptive and potentially a problem if another tutorial says to do the same.

Now maybe mkdir superset is unnecessary if python3 -m venv superset below would create the directory.

I may be misunderstanding though, I'm new to this aspect.

```

Or with pyenv-virtualenv:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still want to keep pyenv documentation around, i.e., developers may want to switch which Python environment they want to use. In essence pyenv is a precursor to setting up Superset.

Copy link
Contributor Author

@sfirke sfirke Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I'm new to this but my newbie perspective was: the docs say:

You can create and activate a virtual environment using: [virtual env instructions] ... Or with pyenv-virtualenv:

I was unclear, which should I use? And when I read a little about it, it seemed like the venv was the most widely-used and so I eliminated an alternative.

If we should include both, can someone add a note to help a user decide which to use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@john-bodley circling back to this -- are you good with my changes or is there another committer you think could weigh in? Or if nothing else I can revert these parts, I'd rather just get the bulk of this merged.


Then create and activate a virtual environment there:
```
# Here we name the virtual env 'superset'
pyenv virtualenv superset
pyenv activate superset
python3 -m venv superset
. superset/bin/activate
```

Once you activated your virtual environment, all of the Python packages you install or uninstall
While your virtual environment is activated, all of the Python packages you install or uninstall
will be confined to this environment. You can exit the environment by running `deactivate` on the
command line.

Expand All @@ -129,21 +121,39 @@ First, start by installing `apache-superset`:
pip install apache-superset
```

Then, you need to initialize the database:
Superset configurations are stored in a file. One of these configurations, a `SECRET_KEY`, is required for the application to start. Create your config file:
```
touch superset/superset_config.py
sfirke marked this conversation as resolved.
Show resolved Hide resolved
```
And make this file findable by adding its path as an environment variable:
```
export SUPERSET_CONFIG_PATH=superset/superset_config.py
sfirke marked this conversation as resolved.
Show resolved Hide resolved
```
Note that this stores the path in the environment only temporarily. If you wish for this to persist through reboots, permanently set this environment variable by [adding it to your `~/.profile` file](https://unix.stackexchange.com/questions/117467/how-to-permanently-set-environmental-variables).

Now generate a strong value for `SECRET_KEY` and write it to your config file:
```
superset db upgrade
echo "SECRET_KEY='$(openssl rand -base64 42)'" | tee -a superset/superset_config.py
```
Do not lose this key. Consider maintaining a secure backup of your `superset_config.py` file.

:::tip
Note that some configuration is mandatory for production instances of Superset. In particular, Superset will not start without a user-specified value of SECRET_KEY. Please see [Configuring Superset](/docs/configuration/configuring-superset).
:::
You could also tell Superset where to store its metadata - that is, what charts, dashboards, etc. have been created. By default, this is a SQLite database at the filepath `~/.superset/superset.db`.

Finish installing by running through the following commands:
In a production setup, you would change this to point to say, a PostgreSQL database that gets backed up. You change this by specifying a new value in `superset_config.py` for the variable `SQLALCHEMY_DATABASE_URI`.

Set an environment variable so that `superset` commands will work in the terminal. As above, this is temporary, and can be set to persist by adding it to your `~/.profile`.
```
# Create an admin user in your metadata database (use `admin` as username to be able to load the examples)
export FLASK_APP=superset
```

Now initialize the database:
```
superset db upgrade
```

Finish installing by running through the following commands:
```
# Create an admin user in your metadata database (use `admin` as username to be able to load the examples)
superset fab create-admin

# Load some data to play with
Expand All @@ -158,3 +168,5 @@ superset run -p 8088 --with-threads --reload --debugger

If everything worked, you should be able to navigate to `hostname:port` in your browser (e.g.
locally by default at `localhost:8088`) and login using the username and password you created.

Next see [Configuring Superset](/docs/configuration/configuring-superset) and further set up your instance.