Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow dataroot to be configurable at a global level #46

Open
nickjer opened this issue Aug 1, 2017 · 3 comments
Open

Allow dataroot to be configurable at a global level #46

nickjer opened this issue Aug 1, 2017 · 3 comments

Comments

@nickjer
Copy link
Contributor

nickjer commented Aug 1, 2017

This isn't super important as we make a shared filesystem for the home directory a requirement of OOD, but some centers such as TACC don't have a shared filesystem for their home directories.

The HOME directory is local to the individual clusters I believe (with a small 10 GB of space). And instead offer a shared filesystem (not backed up, with 1 TB space) across all clusters called WORK, not to be confused with SCRATCH (where they purge files).

So it may be beneficial for a sys admin to be able to change the root of the OOD_DATAROOT directories from HOME to WORK somehow.

@ericfranz

┆Issue is synchronized with this Asana task by Unito

@ericfranz
Copy link
Contributor

ericfranz commented Aug 1, 2017

So this one is tricky.

For each of the apps, you can just set OOD_DATAROOT explicitly in the .env.local files. For example in my jobs we do this (though it is not necessary):

OOD_DATAROOT=$HOME/$OOD_PORTAL/data/sys/myjobs
DATABASE_PATH=$OOD_DATAROOT/production.sqlite3

We could probably do a quick test and prove that I could set OOD_DATAROOT=/fs/scratch/$USER/ood/data/sys/myjobs for myjobs and it would work.

However... the assumption is still that there is a shared directory somewhere that can be called the dataroot, that the PUN has access to AND the running batch jobs have access to.

If we had a situation where that was not the case, we would need another solution to share data (results, rendered job template, etc.) between OnDemand and individual jobs.

@ericfranz
Copy link
Contributor

That said, being able to share configuration between all the apps makes sense. So for example, instead of setting the data root for every single app individually, we set the parent directory once (perhaps to a template string)

@ericfranz
Copy link
Contributor

ericfranz commented Aug 2, 2017

This is related #31. A shared env could provide something like OOD_DATAROOT_PARENT=$HOME/ondemand/data and change this to OOD_DATAROOT_PARENT=$WORK/ondemand/data.

But in your description that wouldn't really work well since according to your description $WORK is not backed up so the sqlite3 database is not something we would want to store there.

So we have made the assumption that the shared directory to store a job template and results for a queued/running/recently completed job is the same location as the location to store the sqlite3 database of an app and other forms of persistent data

And it seems like we would want the ability to separate those out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants