Skip to content

DEA Notebooks Hackathon: Make DEA Notebooks faster!

Robbi Bishop-Taylor edited this page Jul 10, 2024 · 28 revisions

Aim and guidance for the Hackathon

Our aim is to make DEA Notebooks faster and more efficient so they run more quickly for our users and in our integration tests.

Important

We want to preserve the overall "purpose" of our notebooks when making changes.

Ideally, we should only make changes if we can do it without making our examples less useful or informative to our users:

  1. Before making any changes, read through and run each cell in the notebook carefully. Try and understand its overall purpose or message, i.e. what is it trying to convey to our users? What functionality is it showing off? What do we need to keep so make sure the example is still useful?

  2. Once you understand the purpose and approach of the notebook, look for places where we can make it faster to run without affecting its overall purpose. For most notebooks, the main changes we should look at include:

  • Reducing the time period (e.g. can the notebook be run one one year of data instead of two without affecting its conclusions?)
  • Reducing the area/extent (e.g. can we load data for a smaller area and still demonstrate the same functionality?)
  • Loading fewer products (e.g. do we have to load data from Landsat 7 and 8 if just Landsat 8 will do?)

Some other completely optional/more advanced ideas include:

  • Filtering to less cloudy images by metadata (e.g. cloud_cover=(0, 10)) to load only clear images
  • Updating code to be more efficient (e.g. using built-in xarray or numpy tools instead of for-loops etc)
  1. Once you have made some changes to the notebook, double check that the notebook markdown cells/descriptions still match the analysis (e.g. update references to the time period and location to match the new values you have chosen).

  2. Re-run the entire notebook, then commit it back into the repo for review! (see Git details below)

Technical guide to editing notebooks

  1. Open the DEA Sandbox: https://app.sandbox.dea.ga.gov.au/
  2. Launch the "Default environment 2 Cores, 16G Memory" server option (this is what our external users use, and is most similar to how we run our tests)

image

  1. If it is your first time editing a notebook, follow this guide to setting up DEA Notebooks with Git: https://github.com/GeoscienceAustralia/dea-notebooks/wiki/Edit-a-DEA-Notebook
Clone this wiki locally