Skip to content

Python package to facilitate interoperation between python and a SQLite database of geochemical data.

License

Notifications You must be signed in to change notification settings

sarttiso/geochemdb

Repository files navigation

geochemdb

This python module provides a class and methods for adding, updating, and reading data from a SQLite database. It is geared towards a database of geochemical measurements on geologic materials, but it will be as general as possible with respect to admit a variety of data sources.

A minimal SQLite schema for utilizing this package is presented below and available in this repository as schema.sql. SQLiteStudio provides a convenient way to access and modify SQLite databases.

Getting Started

Installation

The geochemdb package can be installed with

pip install geochemdb

Create Database

Option 1

I have a provided the schema.sql file for the structure shown below. You can use this to create your own, empty database by executing:

sqlite3 your-geochem-database.db < schema.sql

Option 2

Alternatively, download and install SQLiteStudio. Create a new database from the "Database/Add a Database" menu. Create the file wherever you like, and name it as you like. Right click the newly-created database in the pane on the left and click "Execute SQL from file." In the new window, select the input file by navigating to the schema.sql file I provide and click ok.

Option 3

Finally, you can just download the example database geochemdb_example.db in the example_data folder.

Note

Samples must be added to the database manually. You can use SQLiteStudio for this.

Tutorial

I have prepared a guide showing the functionality of geochemdb. This guide is the best way to get a sense for how to use the package to interface with your new database.

SQLite Schema

The schema presented below is largely inspired by the following references:

Chamberlain, K.J., Lehnert, K.A., McIntosh, I.M., Morgan, D.J., and Wörner, G., 2021, Time to change the data culture in geochemistry: Nature Reviews Earth & Environment, v. 2, p. 737–739, doi:10.1038/s43017-021-00237-w.

Staudigel, H. et al., 2003, Electronic data publication in geochemistry: Geochemistry, Geophysics, Geosystems, v. 4, doi:10.1029/2002GC000314.

Full implementation of the metadata collection described therein is a work-in-progress, but this schema captures the most important elements.

schema

Primary Tables

The geochemical data are assumed to follow a hierarchy that progresses from sample → aliquot → analysis → measurement. Various metadata are tabulated at each level. This structure is presented here and in the guide with the example of split-stream laser ablation mass spectrometry (LASS-ICPMS) of zircon.

Samples

Samples are uniquely identified by names and contain metadata. Minimal metadata for a sample include:

  • latitude and longitude
  • horizontal precision
  • elevation
  • vertical precision

In this example, the sample would be the rock from which zircons are separated.

Aliquots

Aliquots reflect portions of a sample and have a many-to-one (foreign key) relationship with the Samples table. Minimal metadata for an aliquot include:

  • material (e.g., whole rock, zircon, etc.)

In the example, an aliquot would be a single laser ablation spot on an individual zircon. In general, an aliquot is a part of a sample that undergoes analysis.

Analyses

Analyses are analytical events that yield measurements. In the case of LASS-ICPMS, a single spot (aliquot) yields analyses on two separate instruments: one that measures trace elements, another that measures U and Pb. Analyses must include:

  • date
  • instrument
  • technique

Measurements

Measurements store the geochemical data gathered during analyses. A row exists in the Measurements table for each element, ratio, or other value measured/derived during an analysis. For now, measurements are uniquely identifiable soley by the analysis and the quantity being measured. Measurements must include:

  • measurement unit
  • uncertainty unit
  • reference material

Auxiliary Tables

The following tables provide valid values and combinations of values for several of the metadata fields described above to facilitate data validation and consistency.

Quantities

This table lists valid quantities that can be included in Measurements, such as elemental concentrations, ratios, and error correlations.

MeasurementUnits

This table specifies valid units for Measurements, such as ppm, wt pct, etc.

QuantitiesMeasurementUnits

This table encodes a many-to-many relationship between Quantities and MeasurementUnits, thereby specifying the valid combinations of the two values. For example, ratios cannot be measured in ppm, and so the only valid measurement_unit for the quantity U/Th would be a ratio.

Instruments

This table records the instruments that yield measurements during analyses. Every analysis must have an instrument.

Techniques

Valid analytical techniques are tabulated here. The same instrument can be used in different ways; for example ICPMS measurements can be done on laser ablation aerosols or solutions. This table records these methodological distinctions.

Materials

Rock samples can yield various materials. An aliquot is a particular material, for example a zircon, and this table contains all the valid materials that aliquots can be.

About

Python package to facilitate interoperation between python and a SQLite database of geochemical data.

Topics

Resources

License

Stars

Watchers

Forks

Languages