Skip to content

Commit d68d093

Browse files
committed
Updated docs
1 parent 157f8eb commit d68d093

30 files changed

+13499
-381
lines changed

_images/BackendClassHierarchy.png

20.7 KB
Loading

_images/DriverClassHierarchy.png

-20.7 KB
Binary file not shown.

_images/PluginClassHierarchy.png

-1.04 KB
Loading

_sources/backends.rst.txt

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
Backends
2+
========================
3+
4+
Backends have front-end and back-end functions. Backends connect users to DSI Core middleware (front-end), and Backends allow DSI Middleware data structures to read and write to persistent external storage (back-end). Backends are modular to support user contribution. Backend contributors are encouraged to offer custom Backend abstract classes and Backend implementations. A contributed Backend abstract class may extend another Backend to inherit the properties of the parent. In order to be compatible with DSI Core middleware, Backends should create an interface to Python built-in data structures or data structures from the Python ``collections`` library. Backend extensions will be accepted conditional to the extention of ``backends/tests`` to demonstrate new Backend capability. We can not accept pull requests that are not tested.
5+
6+
7+
.. image:: BackendClassHierarchy.png
8+
9+
.. automodule:: dsi.backends.filesystem
10+
:members:
11+
12+
.. automodule:: dsi.backends.sqlite
13+
:members:
14+
15+
.. automodule:: dsi.backends.gufi
16+
:members:
17+
18+
.. automodule:: dsi.backends.parquet
19+
:members:
20+
21+

_sources/core.rst.txt

Lines changed: 29 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -2,27 +2,27 @@ Core
22
===================
33
The DSI Core middleware defines the Terminal concept. An instantiated Terminal is the human/machine DSI interface. The person setting up a Core Terminal only needs to know how they want to ask questions, and what metadata they want to ask questions about. If they don’t see an option to ask questions the way they like, or they don’t see the metadata they want to ask questions about, then they should ask a Driver Contributor or a Plugin Contributor, respectively.
44

5-
A Core Terminal is a home for Plugins, and an interface for Drivers. A Core Terminal is instantiated with a set of default Plugins and Drivers, but they must be loaded before a user query is attempted. Here's an example of how you might work with DSI using an interactive Python interpreter for your data science workflows::
5+
A Core Terminal is a home for Plugins (Readers/Writers), and an interface for Backends. A Core Terminal is instantiated with a set of default Plugins and Backends, but they must be loaded before a user query is attempted. Here's an example of how you might work with DSI using an interactive Python interpreter for your data science workflows::
66

77
>>> from dsi.core import Terminal
88
>>> a=Terminal()
99
>>> a.list_available_modules('plugin')
1010
>>> # ['Bueno', 'Hostname', 'SystemKernel']
11-
>>> a.load_module('plugin','Bueno','consumer',filename='./data/bueno.data')
12-
>>> # Bueno plugin consumer loaded successfully.
13-
>>> a.load_module('plugin','Hostname','producer')
14-
>>> # Hostname plugin producer loaded successfully.
11+
>>> a.load_module('plugin','Bueno','reader',filename='./data/bueno.data')
12+
>>> # Bueno plugin reader loaded successfully.
13+
>>> a.load_module('plugin','Hostname','writer')
14+
>>> # Hostname plugin writer loaded successfully.
1515
>>> a.list_loaded_modules()
16-
>>> # {'producer': [<dsi.plugins.env.Hostname object at 0x7f21232474d0>],
17-
>>> # 'consumer': [<dsi.plugins.env.Bueno object at 0x7f2123247410>],
16+
>>> # {'writer': [<dsi.plugins.env.Hostname object at 0x7f21232474d0>],
17+
>>> # 'reader': [<dsi.plugins.env.Bueno object at 0x7f2123247410>],
1818
>>> # 'front-end': [],
1919
>>> # 'back-end': []}
2020

2121

22-
At this point, you might decide that you are ready to collect data for inspection. It is possible to utilize DSI Drivers to load additional metadata to supplement your Plugin metadata, but you can also sample Plugin data and search it directly.
22+
At this point, you might decide that you are ready to collect data for inspection. It is possible to utilize DSI Backends to load additional metadata to supplement your Plugin metadata, but you can also sample Plugin data and search it directly.
2323

2424

25-
The process of transforming a set of Plugin producers and consumers into a querable format is called transloading. A DSI Core Terminal has a ``transload()`` method which may be called to execute all Plugins at once::
25+
The process of transforming a set of Plugin writers and readers into a querable format is called transloading. A DSI Core Terminal has a ``transload()`` method which may be called to execute all Plugins at once::
2626

2727
>>> a.transload()
2828
>>> a.active_metadata
@@ -36,32 +36,32 @@ Once a Core Terminal has been transloaded, no further Plugins may be added. Howe
3636
>>> a.active_metadata
3737
>>> # OrderedDict([('uid', [1000, 1000, 1000, 1000]), ('effective_gid', [1000, 1000, 1000...
3838

39-
If you perform data science tasks using Python, it is not necessary to create a DSI Core Terminal front-end because the data is already in a Python data structure. If your data science tasks can be completed in one session, it is not required to interact with DSI Drivers. However, if you do want to save your work, you can load a DSI Driver with a back-end function::
39+
If you perform data science tasks using Python, it is not necessary to create a DSI Core Terminal front-end because the data is already in a Python data structure. If your data science tasks can be completed in one session, it is not required to interact with DSI Backends. However, if you do want to save your work, you can load a DSI Backend with a back-end function::
4040

41-
>>> a.list_available_modules('driver')
41+
>>> a.list_available_modules('backend')
4242
>>> # ['Gufi', 'Sqlite', 'Parquet']
43-
>>> a.load_module('driver','Parquet','back-end',filename='parquet.data')
44-
>>> # Parquet driver back-end loaded successfully.
43+
>>> a.load_module('backend','Parquet','back-end',filename='parquet.data')
44+
>>> # Parquet backend loaded successfully.
4545
>>> a.list_loaded_modules()
46-
>>> # {'producer': [<dsi.plugins.env.Hostname object at 0x7f21232474d0>],
47-
>>> # 'consumer': [<dsi.plugins.env.Bueno object at 0x7f2123247410>],
46+
>>> # {'writer': [<dsi.plugins.env.Hostname object at 0x7f21232474d0>],
47+
>>> # 'reader': [<dsi.plugins.env.Bueno object at 0x7f2123247410>],
4848
>>> # 'front-end': [],
49-
>>> # 'back-end': [<dsi.drivers.parquet.Parquet object at 0x7f212325a110>]}
49+
>>> # 'back-end': [<dsi.backends.parquet.Parquet object at 0x7f212325a110>]}
5050
>>> a.artifact_handler(interaction_type='put')
5151

5252
The contents of the active DSI Core Terminal metadata storage will be saved to a Parquet object at the path you provided at module loading time.
5353

5454
It is possible that you prefer to perform data science tasks using a higher level abstraction than Python itself. This is the purpose of the DSI Driver front-end functionality. Unlike Plugins, Drivers can be added after the initial ``transload()`` operation has been performed::
5555

56-
>>> a.load_module('driver','Parquet','front-end',filename='parquet.data')
57-
>>> # Parquet driver front-end loaded successfully.
56+
>>> a.load_module('backend','Parquet','front-end',filename='parquet.data')
57+
>>> # Parquet backend front-end loaded successfully.
5858
>>> a.list_loaded_modules()
59-
>>> # {'producer': [<dsi.plugins.env.Hostname object at 0x7fce3c612b50>],
60-
>>> # 'consumer': [<dsi.plugins.env.Bueno object at 0x7fce3c622110>],
61-
>>> # 'front-end': [<dsi.drivers.parquet.Parquet object at 0x7fce3c622290>],
62-
>>> # 'back-end': [<dsi.drivers.parquet.Parquet object at 0x7fce3c622650>]}
59+
>>> # {'writer': [<dsi.plugins.env.Hostname object at 0x7fce3c612b50>],
60+
>>> # 'reader': [<dsi.plugins.env.Bueno object at 0x7fce3c622110>],
61+
>>> # 'front-end': [<dsi.backends.parquet.Parquet object at 0x7fce3c622290>],
62+
>>> # 'back-end': [<dsi.backends.parquet.Parquet object at 0x7fce3c622650>]}
6363

64-
Any front-end may be used, but in this case the Parquet driver has a front-end implementation which builds a jupyter notebook from scratch that loads your metadata collection into a Pandas Dataframe. The Parquet front-end will then launch the Jupyter Notebook to support an interactive data science workflow::
64+
Any front-end may be used, but in this case the Parquet backend has a front-end implementation which builds a jupyter notebook from scratch that loads your metadata collection into a Pandas Dataframe. The Parquet front-end will then launch the Jupyter Notebook to support an interactive data science workflow::
6565

6666
>>> a.artifact_handler(interaction_type='inspect')
6767
>>> # Writing Jupyter notebook...
@@ -74,13 +74,13 @@ You can then close your Jupyter notebook, ``transload()`` additionally to increa
7474

7575
Although this demonstration only used one Plugin per Plugin functionality, any number of plugins can be added to collect an arbitrary amount of queriable metadata::
7676

77-
>>> a.load_module('plugin','SystemKernel','producer')
78-
>>> # SystemKernel plugin producer loaded successfully
77+
>>> a.load_module('plugin','SystemKernel','writer')
78+
>>> # SystemKernel plugin writer loaded successfully
7979
>>> a.list_loaded_modules()
80-
>>> # {'producer': [<dsi.plugins.env.Hostname object at 0x7fce3c612b50>, <dsi.plugins.env.SystemKernel object at 0x7fce68519250>],
81-
>>> # 'consumer': [<dsi.plugins.env.Bueno object at 0x7fce3c622110>],
82-
>>> # 'front-end': [<dsi.drivers.parquet.Parquet object at 0x7fce3c622290>],
83-
>>> # 'back-end': [<dsi.drivers.parquet.Parquet object at 0x7fce3c622650>]}
80+
>>> # {'writer': [<dsi.plugins.env.Hostname object at 0x7fce3c612b50>, <dsi.plugins.env.SystemKernel object at 0x7fce68519250>],
81+
>>> # 'reader': [<dsi.plugins.env.Bueno object at 0x7fce3c622110>],
82+
>>> # 'front-end': [<dsi.backends.parquet.Parquet object at 0x7fce3c622290>],
83+
>>> # 'back-end': [<dsi.backends.parquet.Parquet object at 0x7fce3c622650>]}
8484

8585
.. automodule:: dsi.core
8686
:members:

_sources/drivers.rst.txt

Lines changed: 0 additions & 21 deletions
This file was deleted.

_sources/index.rst.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,8 @@ The Data Science Infrastructure Project (DSI)
1515
installation
1616
core
1717
plugins
18-
drivers
19-
permissions
18+
backends
19+
2020

2121
Indices and tables
2222
==================

_sources/introduction.rst.txt

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -19,21 +19,21 @@ The DSI system is composed of three fundamental parts:
1919

2020
DSI Core Middleware
2121
-------------------
22-
DSI's core middleware is focused on delivering user-queries on unified metadata which are distributed across many files and security domains. DSI currently supports Linux, and is tested on RedHat- and Debian-based distributions. The DSI Core middleware is a home for DSI Plugins and an interface for DSI Drivers.
22+
DSI's core middleware is focused on delivering user-queries on unified metadata which are distributed across many files and security domains. DSI currently supports Linux, and is tested on RedHat- and Debian-based distributions. The DSI Core middleware is a home for DSI Plugins and an interface for DSI Backends.
2323

2424
Plugin Abstract Classes
2525
-----------------------
2626
Plugins transform an arbitrary data source into a format that is compatible with our middleware. We call the parsed and queriable attributes "metadata" (data about the data). Metadata share the same security profile as the source data.
2727

28-
Plugins can operate as data consumers or data producers. A simple data consumer might parse an application's output file and place it into a middleware compatible data structure: Python built-ins and members of the popular Python ``collection`` module. A simple data producer might execute an application to supplement existing data and queriable metadata.
28+
Plugins can operate as data readers or data writers. A simple data reader might parse an application's output file and place it into a middleware compatible data structure: Python built-ins and members of the popular Python ``collection`` module. A simple data writer might execute an application to supplement existing data and queriable metadata.
2929

3030
Plugins are defined by a base abstract class, and support child abstract classes which inherit the properties of their ancestors.
3131

3232
.. image:: PluginClassHierarchy.png
3333

34-
Driver Abstract Classes
35-
-----------------------
36-
Drivers are an interface between the User and the Core, or an interface between the Core and a storage medium. Drivers can operate as Front-ends or Back-ends, and a Driver contributor can choose to implement one or both. Driver front-ends are built to deliver an experience which is compatible with a User Story. A simple supporting User Story is a need to query metadata by SQL query. Because the set of queriable metadata are spread across filesystems and security domains, a supporting Driver Back-end is required to assemble query results and present them to the DSI core middleware for transformation and return, creating an experience which is compatible with the User Story.
34+
Backend Abstract Classes
35+
------------------------
36+
Backends are an interface between the User and the Core, or an interface between the Core and a storage medium. Backends can operate as Front-ends or Back-ends, and a Backend contributor can choose to implement one or both. Backend front-ends are built to deliver an experience which is compatible with a User Story. A simple supporting User Story is a need to query metadata by SQL query. Because the set of queriable metadata are spread across filesystems and security domains, a supporting Backend Back-end is required to assemble query results and present them to the DSI core middleware for transformation and return, creating an experience which is compatible with the User Story.
3737

3838
.. image:: user_story.png
3939
:scale: 50%

_sources/permissions.rst.txt

Lines changed: 0 additions & 55 deletions
This file was deleted.

_sources/plugins.rst.txt

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,12 @@
11
Plugins
22
===================
3-
Plugins connect data-producing applications to DSI middleware. Plugins have "producer" or "consumer" functions. A Plugin consumer function deals with existing data files or input streams. A Plugin producer deals with generating new data. Plugins are modular to support user contribution. Plugin contributors are encouraged to offer custom Plugin abstract classes and Plugin implementations. A contributed Plugin abstract class may extend another plugin to inherit the properties of the parent. In order to be compatible with DSI middleware, Plugins should produce data in Python built-in data structures or data structures sourced from the Python ``collections`` library. Plugin extensions will be accepted conditional to the extention of ``plugins/tests`` to demonstrate the new Plugin capability. We can not accept pull requests that are not tested.
3+
Plugins connect data-producing applications to DSI middleware. Plugins have "writers" or "readers" functions. A Plugin reader function deals with existing data files or input streams. A Plugin writer deals with generating new data. Plugins are modular to support user contribution. Plugin contributors are encouraged to offer custom Plugin abstract classes and Plugin implementations. A contributed Plugin abstract class may extend another plugin to inherit the properties of the parent. In order to be compatible with DSI middleware, Plugins should produce data in Python built-in data structures or data structures sourced from the Python ``collections`` library. Plugin extensions will be accepted conditional to the extention of ``plugins/tests`` to demonstrate the new Plugin capability. We can not accept pull requests that are not tested.
44

55
.. image:: PluginClassHierarchy.png
66

7+
.. automodule:: dsi.plugins.plugin
8+
:members:
9+
710
.. automodule:: dsi.plugins.metadata
811
:members:
912

_static/_sphinx_javascript_frameworks_compat.js

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,20 @@
1-
/* Compatability shim for jQuery and underscores.js.
1+
/*
2+
* _sphinx_javascript_frameworks_compat.js
3+
* ~~~~~~~~~~
4+
*
5+
* Compatability shim for jQuery and underscores.js.
6+
*
7+
* WILL BE REMOVED IN Sphinx 6.0
8+
* xref RemovedInSphinx60Warning
29
*
3-
* Copyright Sphinx contributors
4-
* Released under the two clause BSD licence
510
*/
611

12+
/**
13+
* select a different prefix for underscore
14+
*/
15+
$u = _.noConflict();
16+
17+
718
/**
819
* small helper function to urldecode strings
920
*

0 commit comments

Comments
 (0)