Skip to content

APP : App Overview

Nuwan Waidyanatha edited this page Oct 13, 2023 · 2 revisions

rezAWARE APPs

Apps are components of the data as a product architecture. There are three key apps that can be implemented to generate data science software services; namely,:

  1. Wrangler - the leading ETL facilitator for retrieving, transforming, and extending archived data for analytics
  2. Mining - applies the AI/ML models on the curated data for generating warehouses
  3. Visualization - analytic tools and objects for visualizing the warehouse data

Additionally there are the:

  • INSTALLER - that house the setup scripts
  • README - with a quick intro to rezaware and installation instructions
  • requirements.tx - with the necessary and sufficient libraries
  • rezaware.py - the main set of classes for configuring and initializing an instance

Each app follows a common structure with:

  • Modules - that are domain and sector-specific packages with classes and methods
  • Data - an associated folder structure that harmonizes with the modules for storing input, output, and temporary data
  • DB (database) - stores all module-wise SQL scripts for creating database schemas, tables, views, and functions
  • Dags - are module and workflow specific scheduler files for running various pipelines
  • Logs - a folder structure that follows the modules hierarchy for storing package-wise logs
  • Notebooks - are used for experimenting the workflows of the module and requirement-specific pipelines.

Folder Structure

The folder structure for data follows the same as the modules. Simply replace the modules folder with data. This makes it easy to manage and reference module-entity and function-package specific the data. This level of abstraction allows for implementing a common framework across all apps, module-entity, and function-package reads and writes. For example:

## folder path to entity = ota and function = scraper
wrangler/data/ota/scaper