Skip to content

logicalclocks/hopsworks-tutorials

Repository files navigation

👨🏻‍🏫 Hopsworks Tutorials

We are happy to welcome you to our collection of tutorials dedicated to exploring the fundamentals of Hopsworks and Machine Learning development. In addition to offering different types of use cases and common subjects in the field, it facilitates navigation and use of models in a production environment using Hopsworks Feature Store.

⚙️ How to run the tutorials:

For the tutorials to work, you will need a Hopsworks account. To do so, go to app.hopsworks.ai and create one. With a managed account, just run the Jupyter notebook from within Hopsworks.

Generally the notebooks contain the information you will need on how to interact with the Hopsworks Platform.

If you have an app.hopsworks.ai account; you may connect to Hopsworks with the following line; this will prompt you with a link to your Token which will link to the feature store.

import hopsworks
 
project = hopsworks.login()
fs = project.get_feature_store()

In some cases, you may also need to install Hopsworks; to be able to work with the package. Simply start your notebook with:

!pip install -U hopsworks --quiet

The walkthrough and tutorials are provided in the form of Python notebooks, you will therefore need to run a jupyter environment or work within a colaboratory notebook in google; the later option might lead to some minor errors being displayed or libraries might require different library versions to work.

✍🏻 Concepts:

In order to understand the tutorials you need to be familiar with general concepts of Machine Learning and Python development. You may find some useful information in the Hopsworks documentation.

🗄️ Table of Content:

  • Basic Tutorials:
    • QuickStart: Introductory tutorial to get started quickly.
    • Churn: Predict customers that are at risk of churning.
    • Fraud Batch: Detect Fraud Transactions (Batch use case).
    • Fraud Online: Detect Fraud Transactions (Online use case).
    • Iris: Classify iris flower species.
    • Loan Approval: Predict loan approvals.
  • Advanced Tutorials:
    • Air Quality: Creating an air quality AI assistant that displays and explains air quality indicators for specific dates or periods, using Function Calling for LLMs and a RAG approach without a vector database.
    • Bitcoin: Predict Bitcoin price using timeseries features and tweets sentiment analysis.
    • Citibike: Predict the number of citibike users on each citibike station in the New York City.
    • Credit Scores: Predict clients' repayment abilities.
    • Electricity: Predict the electricity prices in several Swedish cities based on weather conditions, previous prices, and Swedish holidays.
    • NYC Taxi Fares: Predict the fare amount for a taxi ride in New York City given the pickup and dropoff locations.
    • Recommender System: Build a recommender system for fashion items.
    • TimeSeries: Timeseries price prediction.
    • LLM PDF: An AI assistant that utilizes a Retrieval-Augmented Generation (RAG) system to provide accurate answers to user questions by retrieving relevant context from PDF documents.
    • Fraud Cheque Detection: Building an AI assistant that detects fraudulent scanned cheque images and generates explanations for the fraud classification, using a fine-tuned open-source LLM.
    • Keras model and Sklearn Transformation Functions with Hopsworks Model Registry: How to register Sklearn Transformation Functions and Keras model in the Hopsworks Model Registry, how to retrieve them and then use in training and inference pipelines.
    • PyTorch model and Sklearn Transformation Functions with Hopsworks Model Registry: How to register Sklearn Transformation Functions and PyTorch model in the Hopsworks Model Registry, how to retrieve them and then use in training and inference pipelines.
    • Sklearn Transformation Functions With Hopsworks Model Registy: How to register sklearn.pipeline with transformation functions and classifier in Hopsworks Model Registry and use it in training and inference pipelines.
    • Custom Transformation Functions: How to register custom transformation functions in hopsworks feature store use then in training and inference pipelines.
  • Integrations:
    • BigQuery Storage Connector: Create an External Feature Group using BigQuery Storage Connector.
    • Google Cloud Storage: Create an External Feature Group using GCS Storage Connector.
    • Redshift: Create an External Feature Group using Redshift Storage Connector.
    • Snowflake: Create an External Feature Group using Snowflake Storage Connector.
    • DBT Tutorial with BigQuery: Perform feature engineering in DBT on BigQuery.
    • WandB: Build a machine learning model with Weights & Biases.
    • Great Expectations: Introduction to Great Expectations concepts and classes which are relevant for integration with the Hopsworks MLOps platform.
    • Neo4j: Perform Anti-money laundering (AML) predictions using Neo4j Graph representation of transactions.
    • Polars : Introductory tutorial on using Polars.
    • PySpark Streaming : Real time feature computation from streaming data using PySpark and HopsWorks Feature Store.
    • Monitoring: How to implement feature monitoring in your production pipeline.
    • Bytewax: Real time feature computation using Bytewax.
    • Apache Beam: Real time feature computation using Apache Beam, Google Cloud Dataflow and Hopsworks Feature Store.
    • Apache Flink: Real time feature computation using Apache Flink and Hopsworks Feature Store.
    • MageAI: Build and operate a ML system with Mage and Hopsworks.

📝 Feedbacks & Comments:

We welcome feedbacks and suggestions, you can contact us on any of the following channels: