Skip to content

Latest commit

 

History

History
58 lines (44 loc) · 2.95 KB

README.md

File metadata and controls

58 lines (44 loc) · 2.95 KB

Predicting Molecular Bond Strengths with Topological Machine Learning

What is it?

This repository showcases the core functionalities of giotto-learn, a Python library for topological machine learning. The accompanying blog post can be found here.

This demo is based on the Predicting Molecular Properties competition on Kaggle, where the task is to predict the bond strength between atoms in molecules.

Getting started

The easiest way to get started is to create a conda environment as follows:

conda create python=3.7 --name molecule -y
conda activate molecule
pip install -r requirements.txt

Results

The scoring function is described on Kaggle and is calculated as follows:

where:

  • T is the number of coupling types
  • n_t is the number of observations of type t
  • y_i is the actual coupling value for this sample
  • is the predicted coupling value for this sample

The figure below summarizes the results and gives a comparison of the results with and without topological features.

External code

The following Kaggle notebooks were used for this project:

Some related publications

To get an introduction to the application of topological data analysis to machine learning, see:

The idea to use topological data analysis for predictions on molecules is not new. Below you can find some interesting papers related to this:

The following papers were used to get some inspiration for the feature creation: