Skip to content

sol4ik/quantum-topological-analysis-of-stock-market-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Quantum-topological analysis of stock market data

Here is a short description of the final project for Architecture of Computer Systems course by Solomiia Leno on quantum-topological analysis of stock market data.

table of contents

  1. aim of project
  2. topological data analysis
  3. why quantum?
  4. algorithm
  5. quantum algorithm
    1. time complexity
  6. implementation details
    1. limitations
    2. circuits
  7. data and testing
  8. usage
  9. references

aim of project

The aim of the project is to implement quantum algorithm for detecting possible time allocaions of stock market crashes using topological data analysis on IBM Quantum Experience API for quantum computers.

topological data analysis

Topological data analysis (shortly - TDA) is an approach to the analysis of topologically stuctured data.This approach manages well the highly dimnesioned or noisy data.

TDA provides a general framework to analyze such data in a manner that is insensitive to the particular metric chosen and provides dimensionality reduction and robustness to noise. Beyond this, it inherits functoriality, a fundamental concept of modern mathematics, from its topological nature, which allows it to adapt to new mathematical tools.

why quantum?

Due to quantum superposition principle, one quantum bit may take infinitely many different states (each being a different superposition of two classical sates - 0 and 1).

Thus n quantum bits allow us to store all 2^n possible topologiacl stucture that may be created with n pieces of data.

Same principle allows one to perform parallel computations, in our case - computation of topological model (simplicial complex) and persistent Betti numbers calculation.

All in all, total time complexity of quantum inplementation of persistent homology is O(n^3) while implementation on classical computer has complexity of O(2^2n).

algorithm

The algorithm is based on the perisitent homology approach to data analysis. As topological indicator Betti numbers are chosen (usually this method produces very noisy data but it is the most convenient algorithm for quantisation).

Detailed explanation of TDA based analysis of stock market crashes along with Python code is in the experimets/algorithm.ipynb Jupyter notebook.

quantum algorithm

Main reference for the quantum algorithm development was Quantum algorithms for topological and geometric analysis of big data by S.Lloyd (link in the references section).

Here is detailed quantum algorithm pipeline:

  1. download the data on stock prices from Yahoo! Finance and save in the .csv format

  2. load the data and store the values of Close price only

    • it is a common practice to use Close values when working with stock market data
  3. create data cloud with Takens' embedding algorithm

    • the embedding dimension must be a power of 2
    • in the implementation presented, embedding_dim = 2
  4. normalize the data cloud

    • consider each point as a vector from R^n and normalize it to be of unit length
  5. express each data point within three alngles for U3 gates

    • U3 is a quantum gate that allows one to set qubit to any possible state. U3 is denoted by the following rotation matrix

    U3 matrix

    • since this gate is applied to |0> state, the resulting state looks as following

    quantum state

    • thus, since we start with |0> state alpha = 0, and phi = 0 because both coordinates of data point are real numbers
  6. create the IBM Q circuit with needed amount of quantum bits

    • it is possible to have as many as 32 quantum bits in the circuit, but if you want to get convenient results with quantum computer you are limited with only 5 quantum bits due to coherence
    • this way you are limited to only 2 2-dimensional data points
  7. set quantum bits to needed states with U3 gates

  8. calculate pairwise distances between the data points

    • set as a and b quantum states denoting hte points we want to calculate the distance between and 0 as the state of contol bit, the one the distance will be calculated to

    distance calulation

    • now, the control bit is in the quantum state denoting distance between a and b
  9. contruct simplicial complex from the pairwise distances between points

    • in the presented implemetation Grover's search algorithm is used for this task
    • you can find detailed description of Grover's search algorithm here
    • main idea is to implement Grover's algorithm with oracle that allows multiple solutions, since multiple simplicial complexes can be cinstructed from a single data cloud
  10. perform quantum persistent homology algorithm

    • now that each quentum state |s_k> denotes a simpex from general data model, one needs to project |s_k> onto sum of its boundary simplices. Such a mapping is denoted by

    boundary mapping

    where s_k_1(l) is the k-th simplex without l-th vertix.

    • after boundary mapping is done, the general quantum state is denoted as sigma_k
    • at last, one wants to perform quantum phase estimation algorithm for the following Hermitian matrix

    B_k hermitian matrix

    • now, each Betti number can be estimated as

    k-th Betti number

    where H_k+1 is the vector space spanned by vectoros s_k+1

  11. plot the Betti curve, set threshold value and detect time periods when the crash happened

    • Betti curves produce quite noisy results and thus good crash estimation depends on threshold value chosen
    • in the implementation proposed threshold = 0.3

time complexity

Depending on the implementation time complexity of pesistent homology algorithm implementation cannot get beteer than O(2^2n)

Quantum algorithm, developed by Lloyd, is estimated have time complexity O(n^3) with perfectly parallel access to qRAM.

Due to IBM Q limitations it is impossible to implement the whole algorithm at once, still it is possible to implement some parts of it with follwing time complexities:

  • Grover's serach algorithm - O(sqrt n)
  • quantum qhase estimtion algorithm - O(1/epsilon) with epsilon being the additive error

implementation details

The quantum algorithm is implemented with IBM Quantum Experience as well as plain Python code.

All the source code can be found in the modules directory.

  • preprocessing.py
    • contains funntions for data preprocessing: load data, transform to data cloud using Takens' embedding and data normalization
  • quantum.py
    • contains functions that are directly related to quantum algorithm implementation: circuit configuration, Grove's search alogorithm implementation etc.
  • postprocessing.py
  • main.py
    • brings all the functional together and manages the access to IBMQ backend

limitations

When it comes to implementing the quantum persistent homology algorithm several limitations arise.

  • due coherence and high error rate it is possible to get concenient results only by working with 5-qubit computers or simulation
    • thus, it is possible to process only two 2-dimnesional points within the quantum implementation of the algorithm
  • IBM Q API does not allow one to create custom gate for 3 qubits which is needed for distances calculations
  • IBM Q does not provide parallel access to qRAM, thus time complexity estimated by LLoyd gets worse in real-life implementations
  • again, due to lack of computational accuracy and qRAM it is impossible to conveniently implement the calculation of quantum density matrix needed for further calculations of Betti numbers

circuits

Since not all of the algorithm can be implemented within IBM Quantum Experience API, I worked on several smaller circuits.

You can find the ciruits schemas in the citcuits directory.

data and testing

As testing data I chose 4 significant histirical cases of stock market crashes: Wall Street crash, Black Monday crash, Crisis of 2008 and 2020 Coronavirus crash.

All the data is stored in .csv files and contains info on values of Dow-Jones and S&P 500 indexes on the time periods of the crashes.

You can find the data files as well as more detailed description and visualizations in this repo - data directory.

usage

In order to try the software developed you need to clone this repository and run the following commands in the terminal

cd quantum-comuting/modules

virtualenv venv
sourse venv/bin/activate

pip install -r requirements.txt

touch ibmq_token.txt

Go to IBM Quantum Experience and create personal account or simply log in if you have one.

In My Account section find Quiskit in local environment and copy your account token. Insert the token in newly created ibmq_token.txt file or right in the main.py code.

Run the program with

python main.py

In order to configure the backend you want to run the circuit on, go to main.py and find the following code

simulator = Aer.get_backend('specify backed you want to use here')

BUT for development I recommend using IBMQ directly as it allows one to run circuit gates step by step, view partial results and monitor the queue for backend.

references

Here is the link to my project presentation.

All the test data are downloaded from Yahoo! Finance.

Some of the sources I worked with:

  • Vakarchuk I. Quantum mechanics. Lviv State University of Ivan Franko; 1998. 614 p.
  • Lloyd, Seth & Garnerone, Silvano & Zanardi, Paolo. (2014). Quantum algorithms for topological and geometric analysis of big data. Nature Communications. 10.1038/ncomms10138.
  • Dawid Kopczyk. (2018). Quantum machine learning for data scientists
  • Detecting stock market crashes with topological data analysis
  • Persistent homology with examples

About

Course project for Architecture of Computer Systems

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published