Here is a short description of the final project for Architecture of Computer Systems course by Solomiia Leno on quantum-topological analysis of stock market data.
- aim of project
- topological data analysis
- why quantum?
- algorithm
- quantum algorithm
- implementation details
- data and testing
- usage
- references
The aim of the project is to implement quantum algorithm for detecting possible time allocaions of stock market crashes using topological data analysis on IBM Quantum Experience API for quantum computers.
Topological data analysis (shortly - TDA) is an approach to the analysis of topologically stuctured data.This approach manages well the highly dimnesioned or noisy data.
TDA provides a general framework to analyze such data in a manner that is insensitive to the particular metric chosen and provides dimensionality reduction and robustness to noise. Beyond this, it inherits functoriality, a fundamental concept of modern mathematics, from its topological nature, which allows it to adapt to new mathematical tools.
Due to quantum superposition principle, one quantum bit may take infinitely many different states (each being a different superposition of two classical sates - 0 and 1).
Thus n quantum bits allow us to store all 2^n possible topologiacl stucture that may be created with n pieces of data.
Same principle allows one to perform parallel computations, in our case - computation of topological model (simplicial complex) and persistent Betti numbers calculation.
All in all, total time complexity of quantum inplementation of persistent homology is O(n^3) while implementation on classical computer has complexity of O(2^2n).
The algorithm is based on the perisitent homology approach to data analysis. As topological indicator Betti numbers are chosen (usually this method produces very noisy data but it is the most convenient algorithm for quantisation).
Detailed explanation of TDA based analysis of stock market crashes along with Python code is in the experimets/algorithm.ipynb Jupyter notebook.
Main reference for the quantum algorithm development was Quantum algorithms for topological and geometric analysis of big data by S.Lloyd (link in the references section).
Here is detailed quantum algorithm pipeline:
-
download the data on stock prices from Yahoo! Finance and save in the .csv format
-
load the data and store the values of
Close
price only- it is a common practice to use Close values when working with stock market data
-
create data cloud with Takens' embedding algorithm
- the embedding dimension must be a power of 2
- in the implementation presented,
embedding_dim = 2
-
normalize the data cloud
-
express each data point within three alngles for U3 gates
- U3 is a quantum gate that allows one to set qubit to any possible state. U3 is denoted by the following rotation matrix
- since this gate is applied to |0> state, the resulting state looks as following
-
create the IBM Q circuit with needed amount of quantum bits
- it is possible to have as many as 32 quantum bits in the circuit, but if you want to get convenient results with quantum computer you are limited with only 5 quantum bits due to coherence
- this way you are limited to only 2 2-dimensional data points
-
set quantum bits to needed states with U3 gates
-
calculate pairwise distances between the data points
- set as a and b quantum states denoting hte points we want to calculate the distance between and 0 as the state of contol bit, the one the distance will be calculated to
- now, the control bit is in the quantum state denoting distance between a and b
-
contruct simplicial complex from the pairwise distances between points
- in the presented implemetation Grover's search algorithm is used for this task
- you can find detailed description of Grover's search algorithm here
- main idea is to implement Grover's algorithm with oracle that allows multiple solutions, since multiple simplicial complexes can be cinstructed from a single data cloud
-
perform quantum persistent homology algorithm
- now that each quentum state denotes a simpex from general data model, one needs to project onto sum of its boundary simplices. Such a mapping is denoted by
where is the k-th simplex without l-th vertix.
- after boundary mapping is done, the general quantum state is denoted as
- at last, one wants to perform quantum phase estimation algorithm for the following Hermitian matrix
- now, each Betti number can be estimated as
-
plot the Betti curve, set threshold value and detect time periods when the crash happened
- Betti curves produce quite noisy results and thus good crash estimation depends on threshold value chosen
- in the implementation proposed
threshold = 0.3
Depending on the implementation time complexity of pesistent homology algorithm implementation cannot get beteer than
Quantum algorithm, developed by Lloyd, is estimated have time complexity with perfectly parallel access to qRAM.
Due to IBM Q limitations it is impossible to implement the whole algorithm at once, still it is possible to implement some parts of it with follwing time complexities:
The quantum algorithm is implemented with IBM Quantum Experience as well as plain Python code.
All the source code can be found in the modules directory.
- preprocessing.py
- contains funntions for data preprocessing: load data, transform to data cloud using Takens' embedding and data normalization
- quantum.py
- contains functions that are directly related to quantum algorithm implementation: circuit configuration, Grove's search alogorithm implementation etc.
- postprocessing.py
- main.py
- brings all the functional together and manages the access to IBMQ backend
When it comes to implementing the quantum persistent homology algorithm several limitations arise.
- due coherence and high error rate it is possible to get concenient results only by working with 5-qubit computers or simulation
- thus, it is possible to process only two 2-dimnesional points within the quantum implementation of the algorithm
- IBM Q API does not allow one to create custom gate for 3 qubits which is needed for distances calculations
- IBM Q does not provide parallel access to qRAM, thus time complexity estimated by LLoyd gets worse in real-life implementations
- again, due to lack of computational accuracy and qRAM it is impossible to conveniently implement the calculation of quantum density matrix needed for further calculations of Betti numbers
Since not all of the algorithm can be implemented within IBM Quantum Experience API, I worked on several smaller circuits.
You can find the ciruits schemas in the citcuits directory.
As testing data I chose 4 significant histirical cases of stock market crashes: Wall Street crash, Black Monday crash, Crisis of 2008 and 2020 Coronavirus crash.
All the data is stored in .csv files and contains info on values of Dow-Jones and S&P 500 indexes on the time periods of the crashes.
You can find the data files as well as more detailed description and visualizations in this repo - data directory.
In order to try the software developed you need to clone this repository and run the following commands in the terminal
cd quantum-comuting/modules
virtualenv venv
sourse venv/bin/activate
pip install -r requirements.txt
touch ibmq_token.txt
Go to IBM Quantum Experience and create personal account or simply log in if you have one.
In My Account section find Quiskit in local environment and copy your account token. Insert the token in newly created ibmq_token.txt file or right in the main.py code.
Run the program with
python main.py
In order to configure the backend you want to run the circuit on, go to main.py and find the following code
simulator = Aer.get_backend('specify backed you want to use here')
BUT for development I recommend using IBMQ directly as it allows one to run circuit gates step by step, view partial results and monitor the queue for backend.
Here is the link to my project presentation.
All the test data are downloaded from Yahoo! Finance.
Some of the sources I worked with:
- Vakarchuk I. Quantum mechanics. Lviv State University of Ivan Franko; 1998. 614 p.
- Lloyd, Seth & Garnerone, Silvano & Zanardi, Paolo. (2014). Quantum algorithms for topological and geometric analysis of big data. Nature Communications. 10.1038/ncomms10138.
- Dawid Kopczyk. (2018). Quantum machine learning for data scientists
- Detecting stock market crashes with topological data analysis
- Persistent homology with examples