Supervised ML model training not reproducible from tutorial #354

qubitzer · 2025-01-23T15:17:25Z

Environment information

OS: Windows 11
Python: 3.10.9
mqt.predictor: 2.1.1

Entire python environment:
absl-py==2.1.0
ale-py==0.10.1
annotated-types==0.7.0
asttokens==3.0.0
beautifulsoup4==4.12.3
bqskit==1.2.0
bqskitrs==0.4.1
certifi==2024.12.14
cffi==1.17.1
charset-normalizer==3.4.1
cloudpickle==3.1.1
colorama==0.4.6
comm==0.2.2
contourpy==1.3.1
cryptography==44.0.0
cycler==0.12.1
debugpy==1.8.12
decorator==5.1.1
dill==0.3.9
docplex==2.29.241
exceptiongroup==1.2.2
executing==2.2.0
Farama-Notifications==0.0.4
fastdtw==0.3.4
filelock==3.17.0
fonttools==4.55.4
frozendict==2.4.6
fsspec==2024.12.0
graphviz==0.20.3
grpcio==1.69.0
gymnasium==1.0.0
h5py==3.12.1
html5lib==1.1
ibm-cloud-sdk-core==3.22.1
ibm-platform-services==0.59.1
idna==3.10
inflection==0.5.1
ipykernel==6.29.5
ipython==8.31.0
ipywidgets==8.1.5
jedi==0.19.2
Jinja2==3.1.5
joblib==1.4.2
jupyter_client==8.6.3
jupyter_core==5.7.2
jupyterlab_widgets==3.0.13
kiwisolver==1.4.8
lark==1.2.2
lxml==5.3.0
Markdown==3.7
markdown-it-py==3.0.0
MarkupSafe==3.0.2
matplotlib==3.10.0
matplotlib-inline==0.1.7
mdurl==0.1.2
more-itertools==10.6.0
mpmath==1.3.0
mqt.bench==1.1.9
mqt.predictor==2.1.1
multitasking==0.0.11
Nasdaq-Data-Link==1.0.4
nest-asyncio==1.6.0
networkx==3.4.2
numpy==1.26.4
opencv-python==4.11.0.86
packaging==24.2
pandas==2.2.3
parso==0.8.4
pbr==6.1.0
peewee==3.17.8
pillow==11.1.0
platformdirs==4.3.6
prompt_toolkit==3.0.50
protobuf==5.29.3
psutil==6.1.1
pure_eval==0.2.3
pycparser==2.22
pydantic==2.9.2
pydantic_core==2.23.4
pygame==2.6.1
Pygments==2.19.1
PyJWT==2.10.1
pyparsing==3.2.1
pyspnego==0.11.2
python-dateutil==2.9.0.post0
pytket==1.39.0
pytket-qiskit==0.62.0
pytz==2024.2
pywin32==308
pyzmq==26.2.0
qiskit==1.3.2
qiskit-aer==0.16.0
qiskit-algorithms==0.3.1
qiskit-finance==0.4.1
qiskit-ibm-runtime==0.34.0
qiskit-nature==0.7.2
qiskit-optimization==0.6.1
qwasm==1.0.1
requests==2.32.3
requests_ntlm==1.3.0
rich==13.9.4
rustworkx==0.15.1
sb3_contrib==2.4.0
scikit-learn==1.6.1
scipy==1.15.1
six==1.17.0
soupsieve==2.6
sspilib==0.2.0
stable_baselines3==2.4.1
stack-data==0.6.3
stevedore==5.4.0
symengine==0.13.0
sympy==1.13.1
tensorboard==2.18.0
tensorboard-data-server==0.7.2
threadpoolctl==3.5.0
torch==2.5.1
tornado==6.4.2
tqdm==4.67.1
traitlets==5.14.3
typing_extensions==4.12.2
tzdata==2025.1
urllib3==2.3.0
wcwidth==0.2.13
webencodings==0.5.1
websocket-client==1.8.0
Werkzeug==3.1.3
widgetsnbextension==4.0.13
yfinance==0.2.52

Description

Hi,

I tried to go through the steps in this tutorial.
Training the RL model with

import mqt.predictor

rl_pred = mqt.predictor.rl.Predictor(
    figure_of_merit="expected_fidelity", device_name="ibm_washington"
)
rl_pred.train_model(timesteps=1024, model_name="sample_model_rl")

seems to have worked. Though, it took 36 minutes even though I decreased the timesteps from 100000 to 1024.

When following the next step in the guide executing

ml_pred = mqt.predictor.ml.Predictor()
ml_pred.generate_compiled_circuits(timeout=600)  # timeout in seconds
training_data, name_list, scores_list = ml_pred.generate_trainingdata_from_qasm_files(
    figure_of_merit="expected_fidelity"
)
mqt.predictor.ml.helper.save_training_data(
    training_data, name_list, scores_list, figure_of_merit="expected_fidelity"
)

I get:
RuntimeError Traceback (most recent call last)
Cell In[2], line 2
1 ml_pred = mqt.predictor.ml.Predictor()
----> 2 ml_pred.generate_compiled_circuits(timeout=600) # timeout in seconds
3 training_data, name_list, scores_list = ml_pred.generate_trainingdata_from_qasm_files(
4 figure_of_merit="expected_fidelity"
5 )
6 mqt.predictor.ml.helper.save_training_data(
7 training_data, name_list, scores_list, figure_of_merit="expected_fidelity"
8 )
...
95 except Exception as e:
96 print(e, filename, device_name)
---> 97 raise RuntimeError("Error during compilation: " + str(e)) from e

RuntimeError: Error during compilation: The RL model is not trained yet. Please train the model before using it.

Expected behavior

No response

How to Reproduce

Create virtual python environment on Windows
Install packages listed in "Environment information"
Run code listed in "Description"

The text was updated successfully, but these errors were encountered:

burgholzer · 2025-01-23T20:16:21Z

Thanks for reporting this.
At a first glance, this seems like a documentation issue as we do test this whole procedure as part of our unit tests.
At least, we generate an RL model here

mqt-predictor/tests/compilation/test_predictor_rl.py

Lines 31 to 60 in 1630d2f

    
           def test_qcompile_with_newly_trained_models() -> None: 
        
               """Test the qcompile function with a newly trained model. 
        
               Important: Those trained models are used in later tests and must not be deleted. 
        
               To test ESP as well, training must be done with a device that provides all relevant information (i.e. T1, T2 and gate times). 
        
               """ 
        
               figure_of_merit = "expected_fidelity" 
        
               device = "ionq_harmony"  # fully specified calibration data 
        
               qc = get_benchmark("ghz", 1, 3) 
        
               predictor = rl.Predictor(figure_of_merit=figure_of_merit, device_name=device) 
        
               model_name = "model_" + figure_of_merit + "_" + device 
        
               model_path = Path(rl.helper.get_path_trained_model() / (model_name + ".zip")) 
        
               if not model_path.exists(): 
        
                   with pytest.raises( 
        
                       FileNotFoundError, 
        
                       match=re.escape("The RL model is not trained yet. Please train the model before using it."), 
        
                   ): 
        
                       rl.qcompile(qc, figure_of_merit=figure_of_merit, device_name=device) 
        
               predictor.train_model( 
        
                   timesteps=100, 
        
                   test=True, 
        
               ) 
        
               res = rl.qcompile(qc, figure_of_merit=figure_of_merit, device_name=device) 
        
               assert isinstance(res, tuple) 
        
               qc_compiled, compilation_information = res 
        
               assert qc_compiled.layout is not None 
        
               assert compilation_information is not None

And I would strongly suspect that this is then used in the ML tests here:

mqt-predictor/tests/device_selection/test_predictor_ml.py

Lines 17 to 47 in 1630d2f

    
           def test_train_and_predictor_random_forest_classifier() -> None: 
        
               """Test the training of a random forest classifier. 
        
               This test must be executed prior to any prediction to make sure the model is trained using the latest scikit-learn version. 
        
               """ 
        
               predictor = ml.Predictor() 
        
               assert predictor.clf is None 
        
               predictor.train_random_forest_classifier(visualize_results=False, save_classifier=False) 
        
               assert predictor.clf is not None 
        
               qc = benchmark_generator.get_benchmark("ghz", 1, 3) 
        
               prediction = predictor.predict_probs(qc, "expected_fidelity") 
        
               for elem in prediction: 
        
                   assert 0 <= elem <= 1 
        
               file = Path("test_qasm.qasm") 
        
               qc = benchmark_generator.get_benchmark("dj", 1, 3) 
        
               with file.open("w", encoding="utf-8") as f: 
        
                   dump(qc, f) 
        
               prediction = predictor.predict_probs(file, "expected_fidelity") 
        
               for elem in prediction: 
        
                   assert 0 <= elem <= 1 
        
               with pytest.raises( 
        
                   FileNotFoundError, match=re.escape("The ML model is not trained yet. Please train the model before using it.") 
        
               ): 
        
                   ml.helper.predict_device_for_figure_of_merit(qc, "false_input")  # type: ignore[arg-type] 
        
               (ml.helper.get_path_trained_model("expected_fidelity").parent / "non_zero_indices_expected_fidelity.npy").unlink()

Unfortunately, I do not have stable enough internet at the moment to try to reproduce this.
@nquetschlich do you have any input on this?

nquetschlich · 2025-01-23T22:30:04Z

Hi @qubitzer,

the root cause is the following: When you correctly triggered the training process of the RL compiler, you specifically trained for the ibm_washington device.

However, when calling ml_pred.generate_compiled_circuits(timeout=600), the generation of the training data for the device selection is started. For this, it is assumed that a respective RL compiler was trained for all supported devices before. Since this is not the case, the shown error is thrown.

I hope, that this clarifies the behavior.

qubitzer · 2025-01-24T14:42:11Z

Hi @nquetschlich ,

thanks, for the info.

Were do I find the technical device_names of all seven supported devices? I only found their non-technical names here.

nquetschlich · 2025-01-24T14:49:14Z

Hi @nquetschlich ,

thanks, for the info.

Were do I find the technical device_names of all seven supported devices? I only found their non-technical names here.

Hi @qubitzer, sorry for that, they are a bit hard to find and actually we should list them more clearly. They are taken from MQT Bench using the respective line here:

mqt-predictor/src/mqt/predictor/ml/predictor.py

Line 43 in 1630d2f

self.devices = get_available_devices()

This is calling the following MQT Bench function:

https://github.com/cda-tum/mqt-bench/blob/d6cb81b60ae284787ce2135a230533f68fd52212/src/mqt/bench/devices/__init__.py#L55

Probably it would be easiest to extract them from MQT Bench as well calling the same function. Also, until MQT Predictor v2.1.0, we provided pre-trained models. If you want to skip the training, you could just use v2.0.0.

qubitzer · 2025-01-24T14:56:54Z

Thank you.
I used

from mqt.bench.devices import get_available_devices

for device in get_available_devices():
    print(device.name)

to get:

ibm_washington
ibm_montreal
ionq_harmony
ionq_aria1
oqc_lucy
rigetti_aspen_m3
quantinuum_h2
iqm_adonis
iqm_apollo

However, there is some discrepancy to the list in the tutorial.

Is this intended? Do I need to train a RL model for all 9 devices given by the get_available_devices API?

nquetschlich · 2025-01-24T15:38:24Z

Thank you. I used
from mqt.bench.devices import get_available_devices

for device in get_available_devices():
    print(device.name)
to get:
ibm_washington
ibm_montreal
ionq_harmony
ionq_aria1
oqc_lucy
rigetti_aspen_m3
quantinuum_h2
iqm_adonis
iqm_apollo
However, there is some discrepancy to the list in the tutorial.

Is this intended? Do I need to train a RL model for all 9 devices given by the get_available_devices API?

Thanks for pointing this out. This is not intend, we extended MQT Bench and apparently overlooked to update it in the MQT Predictor documentation as well.
Unfortunately, the framework is not yet flexible to consider only a subset of the devices. Therefore, you would need to train models for all of them if you like to generate the training data necessary for the device selection (or use the old Predictor version with the pre-trained RL models and less devices).

nquetschlich · 2025-02-03T11:22:55Z

Dear @qubitzer, we have just released a new MQT Predictor version (v2.2.0, see https://github.com/cda-tum/mqt-predictor/releases/tag/v2.2.0) that should solve all your issues (this one as well as #356 and #357). Can you verify that it is the case for you? If not, we are happy to work on it before closing these issues.

burgholzer added this to MQT Applications and MQT Jan 23, 2025

github-project-automation bot moved this to In Progress in MQT Jan 23, 2025

github-project-automation bot moved this to In Progress in MQT Applications Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supervised ML model training not reproducible from tutorial #354

Supervised ML model training not reproducible from tutorial #354

qubitzer commented Jan 23, 2025

burgholzer commented Jan 23, 2025

nquetschlich commented Jan 23, 2025 •

edited

Loading

qubitzer commented Jan 24, 2025 •

edited

Loading

nquetschlich commented Jan 24, 2025

qubitzer commented Jan 24, 2025

nquetschlich commented Jan 24, 2025

nquetschlich commented Feb 3, 2025

Supervised ML model training not reproducible from tutorial #354

Supervised ML model training not reproducible from tutorial #354

Comments

qubitzer commented Jan 23, 2025

Environment information

Description

Expected behavior

How to Reproduce

burgholzer commented Jan 23, 2025

nquetschlich commented Jan 23, 2025 • edited Loading

qubitzer commented Jan 24, 2025 • edited Loading

nquetschlich commented Jan 24, 2025

qubitzer commented Jan 24, 2025

nquetschlich commented Jan 24, 2025

nquetschlich commented Feb 3, 2025

nquetschlich commented Jan 23, 2025 •

edited

Loading

qubitzer commented Jan 24, 2025 •

edited

Loading