Request for More Information on Hardcoded References in Preprocessing #1

ShaunFChen · 2023-07-02T03:55:12Z

Hello,

I have been exploring your NeuralCVD repository for our study and appreciate the considerable effort put into this tool. We believe it has a potential to make a significant contribution to our research. However, I have been encountering some difficulties during the preprocessing step.

The tool appears to have hardcoded references to files under:

path = "/data/analysis/ag-reils/steinfej/code/umbrella/pre/ukbb"
data_path = "/data/analysis/ag-reils/ag-reils-shared/cardioRS/data"

in the subfolder named mapping, also:

codes_gp_records = pd.read_feather(f"{data_path}/1_decoded/codes_gp_diagnoses_210119.feather").drop("level", axis=1)
codes_hospital_records = pd.read_feather(f"{data_path}/1_decoded/codes_hes_diagnoses_210120.feather")

which didn't include in the output of "0_decode_ukbb.ipynb".

While I understand that the UK Biobank codings are used in your tool, and I'm able to obtain those, there are other datasets which are not clear to me: atc, phecodes, snomed_cor_list, and athena_vocabulary_covid. I am having difficulty confirming the consistency of these data and their format with what the tool requires. In order to correctly run the tool and ensure the validity of our results, it's crucial that we have the same version and format of these specific datasets. Unfortunately, the current resources do not provide sufficient details to accurately reproduce this setup.

As a result, I kindly request you to share these referenced data directly, if it's possible and within compliance.

However, if direct access is not feasible due to any constraints, could you please provide further information on how to obtain or generate these datasets? This ideally includes the specific versions of these datasets, the expected formats, and any preprocessing steps required for compatibility with NeuralCVD.

Your assistance will greatly aid us in overcoming this roadblock, and will facilitate the effective use of this tool in our research.

Thank you for your time and for your invaluable contributions to the field.

Best regards,
Shaun

The text was updated successfully, but these errors were encountered:

DhanushB2000 · 2024-06-01T16:41:48Z

Thank you for developing such a good code snippet of exploring the UKBiobank data.
I was also looking into this comprehensive code for getting familiarised to work with UKBB data.
It would be great if you could share the files that were used for the code (like as mentioned in the previous comment as well as).
Your help is much appreciated, requesting you to share those files.

Regards,
Dhanush

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for More Information on Hardcoded References in Preprocessing #1

Request for More Information on Hardcoded References in Preprocessing #1

ShaunFChen commented Jul 2, 2023 •

edited

Loading

DhanushB2000 commented Jun 1, 2024

Request for More Information on Hardcoded References in Preprocessing #1

Request for More Information on Hardcoded References in Preprocessing #1

Comments

ShaunFChen commented Jul 2, 2023 • edited Loading

DhanushB2000 commented Jun 1, 2024

ShaunFChen commented Jul 2, 2023 •

edited

Loading