Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adql query parser function #696

Draft
wants to merge 46 commits into
base: master
Choose a base branch
from
Draft

adql query parser function #696

wants to merge 46 commits into from

Conversation

burnout87
Copy link
Collaborator

No description provided.

@burnout87 burnout87 linked an issue Jun 24, 2024 that may be closed by this pull request
Copy link

codecov bot commented Jun 24, 2024

Codecov Report

Attention: Patch coverage is 21.00000% with 79 lines in your changes missing coverage. Please review.

Project coverage is 61.68%. Comparing base (f3e0e20) to head (b26a5da).

Files with missing lines Patch % Lines
cdci_data_analysis/analysis/ivoa_helper.py 17.54% 47 Missing ⚠️
cdci_data_analysis/flask_app/app.py 8.57% 32 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #696      +/-   ##
==========================================
- Coverage   62.05%   61.68%   -0.38%     
==========================================
  Files          50       51       +1     
  Lines        9138     9237      +99     
==========================================
+ Hits         5671     5698      +27     
- Misses       3467     3539      +72     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@burnout87
Copy link
Collaborator Author

burnout87 commented Sep 26, 2024

I did quite some progress on this PR in the last two days.

Few important first points:

  • an endpoint in the dispatcher takes as arg the query in ADQL format, converts it to a postgresql query and runs it, the list of rows (query result) are returned
    • queries in ADQL format are translated in postgresql using the queryparser library
    • a special extension pg_spehere has be installed on the postgresql (strict version requirements apply here) in order to execute certain functions
  • the idea is to use a view defined within the gallery DB which is on a MySQL DB
    • thanks to the pgloader the data can be easily transferred from one DB to the other

Why postgresql?

There is an extension, called mysql_sphere, but this is no longer developed and now it's quite outdated. However I gave it a try but I was not able to compile it. I had a brief interaction with the development team of queryparser and they advised against it.

On the contrary, I am testing pg_sphere (running postgresql version 16) and I currently am successful in running ADQL queries using the gallery DB.

@burnout87
Copy link
Collaborator Author

I cant really understand this error : https://github.com/oda-hub/dispatcher-app/actions/runs/11054770284/job/30712623502?pr=696

Any ideas? It should not be related to this PR

@burnout87
Copy link
Collaborator Author

burnout87 commented Sep 27, 2024

I cant really understand this error : https://github.com/oda-hub/dispatcher-app/actions/runs/11054770284/job/30712623502?pr=696

Any ideas? It should not be related to this PR

caused by a conflict between the pytest and pytest-xdist libraries, introduced by the queryparser-python3 library (only for python 3.8)

@@ -708,10 +729,16 @@
yield yaml.load(open(dispatcher_test_conf_with_gallery_fn), Loader=yaml.SafeLoader)['dispatcher']


@pytest.fixture
def dispatcher_test_conf_with_vo_options(dispatcher_test_conf_with_vo_options_fn):
yield yaml.load(open(dispatcher_test_conf_with_vo_options_fn), Loader=yaml.SafeLoader)['dispatcher']

Check warning

Code scanning / CodeQL

File is not always closed Warning

File is opened but is not closed.
@burnout87
Copy link
Collaborator Author

Having tried to directly import the whole product gallery production DB, into the PostgreSQL (trying to create a PostgreSQL table out of the MySQL view, ie materializing) instance, resulted in the whole process to take an unsustainable amount of time, probably causing disruptions.

So I decided to try and skip the materialization step. With that, the import was very quick (less than a minute), and after some initial testing, I realized that querying the view was still an option as the time taken was acceptable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

make VO endpoints: SIAP, TAP
1 participant