Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update benchmark questions (1/2) #178

Merged
merged 2 commits into from
Jun 24, 2024
Merged

Update benchmark questions (1/2) #178

merged 2 commits into from
Jun 24, 2024

Conversation

wongjingping
Copy link
Collaborator

@wongjingping wongjingping commented Jun 24, 2024

We keep only advising (for its date columns), atis (for its unix timestamp columns), yelp (for its year / month columns).
We delete academic's and scholar's date_functions questions, which we will replace with questions from the 4 new schema in a subsequent PR. This is because academic and scholar are semantically similar to advising, and is a repeat of the year/month-syntax questions in yelp.

Other single-question changes:

How many reviews were written for businesses located in California in the last 10 months?

Updated this date_functions question to use the actual date ranges in the data

Return the course id's that are offered in either semesters 1 or 2 and ends before 1pm and had an instructor on thursday

Modified 1 question in advising to filter on time and day-of-week column since no other questions were testing for those columns in the advising schema.

Will make all of the changes before translating them over to the other dialects in 1 go.

Updated some of the existing date_functions questions to use the actual date ranges in the data
Modified 1 question in advising to filter on time and day-of-week column since no other questions were testing for those columns in the advising schema.
@wongjingping wongjingping changed the title Update benchmark questions Update benchmark questions (1/2) Jun 24, 2024
@rishsriv
Copy link
Member

Thank you! This looks good to me. Okay if we merge this along with the other upcoming PR with new questions added in? That way, we'll keep the current 25 questions for the date functions benchmark until the merge is done

@wendy-aw
Copy link
Contributor

Thanks for the changes! I'll wait for you to complete the changes on this main set of questions before clarifying some other questions. Meanwhile I'll make changes to the translate script and dialects.py cos some errors were slipping thru (Thanks for spotting all those rishabh!)

@wongjingping
Copy link
Collaborator Author

wongjingping commented Jun 24, 2024

Added 5 questions for broker and car_dealership each according to the following question types:

broker:

  • top agg timestamp
  • month agg
  • current date diff agg (min)
  • date diff date agg (min)
  • current date diff agg (count)

car_dealership:

  • top date diff date
  • agg extract dow
  • agg date compare
  • join info with latest snapshot in a given date range
  • date trunc quarter

Let me know if we'd prefer to add other types of date queries here!

Will add 10 more for the other 2 schema later~

@rishsriv
Copy link
Member

Thank you! Really appreciate the work on making the date functions evals more representative of actual usage :D

@rishsriv rishsriv merged commit b18bab9 into main Jun 24, 2024
2 checks passed
@rishsriv rishsriv deleted the jp/update1 branch June 24, 2024 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants