Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connecting to Apache Pinot as a data source #8579

Closed
AnishNair94 opened this issue Aug 14, 2024 · 8 comments
Closed

Connecting to Apache Pinot as a data source #8579

AnishNair94 opened this issue Aug 14, 2024 · 8 comments
Assignees
Labels
data source driver enhancement New feature proposal help wanted Community contributions are welcome.

Comments

@AnishNair94
Copy link

Is your feature request related to a problem? Please describe.
The lack of Apache Pinot integration in Cube creates a bottleneck in our analytics workflow, limiting our ability to fully leverage the benefits of Cube headless BI approach.

Describe the solution you'd like
I would like Cube to add native integration with Apache Pinot by developing a dedicated driver or connector.

Describe alternatives you've considered
It seems like the only other alternatives are to look for another headless BI tool or create our own.

Additional context
Apache Pinot version 1.0, Looking for Multi-Stage Engine Support which has join, subquery etc and closer to ANSI-SQL.

@ovr
Copy link
Member

ovr commented Aug 14, 2024

cC @igorlukanin Should we add this to #7076?

@igorlukanin igorlukanin added enhancement New feature proposal help wanted Community contributions are welcome. data source driver labels Aug 14, 2024
Copy link

If you are interested in working on this issue, please go ahead and provide PR for that.
We'd be happy to review it and merge it.
If this is the first time you are contributing a Pull Request to Cube, please check our contribution guidelines.
You can also post any questions while contributing in the #contributors channel in the Cube Slack.

@igorlukanin
Copy link
Member

@AnishNair94 Thanks for the proposal! I think it would be great if someone from the community volunteers to create a Pinot driver. We have some contribution guidelines here: https://github.com/cube-js/cube/blob/master/CONTRIBUTING.md#contributing-database-drivers

@jronsse
Copy link
Contributor

jronsse commented Sep 2, 2024

Hello! We were facing the same requirements building our application using Pinot as a datasource. We ended up using CubeJS -> Trino -> Apache Pinot.

The Trino driver uses the Presto driver. This driver uses a polling method to query trino, which degrades the performances. The default checkInterval is set to 800ms. As mentioned in this issue, you can lower the interval to get better response times. You can do so in cubejs by providing a driverFactory:

const TrinoDriver = require('@cubejs-backend/trino-driver');

module.exports = {
  driverFactory: ({ securityContext }) => {
    return new TrinoDriver({checkInterval: 100})
  },
};

I'm working on a cubejs driver for Apache Pinot whenever I get some spare time. Hopefully it won't take me too long.

@igorlukanin
Copy link
Member

Hi @jronsse 👋

Amazing! Thanks for taking the lead on this!

@AnishNair94
Copy link
Author

AnishNair94 commented Nov 3, 2024

Hey @jronsse
This is awesome !!
Pulled latest docker to try this out, but couldn't find Pinot in Data Source.
Can anyone help ?

@jronsse
Copy link
Contributor

jronsse commented Nov 3, 2024

Hello @AnishNair94 !

I just tried it on my end with the latest version v1.1.2. Pinot does not show up in the selection of datasources in the UI but you can still set it up through environment variables.
Here's an example:

CUBEJS_DB_TYPE=pinot
CUBEJS_DB_HOST=http[s]://pinot.broker.host
CUBEJS_DB_PORT=8099
CUBEJS_DB_USER=pinot_user
CUBEJS_DB_PASS=**********

The user and password are not mandatory.

Please let me know if you encounter any issues using the driver once you've got it up and running.

@igorlukanin
Copy link
Member

I've added a note on the UI vs. env vars in the docs on the Pinot page. With that, I guess we're good to close this issue as resolved. Huge thanks for your contribution @jronsse 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data source driver enhancement New feature proposal help wanted Community contributions are welcome.
Projects
None yet
Development

No branches or pull requests

4 participants