Skip to content

feat: Add QuickBooks verified source #609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

ah12068
Copy link

@ah12068 ah12068 commented Apr 22, 2025

This is my first open source contribution 🥳

This PR aims to add the QuickBooks online as a source, raised in issue #586

I am using python-quickbooks package (pypi / github) to create the DLT resource and pipeline

I've only done this for the customer object from quickbooks, and i'm aware that there are a lot more objects available. Hopefully it's a good start for others to contribute as well.

Any feedback is appreciated.

Additional Context

To get set up and test

  1. Sign up to quickbooks
  2. Sign in to quickbooks developer portal
  3. In quickbooks developer portal, click on 'My Hub' then on 'Sandboxes' and create/add a sandbox company, take a note of the ream id - this is the company id
  4. In quickbooks developer portal, create a workspace or use the sample workspace (this should exist and is provided by intuit
  5. In the workspace created, create an application and take note of the client id and client secret (typically use the development one)
  6. In the app, go to settings and go to 'Redirect URIs' and take note of the redirect uri
  7. go to this url: https://developer.intuit.com/app/developer/playground and select the workspace, app and choose the scope (i chose com.intuit.quickbooks.accounting) and then click on 'Get authorization code'
  8. Click on 'Get tokens' and take note of the refresh token and access token provided in the api response (these last 1hr for testing)
  9. put the put the credentials in the secrets.toml also, REALM ID = COMPANY ID
[sources.quickbooks_online]
environment='sandbox'
client_id=
client_secret=
company_id=
redirect_uri=
refresh_token=
access_token=
  1. Run the pipeline

Notes

  1. Incremental loading methods should be possible since each record has the following : MetaData: {'CreateTime': 'YYYY-MM-DD'T'hh:mmTZD', 'LastUpdateTime': 'YYYY-MM-DD'T'hh:mmTZD'} (ISO8601 format) and described in the underlying package docs
  2. oAuth is a bit annoying to deal with since i haven't had much experience dealing with this type of authentication.
  3. I probably want a bit more testing to get a better understanding, given that it's only testing one table (always static if we're using the sample workspace provided by intuit)

@rudolfix rudolfix requested review from anuunchin and djudjuu and removed request for anuunchin April 23, 2025 09:43
Copy link
Contributor

@djudjuu djudjuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @ah12068 ,
thanks for this contribution. This looks really solid.
Also, it surfaced a bug in our verified-sources repo that will get fixed soon (#613).
The only thing you could have done better is make sure to talk to us, before implementing something (good piece of advice for any oss contribution).
Then we would have told you whether we would want this source or not, which atm is not likely, as it increases the amount of code that we have to maintain. [1]

UPDATED:
In order for this to become a real verified source, we'd need a bit more work such that the pipeline can run just with the credential file without any manual work (getting tokens from their website).
Intuit seems to be doing a standard Oauth 2.0 authentication, which is something that dlt does support.

So let's try to get this to a state where the user sets the unchanging values in the secret (client-id, client-secret, project-id, redirect-uri etc...) and the code connects to the oauth-endpoints to get the authorization token which it uses in turn to get the access- and refresh-token and then instanitates the client with it.

here are the docs for dlt-oauth: https://dlthub.com/docs/general-usage/credentials/complex_types#oauth2credentials
it's an abstract class that is being implemented for GoogleCloud-Credentials, so you could copy how that is done.

If you're into it, you can also try to 'vibe-code' it and have an LLM help you with writing it. Intuit has a lot of docs, and some example client implementations, so that should be a very good starting point.
We put together some advice on how to achieve good results with it here: https://dlthub.com/docs/dlt-ecosystem/llm-tooling/cursor-restapi

  • [1] there was an issue for it, but it wasnt opened by someone from dltHub, maybe you missed that


expected_tables = ["customer", "invoice"]
# only those tables in the schema
assert set(t["name"] for t in pipeline.default_schema.data_tables()) == set(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: once this is merged: https://github.com/dlt-hub/dlt/pull/2566/files, you can use load_tables_to_dicts instead

@djudjuu
Copy link
Contributor

djudjuu commented May 7, 2025

@ah12068 just tagging you here again, because I updated my earlier reply

@ah12068
Copy link
Author

ah12068 commented May 29, 2025

UPDATED: In order for this to become a real verified source, we'd need a bit more work such that the pipeline can run just with the credential file without any manual work (getting tokens from their website). Intuit seems to be doing a standard Oauth 2.0 authentication, which is something that dlt does support.

So let's try to get this to a state where the user sets the unchanging values in the secret (client-id, client-secret, project-id, redirect-uri etc...) and the code connects to the oauth-endpoints to get the authorization token which it uses in turn to get the access- and refresh-token and then instanitates the client with it.

here are the docs for dlt-oauth: https://dlthub.com/docs/general-usage/credentials/complex_types#oauth2credentials it's an abstract class that is being implemented for GoogleCloud-Credentials, so you could copy how that is done.

If you're into it, you can also try to 'vibe-code' it and have an LLM help you with writing it. Intuit has a lot of docs, and some example client implementations, so that should be a very good starting point. We put together some advice on how to achieve good results with it here: https://dlthub.com/docs/dlt-ecosystem/llm-tooling/cursor-restapi

I finally had some time to look into this and thanks for the docs - turns out Intuit had a repo creating a sample app which implements the oAuth 2.0 flow.

So i've de-django-fied the code (and credited) so that the auth process can obtain the relevant credentials automatically or easily used for end-users to obtain credentials to re-use that going forward. The reason for this approach is that i didn't quite understand the DLT implementation and encountered errors when trying to use it (kept raising that ImplementationError(). But at least that means there is some decoupling should the DLT oAuth code change (the dependency in this case is dlt.sources.helpers.requests).

Let me know what you think

@ah12068 ah12068 requested a review from djudjuu May 29, 2025 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants