Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teradata initial commit #2

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
Open

Conversation

tallamohan
Copy link
Collaborator

Added TeradataResource and TeradataComputeClusterManager

@tallamohan tallamohan requested a review from sc250072 December 3, 2024 05:27
@sc250072
Copy link
Collaborator

sc250072 commented Dec 3, 2024 via email

Copy link
Collaborator

@sc250072 sc250072 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the comments and questions

@@ -0,0 +1,2 @@
[run]
branch = True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this file used and what's important of this field.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .coveragerc file serves as the configuration file for coverage.py. The "branch" option determines whether to measure branch coverage alongside statement coverage. This configuration file is included in all adapters. To maintain consistency with Dagster and other integrations, I have added this file.

@@ -0,0 +1,2 @@
include LICENSE
include dagster_teradata/py.typed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

py.typed contains only partial. What's the role of this file and why do we require it in MANIFEST file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The py.typed file is used in Python projects to indicate that the package supports type annotations and is compliant with PEP 561, which outlines how type information is distributed. This file enables type checkers like mypy to recognize and use the type hints in your package.

By default, non-code files like py.typed are not automatically included in the package distribution. To ensure py.typed is included in the distribution (and therefore available to type checkers when the package is installed), it needs to be explicitly added to the MANIFEST.in file

If the py.typed file contains only the word partial, it indicates that the type annotations provided in the package are incomplete and may not cover all aspects of the API. This serves as a disclaimer for users relying on type checking, making it clear that additional caution or manual checks might be necessary.

This is also assed to maintain consistency with Dagster and other integrations.

# dagster-teradata

The docs for `dagster-teradata` can be found
[here](https://docs.dagster.io/_apidocs/libraries/dagster-teradata).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope this file will be available after release of teradata?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be ready by the time we release, and the work is progressing in parallel.

from dagster_teradata.resources import (
TeradataConnection as TeradataConnection,
TeradataResource as TeradataResource,
fetch_last_updated_timestamps as fetch_last_updated_timestamps,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we require these imports in this file?

@@ -0,0 +1 @@
__version__ = "1!0+dev"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to update this file while release or Dagster team will take care of version change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

"""Loads CSV, JSON, and Parquet format data from Azure Blob Storage to Teradata.

Args:
:param blob_source_key: The URI format specifying the location of the Azure blob object store.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentation missed for azure_client_id and azure_client_secret. Rearrange documentation of fields as per method parameters defined order

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add that

from dagster_teradata.teradata_compute_cluster_manager import TeradataComputeClusterManager


class TeradataResource(ConfigurableResource, IAttachDifferentObjectToOpContext):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel, this resource can be divided into three parts. One for Handling normal query execution, second one for DataTransfer and other for ComputeCluster. Please think on this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will check the possibility

)


def fetch_last_updated_timestamps(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is role of this method?

config_schema=TeradataResource.to_config_schema(),
description="This resource is for connecting to the Teradata Vantage",
)
def teradata_resource(context) -> TeradataConnection:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this method used internal by dagster core framework?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is mandatory

@@ -0,0 +1,135 @@
from abc import abstractmethod
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is role of this class

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need "teradata_io_manager.py". I will remove that file completely

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants