-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingest sloan courses via api #1487
Merged
+365
−117
Merged
Changes from 2 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
d67f877
create sloan courses and course offerings data assets extracted from API
rachellougee df2c2b8
add sloan api definition and code location
rachellougee 95bcf5b
Create a new OAuth resource, refactor OpenEdxApiClient to inherit fro…
rachellougee 237baf8
remove unused code
rachellougee ec94b6a
remove sloan API method from OAuthApiClient and add them to assets file
rachellougee File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,158 @@ | ||
import time | ||
from collections.abc import Generator | ||
from contextlib import contextmanager | ||
from datetime import UTC, datetime, timedelta | ||
from typing import Any, Optional, Self | ||
|
||
import httpx | ||
from dagster import ConfigurableResource, InitResourceContext, ResourceDependency | ||
from pydantic import Field, PrivateAttr, ValidationError, validator | ||
|
||
from ol_orchestrate.resources.secrets.vault import Vault | ||
|
||
TOO_MANY_REQUESTS = 429 | ||
|
||
|
||
class OAuthApiClient(ConfigurableResource): | ||
client_id: str = Field(description="OAUTH2 client ID") | ||
client_secret: str = Field(description="OAUTH2 client secret") | ||
token_type: str = Field( | ||
default="JWT", | ||
description="Token type to generate for use with authenticated requests", | ||
) | ||
token_url: str = Field( | ||
description="URL to request token. e.g. https://lms.mitx.mit.edu/oauth2/access_token", | ||
) | ||
base_url: str = Field( | ||
description="Base URL of OAuth API client being queries. e.g. https://lms.mitx.mit.edu/", | ||
) | ||
http_timeout: int = Field( | ||
default=60, | ||
description=( | ||
"Time (in seconds) to allow for requests to complete before timing out." | ||
), | ||
) | ||
_access_token: Optional[str] = PrivateAttr(default=None) | ||
_access_token_expires: Optional[datetime] = PrivateAttr(default=None) | ||
_http_client: httpx.Client = PrivateAttr(default=None) | ||
|
||
def __init__(self, *args, **kwargs): | ||
super().__init__(*args, **kwargs) | ||
self._initialize_client() | ||
|
||
def _initialize_client(self) -> None: | ||
if self._http_client is not None: | ||
return | ||
timeout = httpx.Timeout(self.http_timeout, connect=10) | ||
self._http_client = httpx.Client(timeout=timeout) | ||
|
||
@validator("token_type") | ||
def validate_token_type(cls, token_type): # noqa: N805 | ||
if token_type.lower() not in ["jwt", "bearer"]: | ||
raise ValidationError | ||
return token_type | ||
|
||
def _fetch_access_token(self) -> Optional[str]: | ||
now = datetime.now(tz=UTC) | ||
if self._access_token is None or (self._access_token_expires or now) <= now: | ||
payload = { | ||
"grant_type": "client_credentials", | ||
"client_id": self.client_id, | ||
"client_secret": self.client_secret, | ||
"token_type": self.token_type, | ||
} | ||
response = self._http_client.post(self.token_url, data=payload) | ||
response.raise_for_status() | ||
self._access_token = response.json()["access_token"] | ||
self._access_token_expires = now + timedelta( | ||
seconds=response.json()["expires_in"] | ||
) | ||
return self._access_token | ||
|
||
@property | ||
def _username(self) -> str: | ||
response = self._http_client.get( | ||
f"{self.base_url}/api/user/v1/me", | ||
headers={"Authorization": f"JWT {self._fetch_access_token()}"}, | ||
) | ||
response.raise_for_status() | ||
return response.json()["username"] | ||
|
||
def _fetch_with_auth( | ||
self, | ||
request_url: str, | ||
page_size: int = 100, | ||
extra_params: dict[str, Any] | None = None, | ||
) -> dict[Any, Any]: | ||
if self.token_url == f"{self.base_url}/oauth2/access_token": | ||
request_params = {"username": self._username, "page_size": page_size} | ||
else: | ||
request_params = {} | ||
|
||
response = self._http_client.get( | ||
request_url, | ||
headers={"Authorization": f"JWT {self._fetch_access_token()}"}, | ||
params=httpx.QueryParams(**request_params), | ||
) | ||
|
||
try: | ||
response.raise_for_status() | ||
except httpx.HTTPStatusError as error_response: | ||
if error_response.response.status_code == TOO_MANY_REQUESTS: | ||
retry_after = error_response.response.headers.get("Retry-After", 60) | ||
delay = int(retry_after) if retry_after.isdigit() else 60 | ||
time.sleep(delay) | ||
return self._fetch_with_auth( | ||
request_url, page_size=page_size, extra_params=extra_params | ||
) | ||
raise | ||
return response.json() | ||
|
||
def get_sloan_courses(self): | ||
""" | ||
Retrieve the course data from their API as JSON | ||
|
||
returns: JSON document representing an array of course objects | ||
""" | ||
course_url = "https://mit-unified-portal-prod-78eeds.43d8q2.usa-e2.cloudhub.io/api/courses" | ||
return self._fetch_with_auth(course_url) | ||
|
||
def get_sloan_course_offerings(self): | ||
""" | ||
Retrieve the course offerings data from their API as JSON | ||
|
||
returns: JSON document representing an array of course offering objects | ||
""" | ||
course_offering_url = "https://mit-unified-portal-prod-78eeds.43d8q2.usa-e2.cloudhub.io/api/course-offerings" | ||
return self._fetch_with_auth(course_offering_url) | ||
|
||
|
||
class OAuthApiClientFactory(ConfigurableResource): | ||
deployment: str = Field(description="The name of the deployment") | ||
_client: OAuthApiClient = PrivateAttr() | ||
vault: ResourceDependency[Vault] | ||
|
||
def _initialize_client(self) -> OAuthApiClient: | ||
client_secrets = self.vault.client.secrets.kv.v1.read_secret( | ||
mount_point="secret-data", | ||
path=f"pipelines/{self.deployment}/oauth-client", | ||
)["data"] | ||
|
||
self._client = OAuthApiClient( | ||
client_id=client_secrets["id"], | ||
client_secret=client_secrets["secret"], | ||
base_url=client_secrets["url"], | ||
token_url=client_secrets.get( | ||
"token_url", f"{client_secrets['url']}/oauth2/access_token" | ||
), | ||
) | ||
return self._client | ||
|
||
@property | ||
def client(self) -> OAuthApiClient: | ||
return self._client | ||
|
||
@contextmanager | ||
def yield_for_execution(self, context: InitResourceContext) -> Generator[Self]: # noqa: ARG002 | ||
self._initialize_client() | ||
yield self |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These method seem like they're unnecessary in this context. If we wanted to make a Sloan API specific resource we could put them there, but I think it's easy enough to just use the
fetch_with_auth
and pass the URL in question. Another reason to not have these methods here is that they will then be inherited by the edX resource (which won't break anything, but is a leaky abstraction)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I realized that they don't belong here. I moved them to the asset file so we don't need to create a new resource file for these two simple methods.