Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[graphql] Replace Int with BigInt for overflow #26757

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

deepyaman
Copy link
Contributor

@deepyaman deepyaman commented Dec 30, 2024

Summary & Motivation

Stems from an internal issue to identify and triage any of the places we use an int as an ID/cursor in Graphene (there was a situation in internal testing where the ID exceeded int limits, even if no users have hit this).

The issue mentions, "In general, any use case of Graphene.Integer is suspect", so I've gone ahead and commented on all current uses of graphene.Int (except in limits, which discussed with @OwenKephart should never go that high). The comments are all part of a separate commit fca93a3 and will be reverted before merging; they're just here to explain the decision for not changing things.

How I Tested These Changes

BK

@@ -114,7 +114,7 @@ class GraphenePendingConcurrencyStep(graphene.ObjectType):
stepKey = graphene.NonNull(graphene.String)
enqueuedTimestamp = graphene.NonNull(graphene.Float)
assignedTimestamp = graphene.Float()
priority = graphene.Int()
priority = graphene.Int() # Can't imagine anybody needs to set BigInt priorities
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe should be a BigInt; sometimes people spam 9 key

@@ -225,7 +225,7 @@ class GrapheneAssetNode(graphene.ObjectType):
non_null_list(GrapheneMaterializationEvent),
partitions=graphene.List(graphene.NonNull(graphene.String)),
beforeTimestampMillis=graphene.String(),
limit=graphene.Int(),
limit=graphene.BigInt(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All limits should be ints, since they're what are returned to the frontend

@@ -536,25 +538,25 @@ class GrapheneInstigationState(graphene.ObjectType):
typeSpecificData = graphene.Field(GrapheneInstigationTypeSpecificData)
runs = graphene.Field(
non_null_list("dagster_graphql.schema.pipelines.pipeline.GrapheneRun"),
limit=graphene.Int(),
limit=graphene.BigInt(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
limit=graphene.BigInt(),
limit=graphene.Int(),

)
runsCount = graphene.NonNull(graphene.Int)
runsCount = graphene.NonNull(graphene.BigInt)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't be bigint number of runs on a single persons org

@@ -26,7 +26,7 @@


class GrapheneEvaluationStackListItemEntry(graphene.ObjectType):
list_index = graphene.NonNull(graphene.Int)
list_index = graphene.NonNull(graphene.BigInt)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a list index doesn't need billions? Hard to imagine


class Meta:
name = "PipelineRuns"


class GrapheneRuns(graphene.ObjectType):
results = non_null_list("dagster_graphql.schema.pipelines.pipeline.GrapheneRun")
count = graphene.Int()
count = graphene.BigInt()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't be billions on a single account

@@ -103,15 +103,15 @@ class Meta:

class GraphenePipelineRuns(graphene.Interface):
results = non_null_list("dagster_graphql.schema.pipelines.pipeline.GrapheneRun")
count = graphene.Int()
count = graphene.BigInt()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't be billions on a single account

@@ -41,7 +41,7 @@ class GrapheneRunsFeedCount(graphene.ObjectType):
class Meta:
name = "RunsFeedCount"

count = graphene.NonNull(graphene.Int)
count = graphene.NonNull(graphene.BigInt)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't be billions on a single account

Comment on lines 286 to 287
startIdx=graphene.BigInt(),
endIdx=graphene.BigInt(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indices probably can't get into billions

@deepyaman deepyaman force-pushed the deepyaman/chore/use-bigint branch 2 times, most recently from 4d11713 to 2f130cb Compare December 31, 2024 17:13
@deepyaman deepyaman force-pushed the deepyaman/chore/use-bigint branch from 2f130cb to 09ab231 Compare December 31, 2024 17:16
This commit is only for review; will revert before
merging anything.
@deepyaman deepyaman force-pushed the deepyaman/chore/use-bigint branch from 09ab231 to fca93a3 Compare December 31, 2024 17:18
Copy link

github-actions bot commented Jan 1, 2025

Deploy preview for dagit-core-storybook ready!

✅ Preview
https://dagit-core-storybook-rhpvfph6t-elementl.vercel.app
https://deepyaman-chore-use-bigint.core-storybook.dagster-docs.io

Built with commit ce989dc.
This pull request is being automatically deployed with vercel-action

runId: String!
timestamp: Float!
}

scalar BigInt
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like should be using something like https://the-guild.dev/graphql/scalars/docs/scalars/big-int (the automatically-generated typing seems too broad/wrong?), but most of the files say to not modify them directly, so I've just left the autogenerated types. Maybe somebody knowledgeable about the UI module and GraphQL can weigh in?

@deepyaman deepyaman marked this pull request as ready for review January 1, 2025 15:21
@deepyaman
Copy link
Contributor Author

@gibsondan I added you as a suggested reviewer, and somebody who had recently touched these files; if there's somebody better to add who is familiar with the UI side, please feel free to assign them instead!

Copy link
Member

@gibsondan gibsondan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think BigInt essentially does not work in Javascript, you run into integer rounding issues causing it to change values and start rounding to the nearest 1000

So IMO the mission here is to change these to graphene.ID where needed (which is a string), not graphene.BigInt

priority = graphene.Int()
priority = (
graphene.BigInt()
) # Maybe somebody will spam hold down the 9 key on their keyboard ¯\_(ツ)_/¯
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we actually enforce this in Python, see https://github.com/dagster-io/dagster/pull/25172/files

@deepyaman
Copy link
Contributor Author

I think BigInt essentially does not work in Javascript, you run into integer rounding issues causing it to change values and start rounding to the nearest 1000

So IMO the mission here is to change these to graphene.ID where needed (which is a string), not graphene.BigInt

As in e9e61f1?

@gibsondan Makes sense. Then, I'll not change priority, but for storageId I'll use graphene.ID, and for the GrapheneCursor... I'll need to look into it, but maybe that may also be graphene.ID?

@gibsondan
Copy link
Member

gibsondan commented Jan 2, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants