Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[components] [RFC] Overhaul DgConfig, replace is_{workspace,project,component_lib} with type field #28002

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

smackesey
Copy link
Collaborator

@smackesey smackesey commented Feb 23, 2025

Summary & Motivation

Overhaul the dg config system. This is essential for a stable foundation going forward.

Status quo

The config system reads config from three sources and merges them all together.

  • the context root config file
  • a containing workspace config file
  • CLI options

Even though the kinds of options you can set in these different locations differ (for example, you can't set is_workspace from the CLI), they are currently all treated as having the same type. This:

  • makes our validation much looser than it should be
  • makes the code harder to follow

The implicit complexity will become a problem as we scale.

This PR

  • The basic approach of reading config from the above three sources is unchanged, but there is now much stronger typing and consequently tighter validation. Error messages for missing or wrong values in the config are now clear and immediate, no invalid config penetrates the internals.
  • The config file now takes a type argument specifying the "directory type" instead of setting booleans (is_workspace, is_project, is_component_library). There are only three types:
    • "workspace"
    • "project"
    • "library"
  • The config file now has subsections corresponding to the type.
    • tool.dg.project: this is required when tool.dg.type = "project". At a minimum you must specify the root package name of your project (comes automatically in scaffolding).
      • package_name
      • components_package_name (defaults to <package_name>.components)
      • components_lib_package_name (defaults to <package_name>.lib)
    • tool.dg.library: this is required when tool.dg.type = "library". At a minimum you must specify the root package name of your library (comes automatically in scaffolding).
      • package_name
      • components_lib_package_name
    • tool.dg.workspace does not exist, because there are no settings yet specific to the workspace.
    • tool.dg.global takes global settings, the same stuff you pass on the command line, verbose etc.
  • The config object now has sub-objects that correspond to different kinds of settings:
    • global: always defined
    • project only defined in a project context
    • library only defined in a library context

Example workspace config:

[tool.dg]
type = "workspace"

[tool.dg.global]
disable_cache = True

Example project config:

[tool.dg]
type = "project"

[tool.dg.project]  # required
package_name = "foo"  # required
components_package_name = "foo.alternate_components"  # optional
components_lib_package_name = "foo.alternate_lib"  # optional

[tool.dg.global]  # optional
verbose = true

Example library config:

[tool.dg]
type = "library"

[tool.dg.library]  # required
package_name = "foo"  # required
components_lib_package_name = "foo.alternate_lib"  # optional

[tool.dg.global]
verbose = true

Planning on an additional upstack PR that breaks up DgContext with constituent Workspace, Project, and Library objects, which is easy with the new sectioned config.

How I Tested These Changes

Added unit tests.

Copy link
Collaborator Author

smackesey commented Feb 23, 2025

@smackesey smackesey changed the title [components] Break up DgConfig [components] [RFC] Overhaul DgConfig, replace is_{workspace,project,component_lib} with type field Feb 23, 2025
@smackesey smackesey marked this pull request as ready for review February 23, 2025 05:04
Comment on lines +494 to +498
def _get_literal_values(type_: Any) -> tuple[object, ...]:
val = type_
while get_origin(val) is not Literal:
val = get_origin(type_)
return get_args(val)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be a bug in the _get_literal_values() function where the while loop uses get_origin(type_) instead of get_origin(val). This creates an infinite loop since type_ never changes. The line should be:

val = get_origin(val)

This will properly traverse the type hierarchy until reaching the Literal type.

Spotted by Graphite Reviewer

Is this helpful? React 👍 or 👎 to let us know.



def _raise_missing_required_key_error(key: str, type_str: str) -> Never:
raise DgValidationError(f"`Missing required value for `{key}`. Expected {type_str}.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a misplaced backtick in the error message string. The current message:

f"`Missing required value for `{key}`. Expected {type_str}."

Should be:

f"Missing required value for `{key}`. Expected {type_str}."

Spotted by Graphite Reviewer

Is this helpful? React 👍 or 👎 to let us know.

Comment on lines +433 to +434
if not isinstance(section, dict):
raise DgValidationError("`tool.dg.project` must be a table.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message in this function incorrectly references tool.dg.project when validating a library section. Since this is the _validate_file_config_library_section function, the error message should reference tool.dg.library instead.

Spotted by Graphite Reviewer

Is this helpful? React 👍 or 👎 to let us know.

def default(cls) -> Self:
return cls(DgGlobalConfig.default())

# Note that this funciton takes an argument called `container_workspace_file_config` instead of
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small typo in comment: funciton should be function

Spotted by Graphite Reviewer

Is this helpful? React 👍 or 👎 to let us know.

Copy link
Member

@schrockn schrockn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this eliminate the case where a project is also a component library?

Base automatically changed from sean/components/change-config-extension-mechanism to master February 23, 2025 18:00
@smackesey
Copy link
Collaborator Author

Does this eliminate the case where a project is also a component library?

Since writing this my thinking has evolved on "library". See my thoughts here:

https://www.notion.so/dagster/Outstanding-dg-API-ergonomics-questions-1a318b92e462807a855ff198b6d285ae?pvs=4#1a318b92e462801db343e98ffcef98ea

TLDR I'd like to change this PR to eliminate the "library" type, there should just be "workspace" and "project".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants