Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for programmatic title generation #9183

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
135 changes: 135 additions & 0 deletions docs/concepts/json_schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -330,6 +330,7 @@ Some field parameters are used exclusively to customize the generated JSON Schem
* `description`: The description of the field.
* `examples`: The examples of the field.
* `json_schema_extra`: Extra JSON Schema properties to be added to the field.
* `field_title_generator`: A function that programmatically sets the field's title, based on its name.

Here's an example:

Expand Down Expand Up @@ -481,6 +482,50 @@ print(json.dumps(Foo.model_json_schema(), indent=2))
"""
```

### Programmatic field title generation

The `field_title_generator` parameter can be used to programmatically generate the title for a field based on its name.

See the following example:

```py
import json

from pydantic import BaseModel, Field


def make_title(field_name: str) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my main question is whether the full field should be passed to the function, not just the name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full field as in the FieldInfo object? That we need to pass both FieldInfo and the name of the field since FieldInfo doesn't include the name.

Then the signature of make_title would be

def make_title(field_name: str, field_info: FieldInfo) -> str: ...

return field_name.upper()


class Person(BaseModel):
name: str = Field(field_title_generator=make_title)
age: int = Field(field_title_generator=make_title)


print(json.dumps(Person.model_json_schema(), indent=2))
"""
{
"properties": {
"name": {
"title": "NAME",
"type": "string"
},
"age": {
"title": "AGE",
"type": "integer"
}
},
"required": [
"name",
"age"
],
"title": "Person",
"type": "object"
}
"""
```

### Model-Level Customization

You can also use [model config][pydantic.config.ConfigDict] to customize JSON schema generation on a model.
Expand All @@ -490,6 +535,8 @@ Specifically, the following config options are relevant:
* [`json_schema_extra`][pydantic.config.ConfigDict.json_schema_extra]
* [`schema_generator`][pydantic.config.ConfigDict.schema_generator]
* [`json_schema_mode_override`][pydantic.config.ConfigDict.json_schema_mode_override]
* [`field_title_generator`][pydantic.config.ConfigDict.field_title_generator]
* [`class_title_generator`][pydantic.config.ConfigDict.class_title_generator]
NeevCohen marked this conversation as resolved.
Show resolved Hide resolved

### Using `json_schema_extra`

Expand Down Expand Up @@ -1029,6 +1076,94 @@ print(json.dumps(TypeAdapter(Person).json_schema(), indent=2))
```


### Using `field_title_generator`
sydney-runkle marked this conversation as resolved.
Show resolved Hide resolved

The `field_title_generator` parameter can be used to dynamically generate the title for a field based on its name.
This is similar to the field level `field_title_generator`, but the `ConfigDict` option will be applied to all fields of the class.

See the following example:

```py
import json

from pydantic import BaseModel, ConfigDict
from pydantic.alias_generators import to_pascal


class Person(BaseModel):
model_config = ConfigDict(field_title_generator=to_pascal)
NeevCohen marked this conversation as resolved.
Show resolved Hide resolved
name: str
age: int


print(json.dumps(Person.model_json_schema(), indent=2))
"""
{
"properties": {
"name": {
"title": "Name",
"type": "string"
},
"age": {
"title": "Age",
"type": "integer"
}
},
"required": [
"name",
"age"
],
"title": "Person",
"type": "object"
}
"""
```

### Using `class_title_generator`

The `class_title_generator` config option is similar to the `field_title_generator` option, but it applies to the title of the class itself.

See the following example:

```py
import json

from pydantic import BaseModel, ConfigDict


def make_title(field_name: str) -> str:
sydney-runkle marked this conversation as resolved.
Show resolved Hide resolved
return f'Title-{field_name}'


class Person(BaseModel):
model_config = ConfigDict(class_title_generator=make_title)
name: str
age: int


print(json.dumps(Person.model_json_schema(), indent=2))
"""
{
"properties": {
"name": {
"title": "Name",
"type": "string"
},
"age": {
"title": "Age",
"type": "integer"
}
},
"required": [
"name",
"age"
],
"title": "Title-Person",
"type": "object"
}
"""
```

## JSON schema types

Types, custom field types, and constraints (like `max_length`) are mapped to the corresponding spec formats in the
Expand Down
4 changes: 4 additions & 0 deletions pydantic/_internal/_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,8 @@ class ConfigWrapper:
# to construct error `loc`s, default `True`
loc_by_alias: bool
alias_generator: Callable[[str], str] | AliasGenerator | None
class_title_generator: Callable[[str], str] | None
field_title_generator: Callable[[str], str] | None
ignored_types: tuple[type, ...]
allow_inf_nan: bool
json_schema_extra: JsonDict | JsonSchemaExtraCallable | None
Expand Down Expand Up @@ -243,6 +245,8 @@ def push(self, config_wrapper: ConfigWrapper | ConfigDict | None):
from_attributes=False,
loc_by_alias=True,
alias_generator=None,
class_title_generator=None,
field_title_generator=None,
ignored_types=(),
allow_inf_nan=True,
json_schema_extra=None,
Expand Down
78 changes: 70 additions & 8 deletions pydantic/_internal/_generate_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,6 @@
ModifyCoreSchemaWrapHandler = GetCoreSchemaHandler
GetCoreSchemaFunction = Callable[[Any, ModifyCoreSchemaWrapHandler], core_schema.CoreSchema]


TUPLE_TYPES: list[type] = [tuple, typing.Tuple]
LIST_TYPES: list[type] = [list, typing.List, collections.abc.MutableSequence]
SET_TYPES: list[type] = [set, typing.Set, collections.abc.MutableSet]
Expand Down Expand Up @@ -203,20 +202,27 @@ def apply_each_item_validators(


def modify_model_json_schema(
schema_or_field: CoreSchemaOrField, handler: GetJsonSchemaHandler, *, cls: Any
schema_or_field: CoreSchemaOrField,
handler: GetJsonSchemaHandler,
*,
cls: Any,
title: str | None = None,
) -> JsonSchemaValue:
"""Add title and description for model-like classes' JSON schema.

Args:
schema_or_field: The schema data to generate a JSON schema from.
handler: The `GetCoreSchemaHandler` instance.
cls: The model-like class.
title: The title to set for the model's schema, defaults to the model's name

Returns:
JsonSchemaValue: The updated JSON schema.
"""
from ..dataclasses import is_pydantic_dataclass
from ..main import BaseModel
from ..root_model import RootModel
from ._dataclasses import is_builtin_dataclass

json_schema = handler(schema_or_field)
original_schema = handler.resolve_ref_schema(json_schema)
Expand All @@ -225,10 +231,12 @@ def modify_model_json_schema(
ref = original_schema['$ref']
original_schema.clear()
original_schema['allOf'] = [{'$ref': ref}]
if 'title' not in original_schema:
if title is not None:
original_schema['title'] = title
elif 'title' not in original_schema:
original_schema['title'] = cls.__name__
# BaseModel; don't use cls.__doc__ as it will contain the verbose class signature by default
docstring = None if cls is BaseModel else cls.__doc__
# BaseModel + Dataclass; don't use cls.__doc__ as it will contain the verbose class signature by default
NeevCohen marked this conversation as resolved.
Show resolved Hide resolved
docstring = None if cls is BaseModel or is_builtin_dataclass(cls) or is_pydantic_dataclass(cls) else cls.__doc__
if docstring and 'description' not in original_schema:
original_schema['description'] = inspect.cleandoc(docstring)
elif issubclass(cls, RootModel) and cls.model_fields['root'].description:
Expand Down Expand Up @@ -531,7 +539,8 @@ def _model_schema(self, cls: type[BaseModel]) -> core_schema.CoreSchema:
)
config_wrapper = ConfigWrapper(cls.model_config, check=False)
core_config = config_wrapper.core_config(cls)
metadata = build_metadata_dict(js_functions=[partial(modify_model_json_schema, cls=cls)])
title = self._get_class_title_from_config(cls, config_wrapper)
metadata = build_metadata_dict(js_functions=[partial(modify_model_json_schema, cls=cls, title=title)])

model_validators = decorators.model_validators.values()

Expand Down Expand Up @@ -608,6 +617,26 @@ def _model_schema(self, cls: type[BaseModel]) -> core_schema.CoreSchema:
self.defs.definitions[model_ref] = schema
return core_schema.definition_reference_schema(model_ref)

@staticmethod
def _get_class_title_from_config(
cls: type[BaseModel | StandardDataclass], config_wrapper: ConfigWrapper | None = None
) -> str | None:
"""Get the title of a class if `class_title_generator` or `title` are set in the config, else return None"""
if config_wrapper is None:
return None

if config_wrapper.title:
return config_wrapper.title

class_title_generator = config_wrapper.class_title_generator
if class_title_generator:
title = class_title_generator(cls.__name__)
if not isinstance(title, str):
raise TypeError(f'class_title_generator {class_title_generator} must return str, not {title.__class__}')
return title

return None

def _unpack_refs_defs(self, schema: CoreSchema) -> CoreSchema:
"""Unpack all 'definitions' schemas into `GenerateSchema.defs.definitions`
and return the inner schema.
Expand Down Expand Up @@ -1040,6 +1069,27 @@ def _apply_alias_generator_to_computed_field_info(
if computed_field_info.alias_priority == 1:
computed_field_info.alias = _get_first_non_null(serialization_alias, alias)

@staticmethod
def _apply_field_title_generator_to_field_info(
config_wrapper: ConfigWrapper, field_info: FieldInfo | ComputedFieldInfo, field_name: str
) -> None:
"""Apply a field_title_generator on a FieldInfo or ComputedFieldInfo instance if appropriate
Args:
config_wrapper: The config of the model
field_info: The FieldInfo or ComputedField instance to which the title_generator is (maybe) applied.
field_name: The name of the field from which to generate the title.
"""
field_title_generator = field_info.field_title_generator or config_wrapper.field_title_generator
if field_title_generator is None:
return

if field_info.title_priority is None or field_info.title_priority <= 1 or field_info.title is None:
title = field_title_generator(field_name)
if not isinstance(title, str):
raise TypeError(f'field_title_generator {field_title_generator} must return str, not {title.__class__}')

field_info.title = title

def _common_field_schema( # C901
self, name: str, field_info: FieldInfo, decorators: DecoratorInfos
) -> _CommonField:
Expand Down Expand Up @@ -1111,6 +1161,8 @@ def set_discriminator(schema: CoreSchema) -> CoreSchema:
schema = self._apply_field_serializers(
schema, filter_field_decorator_info_by_field(decorators.field_serializers.values(), name)
)
self._apply_field_title_generator_to_field_info(self._config_wrapper, field_info, name)

json_schema_updates = {
'title': field_info.title,
'description': field_info.description,
Expand Down Expand Up @@ -1280,14 +1332,16 @@ def _typed_dict_schema(self, typed_dict_cls: Any, origin: Any) -> core_schema.Co
and field_name in field_docstrings
):
field_info.description = field_docstrings[field_name]
self._apply_field_title_generator_to_field_info(self._config_wrapper, field_info, field_name)
fields[field_name] = self._generate_td_field_schema(
field_name, field_info, decorators, required=required
)

title = self._get_class_title_from_config(typed_dict_cls, ConfigWrapper(config))
metadata = build_metadata_dict(
js_functions=[partial(modify_model_json_schema, cls=typed_dict_cls)], typed_dict_cls=typed_dict_cls
js_functions=[partial(modify_model_json_schema, cls=typed_dict_cls, title=title)],
typed_dict_cls=typed_dict_cls,
)

td_schema = core_schema.typed_dict_schema(
fields,
computed_fields=[
Expand Down Expand Up @@ -1525,6 +1579,7 @@ def _dataclass_schema(
dataclass_bases_stack.enter_context(self._types_namespace_stack.push(dataclass_base))

# Pushing a config overwrites the previous config, so iterate though the MRO backwards
config = None
for dataclass_base in reversed(dataclass.__mro__):
if dataclasses.is_dataclass(dataclass_base):
config = getattr(dataclass_base, '__pydantic_config__', None)
Expand Down Expand Up @@ -1584,6 +1639,11 @@ def _dataclass_schema(
model_validators = decorators.model_validators.values()
inner_schema = apply_model_validators(inner_schema, model_validators, 'inner')

title = self._get_class_title_from_config(dataclass, ConfigWrapper(config))
metadata = build_metadata_dict(
js_functions=[partial(modify_model_json_schema, cls=dataclass, title=title)]
)

dc_schema = core_schema.dataclass_schema(
dataclass,
inner_schema,
Expand All @@ -1592,6 +1652,7 @@ def _dataclass_schema(
fields=[field.name for field in dataclasses.fields(dataclass)],
slots=has_slots,
config=core_config,
metadata=metadata,
)
schema = self._apply_model_serializers(dc_schema, decorators.model_serializers.values())
schema = apply_model_validators(schema, model_validators, 'outer')
Expand Down Expand Up @@ -1713,6 +1774,7 @@ def _computed_field_schema(
self._apply_alias_generator_to_computed_field_info(
alias_generator=alias_generator, computed_field_info=d.info, computed_field_name=d.cls_var_name
)
self._apply_field_title_generator_to_field_info(self._config_wrapper, d.info, d.cls_var_name)

def set_computed_field_metadata(schema: CoreSchemaOrField, handler: GetJsonSchemaHandler) -> JsonSchemaValue:
json_schema = handler(schema)
Expand Down
7 changes: 7 additions & 0 deletions pydantic/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,18 @@ class ConfigDict(TypedDict, total=False):
title: str | None
"""The title for the generated JSON schema, defaults to the model's name"""

class_title_generator: Callable[[str], str] | None
"""A callable that takes a class name and returns the title for it. Defaults to `None`."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be clear about whether this works for:

  • pydantic models?
  • dataclasses?
  • typeddicts?

Copy link
Contributor Author

@NeevCohen NeevCohen May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works for all model and model-like (pydantic dataclass, builtin dataclass, typeddict) types. ConfigDict can be applied to all of them so IMO it's a little redundant. Changing the name to model_title_generator makes this even more clear I feel.

I will add more documentation if you feel it's necessary.


field_title_generator: Callable[[str], str] | None
"""A callable that takes a field name and returns title for it. Defaults to `None`."""

str_to_lower: bool
"""Whether to convert all characters to lowercase for str types. Defaults to `False`."""

str_to_upper: bool
"""Whether to convert all characters to uppercase for str types. Defaults to `False`."""

str_strip_whitespace: bool
"""Whether to strip leading and trailing whitespace for str types."""

Expand Down