Skip to content

feat: agent.action.transform can now transform images to text descriptions #1096

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ export async function updateDocumentTitle(_id, title) {
- [Example: Using the async flag](#example-using-the-async-flag)
- [Transforming Documents](#transforming-documents)
- [Transforming images](#transforming-images)
- [Image descriptions](#image-descriptions)
- [Example: Field-based transformation](#example-field-based-transformation)
- [Translating Documents](#translating-documents)
- [Example: Storing language in a field](#example-storing-language-in-a-field)
Expand Down Expand Up @@ -2111,6 +2112,32 @@ Image transform can have per-path instructions, just like any other target paths

- `target: [{path: ['image', 'asset'], instruction: 'Make the sky blue' }`

##### Image descriptions

## Image description

Images can be transformed to a textual description by targeting a `string`, `text` or Portable Text field (`array` with `block`)
with `operation: {type: 'image-description'}`.

Custom instructions for image description targets will be used to generate the description.

###### Targeting image fields
If a target is a descendant field of an image object, no `sourcePath` is required in the operation:

For example:
- `target: {path: ['image', 'description'], operation: {type: 'image-description'} }`
- `target: {path: ['array', {_key: 'abc'}, 'alt'], operation: {type: 'image-description'} } //assuming the item in the array on the key-ed path is an image`
- `target: {path: ['image'], include: ['portableTextField'], operation: {type: 'image-description'}, instruction: 'Use formatting and headings to describe the image in great detail' }`

###### Targeting non-image fields
If the target image description lives outside an image object, use the `sourcePath` option to specify the path to the image field.
`sourcePath` must be an image or image asset field.

For example:
- `target: {path: ['description'], operation: {type: 'image-description', sourcePath: ['image', 'asset'] }`
- `target: {path: ['wrapper', 'title'], operation: {type: 'image-description', sourcePath: ['array', {_key: 'abc'}, 'image'] }`
- `target: {path: ['wrapper'], include: ['portableTextField'], operation: {type: 'image-description', sourcePath: ['image', 'asset'] }, instruction: 'Use formatting and headings to describe the image in great detail' }`

##### Example: Field-based transformation

```ts
Expand Down
81 changes: 78 additions & 3 deletions src/agent/actions/transform.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,13 @@ import {type Observable} from 'rxjs'

import {_request} from '../../data/dataMethods'
import type {ObservableSanityClient, SanityClient} from '../../SanityClient'
import type {AgentActionParams, Any, HttpRequest, IdentifiedSanityDocumentStub} from '../../types'
import type {
AgentActionParams,
AgentActionPath,
Any,
HttpRequest,
IdentifiedSanityDocumentStub,
} from '../../types'
import {hasDataset} from '../../validators'
import type {
AgentActionAsync,
Expand Down Expand Up @@ -169,7 +175,61 @@ export type TransformTargetDocument =
| {operation: 'createIfNotExists'; _id: string}
| {operation: 'createOrReplace'; _id: string}

/** @beta */
/**
*
* @see #TransformOperation
* @beta
*/
export interface ImageDescriptionOperation {
type: 'image-description'
/**
* When omitted, parent image value will be inferred from the arget path.
*
* When specified, the `sourcePath` should be a path to an image (or image asset) field:
* - `['image']`
* - `['wrapper', 'mainImage']`
* - `['heroImage', 'asset'] // the asset segment is optional, but supported`
*/
sourcePath?: AgentActionPath
}

/**
*
* ## `set` by default
* By default, Transform will change the value of every target field in place using a set operation.
*
* ## Image description
*
* ### Targeting image fields
* Images can be transformed to a textual description by targeting a `string`, `text` or Portable Text field (`array` with `block`)
* with `operation: {type: 'image-description'}`.
*
* Custom instructions for image description targets will be used to generate the description.
*
* Such targets must be a descendant field of an image object.
*
* For example:
* - `target: {path: ['image', 'description'], operation: {type: 'image-description'} }`
* - `target: {path: ['array', {_key: 'abc'}, 'alt'], operation: {type: 'image-description'} } //assuming the item in the array on the key-ed path is an image`
* - `target: {path: ['image'], include: ['portableTextField'], operation: {type: 'image-description'}, instruction: 'Use formatting and headings to describe the image in great detail' }`
*
* ### Targeting non-image fields
* If the target image description lives outside an image object, use the `sourcePath` option to specify the path to the image field.
* `sourcePath` must be an image or image asset field.
*
* For example:
* - `target: {path: ['description'], operation: operation: {type: 'image-description', sourcePath: ['image', 'asset'] }`
* - `target: {path: ['wrapper', 'title'], operation: {type: 'image-description', sourcePath: ['array', {_key: 'abc'}, 'image'] }`
* - `target: {path: ['wrapper'], include: ['portableTextField'], operation: {type: 'image-description', sourcePath: ['image', 'asset'] }, instruction: 'Use formatting and headings to describe the image in great detail' }`
*
* @beta
*/
export type TransformOperation = 'set' | ImageDescriptionOperation

/**
* @see #TransformOperation
* @beta
* */
export interface TransformTargetInclude extends AgentActionTargetInclude {
/**
* Specifies a tailored instruction of this target.
Expand All @@ -185,9 +245,18 @@ export interface TransformTargetInclude extends AgentActionTargetInclude {
* Fields or array items not on the include list, are implicitly excluded.
*/
include?: (AgentActionPathSegment | TransformTargetInclude)[]

/**
* Default: `set`
* @see #TransformOperation
*/
operation?: TransformOperation
}

/** @beta */
/**
* @see #TransformOperation
* @beta
* */
export interface TransformTarget extends AgentActionTarget {
/**
* Specifies a tailored instruction of this target.
Expand All @@ -204,6 +273,12 @@ export interface TransformTarget extends AgentActionTarget {
* Fields or array items not on the include list, are implicitly excluded.
*/
include?: (AgentActionPathSegment | TransformTargetInclude)[]

/**
* Default: `set`
* @see #TransformOperation
*/
operation?: TransformOperation
}

/** @beta */
Expand Down
2 changes: 2 additions & 0 deletions src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1635,7 +1635,9 @@ export type {
export type {PatchDocument, PatchOperation, PatchTarget} from './agent/actions/patch'
export type {PromptRequest} from './agent/actions/prompt'
export type {
ImageDescriptionOperation,
TransformDocument,
TransformOperation,
TransformTarget,
TransformTargetDocument,
TransformTargetInclude,
Expand Down
4 changes: 3 additions & 1 deletion test/client.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4352,13 +4352,15 @@ describe('client', async () => {
temperature: 0.6,
async: false,
target: [
{path: ['title']},
{path: ['title'], operation: 'set'},
{path: ['description'], operation: {type: 'image-description', sourcePath: ['image']}},
{
instruction: 'based on $c – replace this field',
include: [
'object',
{
path: 'array',
operation: 'set',
include: [{_key: '123'}],
instruction: 'based on $b – replace this field',
types: {
Expand Down