Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server): GetAssetSchemaFields which converts assets to schema fields #1461

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jasonkarel
Copy link
Contributor

@jasonkarel jasonkarel commented Mar 24, 2025

Overview

Implement GetAssetSchemaFields GQL to convert asset to schema fields based on Asset ID
GQL query

query GetAssetSchemaFields($assetId: ID!) { assetSchemaFields(assetId: $assetId) { total_count fields { field_name field_type } } }

response example
{ "data": { "assetSchemaFields": { "total_count": 2, "fields": [ { "field_name": "productName", "field_type": "string" }, { "field_name": "productId", "field_type": "integer" } ] } } }

tech doc: https://www.notion.so/eukarya/Schema-or-model-exporting-importing-with-Items-b2ac52dfc8bc4a6eadeaf20edec8427f?pvs=4

Screenshot 2025-03-24 at 11 58 54

What I've done

  1. development
  2. local testing

What I haven't done

How I tested

  1. success case for geojson
  2. success case for JSON Schema
  3. invalid operator
  4. invalid asset ID

Screenshot

Which point I want you to review particularly

Memo

Summary by CodeRabbit

  • New Features

    • Introduced a new GraphQL query that retrieves detailed asset schema information, including field names, types, and counts.
    • Enabled processing of asset metadata from both JSON and GeoJSON formats for improved data extraction.
    • Enhanced validation and error handling for asset schema retrieval, ensuring more reliable responses.
    • Added new error messages for content validation to improve user feedback.
  • Bug Fixes

    • Improved error handling for invalid content types during schema conversion.

@jasonkarel jasonkarel requested a review from yk-eukarya as a code owner March 24, 2025 10:53
Copy link
Contributor

coderabbitai bot commented Mar 24, 2025

Walkthrough

This pull request adds functionality to retrieve and process asset schema fields via GraphQL. It introduces new types and methods for managing asset schema data in the GraphQL model and resolver layers. The changes update the execution context with marshaling functions and new error handling, and they extend the use case to process JSON and GeoJSON asset data. New tests ensure the conversion logic works as expected.

Changes

Files Change Summary
server/internal/adapter/gql/generated.go, server/internal/adapter/gql/resolver_asset.go Added new types (AssetSchemaField, AssetSchemaFieldResult), a new query (AssetSchemaFields), and corresponding resolver methods with integrated complexity and context management.
server/internal/adapter/gql/gqlmodel/models_gen.go, server/internal/adapter/gql/gqlmodel/convert_asset.go Introduced GraphQL model types and a conversion function (AssetsToSchemaFieldData) to map asset schema data into the GraphQL layer.
server/internal/usecase/interactor/asset.go, server/internal/usecase/interactor/asset_test.go Implemented conversion logic (ConvertToSchemaFields, extractSchemaFieldData, isJSONSchema, detectFieldType) for processing JSON/GeoJSON asset data and added tests for various success and failure scenarios.
server/internal/usecase/interfaces/asset.go Added new error variables and types (AssetSchemaFieldsData, AssetSchemaField), and extended the Asset interface with the schema field conversion method.
server/schemas/asset.graphql Updated the GraphQL schema by adding new types (AssetSchemaField, AssetSchemaFieldResult) and a query (assetSchemaFields) to retrieve asset schema fields by asset ID.
server/i18n/en.yml, server/i18n/ja.yml Added new error message entries for content validation in English and Japanese localization files.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Resolver
    participant Usecase
    participant Converter
    Client->>Resolver: Query assetSchemaFields(assetID)
    Resolver->>Usecase: ConvertToSchemaFields(ctx, assetID, operator)
    Usecase->>Converter: extractSchemaFieldData(contentType, content)
    Converter-->>Usecase: schema fields data
    Usecase-->>Resolver: AssetSchemaFieldsData
    Resolver->>Client: GraphQL Response (AssetSchemaFieldResult)
Loading

Possibly related PRs

  • feat(server): asset gzip upload #1363: The changes in the main PR, which introduce new types and methods for handling asset schema fields in GraphQL, are related to the modifications in the retrieved PR that also involve asset handling, specifically the addition of content encoding for asset uploads, as both PRs enhance the asset management capabilities in the system.

Suggested reviewers

  • yk-eukarya

Poem

I'm a bunny in the code carrot patch,
Hopping over changes without a scratch.
New schema fields and queries in sight,
Making GraphQL data feel just right.
With a twitch of my nose and a joyful leap,
I celebrate these updates—happy code to keep! 🐇

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

netlify bot commented Mar 24, 2025

Deploy Preview for reearth-cms canceled.

Name Link
🔨 Latest commit e1498a0
🔍 Latest deploy log https://app.netlify.com/sites/reearth-cms/deploys/67e2403d2d8bc20008a23bc0

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
server/internal/usecase/interactor/asset_test.go (1)

1771-1885: Consider adding more edge cases to extractSchemaFieldData tests.

While the current test cases cover the main scenarios, there are potential edge cases that aren't tested:

  1. GeoJSON with empty properties
  2. JSON Schema with properties that don't have a type field
  3. Mixed type arrays in GeoJSON properties

Here's how you could add a test case for GeoJSON with empty properties:

+		{
+			name: "valid GeoJSON with empty properties",
+			args: args{
+				contentType: "application/geo+json",
+				content: mustMarshal(map[string]any{
+					"type": "Feature",
+					"geometry": map[string]any{
+						"type":        "Point",
+						"coordinates": []interface{}{102.0, 0.5},
+					},
+					"properties": map[string]any{},
+				}),
+			},
+			expectedFields: []interfaces.AssetSchemaField{},
+			expectedCount:  0,
+			expectErr:     nil,
+		},
server/internal/usecase/interactor/asset.go (2)

672-708: Add null handling when returning error responses.

Currently, you're returning a non-nil empty AssetSchemaFieldsData struct along with errors, which could be misleading.

Consider returning nil instead of an empty struct when an error occurs:

func (i *Asset) ConvertToSchemaFields(ctx context.Context, assetID id.AssetID, operator *usecase.Operator) (*interfaces.AssetSchemaFieldsData, error) {
	if operator.AcOperator.User == nil && operator.Integration == nil {
-		return &interfaces.AssetSchemaFieldsData{}, interfaces.ErrInvalidOperator
+		return nil, interfaces.ErrInvalidOperator
	}

	asset, err := i.repos.Asset.FindByID(ctx, assetID)
	if err != nil {
-		return &interfaces.AssetSchemaFieldsData{}, err
+		return nil, err
	}

	assetFileData, err := i.repos.AssetFile.FindByID(ctx, assetID)
	if err != nil {
-		return &interfaces.AssetSchemaFieldsData{}, err
+		return nil, err
	}

	// check if file is a json or geojson
	if assetFileData.ContentType() != "application/json" && assetFileData.ContentType() != "application/geo+json" {
-		return &interfaces.AssetSchemaFieldsData{}, interfaces.ErrInvalidContentTypeForSchemaConversion
+		return nil, interfaces.ErrInvalidContentTypeForSchemaConversion
	}

	// read file
	file, _, err := i.gateways.File.ReadAsset(ctx, asset.UUID(), assetFileData.Path(), nil)
	if err != nil {
-		return &interfaces.AssetSchemaFieldsData{}, err
+		return nil, err
	}
	// read file content
	content, err := io.ReadAll(file)
	if err != nil {
-		return &interfaces.AssetSchemaFieldsData{}, err
+		return nil, err
	}

	schemaFieldData, err := extractSchemaFieldData(assetFileData.ContentType(), content)
	if err != nil {
-		return &interfaces.AssetSchemaFieldsData{}, err
+		return nil, err
	}
	return schemaFieldData, nil
}

763-782: Consider enhancing field type detection for better type precision.

The current implementation correctly identifies basic types, but could be enhanced for more precise type detection.

You might want to consider additional type information or more granular type detection:

func detectFieldType(value any) string {
	switch v := value.(type) {
	case string:
+		// Optionally detect if string is a date/time format
+		// e.g., if timeVal, err := time.Parse(time.RFC3339, v); err == nil {
+		//    return "datetime"
+		// }
		return "string"
	case float64:
		// If the number is whole, assume it's an integer
		if v == float64(int(v)) {
			return "integer"
		}
		return "float"
	case bool:
		return "boolean"
	case map[string]interface{}:
		return "object"
	case []interface{}:
+		// Optionally detect homogeneous arrays
+		// e.g., if all elements are strings, return "string[]"
		return "array"
	default:
		return "unknown"
	}
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a49ae72 and 2bb13f8.

📒 Files selected for processing (8)
  • server/internal/adapter/gql/generated.go (12 hunks)
  • server/internal/adapter/gql/gqlmodel/convert_asset.go (2 hunks)
  • server/internal/adapter/gql/gqlmodel/models_gen.go (1 hunks)
  • server/internal/adapter/gql/resolver_asset.go (1 hunks)
  • server/internal/usecase/interactor/asset.go (2 hunks)
  • server/internal/usecase/interactor/asset_test.go (2 hunks)
  • server/internal/usecase/interfaces/asset.go (3 hunks)
  • server/schemas/asset.graphql (1 hunks)
🧰 Additional context used
🧬 Code Definitions (6)
server/internal/adapter/gql/resolver_asset.go (3)
server/internal/adapter/gql/gqlmodel/models_gen.go (3)
  • AssetSchemaFieldResult (130-133)
  • Asset (76-94)
  • Asset (96-96)
server/internal/usecase/interfaces/asset.go (1)
  • Asset (75-93)
server/internal/adapter/gql/gqlmodel/convert_asset.go (1)
  • AssetsToSchemaFieldData (146-165)
server/internal/adapter/gql/gqlmodel/convert_asset.go (2)
server/internal/usecase/interfaces/asset.go (2)
  • AssetSchemaFieldsData (65-68)
  • AssetSchemaField (70-73)
server/internal/adapter/gql/gqlmodel/models_gen.go (3)
  • AssetSchemaFieldResult (130-133)
  • AssetSchemaField (125-128)
  • FieldType (1767-1767)
server/internal/adapter/gql/gqlmodel/models_gen.go (2)
web/src/components/molecules/Schema/types.ts (1)
  • FieldType (43-60)
server/internal/usecase/interfaces/asset.go (1)
  • AssetSchemaField (70-73)
server/internal/usecase/interactor/asset_test.go (5)
server/internal/infrastructure/mongo/asset_test.go (2)
  • args (278-281)
  • args (530-532)
server/internal/infrastructure/memory/asset_test.go (1)
  • args (263-266)
server/internal/usecase/interactor/asset.go (3)
  • Asset (29-33)
  • detectFieldType (763-782)
  • isJSONSchema (756-761)
server/internal/usecase/interfaces/asset.go (3)
  • Asset (75-93)
  • AssetSchemaFieldsData (65-68)
  • AssetSchemaField (70-73)
server/internal/usecase/interactor/request.go (1)
  • f (54-54)
server/internal/adapter/gql/generated.go (1)
server/internal/adapter/gql/gqlmodel/models_gen.go (19)
  • AssetSchemaField (125-128)
  • FieldType (1767-1767)
  • AssetSchemaFieldResult (130-133)
  • e (1613-1619)
  • e (1621-1623)
  • e (1625-1636)
  • e (1638-1640)
  • e (1656-1662)
  • e (1664-1666)
  • e (1668-1679)
  • e (1681-1683)
  • e (1697-1703)
  • e (1705-1707)
  • e (1709-1720)
  • e (1722-1724)
  • e (1738-1744)
  • e (1746-1748)
  • e (1750-1761)
  • e (1763-1765)
server/internal/usecase/interactor/asset.go (1)
server/internal/usecase/interfaces/asset.go (5)
  • Asset (75-93)
  • AssetSchemaFieldsData (65-68)
  • ErrInvalidContentTypeForSchemaConversion (47-47)
  • AssetSchemaField (70-73)
  • ErrInvalidJSONSchema (46-46)
🔇 Additional comments (25)
server/internal/adapter/gql/gqlmodel/models_gen.go (1)

125-133: LGTM! Good addition of the new types to support asset schema fields.

The new AssetSchemaField and AssetSchemaFieldResult types are well-structured and properly aligned with both the GraphQL schema and the domain model. The JSON field tags follow the snake_case convention expected in GraphQL responses.

server/internal/adapter/gql/gqlmodel/convert_asset.go (1)

146-165: Function implementation looks good and handles edge cases properly.

The AssetsToSchemaFieldData function:

  • Handles nil input correctly
  • Pre-allocates the result slice with proper capacity (good for performance)
  • Correctly maps from domain model to GraphQL model

This follows best practices for conversion functions in Go.

server/internal/adapter/gql/resolver_asset.go (1)

199-210: Resolver implementation is correct and follows existing patterns.

The AssetSchemaFields resolver:

  • Properly converts the asset ID from string to the domain type
  • Returns appropriate errors when ID conversion fails
  • Uses the operator pattern for authorization
  • Converts the domain model to GraphQL model using the new converter function

The implementation is consistent with other resolvers in the file.

server/schemas/asset.graphql (2)

157-165: New GraphQL types look good and follow schema conventions.

The AssetSchemaField and AssetSchemaFieldResult types are well-defined with appropriate field names using snake_case as per GraphQL conventions.


170-170: Query definition looks good and aligns with the PR objectives.

The new assetSchemaFields query takes an assetId parameter and returns an AssetSchemaFieldResult as expected. The non-nullable return type is appropriate since error states are handled via GraphQL errors, not null returns.

server/internal/usecase/interactor/asset_test.go (4)

1486-1641: Excellent test coverage for the new ConvertToSchemaFields functionality.

The test is well-structured with clear test cases covering both successful scenarios (GeoJSON and JSON Schema) and error cases (invalid operator and asset not found). The test setup using mock assets and files is thorough.


1642-1705: Comprehensive unit test for the detectFieldType function.

Your test cases cover all the possible type conversions, including edge cases like float values that should be treated as integers. Good job on the test organization and parallel execution.


1706-1770: Well-structured tests for isJSONSchema function.

The test cases appropriately cover valid JSON schema and various invalid scenarios. The use of descriptive test names makes it clear what each test is verifying.


1886-1923: Good implementation of test helper functions.

The mustMarshal and mockAssetFiles helper functions are well-implemented and make the tests more readable.

server/internal/usecase/interfaces/asset.go (3)

43-47: Well-defined error variables with clear descriptions.

The new error variables use consistent naming and appropriate i18n translations.


65-73: Appropriate data structure for schema fields.

The AssetSchemaFieldsData and AssetSchemaField types are well-defined for representing schema information extracted from assets.


92-92: Interface extension follows the existing pattern.

The new ConvertToSchemaFields method follows the established pattern for the Asset interface, maintaining consistency.

server/internal/usecase/interactor/asset.go (1)

665-670: Good representation of the GeoJSON structure.

The GeoJSONFeature type correctly represents the basic structure needed for GeoJSON processing.

server/internal/adapter/gql/generated.go (12)

124-132: Complexity calculation definitions look correct.

These GraphQL complexity calculation definitions for AssetSchemaField and AssetSchemaFieldResult match the model structures shown in the relevant code snippets. The field names follow the snake_case convention used in the schema, which is consistent with GraphQL best practices.


577-577: Query complexity calculation properly added.

The complexity calculation function for the new AssetSchemaFields query is correctly added to the Query struct, with proper parameter signature matching the expected GraphQL input (assetID).


1044-1044: Resolver interface correctly defines the new query.

The signature for AssetSchemaFields matches the GraphQL schema definition, taking a context and assetID parameter and returning the appropriate result type with error handling.


1344-1371: Complexity calculation cases properly implemented.

All field complexity calculations for AssetSchemaField and AssetSchemaFieldResult are correctly added to the execution context with proper null checks. This ensures accurate query complexity calculations for GraphQL execution.


3479-3490: Query complexity calculation with proper argument handling.

The implementation correctly extracts and validates the assetId argument before calculating complexity. Error handling is appropriately implemented for argument extraction failures.


5373-5386: GraphQL schema definitions match requirements.

The schema definitions for AssetSchemaField, AssetSchemaFieldResult, and the assetSchemaFields query match the requirements specified in the PR objectives. The field names and types are correctly defined with non-nullable constraints.


8908-8934: Argument parsing for the query is correctly implemented.

The implementation properly extracts and validates the assetId argument for the new query with appropriate error handling and path context for GraphQL error reporting.


11481-11661: Field resolvers are properly implemented.

The resolvers for all fields in the new types (AssetSchemaField and AssetSchemaFieldResult) are correctly implemented with proper error handling, null checks, and context management. The child field contexts are also correctly defined for nested type resolution.


25103-25162: Query resolver implementation is correct.

The implementation of the assetSchemaFields query resolver correctly calls the resolver middleware and handles errors appropriately. The field context is properly set up with child field resolution for the result type.


42715-42801: Type resolvers are correctly implemented.

The type resolvers for AssetSchemaField and AssetSchemaFieldResult are properly implemented with field selection handling, concurrent execution support via deferred processing, and appropriate error handling. The implementation follows the pattern of other type resolvers in the codebase.


46825-46847: Concurrent execution for the query is properly set up.

The query is integrated into the concurrent execution context with proper error recovery and field invalidation handling, ensuring that GraphQL execution can process the query efficiently.


50629-50695: Marshaling functions are correctly implemented.

The marshaling functions for the new types handle null cases properly and support concurrent processing of arrays. The implementation follows the pattern used for other types in the codebase.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
server/i18n/en.yml (2)

40-41: Clarify New Error Message Keys.
The new entries for invalid content type and invalid content type for schema conversion have been added. Please verify that these keys exactly match the error constants defined in the asset conversion use case and that any downstream logic referring to these messages uses the identical keys. Also, consider if providing a non-empty default message would be beneficial for debugging purposes.


49-49: Review the 'invalid json schema' Message.
The invalid json schema message is newly added. Ensure that it aligns with the validation logic in use and consider whether a more descriptive default message might aid in resolving JSON schema validation errors.

server/i18n/ja.yml (2)

40-41: Verify Japanese Translations for New Error Messages.
The messages for invalid content type and invalid content type for schema conversion have been translated as "無効なコンテンツタイプです。" and "スキーマ変換のための無効なコンテンツタイプです。", respectively. Please confirm these translations are consistent with the application's terminology and that they convey enough context for end users encountering content validation issues.


49-49: Confirm Translation for 'invalid json schema'.
The translation "無効なJSONスキーマです。" appears correct. Verify that it is coherent with similar error messages in the file and with overall localization standards used in the project.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2bb13f8 and 9bf8426.

📒 Files selected for processing (2)
  • server/i18n/en.yml (1 hunks)
  • server/i18n/ja.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci-server / test

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
server/internal/usecase/interactor/asset.go (4)

692-708: Consider using defer to close the file resource.

The file resource opened by ReadAsset isn't explicitly closed, which might lead to resource leaks.

Add a defer statement to ensure the file is closed:

file, _, err := i.gateways.File.ReadAsset(ctx, asset.UUID(), assetFileData.Path(), nil)
if err != nil {
    return &interfaces.AssetSchemaFieldsData{}, err
}
+defer file.Close()

710-744: Check for invalid JSON Schema before accessing properties.

The code properly handles type assertions for property values, but it doesn't verify the existence of the "properties" field before checking if it's a valid JSON Schema.

Consider reordering the validation logic:

// ✅ Detect JSON Schema
if isJSONSchema(raw) {
    // Extract from "properties"
-   if props, ok := raw["properties"].(map[string]any); ok {
+   props, ok := raw["properties"].(map[string]any)
+   if !ok {
+       return result, interfaces.ErrInvalidJSONSchema
+   }
    for key, val := range props {
        if def, ok := val.(map[string]interface{}); ok {
            typeValue, ok := def["type"].(string)
            if !ok {
                continue
            }
            result.Fields = append(result.Fields, interfaces.AssetSchemaField{
                FieldName: key,
                FieldType: typeValue,
            })
        }
    }
    result.TotalCount = len(result.Fields)
    return result, nil
-   } else {
-       return result, interfaces.ErrInvalidJSONSchema
-   }
} else {
    return result, interfaces.ErrInvalidJSONSchema
}

774-793: Improve handling of numeric types.

The current implementation differentiates between integers and floats based on whether the value has decimal digits. Consider a more robust approach for numeric type detection.

func detectFieldType(value any) string {
    switch v := value.(type) {
    case string:
        return "string"
    case float64:
-       // If the number is whole, assume it's an integer
-       if v == float64(int(v)) {
+       // Check if the float can be represented as an integer without loss
+       intVal := int64(v)
+       if float64(intVal) == v {
            return "integer"
        }
        return "number"
    case bool:
        return "boolean"
    case map[string]interface{}:
        return "object"
    case []interface{}:
        return "array"
+   case nil:
+       return "null"
    default:
        return "unknown"
    }
}

767-772:

❓ Verification inconclusive

Add more robust JSON Schema validation.

The current JSON Schema validation is minimal. Consider adding more checks to ensure it's truly a valid JSON Schema.

func isJSONSchema(data map[string]interface{}) bool {
    _, hasSchema := data["$schema"]
    _, hasProperties := data["properties"]
    typeVal, isTypeObj := data["type"].(string)
-   return hasSchema && hasProperties && isTypeObj && typeVal == "object"
+   
+   // Check for standard JSON Schema URI patterns
+   isStandardSchema := false
+   if schemaVal, ok := data["$schema"].(string); ok {
+       isStandardSchema = strings.Contains(schemaVal, "json-schema.org") || 
+                          strings.Contains(schemaVal, "schema.json")
+   }
+   
+   return (hasSchema && isStandardSchema) && hasProperties && isTypeObj && typeVal == "object"
}

🌐 Web query:

What are standard JSON Schema URI patterns?

💡 Result:

JSON Schema defines specific patterns and conventions for URIs to ensure unambiguous references and validation. Here are the key standards and practices:

Base URI Determination

  • The base URI is resolved using the JSON Schema $id keyword at the root level, which sets the absolute URI for the schema[1][7].
    Example:
    {
      "$id": "https://example.com/schemas/address",
      "type": "object"
    }
    In this case, the base URI becomes https://example.com/schemas/address, overriding the retrieval URI[1].

URI Validation

  • The format: "uri" keyword enforces RFC 3986 compliance, requiring absolute URIs (not relative references)[6][9].
    Example validation rule:
    {
      "type": "string",
      "format": "uri"
    }
    This rejects relative URIs like ../path or #fragment unless resolved against a base URI[6].

Resolution Rules

  1. $ref resolves URIs against the base URI defined by $id[1][3].
    For instance, a $ref: "billing-address" under https://example.com/schemas/address resolves to https://example.com/schemas/billing-address.

  2. Relative references (e.g., subschema#/definitions/entry) are resolved using the base URI but are not valid standalone for format: "uri"[6][9].

Key Considerations

  • Avoid Ambiguity: Always use absolute URIs in $id to prevent resolution failures in anonymous or bundled schemas[1][7].
  • Terminology:
    • URI: Absolute identifier per RFC 3986.
    • URI Reference: Relative path requiring a base URI for resolution[6][9].
  • Rejected Formats: A proposed format: "url" (for WHATWG URL standards) was declined to maintain consistency with RFC 3986[9].

Example Usage in Schemas

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://example.com/schemas/geocoordinates",
  "type": "object",
  "properties": {
    "website": {
      "type": "string",
      "format": "uri"
    },
    "coordinates": {
      "$ref": "https://example.com/schemas/geocoordinates#/definitions/latlng"
    }
  }
}

By adhering to these patterns, JSON Schemas ensure interoperability and strict validation of resource identifiers.

Citations:


Improve JSON Schema URI Validation

The current update adds a check for standard URI patterns by looking for substrings like "json-schema.org/draft" or "schema.json". While this is a step up from the minimal check, the JSON Schema specification recommends stricter URI validation. For instance, valid schemas typically include a draft version (e.g., "https://json-schema.org/draft-07/schema#" or "https://json-schema.org/draft/2020-12/schema") and often rely on an absolute URI—possibly defined via the $id keyword—to avoid ambiguity.

Suggestions:

  • Consider using a regex or URL parser (e.g., Go’s net/url package) to validate that the $schema value is a well-formed and absolute URI.
  • Optionally, check for the existence of a $id field to further ensure proper schema identification.
  • Update unit tests to cover a range of valid and invalid schema URIs.

Affected File:

  • server/internal/usecase/interactor/asset.go (Lines: 767-772)

Diff Suggestion:

 func isJSONSchema(data map[string]interface{}) bool {
     _, hasSchema := data["$schema"]
     _, hasProperties := data["properties"]
     typeVal, isTypeObj := data["type"].(string)
-    return hasSchema && hasProperties && isTypeObj && typeVal == "object"
+    // Validate against standard JSON Schema URI patterns
+    isStandardSchema := false
+    if schemaVal, ok := data["$schema"].(string); ok {
+      // Basic check for known JSON Schema patterns.
+      // Consider using a regex or URL parsing to enforce proper URI format (e.g., ensure the URI is absolute and contains a valid draft version marker).
+      isStandardSchema = strings.Contains(schemaVal, "json-schema.org/draft") ||
+                         strings.Contains(schemaVal, "schema.json")
+    }
+
+    return hasSchema && isStandardSchema && hasProperties && isTypeObj && typeVal == "object"
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9bf8426 and e1498a0.

📒 Files selected for processing (1)
  • server/internal/usecase/interactor/asset.go (2 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
server/internal/usecase/interactor/asset.go (4)
server/internal/usecase/interfaces/asset.go (5)
  • Asset (75-93)
  • AssetSchemaFieldsData (65-68)
  • ErrInvalidContentTypeForSchemaConversion (47-47)
  • AssetSchemaField (70-73)
  • ErrInvalidJSONSchema (46-46)
server/internal/usecase/repo/asset.go (1)
  • Asset (17-26)
server/internal/usecase/gateway/file.go (1)
  • File (63-73)
server/pkg/file/file.go (1)
  • File (19-25)
🔇 Additional comments (2)
server/internal/usecase/interactor/asset.go (2)

665-670: LGTM: Clean GeoJSONFeature struct definition.

The struct definition for GeoJSONFeature is well-structured and includes all the essential fields for a GeoJSON feature with proper JSON tags.


672-691: Good implementation with proper validation.

The method correctly validates the operator, retrieves the asset and file data, and ensures the content type is appropriate before proceeding. The error handling is also comprehensive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant