Skip to content

TypeError: Couldn't cast array of type #8

@rayquazaMega

Description

@rayquazaMega

Thank you for this excellent work! I'm attempting to load dataset from CharlieDreemur/OpenManus-RL using the following code:

import datasets
datasets.load_dataset('CharlieDreemur/OpenManus-RL', name=None)

and I got TypeError below:

TypeError: Couldn't cast array of type
struct<role: string, content: string, type: string>
to
{'role': Value(dtype='string', id=None), 'content': Value(dtype='string', id=None), 'loss': Value(dtype='bool', id=None)}

It appears that the keys in the JSON file are not consistent across rows. To address this, I tried modifying the feature structure, but I ran into the same error:

from datasets import Features, Value
features = Features({
         'role': Value('string'),
         'content': Value('string'),
         'type': Value('bool')
     })
datasets.load_dataset('CharlieDreemur/OpenManus-RL', features=features)
datasets.table.CastError: Couldn't cast
id: string
conversations: list<item: struct<role: string, content: string, loss: bool>>
  child 0, item: struct<role: string, content: string, loss: bool>
      child 0, role: string
      child 1, content: string
      child 2, loss: bool
content: string
role: string
type: bool
to
{'role': Value(dtype='string', id=None), 'content': Value(dtype='string', id=None), 'type': Value(dtype='bool', id=None)}
because column names don't match

How can i fix this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions