Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserializing enums fails at run-time #211

Closed
hniksic opened this issue Oct 13, 2020 · 9 comments
Closed

Deserializing enums fails at run-time #211

hniksic opened this issue Oct 13, 2020 · 9 comments

Comments

@hniksic
Copy link

hniksic commented Oct 13, 2020

What version of the csv crate are you using?

1.1.3

Briefly describe the question, bug or feature request.

I can't get csv/serde to deserialize an enum.

I am processing a CSV file whose records store several different variants of data. It has a discriminator field and columns specific to each variant (which are empty when the discriminator shows the other variant), as well as some common columns, for example:

EventType,C,A1,B1
A,1,2,
B,3,,4

EventType is the discriminator that discriminates between two variants, A and B. C is the field common to both variants, and A1 and B1 are fields belonging to variants A and B respectively.

I would like to deserialize those rows Rust enums, something like:

enum Event {
    A(AEvent),
    B(BEvent),
}

struct AEvent {
    C: u32,
    A1: u32,
}

struct BEvent {
    C: u32,
    B1: u32,
}

...but I can't get that to successfully run with serde/csv using the code below (or variants thereof that I tested).

Include a complete program demonstrating a problem.

use serde::Deserialize;

#[derive(Debug, Deserialize)]
#[serde(tag = "EventType")]
enum Event {
    A(AEvent),
    B(BEvent),
}

#[allow(non_snake_case)]
#[derive(Debug, Deserialize)]
struct AEvent {
    C: u32,
    A1: u32,
}

#[allow(non_snake_case)]
#[derive(Debug, Deserialize)]
struct BEvent {
    C: u32,
    B1: u32,
}

fn main() {
    let src = std::io::Cursor::new(
        r#"EventType,C,A1,B1
A,1,2,
B,3,,4
"#,
    );
    let mut reciter = csv::ReaderBuilder::new()
        .from_reader(src)
        .into_deserialize::<Event>();
    dbg!(reciter.next());
    dbg!(reciter.next());
}

What is the observed behavior of the code above?

The program outputs errors for both fields, the error message being "invalid type: string \"A\", expected internally tagged enum". Full output:

[src/main.rs:34] reciter.next() = Some(
    Err(
        Error(
            Deserialize {
                pos: Some(
                    Position {
                        byte: 18,
                        line: 2,
                        record: 1,
                    },
                ),
                err: DeserializeError {
                    field: None,
                    kind: Message(
                        "invalid type: string \"A\", expected internally tagged enum",
                    ),
                },
            },
        ),
    ),
)
[src/main.rs:35] reciter.next() = Some(
    Err(
        Error(
            Deserialize {
                pos: Some(
                    Position {
                        byte: 25,
                        line: 3,
                        record: 2,
                    },
                ),
                err: DeserializeError {
                    field: None,
                    kind: Message(
                        "invalid type: string \"B\", expected internally tagged enum",
                    ),
                },
            },
        ),
    ),
)

What is the expected or desired behavior of the code above?

It should output the deserialized records, something like:

reciter.next() = A(
    AEvent {
        C: 1,
        A1: 2,
    },
)
reciter.next() = B(
    BEvent {
        C: 3,
        B1: 4,
    },
)
@BurntSushi
Copy link
Owner

BurntSushi commented Oct 13, 2020

I don't see a way to do this. The only enum support offered is at the level of an individual field. There is no record level enum support like you want here and I don't really see a way to add it. You'll have to do this translation manually.

(In general, Serde and csv don't really fit together well. The existing support is a hodge podge of things with a bunch of arbitrary-seeming limitations. The main problem is that csv has no types and is not a recursive format.)

@zeenix
Copy link

zeenix commented May 2, 2021

I don't see a way to do this.

That's very understandable but if you could add this as a warning in the docs, that would be great. I spent a hour trying to find what I'm doing wrong and given that internally-tagged enums are quite common place in the serde world, I'm sure I'm not the only one who will hit this.

@BurntSushi
Copy link
Owner

@zeenix Could you perhaps propose a wording/location (best through a PR) where this warning would have save you time?

@zeenix
Copy link

zeenix commented May 3, 2021

@zeenix Could you perhaps propose a wording/location (best through a PR) where this warning would have save you time?

Sure, I'll add it to my TODO for the week. :)

@zeenix
Copy link

zeenix commented May 3, 2021

@BurntSushi here you go: #231

@vkill

This comment was marked as duplicate.

ambroisie added a commit to ambroisie/processor that referenced this issue Aug 23, 2022
It is unfortunate that both [1] and [2] conspire to make this code way
worse than it could otherwise be with a saner (de)serialization format.

We both need to introduce `TransactionRecord` due to tagged enums not
being powerful enough in CSV, and make its `amount` field optional to
deal with the varying number of fields for each kind of transaction.

[1]: BurntSushi/rust-csv#211
[2]: BurntSushi/rust-csv#172
@zeenix
Copy link

zeenix commented Mar 3, 2023

@zeenix Could you perhaps propose a wording/location (best through a PR) where this warning would have save you time?

It's been 2 years since I provided the PR but you didn't review. 😔

@The0x539
Copy link

The0x539 commented Sep 2, 2023

You'll have to do this translation manually.

I have no idea if there's a good way to do something along these lines in csv::deserializer, but this Deserialize impl should at least be a usable workaround for downstream users.
This approach has some limitations compared to what csv can do when deserializing a record into a plain struct, but it beats the brick wall presented by the status quo.

@wiktor-k
Copy link

I have no idea if there's a good way to do something along these lines in csv::deserializer, but this Deserialize impl should at least be a usable workaround for downstream users.

Thanks for the idea it looks quite simple! I've been previously looking at deserializing it as a map which would allow the discriminant field to be on any position.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants