Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VARCHAR[] cannot be extracted as Rust Vec<String> #1793

Open
aykut-bozkurt opened this issue Aug 6, 2024 · 12 comments
Open

VARCHAR[] cannot be extracted as Rust Vec<String> #1793

aykut-bozkurt opened this issue Aug 6, 2024 · 12 comments

Comments

@aykut-bozkurt
Copy link
Contributor

When I check the is_compatible_with function for String, we also check VARCHAROID. This way, we can map TEXT + VARCHAR datums to Rust String.

But it is not the case with VARCHAR[]. I get non-binary coercible types error when I try to extract Vec<String> from VARCHAR[] datum. I believe we lack a similar check at is_compatible_with for Vec<T>.

I wonder might my very naive approach below work out. It basically relies on if element types are compatible.

image

Looks like a related discussion. #1361

@eeeebbbbrrrr
Copy link
Contributor

Put up a PR with a unit test or two and lets see what CI says. In general, this looks like a good change. AFAIK, the structure of an ArrayType doesn't change based on its element type.

@workingjubilee
Copy link
Member

Reusing the IntoDatum::is_compatible_with logic will allow incorrect integer sizes to work. The integer sizes do in fact alter the structure of arrays, as integers are densely packed in the array, instead of being spaced to every eightbyte.

@workingjubilee
Copy link
Member

So no, we have to figure out something different.

@eeeebbbbrrrr
Copy link
Contributor

I was probably a little imprecise. What I mean is that an ArrayType will have the same shape as ArrayType so long as E and E2 are themselves binary compatible.

And yeah, I now see here in that diff that it's using our T::is_compatible_with() and not Postgres' "is_binary_compatible" function, which we'd need instead.

@workingjubilee
Copy link
Member

I'm not aware of such a function, but I believe that you are referring to the answer of "binary coercible", and that is the question that is answered by IntoDatum::is_compatible_with. "If you put this type in as an argument, can you always also get this other type out?" The problem is that it's not the question we need to answer for Arrays.

@workingjubilee
Copy link
Member

It's discussed a bit in https://www.postgresql.org/docs/current/sql-createcast.html

@eeeebbbbrrrr
Copy link
Contributor

eeeebbbbrrrr commented Aug 6, 2024

I'm not aware of such a function, but I believe that you are referring to the answer of "binary coercible", and that is the question that is answered by IntoDatum::is_compatible_with. "If you put this type in as an argument, can you always also get this other type out?" The problem is that it's not the question we need to answer for Arrays.

Yeah, digging through the postgres sources, IsBinaryCompatible() was replaced by IsBinaryCoercible sometime before pg12. hmm...

I guess we'd need to do something like lookup both types in the pg_type catalog and see if they have the same (at least) typstorage?

EDIT: Actually, I'm not sure what we'd check. typstorage isn't what I initially thought it was.

I'm not seeing anything in the Postgres sources that check for "array binary compatibility" -- at least not in the obvious places.

@eeeebbbbrrrr
Copy link
Contributor

eeeebbbbrrrr commented Aug 6, 2024

hmm. Maybe it's a combination of (typlen, typbyval, typalign).

EDIT: Maybe that's not enough either.

@workingjubilee
Copy link
Member

We're gonna have to handroll it if we want it. It will be easy enough to write something that only works in a reasonable number of cases and doesn't work in all cases, and that will probably be fine.

@aykut-bozkurt
Copy link
Contributor Author

My goal actually is to be able to represent different Postgres string in Rust and convert them back correctly, without losing the actual type, to Postgres types. Representing different Postgres strings via a single Rust struct String loses the actual type in that scenario (String cannot know what was the actual Postgres type). Hence, I internally created different structures for other string types.(mapping each Rust structure to a single Postgres type) e.g.

#[derive(Debug)]
pub(crate) struct Varchar(pub(crate) String);

impl IntoDatum for Varchar {
    fn into_datum(self) -> Option<pg_sys::Datum> {
        self.0.into_datum()
    }

    fn type_oid() -> pg_sys::Oid {
        VARCHAROID
    }
}

impl FromDatum for Varchar {
    unsafe fn from_polymorphic_datum(
        datum: pg_sys::Datum,
        is_null: bool,
        typoid: pg_sys::Oid,
    ) -> Option<Self>
    where
        Self: Sized,
    {
        let val = String::from_polymorphic_datum(datum, is_null, typoid);
        val.and_then(|val| Some(Self(val)))
    }
}

This way I can differentiate between different String types.

@workingjubilee
Copy link
Member

Can you explain why you wish to preserve these types?

@aykut-bozkurt
Copy link
Contributor Author

Data migration purposes. I serialize tuples in Postgres tables into a different data format and deserialize the format back into PG tuples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants