Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query binders: Add support for postgres arrays #336

Closed
nitnelave opened this issue May 23, 2022 · 26 comments · Fixed by #467 · May be fixed by #413
Closed

Query binders: Add support for postgres arrays #336

nitnelave opened this issue May 23, 2022 · 26 comments · Fixed by #467 · May be fixed by #413
Milestone

Comments

@nitnelave
Copy link
Contributor

Motivation

Postgres arrays are not supported in the query binders (see #275) or drivers.

Proposed Solutions

We should implement the conversion from Vec<sea-query::Value> to some Vec<sqlx::Encode> in sea-query-binders/src/sqlx_postgres.rs. For this, we'll need to ensure that all values are of the same type, and do a second level of dispatch based on the value types to convert them to the right type.

@ikrivosheev
Copy link
Member

@nitnelave do you want to create PR for this issue?

@nitnelave
Copy link
Contributor Author

I don't actually have a need for this, I just wanted to file this for posterity and/or completeness sake.

@billy1624 billy1624 linked a pull request Jul 15, 2022 that will close this issue
@billy1624
Copy link
Member

@Sytten
Copy link
Contributor

Sytten commented Jul 15, 2022

So what I am wondering is what is our status for keep drivers and binders? I think keeping both is an unnecessary burden.
We discussed in the PR using the binders for the array but sea-orm still uses the driver so it would need to be converted.

@ikrivosheev
Copy link
Member

@Sytten maybe we can drop drivers from SeaQuery in the future? Now mark it as deprecated and add support other drivers into binder.

@Sytten
Copy link
Contributor

Sytten commented Jul 15, 2022

I think I got an idea for the implementation:

use crate::{SqlxValue, SqlxValues};
use sea_query::Value;

impl<'q> sqlx::Encode<'q, sqlx::postgres::Postgres> for SqlxValue {
    fn encode_by_ref(&self, buf: &mut sqlx::postgres::PgArgumentBuffer) -> sqlx::encode::IsNull {
        match &self.0 {
            Value::Bool(b) => {
                <Option<bool> as sqlx::Encode<'q, sqlx::postgres::Postgres>>::encode_by_ref(&b, buf)
            }
            Value::TinyInt(i) => {
                <Option<i8> as sqlx::Encode<'q, sqlx::postgres::Postgres>>::encode_by_ref(&i, buf)
            }
            Value::SmallInt(i) => {
                todo!()
            }
            Value::Int(i) => {
                todo!()
            }
            Value::BigInt(i) => {
                todo!()
            }
            Value::TinyUnsigned(i) => {
                todo!()
            }
            Value::SmallUnsigned(i) => {
                todo!()
            }
            Value::Unsigned(i) => {
                todo!()
            }
            Value::BigUnsigned(i) => {
                todo!()
            }
            Value::Float(f) => {
                todo!()
            }
            Value::Double(d) => {
                todo!()
            }
            Value::String(s) => {
                todo!()
            }
            Value::Char(c) => {
                todo!()
            }
            Value::Bytes(b) => {
                todo!()
            }
            #[cfg(feature = "with-chrono")]
            Value::ChronoDate(d) => {
                todo!()
            }
            #[cfg(feature = "with-chrono")]
            Value::ChronoTime(t) => {
                todo!()
            }
            #[cfg(feature = "with-chrono")]
            Value::ChronoDateTime(t) => {
                todo!()
            }
            #[cfg(feature = "with-chrono")]
            Value::ChronoDateTimeUtc(t) => {
                todo!()
            }
            #[cfg(feature = "with-chrono")]
            Value::ChronoDateTimeLocal(t) => {
                todo!()
            }
            #[cfg(feature = "with-chrono")]
            Value::ChronoDateTimeWithTimeZone(t) => {
                todo!()
            }
            #[cfg(feature = "with-time")]
            Value::TimeDate(t) => {
                todo!()
            }
            #[cfg(feature = "with-time")]
            Value::TimeTime(t) => {
                todo!()
            }
            #[cfg(feature = "with-time")]
            Value::TimeDateTime(t) => {
                todo!()
            }
            #[cfg(feature = "with-time")]
            Value::TimeDateTimeWithTimeZone(t) => {
                todo!()
            }
            #[cfg(feature = "with-uuid")]
            Value::Uuid(uuid) => {
                todo!()
            }
            #[cfg(feature = "with-rust_decimal")]
            Value::Decimal(d) => {
                todo!()
            }
            #[cfg(feature = "with-bigdecimal")]
            Value::BigDecimal(d) => {
                todo!()
            }
            #[cfg(feature = "with-json")]
            Value::Json(j) => {
                todo!()
            }
            #[cfg(feature = "postgres-array")]
            Value::Array(a) => <Option<Vec<SqlxValue>> as sqlx::Encode<
                'q,
                sqlx::postgres::Postgres,
            >>::encode_by_ref(
                &a.as_ref()
                    .map(|values| values.iter().map(|v| SqlxValue::from(v.clone())).collect()),
                buf,
            ),
            #[cfg(feature = "with-ipnetwork")]
            Value::IpNetwork(ip) => {
                todo!()
            }
            #[cfg(feature = "with-mac_address")]
            Value::MacAddress(mac) => {
                todo!()
            }
        }
    }
}

impl<'q> sqlx::Type<sqlx::postgres::Postgres> for SqlxValue {
    fn type_info() -> sqlx::postgres::PgTypeInfo {
        todo!()
    }
}

impl<'q> sqlx::postgres::PgHasArrayType for SqlxValue {
    fn array_type_info() -> sqlx::postgres::PgTypeInfo {
        todo!()
    }
}

impl<'q> sqlx::IntoArguments<'q, sqlx::postgres::Postgres> for SqlxValues {
    fn into_arguments(self) -> sqlx::postgres::PgArguments {
        let mut args = sqlx::postgres::PgArguments::default();
        for arg in self.0.into_iter() {
            use sqlx::Arguments;
            args.add(arg);
        }
        args
    }
}

I created a new SqlxValue to wrap Value.
The only thing is that PgHasArrayType and Type dont take self, so I can't return something sensible.
For Type it's not too bad because you can override it in the produce of Encode, but PgHasArrayType I don't know if it will work.

I prefer this method to the proposed one in #225 since it allows support for user custom types.

I am starting to believe that the Value is an anti pattern. It would be much easier if we had a bunch of boxed trait where we can expand them as need be. This is the way sqlx and diesel are built (u32 implements Type for example).

@Sytten
Copy link
Contributor

Sytten commented Jul 15, 2022

We have another problem with the current approach.
When trying to implement fn produces(&self) -> Option<sqlx::postgres::PgTypeInfo> of the encode trait, you have the problem that you don't know the underlining type of array value.
So I don't think the current approach will work.

At this point we have a couple of choices:

  1. Implement the casts like in Bind array values in SQLx PostgreSQL driver #225, seems like a big hack to me and I don't think it will even work because it doesn't support empty arrays (which is a use case)
  2. Add a bunch of options to Value for each type. So we would have Value::BoolArray. This is what sqlx does under the hood it seems like, though I am now so sure how they handle custom type arrays (It has to do with PgHasArrayType.
  3. Split the runtime type from the actual value. That would allow us to always has a type to work with even if the value is None or Some(vec![]).

My preference goes to 3, maybe even combined with 2. That would look similar to PgArguments which contains the data in one field and a vec of PgTypeInfo which is basically an enum with stuff like Int8, Int8Array, etc.
2 alone would also work and would require minimal changes to the overall codebase.

@ikrivosheev @nitnelave @billy1624 @tyt2y3 Let me know what you think of all this. This would be a big change but it would unblock a lot of things (including custom types).

@tyt2y3
Copy link
Member

tyt2y3 commented Jul 17, 2022

I would also incline towards 3, although can you illustrate more on the plan perhaps with some code snippets?

@ikrivosheev
Copy link
Member

@Sytten thank you! Can you explain (3)? Small PoC. I don't understand what you want.

@billy1624
Copy link
Member

billy1624 commented Jul 18, 2022

I created a new SqlxValue to wrap Value. The only thing is that PgHasArrayType and Type dont take self, so I can't return something sensible. For Type it's not too bad because you can override it in the produce of Encode, but PgHasArrayType I don't know if it will work.

We could avoid implementing PgHasArrayType on our side. If we employ (2) you proposed.

  1. Add a bunch of options to Value for each type. So we would have Value::BoolArray. This is what sqlx does under the hood it seems like, though I am now so sure how they handle custom type arrays (It has to do with PgHasArrayType.

On sea-query-binder it will be...

match arg {
    Value::BoolArray(b) => {
        args.add(b);
    }
    ...
}

@billy1624
Copy link
Member

Btw... by "custom type arrays" do you mean something like custom enum arrays?
For example, launchbadge/sqlx#298

@Sytten
Copy link
Contributor

Sytten commented Jul 18, 2022

Yeah the option 2 is the easiest but does not allow for custom types. Like currently in sea-orm you have to cast the enum to string before sending it back. The issue you linked for enum array has actually been fixed I think based on the last comment. But enum is just one example you can create new types in postgres that are tuples and any array of that type will be valid (like https://stackoverflow.com/questions/41663637/insert-into-array-of-custom-type-in-postgres).

We can also decide to deal with it later and use option 2 for now. I think the better solution long term is to have a Value that contains a Box of a couple of traits like encode, type info, etc. and use that to translate to the backend engine.

Option 3 is similar to that in a sense that the array option of Value would contain both option of the data (like current) and an enum value of the possible types (u8, bool, etc). That would reduce the boilerplate needed since you would still have one option for all arrays.

enum Type {
 u8,
 bool
}

struct Value {
  data: Option<Box<dyn ValueTrait>>, // Or the current Value enum
  type: Type
}

@billy1624
Copy link
Member

billy1624 commented Jul 19, 2022

Hey @Sytten, base on your explanation for option (3). Personally, I think this is a good direction for SeaQuery.

Also, I think we should go one step further. Storing reference of ValueTrait in SeaQuery. Because we just need reference for value binding.

P.S. That would be a complete rewrite of sea_query::Value API

@ikrivosheev
Copy link
Member

ikrivosheev commented Jul 19, 2022

@Sytten in the (3) option, how can we add custom type? Something like this:

enum Type {
 u8,
 bool,
 custom(...)
}
...
?

@ikrivosheev
Copy link
Member

To test an idea I will try implement it for: #347 and create Draft PR.

@billy1624 @Sytten ok?

@Sytten
Copy link
Contributor

Sytten commented Jul 19, 2022

@billy1624 Yeah it is a big rewrite so we should really think about it before going forward with it.
Considering this would also be the third rewrite of the driver/binder if we do the storing of the engine/backend (we should stick to a name) directly I really want to make sure we know what we want before we start coding...
This is also annoying since it means a big delay for the array support :(

@ikrivosheev Thats what sqlx seems to do in their engine. A alternative would be to do something like:

struct Value(Box<dyn ValueTrait>);

impl<T> ValueTrait for Option<T> {
   // Here provide a method that can provide the type using T::some_method()
}

That might be more elegant a solution.

@billy1624
Copy link
Member

Hey @ikrivosheev, yea, please go ahead and conduct a small POC :D

@Sytten
Copy link
Contributor

Sytten commented Jul 19, 2022

I am not sure a reference will cut it? It's nice in principle but I think we need to own the pointer otherwise it will make it harder for users to pass literal values.
I was thinking though that the implementation of ValueTrait could be dependent on the engine so something like:

#[cfg(feature = "postgres")]
trait ValueTrait: postgres::ToSql {
}

#[cfg(feature = "sqlx")]
trait ValueTrait: sqlx::Encode + sqlx::TypeInfo { 
}

EDIT: That might not work with the generic sqlx traits as is since they need a database parameter and a lifetime so we might need 3 implementation like:

#[cfg(feature = "engine-sqlx")]
#[cfg(feature = "database-postgres")]
trait ValueTrait: sqlx::Encode<'static, sqlx::postgres::Postgres> + sqlx::TypeInfo<sqlx::postgres::Postgres> { 
}

Unsure if the static lifetime is a good idea though, but exposing a lifetime to the trait might become soon very annoying.

@tyt2y3 tyt2y3 added this to the 0.27 milestone Aug 7, 2022
@billy1624
Copy link
Member

billy1624 commented Sep 8, 2022

Regarding the redesign of sea_query::Value. In the following I want to discuss a possible way to extend existing sea_query::Value by allowing db specific value to be stored.

I will narrow the scope and focus on SQLx only.

The goal is to make sea_query::Value extensible - meaning that foreign type can implement some trait and stored inside sea_query::Value. It's just a container for storing the value needed for parameter binding that is performed by the db driver, i.e. SQLx.

By design, SQLx, needs two traits for value binding, Encode and Type. And each takes a database generic, that means a type that bindable for a PostgreSQL database is for<'q> sqlx::Encode<'q, sqlx::Postgres> + sqlx::Type<sqlx::Postgres>. We can define it with super trait syntax:

trait SqlxPostgresValue: for<'q> sqlx::Encode<'q, sqlx::Postgres> + sqlx::Type<sqlx::Postgres> {}

Then, we can add a new variant in sea_query::Value to store custom PostgreSQL type:

pub enum Value {
    ...

    #[cfg(feature = "sqlx-postgres")]
    SqlxPostgres(Box<dyn SqlxPostgresValue>),
}

However, SqlxPostgresValue isn't object safe because Type has functions without self parameter. We'll need to consult SQLx team on this. Possibly a PR or forking SQLx.

Error Log
➜ cargo b
   Compiling sea-query-binder v0.1.0 (/Applications/MAMP/htdocs/sea-query/sea-query-binder)
error[E0038]: the trait `SqlxPostgresValue` cannot be made into an object
   --> /Applications/MAMP/htdocs/sea-query/sea-query-binder/src/sqlx_postgres.rs:129:16
    |
129 |     value: Box<dyn SqlxPostgresValue>,
    |                ^^^^^^^^^^^^^^^^^^^^^ `SqlxPostgresValue` cannot be made into an object
    |
note: for a trait to be "object safe" it needs to allow building a vtable to allow the call to be resolvable dynamically; for more information visit <https://doc.rust-lang.org/reference/items/traits.html#object-safety>
   --> /Users/billy/.cargo/registry/src/github.com-1ecc6299db9ec823/sqlx-core-0.6.1/src/types/mod.rs:175:8
    |
175 |     fn type_info() -> DB::TypeInfo;
    |        ^^^^^^^^^ ...because associated function `type_info` has no `self` parameter
...
184 |     fn compatible(ty: &DB::TypeInfo) -> bool {
    |        ^^^^^^^^^^ ...because associated function `compatible` has no `self` parameter
    |
   ::: /Applications/MAMP/htdocs/sea-query/sea-query-binder/src/sqlx_postgres.rs:126:7
    |
126 | trait SqlxPostgresValue: for<'q> sqlx::Encode<'q, sqlx::Postgres> + sqlx::Type<sqlx::Postgres> {}
    |       ----------------- this trait cannot be made into an object...
    = help: consider turning `type_info` into a method by giving it a `&self` argument or constraining it so it does not apply to trait objects
    = help: consider turning `compatible` into a method by giving it a `&self` argument or constraining it so it does not apply to trait objects
help: consider turning `type_info` into a method by giving it a `&self` argument
   --> |/Users/billy/.cargo/registry/src/github.com-1ecc6299db9ec823/sqlx-core-0.6.1/src/types/mod.rs:175:17
    |
175 |     fn type_info&self() -> DB::TypeInfo;
    |                 +++++
help: consider turning `compatible` into a method by giving it a `&self` argument
   --> |/Users/billy/.cargo/registry/src/github.com-1ecc6299db9ec823/sqlx-core-0.6.1/src/types/mod.rs:184:18
    |
184 |     fn compatible&self, (ty: &DB::TypeInfo) -> bool {
    |                  ++++++

For more information about this error, try `rustc --explain E0038`.
error: could not compile `sea-query-binder` due to previous error

Finally, we can do this for MySQL and SQLite as well:

pub enum Value {
    ...

    #[cfg(feature = "sqlx-postgres")]
    SqlxPostgres(Box<dyn SqlxPostgresValue>),

    #[cfg(feature = "sqlx-mysql")]
    SqlxMysql(Box<dyn SqlxMysqlValue>),

    #[cfg(feature = "sqlx-sqlite")]
    SqlxSqlite(Box<dyn SqlxSqliteValue>),
}

All in all, I think this is a good balance between "built-in types" (all existing variants in sea_query::Value) that are supported by SeaQuery. And, "custom types" that user opt-in to use - which are database specific and against SeaQuery's database agnostic big picture.

Thoughts and comments are welcome!! :D

@Sytten
Copy link
Contributor

Sytten commented Sep 8, 2022

This is basically what I was proposing so I am fine with the proposal. But we do have to fix the issue of object safety for the Type trait which is not easy unless sqlx marks Self as Sized.

What I am wondering is: why not convert every value to this trait. What advantage does keeping in the primitives separate from the generic use case has? It would cleanup the code a lot to just use the generic value IMO.

Also I want to make sure we can support diesel types with the same pattern using the ToSql trait.

@Sytten
Copy link
Contributor

Sytten commented Sep 8, 2022

Sorry for double posting but I saw https://users.rust-lang.org/t/how-to-create-vec-for-sqlx-values/66505 which is directly related to our issue of object safety and I think the solution is one we could explore if we fail to make progress with the sqlx team.

It matches well with the binders crate that we currently use too so thats good news I think.

Another possible solution is detailed here https://stackoverflow.com/questions/70660975/storing-disparate-sqlx-types-in-a-vector and it is rather similar to our current situation but I am not sure It will help for arrays.

@billy1624
Copy link
Member

billy1624 commented Sep 9, 2022

Hey @Sytten, I got a draft based on the link you provided, please check https://github.com/SeaQL/sea-query/compare/WIP-value.

We still have some missing pieces:

  1. A SeaQueryToString trait for query introspection. In the draft I use Display trait to as a temporary solution.
  2. SqlxPostgres(Box<dyn SqlxPostgresValue>) should allow NULL (None) value
  3. How to provide default implementation of SeaQueryToString for types like Vec<i32> or Vec<T> in general

CC @ikrivosheev @tyt2y3

@Sytten
Copy link
Contributor

Sytten commented Sep 9, 2022

It looks good. I like that we have our own set of traits. I think we should split the trait definition from the blanket implementation so we can just create a different blanket implementation for diesel.

Null is tricky if we want to keep our implementation object safe as the whole Type trait bot requiring self was in part to deal with nullable values. Basically we need to always store a value even on null.

@ikrivosheev
Copy link
Member

@billy1624 @Sytten @tyt2y3 well, I'm done with other features (split sea-query into crates, SqlWriter refactoring and some other small features), now i will continue to work on this feature!

@Sytten
Copy link
Contributor

Sytten commented Sep 15, 2022

Good stuff, still a few things to resolve before we can finalize this but you have good stuff to read above.
The object safety thing is a bit annoying but we have a work around, I think mostly is the issue with null.
I am wondering if we could go back to null being a separate value of the enum, it was like that at the beginning of sea-query.

@billy1624
Copy link
Member

Hey @ikrivosheev, the context of why we need to wrap every value in Option<T> in the first place: #107

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment