Skip to content

Aggregate APIs #47

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tyt2y3 opened this issue Jul 15, 2021 · 33 comments
Open

Aggregate APIs #47

tyt2y3 opened this issue Jul 15, 2021 · 33 comments
Assignees
Labels
good first issue Good for newcomers

Comments

@tyt2y3
Copy link
Member

tyt2y3 commented Jul 15, 2021

Add a new trait AggregatorTrait to be impl by Selector*
count()
and sum avg max min by specifying a column to aggregate on

@tyt2y3
Copy link
Member Author

tyt2y3 commented Jul 17, 2021

56763a8

@billy1624
Copy link
Member

billy1624 commented Aug 13, 2021

So, it allows user to aggregate by column(s) / expression on any of the supported aggregation method (i.e. sum, avg, max, min)?

@tyt2y3
Copy link
Member Author

tyt2y3 commented Aug 13, 2021

My idea was to have a struct called Aggregator that will perform the action, as in Paginator.
The sum method on the AggregatorTrait will convert self into an Aggregator and execute in async.
The parameter to sum should be a ColumnTrait

@billy1624
Copy link
Member

How about "Select A, SUM(B) AS B FROM Table GROUP BY A"?

@tyt2y3
Copy link
Member Author

tyt2y3 commented Aug 13, 2021

I think it should work similar to Paginator, creating a sub query is the safest

@billy1624
Copy link
Member

billy1624 commented Aug 13, 2021

So the result is just a simple type T, perhaps a number in i32, i64, f32, f64, etc.

@tyt2y3
Copy link
Member Author

tyt2y3 commented Aug 13, 2021

So the result is just a simple type T, perhaps a number i32, i64, f32, f64, etc.

Oh, not yet thought of this. No, it can't be type T. We need to query the column type in runtime and select the appropriate Value variant

@billy1624
Copy link
Member

billy1624 commented Aug 13, 2021

Btw... we should consider changing the return type of Paginator::count(), make it more dynamic to account for db differences.

e.g. Postgres prefer i64; MySQL is okay with both i32 & i64

@tyt2y3
Copy link
Member Author

tyt2y3 commented Aug 13, 2021

Btw... we should consider changing the return type of Paginator::count(), make it more dynamic.

I think the result of count is not dependent on the column type, only dependent on the database driver. Using i64 is the safest bet.

@billy1624
Copy link
Member

Yes!

@billy1624
Copy link
Member

So all these methods (count, sum, avg, max, min) return single value as result?

So the result is just a simple type T, perhaps a number i32, i64, f32, f64, etc.

Oh, not yet thought of this. No, it can't be type T. We need to query the column type in runtime and select the appropriate Value variant

What you mean by selecting an appropriate Value variant at runtime?

@tyt2y3
Copy link
Member Author

tyt2y3 commented Aug 14, 2021

Well, if the column itself is a Double, then the avg result would be f64. It could be decimal, i32 and so on.
It is different from count that always return an i64.

@billy1624
Copy link
Member

billy1624 commented Aug 14, 2021

Yes, think so!

Generic types loll

@tyt2y3 tyt2y3 added this to the 0.2.0 milestone Aug 23, 2021
@tyt2y3 tyt2y3 modified the milestones: 0.2.0, 0.3.0 Aug 30, 2021
@tyt2y3 tyt2y3 removed this from the 0.3.x milestone Sep 24, 2021
@tyt2y3
Copy link
Member Author

tyt2y3 commented Sep 24, 2021

To whom it may concern, if it is important to you, make a comment.

@tyt2y3 tyt2y3 closed this as completed Sep 24, 2021
@BenJeau
Copy link
Contributor

BenJeau commented Jan 17, 2022

I'm moving from diesel to this crate, and love it! Although I'm using count https://docs.diesel.rs/1.4.x/diesel/dsl/fn.count.html and would like it if it was implemented here too. I can help implementing it, if someone could help me/indicate what to do :)

@billy1624
Copy link
Member

Hey @BenJeau, thanks for the help!

I have a rough idea in mind on how this could be done. It would be similar to PaginatorTrait.

#[async_trait::async_trait]
/// A Trait for any type that can paginate results
pub trait PaginatorTrait<'db, C>
where
C: ConnectionTrait,
{
/// Select operation
type Selector: SelectorTrait + Send + Sync + 'db;
/// Paginate the result of a select operation.
fn paginate(self, db: &'db C, page_size: usize) -> Paginator<'db, C, Self::Selector>;
/// Perform a count on the paginated results
async fn count(self, db: &'db C) -> Result<usize, DbErr>
where
Self: Send + Sized,
{
self.paginate(db, 1).num_items().await
}
}

Let say we create an AggregatorTrait then implement it for...

  • Selector<S>
  • Select<E>
  • SelectTwo<E, F>
#[async_trait::async_trait]
pub trait AggregatorTrait
{
    fn aggregate(self, db: &'db C) -> Aggregator;
}

pub struct Aggregator<'db, C>
where
    C: ConnectionTrait,
{
    pub(crate) query: SelectStatement,
    pub(crate) db: &'db C,
}

impl<'db, C> Aggregator<'db, C> {
    pub fn count<T>(col: T) -> Result<i64, DbErr>
    where
        T: ColumnTrait
    {
        todo!()
    }
}

So, the public API would be

let count: i64 = cake::Entity::find()
    .aggregate(db)
    .count(cake::Column::Id)
    .await?;

And we could use subquery to select the aggregated result, it would be easier to for us to implement.

SELECT COUNT(sub.id) FROM (SELECT cake.id, cake.name FROM cake) AS sub

@billy1624 billy1624 reopened this Jan 18, 2022
@billy1624 billy1624 added the good first issue Good for newcomers label Feb 9, 2022
SebastienGllmt pushed a commit to dcSpark/sea-orm that referenced this issue Apr 13, 2022
@kirawi
Copy link
Contributor

kirawi commented Apr 23, 2022

Is there still interest in this feature being implemented?

@tyt2y3
Copy link
Member Author

tyt2y3 commented Apr 24, 2022

I think so, it's marked as a good first issue anyway

@kirawi
Copy link
Contributor

kirawi commented May 16, 2022

I'll give this issue a shot in the coming days.

@billy1624 billy1624 moved this from Triage to Open for Contributions in SeaQL Dev Tracker Jul 12, 2022
@billy1624 billy1624 moved this to Triage in SeaQL Dev Tracker Jul 12, 2022
@LemarAb
Copy link

LemarAb commented Dec 25, 2022

Is this issue still active @billy1624 ?

@LemarAb
Copy link

LemarAb commented Dec 26, 2022

If nobody minds, I would like to claim this task :) .

@LemarAb
Copy link

LemarAb commented Dec 28, 2022

@billy1624 Following your suggestion, it seems that you would like the count function to actually return the number of rows. Do I assume corrrectly that sea-orm currently does not provide a way to access counts for tables?

@billy1624
Copy link
Member

Hey @LemarAb, we do have methods to construct aggregate expression in SeaORM.
https://docs.rs/sea-orm/latest/sea_orm/entity/prelude/trait.ColumnTrait.html#method.count

@LemarAb
Copy link

LemarAb commented Jan 6, 2023

@billy1624 so there is no intention to make the APIs compatible with further querying, as we return i64 immediately, only (one) aggregation should be performed per aggregate() call, correct? So it's just a way to make aggregation easier?

Can you elaborate on your previous comment:

And we could use subquery to select the aggregated result, it would be easier to for us to implement.

SELECT COUNT(sub.id) FROM (SELECT cake.id, cake.name FROM cake) AS sub

Why is there a necessity for a subquery ?

@billy1624
Copy link
Member

Hey @LemarAb, I saw it the other way round. User can perform filtering or even pre-aggregating, then aggregate it to produce the final result.

SELECT COUNT(sub.id)
FROM (SELECT cake.id, cake.name FROM cake WHERE ... GROUP BY ... HAVING ...) AS sub

Ideally we wrap the original select statement in a subquery because this way we always perform aggregation on a query result of a black box.

I imagine the API be like:

let count: i64 = cake::Entity::find()
    .count(cake::Column::Id) // method defined in `AggregatorTrait` to construct the `Aggregator` struct
    .one(db) // method defined in `Aggregator` struct
    .await?;

@LemarAb
Copy link

LemarAb commented Jan 11, 2023

Okay @billy1624 , we are on the same page when it comes to pre-filtering, but I am concerned with what comes after the count()call.

Your proposal further above intended an i64 to return as a result of the count method...what methods should be callable after that? As far as I am aware, we can implement the call to the db connection in the count method, just like we already do in fn paginate. So why the need to use theone() method?

@billy1624
Copy link
Member

billy1624 commented Jan 12, 2023

The one() method will run the query and retrieve the result as you might have guessed. I want it to be a two steps process and leave room for API addition in the future.

let count: i64 = cake::Entity::find()
    .count(cake::Column::Id) // 1) Convert into `Aggregator`
    .some_more_fn(...) // 1.x) Leave room for future API addition
    .one(db) // 2) Run the query
    .await?;

Does that make sense?

@baoyachi
Copy link
Contributor

Why not replace some_more_fn with and_then.

Ref :and_then

@billy1624 billy1624 moved this from Open for Contributions to Triage in SeaQL Dev Tracker Jan 13, 2023
@LemarAb
Copy link

LemarAb commented Jan 14, 2023

@billy1624 so opposite to what you suggested further above, count should not return the number right away, but return Aggregator to leave room for further aggregation APIs? Are there any functions supposed to be callable right now after count aside from one()?

@billy1624
Copy link
Member

For example we can implement QuerySelect for Aggregator.

@billy1624
Copy link
Member

Why not replace some_more_fn with and_then.

Ref :and_then

some_more_fn is just a placeholder, it's different from the concept of and_then

@colinrknox
Copy link

Is this still open? I've been looking into this, but I'm having trouble figuring out a way to generalize the result over a set of numeric types. I'm currently doing this in the Aggregator:

fn one<T: TryGetable>(db: &'db C) -> Result<T, DbErr>

I'm sure there is a better way of doing this based on the ColumnType of the ColumnTrait from the sum() parameter, but I can't figure it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
Status: Triage
Development

No branches or pull requests

8 participants
@tyt2y3 @colinrknox @baoyachi @BenJeau @billy1624 @kirawi @LemarAb and others