-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support aggregate function for vector. #463
feat: support aggregate function for vector. #463
Conversation
Signed-off-by: my-vegetable-has-exploded <[email protected]>
Signed-off-by: my-vegetable-has-exploded <[email protected]>
I will update the documentation for these functions, as well as the documentation for veci8 later. |
src/datatype/functions_vecf32.rs
Outdated
/// accumulate intermediate state for vector average | ||
#[pgrx::pg_extern(immutable, strict, parallel_safe)] | ||
fn _vectors_vector_accum( | ||
mut state: pgrx::composite_type!('static, "vectors.vector_accum_state"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the schema name hardcoded now? @usamoi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
Suggest add |
Is this same as |
sorry, I refer to |
src/datatype/functions_vecf32.rs
Outdated
.unwrap() | ||
.unwrap(); | ||
check_matched_dims(sum.len(), value.dims()); | ||
// TODO: pgrx::Array<T> don't support mutable operations currently, we can reuse the state once it's supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double check this solution. It seem that the memory management of AggState
is well-designed. For transfn
like sfunc( internal-state, next-data-values ) ---> next-internal-state
, return internal-state
after modification as result is just okay.
See https://github.com/postgres/postgres/blob/52b49b796cc7fd976f4da6aa49c9679ecdae8bd5/src/backend/executor/nodeAgg.c#L81-L100 and https://doxygen.postgresql.org/int8_8c.html#aff6c10f1da41c3aca8253c3e1d6f1378 for more details.
I am going to change the composite_type
to a struct. And modify the state in-place.
c5ad8a2
to
de0f000
Compare
|
||
/// accumulate intermediate state for vector average | ||
#[pgrx::pg_extern(immutable, strict, parallel_safe)] | ||
fn _vectors_vector_accum<'a>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meet unused-lifetimes warning here. I don't configure out the reason here. As far as I know , this warning only occurs with #[pgrx::pg_extern].
The 235th line is the same.
cc @usamoi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The macro expands to
pub unsafe extern "C" fn _vectors_bvecf32_send_wrapper<'a>(
_fcinfo: ::pgrx::pg_sys::FunctionCallInfo,
) -> ::pgrx::pg_sys::Datum
The lifetime parameter is redundant here. You could use allow
to suppress this warning before pgrx fixes it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just discovered this problem too. After add #![allow(unused_lifetimes)] #![allow(clippy::extra_unused_lifetimes)]
in file header, it works. Thanks!
Signed-off-by: my-vegetable-has-exploded <[email protected]>
de0f000
to
4f8bc4f
Compare
Signed-off-by: my-vegetable-has-exploded <[email protected]>
@@ -0,0 +1,292 @@ | |||
#![allow(unused_lifetimes)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@usamoi PTAL |
use std::ptr::NonNull; | ||
|
||
#[repr(C, align(8))] | ||
pub struct AccumulateStateHeader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll be better if it's more generic so that we use reuse it for more usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think only AccumulateState
for vecf32
and AccumulateState
for vecf16
layout can be same? Just adding a generic type seems won't be so helpful?
Signed-off-by: my-vegetable-has-exploded <[email protected]>
da65beb
to
c68bbe9
Compare
@usamoi @my-vegetable-has-exploded Is this ready to merge? |
I think it's ready to go now. |
* implement vector avg. Signed-off-by: my-vegetable-has-exploded <[email protected]> * Implement sum、 vector_dims()、 vector_norm() Signed-off-by: my-vegetable-has-exploded <[email protected]> * update AccumulateState Signed-off-by: my-vegetable-has-exploded <[email protected]> * fix norm. Signed-off-by: my-vegetable-has-exploded <[email protected]> * add l2_norm for VectorBorrowed. Signed-off-by: my-vegetable-has-exploded <[email protected]> --------- Signed-off-by: my-vegetable-has-exploded <[email protected]> Signed-off-by: jinweios <[email protected]>
close #458
This pr mainly contains function including
avg
,sum
,vector_dims
,vector_norm
.vector_dims(vector)
Returns the dimensions of a given vector.
Arguments:a vector.
Return type integer representing the number of dimensions of the given vector.
vector_norms(vector)
Calculates the Euclidean norm of a given vector.
Arguments:a vector.
Return double precision representing the Euclidean norm of the given vector.
AVG
Calculates the average of the processed vectors.
Arguments
vector: vectors.
Return vector representing the average of processed vectors.
SUM
Arguments: vectors.
Return type vector representing the sum of processed vectors.
The aggregate function support need to add a state transition function sfunc and an optional final calculation function ffunc, following create aggregate document. And we add
COMBINEFUNC
here to support parallel aggregate.