FeatHub can use OverWindowTransform and SlidingWindowTransform to describe
how to derive a feature value by applying FeatHub expression and aggregation
function on multiple rows. In the following, we provide example usages of these
window transforms and the supported built-in aggregation functions.
OverWindowTransform derives a feature value by applying FeatHub expression and
aggregation function on multiple rows of a table at a time. It can be used in
DerivedFeatureView.
Below is an example usage of OverWindowTransform.
f_total_cost = Feature(
name="total_cost",
transform=OverWindowTransform(
expr="cost",
agg_func="SUM",
group_by_keys=["name"],
window_size=timedelta(days=2),
),
)
features = DerivedFeatureView(
name="feature_view",
source=source,
features=[
f_total_cost,
],
keep_source_fields=False,
)SlidingWindowTransform derives a feature value by applying FeatHub expression
and aggregation function on multiple rows in a sliding window. It can be used in
SlidingFeatureView.
Below is an example usage of SlidingWindowTransform.
f_total_cost = Feature(
name="total_cost",
transform=SlidingWindowTransform(
expr="cost",
agg_func="SUM",
window_size=timedelta(days=3),
group_by_keys=["name"],
limit=2,
step_size=timedelta(days=1),
),
)
features = SlidingFeatureView(
name="features",
source=source,
features=[f_total_cost],
)The following built-in aggregation functions are supported by the window transforms described above.
| Function | Description |
|---|---|
| AVG | Returns the average (arithmetic mean) of input values. |
| SUM | Returns the sum of input values. |
| MAX | Returns the maximum value of input values. |
| MIN | Returns the minimum value of input values. |
| FIRST_VALUE | Returns the first value in the ordered list of input values. |
| LAST_VALUE | Returns the last value in the ordered list of input values. |
| ROW_NUMBER | Assigns a unique, sequential number to each value in the ordered list of input values, starting with one. |
| COUNT | Returns the number of input values. |
| VALUE_COUNTS | Returns a map that maps each value to the number of occurrences of this value in the input values. |
| COLLECT_LIST | returns a list that contains the ordered list of input values. |