Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MSSE and RMSSE metrics #135

Closed
marcozanotti opened this issue Nov 15, 2024 · 6 comments · Fixed by #136
Closed

Add MSSE and RMSSE metrics #135

marcozanotti opened this issue Nov 15, 2024 · 6 comments · Fixed by #136
Labels
enhancement New feature or request feature

Comments

@marcozanotti
Copy link
Contributor

Description

It would be very nice to have the Mean Squared Scaled Error and the Root Mean Squared Scaled Error metrics / losses.
The implementation should be quite straightforward because they follow the same procedure of MASE but with squared errors.

References: https://otexts.com/fpp3/accuracy.html

Use case

No response

@marcozanotti marcozanotti added enhancement New feature or request feature labels Nov 15, 2024
@marcozanotti
Copy link
Contributor Author

marcozanotti commented Nov 18, 2024

It should be something like this:

def msse(
    df: DFType,
    models: List[str],
    seasonality: int,
    train_df: DFType,
    id_col: str = "unique_id",
    target_col: str = "y",
) -> DFType:
    """Mean Squared Scaled Error (MSSE)

    MSSE measures the relative prediction
    accuracy of a forecasting method by comparinng the mean squared errors
    of the prediction and the observed value against the mean
    squared errors of the seasonal naive model.

    Parameters
    ----------
    df : pandas or polars DataFrame
        Input dataframe with id, actuals and predictions.
    models : list of str
        Columns that identify the models predictions.
    seasonality : int
        Main frequency of the time series;
        Hourly 24, Daily 7, Weekly 52, Monthly 12, Quarterly 4, Yearly 1.
    train_df : pandas or polars DataFrame
        Training dataframe with id and actual values. Must be sorted by time.
    id_col : str (default='unique_id')
        Column that identifies each serie.
    target_col : str (default='y')
        Column that contains the target.

    Returns
    -------
    pandas or polars Dataframe
        dataframe with one row per id and one column per model.

    References
    ----------
    [1] https://robjhyndman.com/papers/mase.pdf
    """
    mean_sq_err = mse(df, models, id_col, target_col)
    if isinstance(train_df, pd.DataFrame):
        mean_sq_err = mean_sq_err.set_index(id_col)
        # assume train_df is sorted
        lagged = train_df.groupby(id_col, observed=True)[target_col].shift(seasonality)
        scale = train_df[target_col].sub(lagged).pow(2)
        scale = scale.groupby(train_df[id_col], observed=True).mean()
        res = mean_sq_err.div(_zero_to_nan(scale), axis=0).fillna(0)
        res.index.name = id_col
        res = res.reset_index()
    else:
        # assume train_df is sorted
        lagged = pl.col(target_col).shift(seasonality).over(id_col)
        scale_expr = pl.col(target_col).sub(lagged).pow(2).alias("scale")
        scale = train_df.select([id_col, scale_expr])
        scale = ufp.group_by(scale, id_col).mean()
        scale = scale.with_columns(_zero_to_nan(pl.col("scale")))

        def gen_expr(model):
            return pl.col(model).truediv(pl.col("scale")).fill_nan(0).alias(model)

        full_df = mean_sq_err.join(scale, on=id_col, how="left")
        res = _pl_agg_expr(full_df, models, id_col, gen_expr)
    return res

and the RMSSE should just be the square root of the MSSE.

@jmoralez
Copy link
Member

Would you like to contribute that @marcozanotti?

@marcozanotti
Copy link
Contributor Author

I really would like but unfortunately, I am not an expert on Python.

@jmoralez
Copy link
Member

That function looks good, if you want to you can open a PR and we can help you there. If not, it's ok, we'll work on this soon.

@marcozanotti
Copy link
Contributor Author

Ok I'll try to do that! Thanks

@marcozanotti
Copy link
Contributor Author

@jmoralez I've done the PR adding both MSSE and RMSSE.

@jmoralez jmoralez linked a pull request Nov 22, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants