-
Notifications
You must be signed in to change notification settings - Fork 3
Vmpy #26
base: master
Are you sure you want to change the base?
Vmpy #26
Changes from all commits
5024350
b0e08cd
47b8dcb
4d52395
9fa9b8b
0dc806a
4dd390d
2e50097
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -94,3 +94,4 @@ ENV/ | |
|
||
# MacOS | ||
.DS_Store | ||
.idea |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
Changelog | ||
========= | ||
|
||
0.X.X (unreleased) | ||
------------------ | ||
|
||
Breaking changes: | ||
|
||
- *add item here* | ||
|
||
New features: | ||
|
||
- *add item here* | ||
|
||
Bug fixes: | ||
|
||
- *add item here* | ||
|
||
|
||
0.7.3 (in progress) | ||
------------------ | ||
|
||
Breaking changes: | ||
|
||
- *add item here* | ||
|
||
New features: | ||
|
||
- add vmpy-0.1.6 as a dependency | ||
- compute_mean_max uses vmpy.metrics.power_duration_curve | ||
- compute_weighted_average_power uses vmpy.metrics.normalized_power | ||
- compute_power_per_kg uses vmpy.streams.wpk | ||
- compute_power_zones based on athlete.ftp, custom ftp or custom zones | ||
|
||
Bug fixes: | ||
|
||
- Use of pytest.approx() in test_algorithms | ||
- Fixed path handling in the pytest fixtures |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,6 +11,20 @@ | |
|
||
|
||
def mean_max_power(power): | ||
"""Mean-Max Power | ||
|
||
Also known as a Power-Duration-Curve | ||
|
||
Parameters | ||
---------- | ||
power: pd.Series | ||
|
||
Returns | ||
------- | ||
pd.Series | ||
|
||
TO-DO: to be replaced by vmpy.algorithms.power_duration_curve() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This module does not exist yet right? What's your idea behind this module? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It does, I've moved it to vmpy.metrics.power_duration_curve() the last moment. It calculates accumulated energy first and than does two arrays diff with a shift. It speeds up calculations by x4. The 8hours ride takes about 10 seconds instead of almost a minute on my PC. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh fancy! Did you benchmark it to my implementation or yours? |
||
""" | ||
mmp = [] | ||
|
||
for i in range(len(power)): | ||
|
@@ -23,11 +37,27 @@ def mean_max_power(power): | |
|
||
|
||
def mean_max_bests(power, duration, amount): | ||
"""Compute multiple best non-overlapping intervals | ||
|
||
Parameters | ||
---------- | ||
power : pd.Series | ||
duration : number | ||
Duration of the interval in seconds | ||
amount : int | ||
Number of the non-overlaping intervals | ||
|
||
Returns | ||
------- | ||
pd.Series | ||
""" | ||
moving_average = power.rolling(duration).mean() | ||
|
||
length = len(moving_average) | ||
mean_max_bests = [] | ||
|
||
for i in range(amount): | ||
|
||
if moving_average.isnull().all(): | ||
mean_max_bests.append(DataPoint(np.nan, np.nan)) | ||
continue | ||
|
@@ -45,10 +75,39 @@ def mean_max_bests(power, duration, amount): | |
|
||
|
||
def weighted_average_power(power): | ||
"""Weighted average power | ||
|
||
Also known as the Normalized Power | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Normalized Power®️ (😉) is a trademarked name and cannot be used commercially without permission from Peaksware. That reason alone is enough for me to avoid using that name. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, didn't know that, but knowing the behavior of Coggan on forums should have guessed it. WAP is good, fits well with Strava community. |
||
|
||
Parameters | ||
---------- | ||
power : pd.Series | ||
|
||
Returns | ||
------- | ||
number | ||
|
||
replaced by vmpy.metrics.normalized_power | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Idem: What are the differences between your implementation and mine? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For NP - none, but the vmpy function also allows to calculate xPower that uses the exponentially weighted average. |
||
""" | ||
|
||
wap = power.rolling(30).mean().pow(4).mean().__pow__(1/4) | ||
|
||
return wap | ||
|
||
|
||
def power_per_kg(power, weight): | ||
""" | ||
|
||
Parameters | ||
---------- | ||
power | ||
weight | ||
|
||
Returns | ||
------- | ||
|
||
replaced by vmpy.metrics.wpk | ||
""" | ||
ppkg = power / weight | ||
|
||
return ppkg |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,36 @@ | ||
from . import algorithms | ||
from .base import BaseWorkoutDataFrame | ||
from .helpers import requires | ||
from vmpy.metrics import normalized_power, power_duration_curve | ||
from vmpy.streams import compute_zones, wpk | ||
|
||
|
||
class WorkoutDataFrame(BaseWorkoutDataFrame): | ||
_metadata = ['athlete'] | ||
|
||
@requires(columns=['power']) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although I am not sure I still like the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In general, I think the require decorator limits the usage. If you get data from Strava for example, the power stream is called 'watts' there. I would propose to ditch require in favor of sane defaults with an ability to override during the function call. Somethin along the line wdf.calculate_awesome_power(arg='power', **kwargs) In this particular case I've removed require for an athlete to allow optionally specifying 'ftp' using keywords. It falls back on athlete['ftp'] as s default. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is really similar to the discussion about accepting multiple input argument types and I am not in favor of it. I think as a developer you have a responsibility to make the interface of you library as well defined as possible, accepting multiple input types or accepting that certain data is stored in different columns makes the interface unclear and therefore more difficult to use. For me this would also make the purpose of the dataframe redundant as you cannot just use the class methods "as is" without understanding the underlying data structure. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The problem with too simple interfaces is that they limit the use. In case of requiring 'power' only you force the user in figuring out how to rename the columns in pd.DataFrame to get it working. The power/watts duality is a one example forced on us by Strava, but with heartrate you will get a full multitude of the possibilities: {hr, bpm, heart_rate, you_name_it} The methods can simply be written with defaults in mind as Another aspect is a sole reliance on the Athlete obj. I think it is less canonical and therefore less obvious for the user to expect that only way to parametrize wdf methods is through the Athlete object. It is a great placeholder for defaults, but it is also quite natural to expect the parameters to be exposed as a function signature. Imagine you want to compare two Wbal calculations on the same data set, but with different W' and CP values. The only way to do it now is through redefining the Athlete properties, while I would naturally expect that W' and CP' should be available as inputs to the method call. |
||
def compute_power_zones(self, **kwargs): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this method touches a sensitive subject ;) : In my view the WorkoutDataFrame should not contain any business logic and should call a method from algorithms as directly as possible. That means adding a method to the algorithms module (which in turn could call vmpy) and call it from here. The WorkoutDataFrame is just a more high level way to work with these algorithms but a user should be able to use all logic in athletic_pandas outside of the WorkoutDataFrame. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It makes sense, especially in light of the email I wrote yesterday. Let's see how we separate Frame from the algos in the most efficient way. |
||
if kwargs.get('ftp', None): | ||
return compute_zones(self.power, ftp=kwargs.get('ftp')) | ||
elif kwargs.get('zones', None): | ||
return compute_zones(self.power, zones=kwargs.get('zones')) | ||
else: | ||
return compute_zones(self.power, ftp=self.athlete.ftp) | ||
|
||
@requires(columns=['power']) | ||
def compute_mean_max_power(self): | ||
return algorithms.mean_max_power(self.power) | ||
return power_duration_curve(self.power) | ||
# return algorithms.mean_max_power(self.power) | ||
|
||
@requires(columns=['power']) | ||
def compute_weighted_average_power(self): | ||
return algorithms.weighted_average_power(self.power) | ||
return normalized_power(self.power, type='NP') | ||
# return algorithms.weighted_average_power(self.power) | ||
|
||
@requires(columns=['power'], athlete=['weight']) | ||
def compute_power_per_kg(self): | ||
return algorithms.power_per_kg(self.power, self.athlete.weight) | ||
return wpk(self.power, self.athlete.weight) | ||
# return algorithms.power_per_kg(self.power, self.athlete.weight) | ||
|
||
@requires(columns=['power'], athlete=['cp', 'w_prime']) | ||
def compute_w_prime_balance(self, algorithm=None, *args, **kwargs): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,4 +2,5 @@ coverage==4.4.1 | |
isort==4.2.15 | ||
lmfit==0.9.7 | ||
pandas==0.20.3 | ||
vmpy==0.1.6 | ||
pytest==3.2.5 |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -105,8 +105,8 @@ def test_extended_critical_power(self, version, expected): | |
power = pd.Series(range(500)) | ||
mean_max_power = algorithms.mean_max_power(power) | ||
ecp = algorithms.extended_critical_power(mean_max_power, version=version) | ||
assert ecp[1] == expected[0] | ||
assert ecp[250] == expected[1] | ||
assert ecp[1] == pytest.approx(expected[0], 0.1) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh I like the |
||
assert ecp[250] == pytest.approx(expected[1], 0.1) | ||
assert len(ecp) == expected[2] | ||
|
||
def test_heartrate_model(self): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,6 @@ | ||
import unittest | ||
|
||
import os | ||
import numpy as np | ||
import pandas as pd | ||
import pytest | ||
|
@@ -23,8 +24,9 @@ def wdf(): | |
@pytest.fixture | ||
def wdf_small(): | ||
athlete = models.Athlete(cp=200, w_prime=20000) | ||
current_dir = os.path.dirname(os.path.abspath(__file__)) | ||
wdf = models.WorkoutDataFrame( | ||
pd.read_csv('tests/example_files/workout_1_short.csv') | ||
pd.read_csv(os.path.join(current_dir, 'example_files/workout_1_short.csv')) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am working on moving all pytest fixtures to a |
||
) | ||
wdf = wdf.set_index('time') | ||
wdf.athlete = athlete | ||
|
@@ -34,8 +36,9 @@ def wdf_small(): | |
@pytest.fixture | ||
def wdf_big(): | ||
athlete = models.Athlete(cp=200, w_prime=20000, weight=80) | ||
current_dir = os.path.dirname(os.path.abspath(__file__)) | ||
wdf = models.WorkoutDataFrame( | ||
pd.read_csv('tests/example_files/workout_1.csv') | ||
pd.read_csv(os.path.join(current_dir, 'example_files/workout_1.csv')) | ||
) | ||
wdf = wdf.set_index('time') | ||
wdf.athlete = athlete | ||
|
@@ -170,8 +173,36 @@ def test_compute_mean_max_power_missing_power(self, wdf): | |
with pytest.raises(exceptions.MissingDataException): | ||
assert wdf.compute_mean_max_power() is None | ||
|
||
def test_compute_power_zones(self): | ||
power = 4.5*np.array([0.54, 0.74, 0.89, 1.04, 1.19, 1.49, 10.0]) | ||
zones = [1,2,3,4,5,6,7] | ||
wdf = models.WorkoutDataFrame({'power': power}) | ||
wdf.athlete = models.Athlete(ftp=4.5) | ||
rv = wdf.compute_power_zones() | ||
assert type(rv) == pd.Series | ||
assert (rv == pd.Series(zones)).all() | ||
|
||
|
||
def test_compute_power_zones_custom_ftp(self): | ||
power = 4.5*np.array([0.54, 0.74, 0.89, 1.04, 1.19, 1.49, 10.0]) | ||
zones = [1,2,3,4,5,6,7] | ||
wdf = models.WorkoutDataFrame({'power': power}) | ||
rv = wdf.compute_power_zones(ftp=4.5) | ||
assert type(rv) == pd.Series | ||
assert (rv == pd.Series(zones)).all() | ||
|
||
|
||
def test_compute_power_zones_custom_zones(self): | ||
power = np.array([0.54, 0.74, 0.89, 1.04, 1.19, 1.49, 10.0]) | ||
zones = [1,2,3,4,5,6,7] | ||
wdf = models.WorkoutDataFrame({'power': power}) | ||
rv = wdf.compute_power_zones(zones=[-0.001, 0.55, 0.75, 0.9, 1.05, 1.2, 1.5, 10.0, 1000]) | ||
assert type(rv) == pd.Series | ||
assert (rv == pd.Series(zones)).all() | ||
|
||
|
||
def test_compute_weighted_average_power(self, wdf_big): | ||
assert wdf_big.compute_weighted_average_power() == 156.24624656343036 | ||
assert wdf_big.compute_weighted_average_power() == pytest.approx(156.24624656343036, 0.1) | ||
|
||
def test_compute_weighted_average_power_missing_power(self, wdf_big): | ||
del wdf_big['power'] | ||
|
@@ -181,6 +212,7 @@ def test_compute_weighted_average_power_missing_power(self, wdf_big): | |
|
||
def test_compute_power_per_kg(self, wdf_big): | ||
# wdf_big.athlete.ftp = 300 | ||
# wdf_big.athlete.weight = 80 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It was just a reminder for myself to remember what is the actual weight. Can be removed. |
||
ppkg = wdf_big.compute_power_per_kg() | ||
|
||
assert ppkg[1] == 1.1625000000000001 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True for now but I probably want to change to ndarray. Do you agree that's better?
Good that your are adding docstrings, I did not get to that yet...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments are based on what it does at the moment. In general, i prefer to have it array-like {list, ndarray, pd.Series} with output having the same time as an input.