Releases: nflverse/nflfastR
Releases · nflverse/nflfastR
nflfastR 5.0.0
Major Changes
- Added new function
calculate_stats()
that combines the output of allcalculate_player_stats*()
functions with a more robust and faster approach. Thecalculate_player_stats*()
function will be deprecated in a future release. (#470) - Added new exported dataframe
nfl_stats_variables
. It lists and explains all variables returned bycalculate_stats()
. A searchable table is available at https://www.nflfastr.com/articles/stats_variables.html. (#470)
Bug Fixes and Minor Changes
- Drop
{crayon}
,{DT}
,{httr}
,{jsonlite}
,{qs}
dependencies. (#453) - The function
calculate_player_stats_def
now returnsseason_type
if argumentweekly
is set toTRUE
for consistency with the other player stats functions. (#455) - The function
missing_raw_pbp()
now allows filtering by season. (#457) - More robust handling of player IDs in
decode_player_ids()
. (#458) - Fixed rare cases where the value of the
yrdln
variable didn't equal"MID 50"
at midfield. (#459) - Fixed rare cases where
drive_start_yard_line
missed the blank space between team name and yard line number. (#459) - Fixed play description in some 1999 and 2000 games where the string "D.Holland" replaced the kick distance. (#459)
- Fixed a problem where the
goal_to_go
variable wasFALSE
in actual goal to go situations. (#460) - Fixed a bug in
fixed_drive
andfixed_drive_result
where the second weather delay in2023_13_ARI_PIT
wasn't identified correctly. (#461) punter_player_id
, andpunter_player_name
are filled for blocked punt attempts. (#463)- Fixed an issue affecting scores of 2022 games involving a return touchdown (#466)
- Added identification of scrambles from 1999 through 2004 with thank to Aaron Schatz (#468, #489)
- Updated the dataframe
stat_ids
with some IDs that were previously missing. (#470) - nflfastR tried to fix bugs in the underlying pbp data of JAX home games prior to the 2016 season. An update of the raw pbp data resolved those bugs so nflfastR needs to remove the hard coded adjustments. This means that nflfastR <= v4.6.1 will return incorrect pbp data for all Jacksonville home games prior to the 2016 season! (#478)
- Fixed a problem where
clean_pbp()
returnedpass = 1
in actual rush plays in very rare cases. (#479) - Removed extra lines for injury timeouts that were breaking
fixed_drive
(#482) - The variable
penalty_type
now correctly lists the penalty "Kickoff Short of Landing Zone" introduced in the 2024 season. (#486) - Fixed a bug where
ep
was incorrect on PAT attempts preceded by a timeout and then a penalty (extremely rare). This bug also caused the variablestotal_home_epa
andtotal_away_epa
to be incorrect for all subsequent plays in the same game. (#493)
Thank you to
@ahmed-cheema, @andrewtek, @guga31bb, @isaactpetersen, @JoeMarino2021, @john-b-edwards, @marcusSasser, @mlounsberry, @morganandrew, @mrcaseb, @mscoop16, @parsnipz, @rjthompson2, and @Useight for their questions, feedback, and contributions towards this release.
nflfastR 4.6.1
- The function
calculate_series_conversion_rates()
now correctly aggregates season level conversion rates. Performance has also been improved. (#440) - Adjusted test behavior at CRAN's request.
Thank you to
@andrewtek, @gregalvi86, @Ic4ru5Wing, @JoeMarino2021, @jreddy1990, @marvin3FF, @mrcaseb, @RicShern, @SPNE, and @trivialfis for their questions, feedback, and contributions towards this release.
nflfastR 4.6.0
New Features
- nflfastR now fully supports loading raw pbp data from local file system. The best way to use this feature is to set
options("nflfastR.raw_directory" = {"your/local/directory"})
. Alternatively, bothbuild_nflfastR_pbp()
andfast_scraper()
support the argumentdir
which defaults to the above option. (#423) - Added the new function
save_raw_pbp()
which efficiently downloads raw play-by-play data and saves it to the local file system. This serves as a helper to setup the system for faster play-by-play parsing via the above functionality. (#423) - Added the new function
missing_raw_pbp()
that computes a vector of game IDs missing in the local raw play-by-play directory. (#423)
Minor Improvements and Bugfixes
- The internal function
get_pbp_nfl()
now usesifelse()
instead ofdplyr::if_else()
to handle some null-checking, fixes bug found in2022_21_CIN_KC
match. - The function
calculate_player_stats()
now summarises target share and air yards share correctly when called with argumentweekly = FALSE
(#413) - The function
calculate_player_stats()
now returns the opponent team when called with argumentweekly = TRUE
(#414) - The function
calculate_player_stats_def()
no longer errors when small subsets of pbp data are missing stats. (#415) - The function
calculate_series_conversion_rates()
no longer returnsNA
values if a small subset of pbp data is missing series on offense or defense. (#417) fixed_drive
now correctly increments on plays where posteam lost a fumble but remains posteam because defteam also lost a fumble during the same play. (#419)- nflfastR now fixes missing drive number counts in raw pbp data in order to provide accurate drive information. (#420)
- nflfastR now returns correct
kick_distance
on all punts and kickoffs. (#422) - Decode player IDs in 2023 pbp. (#425)
- Drop the pseudo plays TV Timeout and Two-Minute Warning. (#426)
- Fix posteam on kickoffs and PATs following a defensive TD in 2023+ pbp. (#427)
calculate_player_stats()
no more counts lost fumbles on plays where a player fumbles, a team mate recovers and then loses a fumble to the defense. (#431)- The variables
passer
,receiver
, andrusher
no more returnNA
on "abnormal" plays - like direct snaps, aborted snaps, laterals etc. - that resulted in a penalty. (#435)
Thank you to
@903124, @ak47twq, @andrewtek, @darkhark, @dennisbrookner, @marvin3FF, @mistakia, @mrcaseb, @nicholasmendoza22, @rickstarblazer, @RileyJohnson22, and @tanho63 for their questions, feedback, and contributions towards this release.
nflfastR 4.5.1
- New implementation of tests to be able to identify breaking changes in reverse dependencies (#396, #406)
calculate_standings()
no more freezes when computing standings from schedules where some games are missing results, i.e. upcoming games.- Bug fix that caused problems with upcoming dplyr and tidyselect updates that weren't reverse compatible.
- Significant performance improvements of internal functions. (#402)
- Wrap examples in
try()
to avoid CRAN problems. (#404) - Fixed a bug where
calculate_standings()
wasn't able to handle nflverse pbp data. (#404)
nflfastR 4.5.0
New (experimental) functions
- Added new function
calculate_player_stats_def()
that aggregates defensive player stats either at game level or overall. (#288) - The situation report
nflverse_sitrep
which is an alias of the already availablereport()
- Added new function
calculate_player_stats_kicking()
that aggregates player stats for field goals and extra points at game level or overall. (#381) - Added new function
calculate_series_conversion_rates()
that computes series conversion and series result rates at a game level or season level. (#393)
Bugfixes and Minor Improvements
- Internal change to
calculate_player_stats()
that reflects new nflverse data infrastructure. calculate_player_stats()
now unifies player names and joins the following player information vianflreadr::load_players()
:player_display_name
- Full name of the playerposition
- Position of the playerposition_group
- Position group of the playerheadshot_url
- URL to a player headshot image
- Make data work in 2022 (hopefully)
- Fix Amon-Ra St. Brown breaking the name parser
- Add gsis_id patch to
clean_pbp()
. calculate_player_stats_def()
failed in situations where play-by-play data is missing certain stats. (#382)- Spot-fixing
calculate_player_stats()
forNA
names.
nflfastR 4.4.0
New Functions, Options, Data
- Added new function
calculate_standings()
that computes regular season division standings and playoff seeds from nflverse data. - The database function
update_db()
now supports the option "nflfastR.dbdirectory" which can be used to set the directory of the nflfastR pbp database globally and independent of any project structure or working directories. - The embedded data frame
?teams_colors_logos
has been updated to reflect the most recent team color themes and gained additional variables for conference and division as well as logo urls to the conference and league logos. (#290) - The embedded data frame
?teams_colors_logos
has been updated with the Washington Commanders. (#312)
Deprecation
- The argument
qs
in the functionsload_pbp()
andload_player_stats()
has been deprecated as of nflfastR 4.3.0. This release removes the argument entirely.
Bugfixes and Minor Improvements
- Fixed bug where a player could be duplicated in
calculate_player_stats()
in very rare cases caused by plays with laterals. (#289) - Fixed a bug where the function
add_xpass()
failed when called with an empty data frame. (#296) - Fixed a bug where
play_type
showedno_play
on plays with penalties that don't result in a replay of the down. (#277, #281) - Fixed a bug in the variable descriptions of
total_home_score
andtotal_away_score
. (#300) fast_scraper_rosters()
andfast_scraper_schedules()
now callnflreadr::load_rosters()
andnflreadr::load_schedules()
under the hood (#304)- Fixed a bug causing missing EPA on game-ending turnovers in overtime
- Bump minimum nflreadr version to 1.2.0 for data repository change
- Fix a bug affecting yardline for a very small number of plays in the 2000 season (#323)
update_db()
now uses a default play to predefine column types for all db drivers. (#324)- Fix a bug that resulted in incorrect
xyac_mean_yardage
on 4th downs (#327) - Fix a bug that resulted in missing
xyac
information for plays involving J.O'Shaughnessy (#329) - Fix a bug that resulted in missing
epa
on the last play of some games involving NE and BUF (#331) fast_scraper()
andbuild_nflfastR_pbp()
now return data frames of classnflverse_data
to be consistent withnflreadr
.- Fix behavior of EP model in neutral site games (treats both teams as away teams)
nflfastR 4.3.0
Minor Changes
- Add nflreadr to dependecies and drop lubridate and magrittr dependency
- The functions
load_pbp()
andload_player_stats()
now callnflreadr::load_pbp()
andnflreadr::load_player_stats()
respectively. Therefore the argumentqs
has been deprecated in both functions. It will be removed in a future release. Runningload_player_stats()
without any argument will now return player stats of the current season only (the default innflreadr
). - The deprecated arguments
source
andpp
in the functionsfast_scraper_*()
andbuild_nflfastR_pbp()
have been removed - Added the variables
racr
("Receiver Air Conversion Ratio"),target_share
,air_yards_share
,wopr
("Weighted Opportunity Rating") andpacr
("Passing Air Conversion Ratio") to the output ofcalculate_player_stats()
- Added the function
report()
which will be used by the maintainers to help users debug their problems (#274).
Bug Fixes
nflfastR 4.2.0
- All
wpa
variables areNA
on end game line - All
wp
variables are 0, 0.5, 1, orNA
on end game line - Fix bug where win prob on PATs assumed a PAT placed at 15 yard line, even in older seasons
- The function
decode_player_ids()
now really decodes the new variablefantasy_id
(#229) - Fixed a bug that caused slightly differing
wp
values depending on the first game in the data set (#183) - Edited GitHub references to point to nflverse
- Added the variables
sack_yards
,sack_fumbles
,rushing_fumbles
andreceiving_fumbles
to the output of the functioncalculate_player_stats()
, thanks to Mike Filicicchia (@TheMathNinja). (#239) - Fixed a bug where
calculate_player_stats()
falsely counted lost fumbles on aborted snaps (#238) - Added the variable
season_type
to the output ofcalculate_player_stats()
andload_player_stats()
in preparation of the extended Regular Season starting in 2021 (#240) - Updated
season_type
definitions in preparation of the extended Regular Season starting in 2021 (#242) - Fix for
fixed_drive
where it wasn't incrementing when there was a muffed punt followed by timeout (#244) - Fix for
fixed_drive
where it wasn't incrementing following an interception with the intercepting player then losing a fumble (#247) - Fix for more issues with missing play info in 2018_01_ATL_PHI (#246)
- Added the variables
safety_player_name
andsafety_player_id
to the play-by-play data (#252) - Dropped the dependency
usethis
nflfastR 4.1.0
Breaking changes
Functions
- Added the function
calculate_player_stats()
that aggregates official passing, rushing, and receiving stats either at game level or overall - Added the function
load_player_stats()
that loads weekly player stats from 1999 to the most recent season - The performance of the functions
add_xyac()
andclean_pbp()
has been significantly improved
New Variables
- Added the new columns
td_player_name
andtd_player_id
to clearly identify the player who scored a touchdown (this is especially helpful for plays with multiple fumbles or laterals resulting in a touchdown) - The function
calculate_player_stats()
now adds the variabledakota
, theepa
+cpoe
composite, for players with minimum 5 pass attempts. - Added column
home_opening_kickoff
toclean_pbp()
- Added the variables
sack_player_id
,sack_player_name
,half_sack_1_player_id
,half_sack_1_player_name
,half_sack_2_player_id
andhalf_sack_2_player_name
who identify players that recorded sacks (or half sacks). Also updated the description of the variablesqb_hit_1_player_id
,qb_hit_1_player_name
,qb_hit_2_player_id
andqb_hit_2_player_name
to make more clear that they did not record a sack. (#180)
Minor improvements and fixes
- The variable
qb_scramble
was incomplete for the 2005 season because of missing scramble indicators in the play description. This has been mostly fixed courtesy of charting data from Football Outsiders (with thanks to Aaron Schatz!). Some notes on this fix: Weeks 1-16 are based on charting. Weeks 17-21 are guesses (basically every QB run except those that were a) a loss, b) no gain, or c) on 3/4 down with 1-2 to go). Plays nullified by penalty are not included. - Change
name
,id
,rusher
, andrusher_id
to be the player charged with the fumble on aborted snaps when the QB is unable to make a play (i.e. pass, sack, or scramble) (#162) - The function
clean_pbp()
now standardizes the team name columnstackle_with_assist_*_team
- Fix bug in
drive
that was causing incorrect overtime win probabilities (#194) - Fixed a bug where
posteam
was notNA
on end of quarter 2 (or end of quarter 4 in overtime games) causing wrong values forfixed_drive
,fixed_drive_result
,series
andseries_result
- Fixed a bug where
fixed_drive
andseries
were falsely incrementing on kickoffs recovered by the kicking team or on defensive touchdowns followed by timeouts - Fixed a bug where
fixed_drive
andseries
were falsely incrementing on muffed punts recovered by the punting team for a touchdown - Fixed a bug where
add_xpass()
crashed when ran with data already including xpass variables. - Fixed a bug in
epa
when a safety is scored by the team beginning the play in possession of the ball (#186) - Fix some bugs related to David and Duke Johnson on the Texans in 2020 (#163)
- Fix yet another bug related to correctly identifying possession team on kickoffs nullified by penalty (#199)
- Fixed a bug where
calculate_player_stats()
forgot to clean player names by using their IDs - Fixed a bug where special teams touchdowns were missing in the output of
calculate_player_stats()
(#203) - Fixed for some old Jaguars games where the wrong team was awarded points for safeties and kickoff return TDs (#209)
- The function
update_db()
no more falsely closes a database connection provided by the argumentdb_connection
(#210) - Fixed a bug where
yards_gained
was missing yardage on plays with laterals. (#216) - Fixed a bug where there were stats wrongly given on a play with penalty (#218)
fixed_drive
now increments properly on onside kick recoveries (#215)fixed_drive
no longer counts a muffed kickoff as a one-play drive on its own (#217)fixed_drive
now properly increments after a safety (#219)- Improved parser for
penalty_type
and updated the description of the variable to make more clear it's the first penalty that happened on a play. (#223)
nflfastR 4.0.0
Breaking changes
Changed Functions
- Deprecated the arguments
source
andpp
all across the package. Using them will cause a
warning. Parallel processing has to be activated by choosing an appropriatefuture::plan()
before
calling the relevant functions. For more information please see the package documentation. - The function
build_nflfastR_pbp()
will now rundecode_player_ids()
by default (can be deactivated with the argumentdecode = FALSE
). - The function
build_nflfastR_pbp()
will now runadd_xpass()
by default and add the new variablesxpass
andpass_oe
. - The functions
fast_scraper()
andbuild_nflfastR_pbp()
now allow the output offast_scraper_schedules()
directly as input so it's not necessary anymore to pull thegame_id
first.
New Functions and Variables
- Added the new function
load_pbp()
that loads complete seasons into memory for fast access of the play-by-play data. - Added the new variables
rushing_yards
,lateral_rushing_yards
,passing_yards
,receiving_yards
,lateral_receiving_yards
to fix an old bug whereyards_gained
gets overwritten on plays with laterals (#115). - Added columns
vegas_wpa
andvegas_home_wpa
which contain Win Probability Added from the spread-adjusted WP model - Added column
out_of_bounds
- Added columns
fantasy
,fantasy_id
,fantasy_player_name
, andfantasy_player_id
that indicate the rusher or receiver on the play - Added columns
tackle_with_assist
,tackle_with_assist_1_player_id
,tackle_with_assist_1_player_name
,tackle_with_assist_1_team
,tackle_with_assist_2_player_id
,tackle_with_assist_2_player_name
,tackle_with_assist_2_team
Models and Miscellaneous
- Tuned spread-adjusted win probability model one final (?) time. Expected points is now no longer
required forcalculate_win_probability()
- Added field descriptions
vignette("field_descriptions")
with a searchable list of all nflfastR variables - Switched data source for 2001-2010 to what is used for 2011 and on
- All models have been moved to the fastrmodels package
- Added the data frames
?field_descriptions
and?stat_ids
to the package
Minor improvements and fixes
- Fix bug where
fixed_drive
andseries
weren't updating after muffed punt (#144) - Fix bug induced by fixing the above (#149)
- Fix bug where some special teams plays were incorrectly being labeled as pass plays (#125)
- Fix bug where points for safeties were given to the
defteam
instead of theposteam
(#152) - Fix bug where a muffed punt TD was given to the wrong team in a 2011 Jaguars game (#154)
- Win probability is now calculated prior to PAT attempts rather than using WP on the ensuing kickoff
- Improved performance of internal functions that speed up the rebuilding process in
update_db()
(addedqs
andcurl
to dependencies) - Fixed a bug where
calculate_expected_points()
andcalculate_win_probability()
duplicated some existing variables instead of replacing them (#170) - Fixed a bug where
penalty_type
wasn't"no_play"
although it should have been (#172) - Fixed a bug where
penalty_team
could be incorrect in games of the Jaguars in the seasons 2011 - 2015 (#174) - Fixed a bug related to the calculation of
epa
on plays before a failed pass interference challenge in a few 2019 games (#175) - Fixed a bug related to lots of fields with
NA
on offsetting penalties (#44) - Fixed a bug in
epa
when possession team changes at end of 1st or 3rd quarter (#182) - Fixed a bug where various functions have left open connections
vegas_wp
is nowNA
on final line since there is no possession team