Title: | Functions to Efficiently Access NFL Play by Play Data |
---|---|
Description: | A set of functions to access National Football League play-by-play data from <https://www.nfl.com/>. |
Authors: | Sebastian Carl [aut], Ben Baldwin [cre, aut], Lee Sharpe [ctb], Maksim Horowitz [ctb], Ron Yurko [ctb], Samuel Ventura [ctb], Tan Ho [ctb], John Edwards [ctb] |
Maintainer: | Ben Baldwin <[email protected]> |
License: | MIT + file LICENSE |
Version: | 5.0.0.9000 |
Built: | 2024-12-03 20:29:32 UTC |
Source: | https://github.com/nflverse/nflfastR |
A set of functions to access National Football League play-by-play data from https://www.nfl.com/.
Prior to nflfastR v4.0, parallel processing could be activated with an
argument pp
in the relevant functions and progress updates were always
shown. Both of these methods are bad practice and were therefore removed
in nflfastR v4.0
The next sections describe how to make nflfastR work in parallel processes and show progress updates if the user wants to.
Nearly all nflfastR functions support parallel processing
using furrr::future_map()
if it is enabled by a call to future::plan()
prior to the function call.
Please see the documentation of the functions for detailed information.
As an example, the following code block will resolve all function calls in the current session using multiple sessions in the background and load play-by-play data for the 2018 through 2020 seasons or build them freshly for the 2018 and 2019 Super Bowls:
future::plan("multisession") load_pbp(2018:2020) build_nflfastR_pbp(c("2018_21_NE_LA", "2019_21_SF_KC"))
We recommend choosing a default parallel processing method and saving it as an environment variable in the R user profile to make sure all futures will be resolved with the chosen method by default. This can be done by following the below given steps.
First, run the following line and the file .Renviron
should be opened automatically.
If you haven't saved any environment variables yet, this will be an empty file.
usethis::edit_r_environ()
In the opened file .Renviron
add the next line, then save the file and restart your R session.
Please note that this example sets "multisession" as default. For most users
this should be the appropriate plan but please make sure it truly is.
R_FUTURE_PLAN="multisession"
After the session is freshly restarted please check if the above method worked
by running the next line. If the output is FALSE
you successfully set up a
default non-sequential future::plan()
. If the output is TRUE
all functions
will behave like they were called with purrr::map()
and NOT in multisession.
inherits(future::plan(), "sequential")
For more information on possible plans please see the future package Readme.
For more information on .Renviron
please see
this book chapter.
Most nflfastR functions are able to show progress updates
using progressr::progressor()
if they are turned on before the function is
called. There are at least two basic ways to do this by either activating
progress updates globally (for the current session) with
progressr::handlers(global = TRUE)
or by piping the function call into progressr::with_progress()
:
load_pbp(2018:2020) %>% progressr::with_progress()
Just like in the previous section, it is possible to activate global progression handlers by default. This can be done by following the below given steps.
First, run the following line and the file .Rprofile
should be opened automatically.
If you haven't saved any code yet, this will be an empty file.
usethis::edit_r_profile()
In the opened file .Rprofile
add the next line, then save the file and restart your R
session. All code in this file will be executed when a new R session starts.
The part if (require("progressr"))
makes sure this will only run if the
package progressr is installed to avoid crashing R sessions.
if (requireNamespace("progressr", quietly = TRUE)) progressr::handlers(global = TRUE)
After the session is freshly restarted please check if the above method worked
by running the next line. If the output is TRUE
you successfully activated
global progression handlers for all sessions.
progressr::handlers(global = NA)
For more information how to work with progress handlers please see progressr::progressr.
For more information on .Rprofile
please see
this book chapter.
Maintainer: Ben Baldwin [email protected]
Authors:
Sebastian Carl [email protected]
Other contributors:
Lee Sharpe [contributor]
Maksim Horowitz [email protected] [contributor]
Ron Yurko [email protected] [contributor]
Samuel Ventura [email protected] [contributor]
Tan Ho [contributor]
John Edwards [email protected] [contributor]
Useful links:
Report bugs at https://github.com/nflverse/nflfastR/issues
Compute QB epa
add_qb_epa(pbp, ...)
add_qb_epa(pbp, ...)
pbp |
is a Data frame of play-by-play data scraped using |
... |
Additional arguments passed to a message function (for internal use). |
Add the variable 'qb_epa', which gives QB credit for EPA for up to the point where a receiver lost a fumble after a completed catch and makes EPA work more like passing yards on plays with fumbles
Build columns from the expected dropback model. Will return
NA
on data prior to 2006 since that was before NFL started marking scrambles.
Must be run on a dataframe that has already had clean_pbp()
run on it.
Note that the functions build_nflfastR_pbp()
and
the database function update_db()
already include this function.
add_xpass(pbp, ...)
add_xpass(pbp, ...)
pbp |
is a Data frame of play-by-play data scraped using |
... |
Additional arguments passed to a message function (for internal use). |
The input Data Frame of the parameter pbp
with the following columns
added:
Probability of dropback scaled from 0 to 1.
Dropback percent over expected on a given play scaled from 0 to 100.
Add expected yards after completion (xyac) variables
add_xyac(pbp, ...)
add_xyac(pbp, ...)
pbp |
is a Data frame of play-by-play data scraped using |
... |
Additional arguments passed to a message function (for internal use). |
Build columns that capture what we should expect after the catch.
The input Data Frame of the parameter 'pbp' with the following columns added:
Expected value of EPA gained after the catch, starting from where the catch was made. Zero yards after the catch would be listed as zero EPA.
Probability play earns positive EPA (relative to where play started) based on where ball was caught.
Probability play earns a first down based on where the ball was caught.
Average expected yards after the catch based on where the ball was caught.
Median expected yards after the catch based on where the ball was caught.
build_nflfastR_pbp
is a convenient wrapper around 6 nflfastR functions:
Please see either the documentation of each function or the nflfastR Field Descriptions website to learn about the output.
build_nflfastR_pbp( game_ids, dir = getOption("nflfastR.raw_directory", default = NULL), ..., decode = TRUE, rules = TRUE )
build_nflfastR_pbp( game_ids, dir = getOption("nflfastR.raw_directory", default = NULL), ..., decode = TRUE, rules = TRUE )
game_ids |
Vector of character ids or a data frame including the variable
|
dir |
Path to local directory (defaults to option "nflfastR.raw_directory")
where nflfastR searches for raw game play-by-play data.
See |
... |
Additional arguments passed to the scraping functions (for internal use) |
decode |
If |
rules |
If |
To load valid game_ids please use the package function fast_scraper_schedules()
.
An nflfastR play-by-play data frame like it can be loaded from https://github.com/nflverse/nflverse-data.
For information on parallel processing and progress updates please see nflfastR.
# Build nflfastR pbp for the 2018 and 2019 Super Bowls try({# to avoid CRAN test problems build_nflfastR_pbp(c("2018_21_NE_LA", "2019_21_SF_KC")) }) # It is also possible to directly use the # output of `fast_scraper_schedules` as input try({# to avoid CRAN test problems library(dplyr, warn.conflicts = FALSE) fast_scraper_schedules(2020) %>% slice_tail(n = 3) %>% build_nflfastR_pbp() })
# Build nflfastR pbp for the 2018 and 2019 Super Bowls try({# to avoid CRAN test problems build_nflfastR_pbp(c("2018_21_NE_LA", "2019_21_SF_KC")) }) # It is also possible to directly use the # output of `fast_scraper_schedules` as input try({# to avoid CRAN test problems library(dplyr, warn.conflicts = FALSE) fast_scraper_schedules(2020) %>% slice_tail(n = 3) %>% build_nflfastR_pbp() })
for provided plays. Returns the data with probabilities of each scoring event and EP added. The following columns must be present: season, home_team, posteam, roof (coded as 'open', 'closed', or 'retractable'), half_seconds_remaining, yardline_100, ydstogo, posteam_timeouts_remaining, defteam_timeouts_remaining
calculate_expected_points(pbp_data)
calculate_expected_points(pbp_data)
pbp_data |
Play-by-play dataset to estimate expected points for. |
Computes expected points for provided plays. Returns the data with probabilities of each scoring event and EP added. The following columns must be present:
season
home_team
posteam
roof (coded as 'outdoors', 'dome', or 'open'/'closed'/NA (retractable))
half_seconds_remaining
yardline_100
down
ydstogo
posteam_timeouts_remaining
defteam_timeouts_remaining
The original pbp_data with the following columns appended to it:
expected points.
probability of no more scoring this half.
probability next score opponent field goal this half.
probability next score opponent safety this half.
probability of next score opponent touchdown this half.
probability next score field goal this half.
probability next score safety this half.
probability text score touchdown this half.
try({# to avoid CRAN test problems library(dplyr) data <- tibble::tibble( "season" = 1999:2019, "home_team" = "SEA", "posteam" = "SEA", "roof" = "outdoors", "half_seconds_remaining" = 1800, "yardline_100" = c(rep(80, 17), rep(75, 4)), "down" = 1, "ydstogo" = 10, "posteam_timeouts_remaining" = 3, "defteam_timeouts_remaining" = 3 ) nflfastR::calculate_expected_points(data) %>% dplyr::select(season, yardline_100, td_prob, ep) })
try({# to avoid CRAN test problems library(dplyr) data <- tibble::tibble( "season" = 1999:2019, "home_team" = "SEA", "posteam" = "SEA", "roof" = "outdoors", "half_seconds_remaining" = 1800, "yardline_100" = c(rep(80, 17), rep(75, 4)), "down" = 1, "ydstogo" = 10, "posteam_timeouts_remaining" = 3, "defteam_timeouts_remaining" = 3 ) nflfastR::calculate_expected_points(data) %>% dplyr::select(season, yardline_100, td_prob, ep) })
A "Series" begins on a 1st and 10 and each team attempts to either earn a new 1st down (on offense) or prevent the offense from converting a new 1st down (on defense). Series conversion rate represents how many series have been either converted to a new 1st down or ended in a touchdown. This function computes series conversion rates on offense and defense from nflverse play-by-play data along with other series results. The function automatically removes series that ended in a QB kneel down.
calculate_series_conversion_rates(pbp, weekly = FALSE)
calculate_series_conversion_rates(pbp, weekly = FALSE)
pbp |
Play-by-play data as returned by |
weekly |
If |
A data frame of series information including the following columns:
The NFL season
NFL team abbreviation
Week if weekly
is TRUE
The number of series the offense played (excludes QB kneel downs, kickoffs, extra point/two point conversion attempts, non-plays, and plays that do not list a "posteam")
The rate at which a series ended in either new 1st down or touchdown while the offense was on the field
The rate at which an offense earned a 1st down or scored a touchdown on 1st down
The rate at which an offense earned a 1st down or scored a touchdown on 2nd down
The rate at which an offense earned a 1st down or scored a touchdown on 3rd down
The rate at which an offense earned a 1st down or scored a touchdown on 4th down
The rate of series that ended in a new 1st down while the offense was on the field (does not include offensive touchdown)
The rate of series that ended in an offensive touchdown while the offense was on the field
The rate of series that ended in a field goal attempt while the offense was on the field
The rate of series that ended in a punt while the offense was on the field
The rate of series that ended in a turnover (including on downs), in an opponent score, or at the end of half (or game) while the offense was on the field
The number of series the defense played (excludes QB kneel downs, kickoffs, extra point/two point conversion attempts, non-plays, and plays that do not list a "posteam")
The rate at which a series ended in either new 1st down or touchdown while the defense was on the field
The rate at which a defense allowed a 1st down or touchdown on 1st down
The rate at which a defense allowed a 1st down or touchdown on 2nd down
The rate at which a defense allowed a 1st down or touchdown on 3rd down
The rate at which a defense allowed a 1st down or touchdown on 4th down
The rate of series that ended in a new 1st down while the defense was on the field (does not include offensive touchdown)
The rate of series that ended in an offensive touchdown while the defense was on the field
The rate of series that ended in a field goal attempt while the defense was on the field
The rate of series that ended in a punt while the defense was on the field
The rate of series that ended in a turnover (including on downs), in an opponent score, or at the end of half (or game) while the defense was on the field
try({# to avoid CRAN test problems pbp <- nflfastR::load_pbp(2021) weekly <- calculate_series_conversion_rates(pbp, weekly = TRUE) dplyr::glimpse(weekly) overall <- calculate_series_conversion_rates(pbp, weekly = FALSE) dplyr::glimpse(overall) })
try({# to avoid CRAN test problems pbp <- nflfastR::load_pbp(2021) weekly <- calculate_series_conversion_rates(pbp, weekly = TRUE) dplyr::glimpse(weekly) overall <- calculate_series_conversion_rates(pbp, weekly = FALSE) dplyr::glimpse(overall) })
This function calculates division standings as well as playoff seeds per conference based on either nflverse play-by-play data or nflverse schedule data.
calculate_standings( nflverse_object, tiebreaker_depth = 3, playoff_seeds = NULL )
calculate_standings( nflverse_object, tiebreaker_depth = 3, playoff_seeds = NULL )
nflverse_object |
Data object of class |
tiebreaker_depth |
A single value equal to 1, 2, or 3. The default is 3. The value controls the depth of tiebreakers that shall be applied. The deepest currently implemented tiebreaker is strength of schedule. The following values are valid:
|
playoff_seeds |
Number of playoff teams per conference. If |
A tibble with NFL regular season standings
try({# to avoid CRAN test problems # load nflverse data both schedules and pbp scheds <- fast_scraper_schedules(2014) pbp <- load_pbp(c(2018, 2021)) # calculate standings based on pbp calculate_standings(pbp) # calculate standings based on schedules calculate_standings(scheds) })
try({# to avoid CRAN test problems # load nflverse data both schedules and pbp scheds <- fast_scraper_schedules(2014) pbp <- load_pbp(c(2018, 2021)) # calculate standings based on pbp calculate_standings(pbp) # calculate standings based on schedules calculate_standings(scheds) })
Compute various NFL stats based off nflverse Play-by-Play data.
calculate_stats( seasons = nflreadr::most_recent_season(), summary_level = c("season", "week"), stat_type = c("player", "team"), season_type = c("REG", "POST", "REG+POST") )
calculate_stats( seasons = nflreadr::most_recent_season(), summary_level = c("season", "week"), stat_type = c("player", "team"), season_type = c("REG", "POST", "REG+POST") )
seasons |
A numeric vector of 4-digit years associated with given NFL seasons - defaults to latest season. If set to TRUE, returns all available data since 1999. |
summary_level |
Summarize stats by |
stat_type |
Calculate |
season_type |
One of |
A tibble of player/team stats summarized by season/week.
nfl_stats_variables for a description of all variables.
https://www.nflfastr.com/articles/stats_variables.html for a searchable table of the stats variable descriptions.
try({# to avoid CRAN test problems stats <- calculate_stats(2023, "season", "player") dplyr::glimpse(stats) })
try({# to avoid CRAN test problems stats <- calculate_stats(2023, "season", "player") dplyr::glimpse(stats) })
for provided plays. Returns the data with probabilities of winning the game. The following columns must be present: receive_h2_ko (1 if game is in 1st half and possession team will receive 2nd half kickoff, 0 otherwise), home_team, posteam, half_seconds_remaining, game_seconds_remaining, spread_line (how many points home team was favored by), down, ydstogo, yardline_100, posteam_timeouts_remaining, defteam_timeouts_remaining
calculate_win_probability(pbp_data)
calculate_win_probability(pbp_data)
pbp_data |
Play-by-play dataset to estimate win probability for. |
Computes win probability for provided plays. Returns the data with spread and non-spread-adjusted win probabilities. The following columns must be present:
receive_2h_ko (1 if game is in 1st half and possession team will receive 2nd half kickoff, 0 otherwise)
score_differential
home_team
posteam
half_seconds_remaining
game_seconds_remaining
spread_line (how many points home team was favored by)
down
ydstogo
yardline_100
posteam_timeouts_remaining
defteam_timeouts_remaining
The original pbp_data with the following columns appended to it:
win probability.
win probability taking into account pre-game spread.
try({# to avoid CRAN test problems library(dplyr) data <- tibble::tibble( "receive_2h_ko" = 0, "home_team" = "SEA", "posteam" = "SEA", "score_differential" = 0, "half_seconds_remaining" = 1800, "game_seconds_remaining" = 3600, "spread_line" = c(1, 3, 4, 7, 14), "down" = 1, "ydstogo" = 10, "yardline_100" = 75, "posteam_timeouts_remaining" = 3, "defteam_timeouts_remaining" = 3 ) nflfastR::calculate_win_probability(data) %>% dplyr::select(spread_line, wp, vegas_wp) })
try({# to avoid CRAN test problems library(dplyr) data <- tibble::tibble( "receive_2h_ko" = 0, "home_team" = "SEA", "posteam" = "SEA", "score_differential" = 0, "half_seconds_remaining" = 1800, "game_seconds_remaining" = 3600, "spread_line" = c(1, 3, 4, 7, 14), "down" = 1, "ydstogo" = 10, "yardline_100" = 75, "posteam_timeouts_remaining" = 3, "defteam_timeouts_remaining" = 3 ) nflfastR::calculate_win_probability(data) %>% dplyr::select(spread_line, wp, vegas_wp) })
Clean Play by Play Data
clean_pbp(pbp, ...)
clean_pbp(pbp, ...)
pbp |
is a Data frame of play-by-play data scraped using |
... |
Additional arguments passed to a message function (for internal use). |
Build columns that capture what happens on all plays, including penalties, using string extraction from play description. Loosely based on Ben's nflfastR guide (https://www.nflfastr.com/articles/beginners_guide.html) but updated to work with the RS data, which has a different player format in the play description; e.g. 24-M.Lynch instead of M.Lynch. The function also standardizes team abbreviations so that, for example, the Chargers are always represented by 'LAC' regardless of which year it was. Starting in 2022, play-by-play data was missing gsis player IDs of rookies. This functions tries to fix as many as possible.
The input Data Frame of the parameter 'pbp' with the following columns added:
Binary indicator wheter epa > 0 in the given play.
Name of the dropback player (scrambles included) including plays with penalties.
Jersey number of the passer.
Name of the rusher (no scrambles) including plays with penalties.
Jersey number of the rusher.
Name of the receiver including plays with penalties.
Jersey number of the receiver.
Binary indicator if the play was a pass play (sacks and scrambles included).
Binary indicator if the play was a rushing play.
Binary indicator if the play was a special teams play.
Binary indicator if the play ended in a first down.
Binary indicator if the play description indicates "Aborted".
Binary indicator: 1 if the play was a 'normal' play (including penalties), 0 otherwise.
ID of the player in the 'passer' column.
ID of the player in the 'rusher' column.
ID of the player in the 'receiver' column.
Name of the 'passer' if it is not 'NA', or name of the 'rusher' otherwise.
Name of the rusher on rush plays or receiver on pass plays.
ID of the rusher on rush plays or receiver on pass plays.
Name of the rusher on rush plays or receiver on pass plays (from official stats).
ID of the rusher on rush plays or receiver on pass plays (from official stats).
Jersey number of the player listed in the 'name' column.
ID of the player in the 'name' column.
= 1 if play description contains "ran ob", "pushed ob", or "sacked ob"; = 0 otherwise.
= 1 if the home team received the opening kickoff, 0 otherwise.
For information on parallel processing and progress updates please see nflfastR.
Takes all columns ending with 'player_id'
as well as the
variables 'passer_id'
, 'rusher_id'
, 'fantasy_id'
,
'receiver_id'
, and 'id'
of an nflfastR play-by-play data set
and decodes the player IDs to the commonly known GSIS ID format 00-00xxxxx.
The function uses by default the high efficient decode_ids
of the package gsisdecoder
.
In the unlikely event that there is a problem with this function, an nflfastR
internal decoder can be used with the option fast = FALSE
.
The 2022 play by play data introduced new player IDs that can't be decoded with gsisdecoder. In that case, IDs are joined through nflreadr::load_players.
decode_player_ids(pbp, ..., fast = TRUE)
decode_player_ids(pbp, ..., fast = TRUE)
pbp |
is a Data frame of play-by-play data scraped using |
... |
Additional arguments passed to a message function (for internal use). |
fast |
If |
The input data frame of the parameter pbp
with decoded player IDs.
# Decode data frame consisting of some names and ids decode_player_ids(data.frame( name = c("P.Mahomes", "B.Baldwin", "P.Mahomes", "S.Carl", "J.Jones"), id = c( "32013030-2d30-3033-3338-3733fa30c4fa", NA_character_, "00-0033873", NA_character_, "32013030-2d30-3032-3739-3434d4d3846d" ) ))
# Decode data frame consisting of some names and ids decode_player_ids(data.frame( name = c("P.Mahomes", "B.Baldwin", "P.Mahomes", "S.Carl", "J.Jones"), id = c( "32013030-2d30-3033-3338-3733fa30c4fa", NA_character_, "00-0033873", NA_character_, "32013030-2d30-3032-3739-3434d4d3846d" ) ))
Load and parse NFL play-by-play data and add all of the original
nflfastR variables. As nflfastR now provides multiple functions which add
information to the output of this function, it is recommended to use
build_nflfastR_pbp
instead.
fast_scraper( game_ids, dir = getOption("nflfastR.raw_directory", default = NULL), ..., in_builder = FALSE )
fast_scraper( game_ids, dir = getOption("nflfastR.raw_directory", default = NULL), ..., in_builder = FALSE )
game_ids |
Vector of character ids or a data frame including the variable
|
dir |
Path to local directory (defaults to option "nflfastR.raw_directory")
where nflfastR searches for raw game play-by-play data.
See |
... |
Additional arguments passed to the scraping functions (for internal use) |
in_builder |
If |
To load valid game_ids please use the package function
fast_scraper_schedules
(the function can directly handle the
output of that function)
Data frame where each individual row represents a single play for all passed game_ids containing the following detailed information (description partly extracted from nflscrapR):
Numeric play id that when used with game_id and drive provides the unique identifier for a single play.
Ten digit identifier for NFL game.
Legacy NFL game ID.
String abbreviation for the home team.
String abbreviation for the away team.
'REG' or 'POST' indicating if the game belongs to regular or post season.
Season week.
String abbreviation for the team with possession.
String indicating whether the posteam team is home or away.
String abbreviation for the team on defense.
String abbreviation for which team's side of the field the team with possession is currently on.
Numeric distance in the number of yards from the opponent's endzone for the posteam.
Date of the game.
Numeric seconds remaining in the quarter.
Numeric seconds remaining in the half.
Numeric seconds remaining in the game.
String indicating which half the play is in, either Half1, Half2, or Overtime.
Binary indicator for whether or not the row of the data is marking the end of a quarter.
Numeric drive number in the game.
Binary indicator for whether or not a score occurred on the play.
Quarter of the game (5 is overtime).
The down for the given play.
Binary indicator for whether or not the posteam is in a goal down situation.
Time at start of play provided in string format as minutes:seconds remaining in the quarter.
String indicating the current field position for a given play.
Numeric yards in distance from either the first down marker or the endzone in goal down situations.
Numeric value for total yards gained on the given drive.
Detailed string description for the given play.
String indicating the type of play: pass (includes sacks), run (includes scrambles), punt, field_goal, kickoff, extra_point, qb_kneel, qb_spike, no_play (timeouts and penalties), and missing for rows indicating end of play.
Numeric yards gained (or lost) by the possessing team, excluding yards gained via fumble recoveries and laterals.
Binary indicator for whether or not the play was in shotgun formation.
Binary indicator for whether or not the play was in no_huddle formation.
Binary indicator for whether or not the QB dropped back on the play (pass attempt, sack, or scrambled).
Binary indicator for whether or not the QB took a knee.
Binary indicator for whether or not the QB spiked the ball.
Binary indicator for whether or not the QB scrambled.
String indicator for pass length: short or deep.
String indicator for pass location: left, middle, or right.
Numeric value for distance in yards perpendicular to the line of scrimmage at where the targeted receiver either caught or didn't catch the ball.
Numeric value for distance in yards perpendicular to the yard line where the receiver made the reception to where the play ended.
String indicator for location of run: left, middle, or right.
String indicator for line gap of run: end, guard, or tackle
String indicator for result of field goal attempt: made, missed, or blocked.
Numeric distance in yards for kickoffs, field goals, and punts.
String indicator for the result of the extra point attempt: good, failed, blocked, safety (touchback in defensive endzone is 1 point apparently), or aborted.
String indicator for result of two point conversion attempt: success, failure, safety (touchback in defensive endzone is 1 point apparently), or return.
Numeric timeouts remaining in the half for the home team.
Numeric timeouts remaining in the half for the away team.
Binary indicator for whether or not a timeout was called by either team.
String abbreviation for which team called the timeout.
String abbreviation for which team scored the touchdown.
String name of the player who scored a touchdown.
Unique identifier of the player who scored a touchdown.
Number of timeouts remaining for the possession team.
Number of timeouts remaining for the team on defense.
Score for the home team at the end of the play.
Score for the away team at the end of the play.
Score the posteam at the start of the play.
Score the defteam at the start of the play.
Score differential between the posteam and defteam at the start of the play.
Score for the posteam at the end of the play.
Score for the defteam at the end of the play.
Score differential between the posteam and defteam at the end of the play.
Predicted probability of no score occurring for the rest of the half based on the expected points model.
Predicted probability of the defteam scoring a FG next.
Predicted probability of the defteam scoring a safety next.
Predicted probability of the defteam scoring a TD next.
Predicted probability of the posteam scoring a FG next.
Predicted probability of the posteam scoring a safety next.
Predicted probability of the posteam scoring a TD next.
Predicted probability of the posteam scoring an extra point.
Predicted probability of the posteam scoring the two point conversion.
Using the scoring event probabilities, the estimated expected points with respect to the possession team for the given play.
Expected points added (EPA) by the posteam for the given play.
Cumulative total EPA for the home team in the game so far.
Cumulative total EPA for the away team in the game so far.
Cumulative total rushing EPA for the home team in the game so far.
Cumulative total rushing EPA for the away team in the game so far.
Cumulative total passing EPA for the home team in the game so far.
Cumulative total passing EPA for the away team in the game so far.
EPA from the air yards alone. For completions this represents the actual value provided through the air. For incompletions this represents the hypothetical value that could've been added through the air if the pass was completed.
EPA from the yards after catch alone. For completions this represents the actual value provided after the catch. For incompletions this represents the difference between the hypothetical air_epa and the play's raw observed EPA (how much the incomplete pass cost the posteam).
EPA from the air yards alone only for completions.
EPA from the yards after catch alone only for completions.
Cumulative total completions air EPA for the home team in the game so far.
Cumulative total completions air EPA for the away team in the game so far.
Cumulative total completions yac EPA for the home team in the game so far.
Cumulative total completions yac EPA for the away team in the game so far.
Cumulative total raw air EPA for the home team in the game so far.
Cumulative total raw air EPA for the away team in the game so far.
Cumulative total raw yac EPA for the home team in the game so far.
Cumulative total raw yac EPA for the away team in the game so far.
Estimated win probabiity for the posteam given the current situation at the start of the given play.
Estimated win probability for the defteam.
Estimated win probability for the home team.
Estimated win probability for the away team.
Win probability added (WPA) for the posteam.
Win probability added (WPA) for the posteam: spread_adjusted model.
Win probability added (WPA) for the home team: spread_adjusted model.
Estimated win probability for the home team at the end of the play.
Estimated win probability for the away team at the end of the play.
Estimated win probabiity for the posteam given the current situation at the start of the given play, incorporating pre-game Vegas line.
Estimated win probability for the home team incorporating pre-game Vegas line.
Cumulative total rushing WPA for the home team in the game so far.
Cumulative total rushing WPA for the away team in the game so far.
Cumulative total passing WPA for the home team in the game so far.
Cumulative total passing WPA for the away team in the game so far.
WPA through the air (same logic as air_epa).
WPA from yards after the catch (same logic as yac_epa).
The air_wpa for completions only.
The yac_wpa for completions only.
Cumulative total completions air WPA for the home team in the game so far.
Cumulative total completions air WPA for the away team in the game so far.
Cumulative total completions yac WPA for the home team in the game so far.
Cumulative total completions yac WPA for the away team in the game so far.
Cumulative total raw air WPA for the home team in the game so far.
Cumulative total raw air WPA for the away team in the game so far.
Cumulative total raw yac WPA for the home team in the game so far.
Cumulative total raw yac WPA for the away team in the game so far.
Binary indicator for if the punt was blocked.
Binary indicator for if a running play converted the first down.
Binary indicator for if a passing play converted the first down.
Binary indicator for if a penalty converted the first down.
Binary indicator for if the first down was converted on third down.
Binary indicator for if the posteam failed to convert first down on third down.
Binary indicator for if the first down was converted on fourth down.
Binary indicator for if the posteam failed to convert first down on fourth down.
Binary indicator for if the pass was incomplete.
Binary indicator for if a touchback occurred on the play.
Binary indicator for if the pass was intercepted.
Binary indicator for if the punt ended inside the twenty yard line.
Binary indicator for if the punt was in the endzone.
Binary indicator for if the punt went out of bounds.
Binary indicator for if the punt was downed.
Binary indicator for if the punt was caught with a fair catch.
Binary indicator for if the kickoff ended inside the twenty yard line.
Binary indicator for if the kickoff was in the endzone.
Binary indicator for if the kickoff went out of bounds.
Binary indicator for if the kickoff was downed.
Binary indicator for if the kickoff was caught with a fair catch.
Binary indicator for if the fumble was forced.
Binary indicator for if the fumble was not forced.
Binary indicator for if the fumble went out of bounds.
Binary indicator if the play had a solo tackle (could be multiple due to fumbles).
Binary indicator for whether or not a safety occurred.
Binary indicator for whether or not a penalty occurred.
Binary indicator for whether or not a tackle for loss on a run play occurred.
Binary indicator for if the fumble was lost.
Binary indicator for if the kicking team recovered the kickoff.
Binary indicator for if the kicking team recovered the kickoff and scored a TD.
Binary indicator if the QB was hit on the play.
Binary indicator for if the play was a run.
Binary indicator for if the play was a pass attempt (includes sacks).
Binary indicator for if the play ended in a sack.
Binary indicator for if the play resulted in a TD.
Binary indicator for if the play resulted in a passing TD.
Binary indicator for if the play resulted in a rushing TD.
Binary indicator for if the play resulted in a return TD.
Binary indicator for extra point attempt.
Binary indicator for two point conversion attempt.
Binary indicator for field goal attempt.
Binary indicator for kickoff.
Binary indicator for punts.
Binary indicator for if a fumble occurred.
Binary indicator for if the pass was completed.
Binary indicator for if an assist tackle occurred.
Binary indicator for if a lateral occurred on the reception.
Binary indicator for if a lateral occurred on a run.
Binary indicator for if a lateral occurred on a return.
Binary indicator for if a lateral occurred on a fumble recovery.
Unique identifier for the player that attempted the pass.
String name for the player that attempted the pass.
Numeric yards by the passer_player_name, including yards gained in pass plays with laterals. This should equal official passing statistics.
Unique identifier for the receiver that was targeted on the pass.
String name for the targeted receiver.
Numeric yards by the receiver_player_name, excluding yards gained in pass plays with laterals.
This should equal official receiving statistics but could miss yards gained in pass plays with laterals.
Please see the description of lateral_receiver_player_name
for further information.
Unique identifier for the player that attempted the run.
String name for the player that attempted the run.
Numeric yards by the rusher_player_name, excluding yards gained in rush plays with laterals.
This should equal official rushing statistics but could miss yards gained in rush plays with laterals.
Please see the description of lateral_rusher_player_name
for further information.
Unique identifier for the player that received the last(!) lateral on a pass play.
String name for the player that received the last(!) lateral on a pass play. If there were multiple laterals in the same play, this will only be the last player who received a lateral. Please see https://github.com/mrcaseb/nfl-data/tree/master/data/lateral_yards for a list of plays where multiple players recorded lateral receiving yards.
Numeric yards by the lateral_receiver_player_name
in pass plays with laterals.
Please see the description of lateral_receiver_player_name
for further information.
Unique identifier for the player that received the last(!) lateral on a run play.
String name for the player that received the last(!) lateral on a run play. If there were multiple laterals in the same play, this will only be the last player who received a lateral. Please see https://github.com/mrcaseb/nfl-data/tree/master/data/lateral_yards for a list of plays where multiple players recorded lateral rushing yards.
Numeric yards by the lateral_rusher_player_name
in run plays with laterals.
Please see the description of lateral_rusher_player_name
for further information.
Unique identifier for the player that received the lateral on a sack.
String name for the player that received the lateral on a sack.
Unique identifier for the player that intercepted the pass.
String name for the player that intercepted the pass.
Unique indentifier for the player that received the lateral on an interception.
String name for the player that received the lateral on an interception.
Unique identifier for the punt returner.
String name for the punt returner.
Unique identifier for the player that received the lateral on a punt return.
String name for the player that received the lateral on a punt return.
String name for the kickoff returner.
Unique identifier for the kickoff returner.
Unique identifier for the player that received the lateral on a kickoff return.
String name for the player that received the lateral on a kickoff return.
Unique identifier for the punter.
String name for the punter.
String name for the kicker on FG or kickoff.
Unique identifier for the kicker on FG or kickoff.
Unique identifier for the player that recovered their own kickoff.
String name for the player that recovered their own kickoff.
Unique identifier for the player that blocked the punt or FG.
String name for the player that blocked the punt or FG.
Unique identifier for one of the potential players with the tackle for loss.
String name for one of the potential players with the tackle for loss.
Unique identifier for one of the potential players with the tackle for loss.
String name for one of the potential players with the tackle for loss.
Unique identifier for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see sack_player
or half_sack_*_player
.
String name for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see sack_player
or half_sack_*_player
.
Unique identifier for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see sack_player
or half_sack_*_player
.
String name for one of the potential players that hit the QB. No sack as the QB was not the ball carrier. For sacks please see sack_player
or half_sack_*_player
.
Team of one of the players with a forced fumble.
Unique identifier of one of the players with a forced fumble.
String name of one of the players with a forced fumble.
Team of one of the players with a forced fumble.
Unique identifier of one of the players with a forced fumble.
String name of one of the players with a forced fumble.
Team of one of the players with a solo tackle.
Team of one of the players with a solo tackle.
Unique identifier of one of the players with a solo tackle.
Unique identifier of one of the players with a solo tackle.
String name of one of the players with a solo tackle.
String name of one of the players with a solo tackle.
Unique identifier of one of the players with a tackle assist.
String name of one of the players with a tackle assist.
Team of one of the players with a tackle assist.
Unique identifier of one of the players with a tackle assist.
String name of one of the players with a tackle assist.
Team of one of the players with a tackle assist.
Unique identifier of one of the players with a tackle assist.
String name of one of the players with a tackle assist.
Team of one of the players with a tackle assist.
Unique identifier of one of the players with a tackle assist.
String name of one of the players with a tackle assist.
Team of one of the players with a tackle assist.
Binary indicator for if there has been a tackle with assist.
Unique identifier of one of the players with a tackle with assist.
String name of one of the players with a tackle with assist.
Team of one of the players with a tackle with assist.
Unique identifier of one of the players with a tackle with assist.
String name of one of the players with a tackle with assist.
Team of one of the players with a tackle with assist.
Unique identifier of one of the players with a pass defense.
String name of one of the players with a pass defense.
Unique identifier of one of the players with a pass defense.
String name of one of the players with a pass defense.
Team of one of the first player with a fumble.
Unique identifier of the first player who fumbled on the play.
String name of one of the first player who fumbled on the play.
Unique identifier of the second player who fumbled on the play.
String name of one of the second player who fumbled on the play.
Team of one of the second player with a fumble.
Team of one of the players with a fumble recovery.
Yards gained by one of the players with a fumble recovery.
Unique identifier of one of the players with a fumble recovery.
String name of one of the players with a fumble recovery.
Team of one of the players with a fumble recovery.
Yards gained by one of the players with a fumble recovery.
Unique identifier of one of the players with a fumble recovery.
String name of one of the players with a fumble recovery.
Unique identifier of the player who recorded a solo sack.
String name of the player who recorded a solo sack.
Unique identifier of the first player who recorded half a sack.
String name of the first player who recorded half a sack.
Unique identifier of the second player who recorded half a sack.
String name of the second player who recorded half a sack.
String abbreviation of the return team.
Yards gained by the return team.
String abbreviation of the team with the penalty.
Unique identifier for the player with the penalty.
String name for the player with the penalty.
Yards gained (or lost) by the posteam from the penalty.
Binary indicator for whether or not a replay or challenge.
String indicating the result of the replay or challenge.
String indicating the penalty type of the first penalty in the given play. Will be NA
if desc
is missing the type.
Binary indicator whether or not the defense was able to have an attempt on a two point conversion, this results following a turnover.
Binary indicator whether or not the defense successfully scored on the two point conversion.
Binary indicator whether or not the defense was able to have an attempt on an extra point attempt, this results following a blocked attempt that the defense recovers the ball.
Binary indicator whether or not the defense successfully scored on an extra point attempt.
String name for the player who scored a safety.
Unique identifier for the player who scored a safety.
4 digit number indicating to which season the game belongs to.
Numeric value indicating the probability for a complete pass based on comparable game situations.
For a single pass play this is 1 - cp when the pass was completed or 0 - cp when the pass was incomplete. Analyzed for a whole game or season an indicator for the passer how much over or under expectation his completion percentage was.
Starts at 1, each new first down increments, numbers shared across both teams NA: kickoffs, extra point/two point conversion attempts, non-plays, no posteam
1: scored touchdown, gained enough yards for first down.
Possible values: First down, Touchdown, Opp touchdown, Field goal, Missed field goal, Safety, Turnover, Punt, Turnover on downs, QB kneel, End of half
Kickoff time in eastern time zone.
Column provided by NFL to fix out-of-order plays. Available 2011 and beyond with source "nfl".
Time of day of play in UTC "HH:MM:SS" format. Available 2011 and beyond with source "nfl".
Game site name.
String describing the weather including temperature, humidity and wind (direction and speed). Doesn't change during the game!
UUID of the game in the new NFL API.
Time on the playclock when the ball was snapped.
Binary indicator for deleted plays.
Play type as listed in the NFL source. Slightly different to the regular play_type variable.
Binary indicator for whether play is special teams play from NFL source. Available 2011 and beyond with source "nfl".
Type of special teams play from NFL source. Available 2011 and beyond with source "nfl".
Game time at the end of a given play.
String indicating the yardline at the end of the given play consisting of team half and yard line number.
Local day time when the drive started (currently not used by the NFL and therefore mostly 'NA').
Numeric value of how many regular plays happened in a given drive.
Time of possession in a given drive.
Number of forst downs in a given drive.
Binary indicator if the offense was able to get inside the opponents 20 yard line.
Binary indicator the drive ended with a score.
Numeric value indicating in which quarter the given drive has started.
Numeric value indicating in which quarter the given drive has ended.
Numeric value of how many yards the offense gained or lost through penalties in the given drive.
String indicating how the offense got the ball.
String indicating how the offense lost the ball.
Game time at the beginning of a given drive.
Game time at the end of a given drive.
String indicating where a given drive started consisting of team half and yard line number.
String indicating where a given drive ended consisting of team half and yard line number.
Play_id of the first play in the given drive.
Play_id of the last play in the given drive.
Manually created drive number in a game.
Manually created drive result.
Total points scored by the away team.
Total points scored by the home team.
Either 'Home' o 'Neutral' indicating if the home team played at home or at a neutral site.
Equals home_score - away_score and means the game outcome from the perspective of the home team.
Equals home_score + away_score and means the total points scored in the given game.
The closing spread line for the game. A positive number means the home team was favored by that many points, a negative number means the away team was favored by that many points. (Source: Pro-Football-Reference)
The closing total line for the game. (Source: Pro-Football-Reference)
Binary indicator for if the given game was a division game.
One of 'dome', 'outdoors', 'closed', 'open' indicating indicating the roof status of the stadium the game was played in. (Source: Pro-Football-Reference)
What type of ground the game was played on. (Source: Pro-Football-Reference)
The temperature at the stadium only for 'roof' = 'outdoors' or 'open'.(Source: Pro-Football-Reference)
The speed of the wind in miles/hour only for 'roof' = 'outdoors' or 'open'. (Source: Pro-Football-Reference)
First and last name of the home team coach. (Source: Pro-Football-Reference)
First and last name of the away team coach. (Source: Pro-Football-Reference)
ID of the stadium the game was played in. (Source: Pro-Football-Reference)
Name of the stadium the game was played in. (Source: Pro-Football-Reference)
For information on parallel processing and progress updates please see nflfastR.
build_nflfastR_pbp()
, save_raw_pbp()
# Get pbp data for two games try({# to avoid CRAN test problems fast_scraper(c("2019_01_GB_CHI", "2013_21_SEA_DEN")) }) # It is also possible to directly use the # output of `fast_scraper_schedules` as input try({# to avoid CRAN test problems library(dplyr, warn.conflicts = FALSE) fast_scraper_schedules(2020) %>% slice_tail(n = 3) %>% fast_scraper() })
# Get pbp data for two games try({# to avoid CRAN test problems fast_scraper(c("2019_01_GB_CHI", "2013_21_SEA_DEN")) }) # It is also possible to directly use the # output of `fast_scraper_schedules` as input try({# to avoid CRAN test problems library(dplyr, warn.conflicts = FALSE) fast_scraper_schedules(2020) %>% slice_tail(n = 3) %>% fast_scraper() })
Load Rosters
fast_scraper_roster(...)
fast_scraper_roster(...)
... |
Arguments passed on to
|
See nflreadr::load_rosters
for details.
A tibble of season-level roster data.
For information on parallel processing and progress updates please see nflfastR.
# Roster of the 2019 and 2020 seasons try({# to avoid CRAN test problems fast_scraper_roster(2019:2020) })
# Roster of the 2019 and 2020 seasons try({# to avoid CRAN test problems fast_scraper_roster(2019:2020) })
This returns game/schedule information as maintained by Lee Sharpe.
fast_scraper_schedules(...)
fast_scraper_schedules(...)
... |
Arguments passed on to
|
See nflreadr::load_schedules
for details.
A tibble of game information for past and/or future games.
For information on parallel processing and progress updates please see nflfastR.
# Get schedules for the whole 2015 - 2018 seasons try({# to avoid CRAN test problems fast_scraper_schedules(2015:2018) })
# Get schedules for the whole 2015 - 2018 seasons try({# to avoid CRAN test problems fast_scraper_schedules(2015:2018) })
nflfastR Field Descriptions
field_descriptions
field_descriptions
A data frame including names and descriptions of all variables in an nflfastR dataset.
The searchable table on the nflfastR website
field_descriptions
field_descriptions
Loads play by play seasons from the nflverse-data repository
load_pbp(...)
load_pbp(...)
... |
Arguments passed on to
|
The complete nflfastR dataset as returned by nflfastR::build_nflfastR_pbp()
(see below) for all given seasons
https://nflreadr.nflverse.com/articles/dictionary_pbp.html for a web version of the data dictionary
dictionary_pbp
for the data dictionary bundled as a package dataframe
https://www.nflfastr.com/reference/build_nflfastR_pbp.html for the nflfastR function nflfastR::build_nflfastR_pbp()
Issues with this data should be filed here: https://github.com/nflverse/nflverse-pbp
try({# to avoid CRAN test problems pbp <- load_pbp(2019:2020) dplyr::glimpse(pbp) })
try({# to avoid CRAN test problems pbp <- load_pbp(2019:2020) dplyr::glimpse(pbp) })
Load Player Level Weekly Stats
load_player_stats(...)
load_player_stats(...)
... |
Arguments passed on to
|
A tibble of week-level player statistics that aims to match NFL official box scores.
The function calculate_player_stats()
and the corresponding examples
on the nflfastR website
try({# to avoid CRAN test problems stats <- load_player_stats() dplyr::glimpse(stats) })
try({# to avoid CRAN test problems stats <- load_player_stats() dplyr::glimpse(stats) })
Uses nflreadr::load_schedules()
to load game IDs of finished games and
compares these IDs to all files saved under dir
.
This function is intended to serve as input for save_raw_pbp()
.
missing_raw_pbp( dir = getOption("nflfastR.raw_directory", default = NULL), seasons = TRUE, verbose = TRUE )
missing_raw_pbp( dir = getOption("nflfastR.raw_directory", default = NULL), seasons = TRUE, verbose = TRUE )
dir |
Path to local directory (defaults to option "nflfastR.raw_directory"). nflfastR will download the raw game files split by season into one sub directory per season. |
seasons |
a numeric vector of seasons to return, default |
verbose |
If |
A character vector of missing game IDs. If no files are missing,
returns NULL
invisibly.
try( missing <- missing_raw_pbp(tempdir()) )
try( missing <- missing_raw_pbp(tempdir()) )
NFL Stats Variables
nfl_stats_variables
nfl_stats_variables
A data frame explaining all variables returned by the function
calculate_stats()
.
nfl_stats_variables
nfl_stats_variables
This function gives a quick overview of the versions of R and the operating system as well as the versions of nflverse packages, options, and their dependencies. It's primarily designed to help you get a quick idea of what's going on when you're helping someone else debug a problem.
report(...)
report(...)
... |
Arguments passed on to
|
See nflreadr::nflverse_sitrep
for details.
report(recursive = FALSE) nflverse_sitrep(pkg = "nflreadr", recursive = TRUE)
report(recursive = FALSE) nflverse_sitrep(pkg = "nflreadr", recursive = TRUE)
The functions build_nflfastR_pbp()
and fast_scraper()
support loading
raw pbp data from local file systems instead of Github servers.
This function is intended to help setting this up. It loads raw pbp data
and saves it in the given directory split by season in subdirectories.
save_raw_pbp( game_ids, dir = getOption("nflfastR.raw_directory", default = NULL) )
save_raw_pbp( game_ids, dir = getOption("nflfastR.raw_directory", default = NULL) )
game_ids |
A vector of nflverse game IDs. |
dir |
Path to local directory (defaults to option "nflfastR.raw_directory"). nflfastR will download the raw game files split by season into one sub directory per season. |
The function returns a data frame with one row for each downloaded file and the following columns:
success
if the HTTP request was successfully performed, regardless of the
response status code. This is FALSE
in case of a network error, or in case
you tried to resume from a server that did not support this. A value of NA
means the download was interrupted while in progress.
status_code
the HTTP status code from the request. A successful download is
usually 200
for full requests or 206
for resumed requests. Anything else
could indicate that the downloaded file contains an error page instead of the
requested content.
resumefrom
the file size before the request, in case a download was resumed.
url
final url (after redirects) of the request.
destfile
downloaded file on disk.
error
if success == FALSE
this column contains an error message.
type
the Content-Type
response header value.
modified
the Last-Modified
response header value.
time
total elapsed download time for this file in seconds.
headers
vector with http response headers for the request.
build_nflfastR_pbp()
, missing_raw_pbp()
# CREATE LOCAL TEMP DIRECTORY local_dir <- tempdir() # LOAD AND SAVE A GAME TO TEMP DIRECTORY save_raw_pbp("2021_20_BUF_KC", dir = local_dir) # REMOVE THE DIRECTORY unlink(file.path(local_dir, 2021))
# CREATE LOCAL TEMP DIRECTORY local_dir <- tempdir() # LOAD AND SAVE A GAME TO TEMP DIRECTORY save_raw_pbp("2021_20_BUF_KC", dir = local_dir) # REMOVE THE DIRECTORY unlink(file.path(local_dir, 2021))
NFL Stat IDs and their Meanings
stat_ids
stat_ids
A data frame including NFL stat IDs, names and descriptions used in an nflfastR dataset.
http://www.nflgsis.com/gsis/Documentation/Partners/StatIDs.html
stat_ids
stat_ids
NFL Team names, colors and logo urls.
teams_colors_logos
teams_colors_logos
A data frame with 36 rows and 10 variables containing NFL team level information, including franchises in multiple cities:
Team abbreviation
Complete Team name
Team id used in the roster function
Nickname
Conference
Division
Primary color
Secondary color
Tertiary color
Quaternary color
Url to Team logo on wikipedia
Url to higher quality logo on espn
Url to team wordmarks
Url to AFC and NFC logos
Url to NFL logo
The primary and secondary colors have been taken from nfl.com with some modifications
for better team distinction and most recent team color themes.
The tertiary and quaternary colors are taken from Lee Sharpe's teamcolors.csv
who has taken them from the teamcolors
package created by Ben Baumer and
Gregory Matthews. The Wikipeadia logo urls are taken from Lee Sharpe's logos.csv
Team wordmarks from nfl.com
teams_colors_logos
teams_colors_logos
update_db
updates or creates a database with nflfastR
play by play data of all completed games since 1999.
update_db( dbdir = getOption("nflfastR.dbdirectory", default = "."), dbname = "pbp_db", tblname = "nflfastR_pbp", force_rebuild = FALSE, db_connection = NULL )
update_db( dbdir = getOption("nflfastR.dbdirectory", default = "."), dbname = "pbp_db", tblname = "nflfastR_pbp", force_rebuild = FALSE, db_connection = NULL )
dbdir |
Directory in which the database is or shall be located. Can also
be set globally with |
dbname |
File name of an existing or desired SQLite database within |
tblname |
The name of the play by play data table within the database |
force_rebuild |
Hybrid parameter (logical or numeric) to rebuild parts of or the complete play by play data table within the database (please see details for further information) |
db_connection |
A |
This function creates and updates a data table with the name tblname
within a SQLite database (other drivers via db_connection
) located in
dbdir
and named dbname
.
The data table combines all play by play data for every available game back
to the 1999 season and adds the most recent completed games as soon as they
are available for nflfastR
.
The argument force_rebuild
is of hybrid type. It can rebuild the play
by play data table either for the whole nflfastR era (with force_rebuild = TRUE
)
or just for specified seasons (e.g. force_rebuild = c(2019, 2020)
).
Please note the following behavior:
force_rebuild = TRUE
: The data table with the name tblname
will be removed completely and rebuilt from scratch. This is helpful when
new columns are added during the Off-Season.
force_rebuild = c(2019, 2020)
: The data table with the name tblname
will be preserved and only rows from the 2019 and 2020 seasons will be
deleted and re-added. This is intended to be used for ongoing seasons because
the NFL fixes bugs in the underlying data during the week and we recommend
rebuilding the current season every Thursday during the season.
The parameter db_connection
is intended for advanced users who want
to use other DBI drivers, such as MariaDB, Postgres or odbc. Please note that
the arguments dbdir
and dbname
are dropped in case a db_connection
is provided but the argument tblname
will still be used to write the
data table into the database.