CatapultR Demo

Sergey Sandler has created an R package that includes many useful functions involving the OpenField Cloud API. One of the hopes of the package is that it can help reduce the manual effort of accessing and exporting data from OpenField. Automating this process in an R script file can reduce the time involved and ensure reproducibility of common tasks. The objective of this post is to demonstrate some of the ways in which Sport Scientists might find the package useful.

Sergey Sandler (Catapult Sports - Data Science Team)www.catapultsports.com , Brian Hart (Catapult Sports - Data Science Team)www.catapultsports.com
02-20-2020

Table of Contents


Difficulty Level: Beginner

What is the purpose of the Catapult R package?

The main purpose of the R package is to provide users the ability to access OpenField Cloud through a variety of functions.

There is a function for reading 100 Hz files generated through post-processing (provided that the high frequency module is enabled for the account), and the package also supports 10 Hz data from a mixed (dual stream) activity, which is not available through Connect.


Prerequisites

In order to utilize the catapultR package, you must have R installed on your computer. Do not install R in “Program Files” directory in Windows. The target directory must not have spaces. You can download R from the Comprehensive R Archive Network (CRAN). Just use the link that corresponds to a location near you.

See the Access catapultR page for information on how to download the package and the Installation and Login Demo to learn more about how to validate your user credentials.


Now we can load the package along with a few other helpful packages. Also, setting the options to 12 digits enables centiseconds to be returned for some of the functions.


# load packages ------------------------------------------------------------
library(catapultR)
library(tidyverse)
library(lubridate)
options(digits = 12)



The package is copyrighted for Catapult Sports. This can be seen by running the following line in R.


paste0(readr::read_lines(system.file("LICENSE", package = "catapultR"))[1], " ",
       readr::read_lines(system.file("LICENSE", package = "catapultR"))[2])

[1] "YEAR: 2022 COPYRIGHT HOLDER: Catapult Sports Pty Ltd"

Try running these lines for some additional documentation.


Validate User Credentials

If enabled on your account, you can also generate an API token string in OpenField Cloud. If you would like this feature enabled on your OpenField account, please speak with your Catapult Sports Customer Success Representative.

I will use a Women’s College Soccer demo account for this example, but note that the password, client ID, and client secret below have all been generated randomly for presentation purposes only. Typical token strings created in OpenField Cloud will be much longer.


token <- ofCloudCreateToken(sToken = "STe67MbNGfkO5XH8jaZDyoQvnz4VBYiL0Clg2RcW",  sRegion = "America")

token

ofCredentials: 
  Name: brian_wsoccer_demo
  Region:  America
  Stage:  main
  TokenTime:  1648585862
  TokenExpireTime:  1648589462
  ApiStatus:  200
  ApiMessage:  
  ApiTimeout:  60


The token will be used to access data for a specific user’s OF account.


Get the customer information for the account:


rmarkdown::paged_table(ofCloudGetCustomerInfo(token))

See what teams are associated the account:


rmarkdown::paged_table(ofCloudGetTeams(token))

See what modules are available on the account:


ofCloudGetModules(token)

 [1] "GoalKeeping"                    "Download"                      
 [3] "APICache"                       "GPS"                           
 [5] "AppData"                        "PluginSecurity:Adhoc"          
 [7] "VelocityBandSet2"               "Gen2AccelerationBands.Released"
 [9] "Gen1VelocityBands.Released"     "AdhocParams.Released"          
[11] "Gen1AccelerationBands.Released" "Gen2VelocityBands.Released"    
[13] "GoalkeepingDive"                "GK.V2"                         
[15] "GK.V1"                          "StreamFirmwareRelease"         
[17] "APITokenAdmin"                  "StreamOpenfieldOptimeyeStable" 

Extracting Aggregated Data

One potential use case for Sport Scientists is extracting aggregated data (by Period or Activity) for a group of athletes. This might be useful for season reviews or assessing longitudinal trends.


activities <- ofCloudGetActivities(credentials = token)

stats_df <- ofCloudGetStatistics(
        token, 
        params = c("athlete_name", "position_name", "date", "start_time", "end_time", 
                   "total_distance", "total_duration", "total_player_load", "max_vel", 
                   "hsr_efforts", "max_heart_rate", "mean_heart_rate", 
                   "period_id", "period_name", "activity_name"), 
        groupby = c("athlete", "period", "activity"), 
        filters = list(name = "activity_id",
                       comparison = "=",
                       values = activities$id))

# arrange by start time of period -----------------------------------------
stats_df <- stats_df %>% 
  arrange(start_time)

# set date data type ------------------------------------------------------
stats_df$date <- lubridate::ymd(stats_df$date)

# inspect data ------------------------------------------------------------
tibble::glimpse(stats_df)

Rows: 17,673
Columns: 21
$ athlete_name      <chr> "Jamari Mckisson", "Jamari Mckisson", "Sea~
$ activity_name     <chr> "Match", "Training", "Training", "Training~
$ period_id         <chr> "eusivpz4-i64b-uc9d-g7f8-59fe6zhibduv", "e~
$ period_name       <chr> "Drills B", "Drills B", "Drills B", "Drill~
$ start_time        <dbl> 1422103865, 1422103865, 1422103865, 142210~
$ end_time          <dbl> 1422104689, 1422104689, 1422104689, 142210~
$ position_name     <chr> "Attacker", "Attacker", "Defender", "Defen~
$ total_distance    <dbl> 683.91998, 683.91998, 750.60999, 633.72998~
$ total_duration    <dbl> 824, 824, 824, 824, 824, 824, 824, 824, 82~
$ total_player_load <dbl> 72.48, 72.48, 84.05, 81.01, 89.35, 95.70, ~
$ max_heart_rate    <int> 189, 189, 205, 152, 187, 187, 164, 195, 16~
$ max_vel           <dbl> 7.2995, 7.2995, 8.1965, 7.7135, 8.1525, 7.~
$ mean_heart_rate   <dbl> 116.94803, 116.94803, 166.59029, 119.38738~
$ hsr_efforts       <int> 6, 6, 6, 6, 6, 6, 8, 6, 0, 6, 5, 0, 6, 5, ~
$ date              <chr> "2015-03-17", "2015-03-17", "2015-03-17", ~
$ activity_id       <chr> "ytz8ugmv-67uy-45tn-3fgn-yrb7ax8pd3i5", "j~
$ int_day_id        <int> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, ~
$ start_time_h      <chr> "12:51:05", "12:51:05", "12:51:05", "12:51~
$ end_time_h        <chr> "13:04:49", "13:04:49", "13:04:49", "13:04~
$ date_id           <chr> "2015-03-17", "2015-03-17", "2015-03-17", ~
$ date_name         <chr> "2015-03-17", "2015-03-17", "2015-03-17", ~


Using the function ofCloudGetStatistics(), we were able to save all specified parameters aggregated by each period and activity as a data frame. From this data frame, it’s pretty easy to do some exploratory analysis…


# Example Analysis with Aggregated Data -----------------------------------
stats_df_filtered <- stats_df %>% 
  dplyr::filter(total_player_load > 0,
                total_distance > 0)
stats_df_filtered$position_name <- factor(stats_df_filtered$position_name)


stats_df_filtered %>% 
  ggplot(aes((total_player_load))) +
  geom_density(aes(fill = position_name), alpha = 0.7) +
  facet_wrap(~ position_name, ncol = 1) +
  scale_x_continuous(limits = c(0, 700)) +
  theme_minimal() +
  theme(legend.position = "none") +
  labs(title = "Total PL by Position Distributions",
       x = "Total PlayerLoad")


stats_df_filtered %>% 
  dplyr::filter(max_vel < 20,
                date > max(date) - 100) %>% 
  ggplot(aes(x = date, y = max_vel)) +
  geom_point(aes(color = position_name), alpha = 0.15) +
  geom_smooth(aes(color = position_name), se = FALSE) +
  theme_minimal() +
  theme(legend.position = "top",
        legend.title = element_blank()) +
  labs(title = "Max Velocity by Position",
       subtitle = "Most Recent 100 Days",
       x = "Date",
       y = "Max Velocity (m/s)") +
  scale_y_continuous(limits = c(0, 10))



Downloading IMA Event Data

There are also functions to extract IMA events from Catapult OFD files. For a list of events available, we can use the function ima_events().


ima_events()

 [1] "ima_acceleration"               
 [2] "ima_jump"                       
 [3] "ima_jump_ml"                    
 [4] "ima_impact"                     
 [5] "goalkeeping_v1"                 
 [6] "goalkeeping_v2"                 
 [7] "cricket_delivery_au"            
 [8] "cricket_delivery"               
 [9] "running_symmetry"               
[10] "ice_hockey_stride"              
[11] "ice_hockey_bout"                
[12] "baseball_pitch_v1"              
[13] "baseball_swing_v1"              
[14] "baseball_pitch"                 
[15] "baseball_swing"                 
[16] "baseball_throw"                 
[17] "free_running"                   
[18] "football_movement_analysis"     
[19] "rugby_union_scrum"              
[20] "rugby_union_contact_involvement"
[21] "rugby_union_kick"               
[22] "rugby_league_tackle"            
[23] "us_football_lineman_contact"    
[24] "us_football_throw"              
[25] "us_football_impact"             
[26] "ice_hockey_goaltender_movement" 
[27] "basketball"                     
[28] "tennis"                         


For this example, we will extract all Rugby Union IMA events from a game in our Men’s Rugby Union demo account. First we have to get the token corresponding to this account.


token <- ofCloudCreateToken(sToken = "8SyAJZonGYUH5adDgINl1re7WkOPXhB0EtbVzcMp",  sRegion = "America")


Now, we can see which athletes and activities are associated with this account. Typically, you will want to identify specific athletes or activites for which you want more information. Once you know when these Activities occurred and the Device IDs that you’re interested in, you can access IMA Event data, Generation 2 Effort data, 10 Hz Sensor Data, or even 100 Hz high frequency data if the necessary modules are enabled on the account.

In order to maintain anonymity, no identifiable data will be shared for the following examples.

Athletes


athletes <- ofCloudGetAthletes(token)
glimpse(athletes)

Rows: 299
Columns: 30
$ id                         <fct> ia65ocdu-zlut-ru7h-z5dj-xteapn7wc~
$ first_name                 <chr> "Blas", "Darren", "Zane", "Abbige~
$ last_name                  <chr> "Bell", "al-Omer", "al-Shaheed", ~
$ jersey                     <chr> "123", "DM5", "JC", "EO1", "EE9",~
$ nickname                   <chr> "", "", "Mailman", "", "", "", "M~
$ height                     <dbl> 180.649721537, 180.445787567, 181~
$ weight                     <dbl> 76.8430889782, 68.2587630431, 71.~
$ date_of_birth              <dbl> 930562319, 957398680, 909982778, ~
$ velocity_max               <dbl> 10.00, 10.00, 9.50, 9.50, 10.00, ~
$ acceleration_max           <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 10~
$ heart_rate_max             <int> 0, 0, 200, 196, 200, 199, 0, 200,~
$ player_load_max            <int> 0, 0, 500, 0, 500, 0, 0, 0, 500, ~
$ image                      <chr> "", "", "", "", "", "", "", "", "~
$ icon                       <chr> "circle", "circle", "circle", "tr~
$ stroke_colour              <chr> "#030303", "#030303", "#030303", ~
$ fill_colour                <chr> "#ff0000", "#ff0000", "#ff0000", ~
$ trail_colour_start         <chr> "", "", "", "", "", "", "", "", "~
$ trail_colour_end           <chr> "", "", "", "", "", "", "", "", "~
$ is_synced                  <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ~
$ is_deleted                 <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ~
$ created_at                 <chr> "2015-06-22 10:04:02", "2015-07-0~
$ modified_at                <chr> "2015-07-31 08:38:44", "2016-09-1~
$ date_of_birth_date         <chr> "1970-01-01", "1970-01-01", "1970~
$ tag_list                   <list> <>, <>, <>, <>, <>, <>, <>, <>, ~
$ tags                       <list> [<data.frame[0 x 0]>], [<data.fr~
$ current_team_id            <chr> "vb8tor4u-psbg-45ub-1i69-s8oy37e9~
$ max_player_load_per_minute <int> 12, 12, 15, 15, 15, 15, 12, 12, 1~
$ position                   <chr> "SH", "S", "C", "FB", "SH", "S", ~
$ position_id                <chr> "b93eae54-7071-11e4-afbc-0afe0c90~
$ position_name              <chr> "Scrum-Half", "Second Row", "Cent~


Activities


# Get data for each activity (Demo Account) ----------------------------------------------
to <- as.integer(as.POSIXct(as.Date("2018-07-02")))
from <- as.integer(as.POSIXct(as.Date("2017-03-02"))) # 20 weeks
activities <- ofCloudGetActivities(token, from = from, to = to)
glimpse(activities)
nrow(activities)

Rows: 180
Columns: 19
$ id             <chr> "a65ocdv3-lutn-ru7h-z5dj-xteapn7wcybg", "zsjh~
$ name           <chr> "Training", "Training", "Rehab", "Match", "Pr~
$ start_time     <dbl> 1585302081, 1585216096, 1585126999, 158508408~
$ end_time       <dbl> 1585304365, 1585217700, 1585129048, 158508782~
$ modified_at    <chr> "2020-01-31 17:15:48", "2020-01-31 17:15:48",~
$ game_id        <chr> "a65ocdv3-lutn-ru7h-z5dj-xteapn7wcybg", "zsjh~
$ owner_id       <chr> "efcb4474-22a6-41fc-95ce-8389d6e8fd36", "efcb~
$ owner          <df[,10]> <data.frame[23 x 10]>
$ periods        <list> [<data.frame[1 x 4]>], [<data.frame[1 x 4~
$ tags           <list> <"Friday", "Compressed", "GPS", "Compressed"~
$ tag_list       <list> [<data.frame[4 x 5]>], [<data.frame[2 x 5]>]~
$ athlete_count  <int> 1, 1, 1, 1, 1, 1, 27, 1, 2, 2, 7, 34, 1, 2, ~
$ period_count   <int> 1, 1, 1, 1, 1, 1, 46, 1, 2, 1, 6, 7, 1, 1, 1,~
$ venue_name     <chr> "Training Pitch 2", "Practice Area", "Practic~
$ venue_width    <int> 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 7~
$ venue_length   <int> 105, 105, 105, 105, 105, 105, 105, 105, 105, ~
$ venue_rotation <int> 169, 169, 169, 169, 169, 169, 169, 169, 169, ~
$ venue_lat      <dbl> 44.7755572, 44.7755572, 44.7755572, 44.775557~
$ venue_lng      <dbl> 20.6654257, 20.6654257, 20.6654257, 20.665425~

[1] 180

The column names for the activities dataframe are printed above. We can see that each row contains information for a single activity. Between March 2, 2017 and July 2, 2018, there were 180 activities associated with the account.


Accessing IMA Events

We can use the activities data frame above to identify an activity or period from which we would like to extract IMA Events. We also need the athlete_id for each athlete, and the corresponding name of each athlete will also be helpful. For this example, I will just select a random activity. To get the athlete_ids and athlete_names for this activity, we can use the function ofCloudGetAthleteDevices().

Get Device Info for the Activity


# We want athlete name, athlete_id, and device_id
# we can match the name using the device id from the 
# info returned by ofCloudGetAthleteDevices()
device_info <- ofCloudGetAthleteDevicesInActivity(
  token, 
  activity_id = activities$id[27])

glimpse(device_info)

Rows: 32
Columns: 12
$ device_id          <int> 1777, 8557, 8556, 1471, 4147, 8564, 8691,~
$ athlete_id         <chr> "o9frvm2q-uz7g-u81r-k4ot-y6jnge2wr8tk", "~
$ athlete_name       <chr> "Cory Nyamekye", "Yeeleng Hart", "Shane a~
$ athlete_first_name <chr> "Cory", "Yeeleng", "Shane", "Mersadez", "~
$ athlete_last_name  <chr> "Nyamekye", "Hart", "al-Badour", "Reyes Z~
$ mapping_start_time <dbl> 1582011089, 1582010980, 1579245856, 15835~
$ mapping_end_time   <dbl> 1584086393, 1642503372, 1584020817, 15838~
$ is_current_mapping <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
$ jersey             <chr> "CH5", "AC8", "KP2", "DT2", "SM4", "TL3",~
$ team_id            <chr> "drwtlz68-ausy-u8g5-hrwp-e7cbd391pa8k", "~
$ team_name          <chr> "Cyclones", "Cyclones", "Cyclones", "Cycl~
$ player_id          <chr> "", "", "", "", "", "", "", "", "", "", "~

The column names for device_info have been printed above. The device_id column can be used to join athlete names to the event data.


Get the IMA Events for a Single Athlete.

Now that we have the device information (device_id, athlete_id, athlete_name, etc.), we can go ahead and get the IMA events. The code below will get the events for the first athlete. The function ofCloudGetActivityEvents() returns a list of dataframes, one for each event type.


# get ima acceleration, jump, and impact events
events <- ofCloudGetActivityEvents(token, 
                         athlete_id = device_info$athlete_id[10], 
                         activity_id = activities$id[27], 
                         events = c("rugby_union_scrum", 
                                    "rugby_union_contact_involvement", 
                                    "rugby_union_kick")) 
# specify event types from ima_events()

names(events)

[1] "rugby_union_contact_involvement"
[2] "rugby_union_scrum"              

Scrums:


rmarkdown::paged_table(events[["rugby_union_scrum"]])

Contact Involvements:


rmarkdown::paged_table(events[["rugby_union_contact_involvement"]])

Kicks:


events[["rugby_union_kick"]]

NULL

The athlete in the example above did not have any kicks. This is not concerning as most positions on a rugby team do not kick.


Extract IMA Events for All Athletes

We can also loop through the function and get events for all athletes in the activity. To store this data, we will use a nested dataframe. For each athlete and event combination, we will add a row. The last row will contain a full dataframe


# get ima acceleration, jump, and impact events
events_list <- list()

for (i in seq_along(device_info$athlete_id)) {
  events <- ofCloudGetActivityEvents(token, 
                         athlete_id = device_info$athlete_id[i], 
                         activity_id = activities$id[27], 
                         events = c("rugby_union_scrum", 
                                    "rugby_union_contact_involvement", 
                                    "rugby_union_kick"))
  
  # check to make sure there are acceleration events
  if (!is.null(events$rugby_union_scrum)) {
    scrum_df <- events$rugby_union_scrum %>% 
    mutate(athlete_id = device_info$athlete_id[i],
           event_type = "rugby_union_scrum") %>% 
    group_by(athlete_id, event_type) %>% 
    nest()
  } else {scrum_df <- tibble()}
  
  # check to make sure there are jump events
  if (!is.null(events$rugby_union_contact_involvement)) {
  contact_involvement_df <- events$rugby_union_contact_involvement %>% 
    mutate(athlete_id = device_info$athlete_id[i],
           event_type = "rugby_union_contact_involvement") %>% 
    group_by(athlete_id, event_type) %>% 
    nest()
  } else {contact_involvement_df <- tibble()}
  
  # check to make sure there are impact events
  if (!is.null(events$rugby_union_kick)) {
  kick_df <- events$rugby_union_kick %>% 
    mutate(athlete_id = device_info$athlete_id[i],
           event_type = "rugby_union_kick") %>% 
    group_by(athlete_id, event_type) %>% 
    nest()
  } else {kick_df <- tibble()}
  
  events_list[[i]] <- scrum_df %>% 
    bind_rows(contact_involvement_df) %>% 
    bind_rows(kick_df)
}

all_events <- bind_rows(events_list)


Now that we have the nested dataframe of all events, we can join the athlete_name and device_id from the device_info dataframe. The link between each of these dataframes is athlete_id. In the example here, I will not join the athlete_name so as to maintain anonymity.


all_events <- all_events %>% 
  left_join(device_info %>% select(athlete_id, device_id)) %>% 
  select(device_id, everything()) # reorder columns

head(all_events, n = 10)

# A tibble: 10 x 5
   device_id athlete_id       event_type      data    activity_id     
       <int> <chr>            <chr>           <list>  <chr>           
 1      2873 25c5a41f-3a14-4~ rugby_union_co~ <tibbl~ zsjh129a-4j1h-o~
 2      2873 25c5a41f-3a14-4~ rugby_union_ki~ <tibbl~ zsjh129a-4j1h-o~
 3      1777 81e1c9e6-2cdb-4~ rugby_union_sc~ <tibbl~ k4ot52q6-jge2-n~
 4      1777 81e1c9e6-2cdb-4~ rugby_union_co~ <tibbl~ k4ot52q6-jge2-n~
 5      8557 7bfc92ec-e036-4~ rugby_union_sc~ <tibbl~ 6zu297gl-lpf6-z~
 6      8557 7bfc92ec-e036-4~ rugby_union_co~ <tibbl~ 6zu297gl-lpf6-z~
 7      8556 1d11a3b2-2f9c-4~ rugby_union_sc~ <tibbl~ jq18h9r3-eu29-2~
 8      8556 1d11a3b2-2f9c-4~ rugby_union_co~ <tibbl~ jq18h9r3-eu29-2~
 9      1471 e9d56978-34d0-4~ rugby_union_co~ <tibbl~ dzbvif3n-yles-6~
10      4147 2224cf40-36f3-4~ rugby_union_sc~ <tibbl~ kdzn8bc2-qbjm-y~


You can see that there is a row for each athlete and event type combination. Nested dataframes are useful in this situation because the data for each event differs slightly in terms of column name and number of columns. To look at a single event type for all athletes, we can just filter for that specific event type and unnest the dataframe.

The code below demonstrates how to accomplish this for scrum events, but the same logic can be applied to other events.


# look only at acceleration events

all_events %>% 
  dplyr::filter(event_type == "rugby_union_scrum") %>% 
  unnest(cols = "data") # specify column to "unnest"

# A tibble: 100 x 12
   device_id athlete_id       event_type   start_time end_time version
       <int> <chr>            <chr>             <dbl>    <dbl> <chr>  
 1      2873 81e1c9e6-2cdb-4~ rugby_union~     1.59e9   1.58e9 2.3    
 2      2873 81e1c9e6-2cdb-4~ rugby_union~     1.59e9   1.58e9 2.3    
 3      2873 81e1c9e6-2cdb-4~ rugby_union~     1.59e9   1.58e9 2.3    
 4      2873 81e1c9e6-2cdb-4~ rugby_union~     1.59e9   1.58e9 2.3    
 5      2873 81e1c9e6-2cdb-4~ rugby_union~     1.59e9   1.58e9 2.3    
 6      2873 81e1c9e6-2cdb-4~ rugby_union~     1.59e9   1.58e9 2.3    
 7      2873 81e1c9e6-2cdb-4~ rugby_union~     1.59e9   1.58e9 2.3    
 8      1777 7bfc92ec-e036-4~ rugby_union~     1.59e9   1.58e9 2.3    
 9      1777 7bfc92ec-e036-4~ rugby_union~     1.59e9   1.58e9 2.3    
10      1777 7bfc92ec-e036-4~ rugby_union~     1.59e9   1.58e9 2.3    
# ... with 90 more rows, and 6 more variables: confidence <dbl>,
#   duration <dbl>, post_event_active <chr>,
#   RugbyScrumPostScrumBigTime <dbl>, post_event_load <dbl>,
#   activity_id <chr>


Now that we have the IMA event data, we may want to plot each event over time. To do this, we can just unnest each event type and select the time of the event only.


scrum_events <- all_events %>% 
  dplyr::filter(event_type == "rugby_union_scrum") %>% 
  unnest(cols = "data") %>% 
  select(device_id, event_type, start_time)

contact_involvement_events <- all_events %>% 
  dplyr::filter(event_type == "rugby_union_contact_involvement") %>% 
  unnest(cols = "data") %>% 
  select(device_id, event_type, start_time)

kick_events <- all_events %>% 
  dplyr::filter(event_type == "rugby_union_kick") %>% 
  unnest(cols = "data") %>% 
  select(device_id, event_type, start_time)

all_events_unnested <- scrum_events %>% 
  bind_rows(contact_involvement_events) %>% 
  bind_rows(kick_events)

all_events_unnested %>% 
  ggplot(aes(x = start_time, y = as.factor(device_id))) +
  geom_point(aes(color = event_type), alpha = 0.8, size = 2) +
  theme_minimal() +
  theme(legend.position = "top",
        legend.title = element_blank()) +
  labs(title = "Rugby Union Events For All Athletes",
       x = "Unix Time (cs)",
       y = "Device ID") +
  guides(colour = guide_legend(override.aes = list(alpha = 1)))

There are many different ways to look at this data, but the above example demonstrates at least one way to recombine each event from all athletes into a single plot.

After plotting the events, you can visually inspect for trends. It looks like there are some athletes that have more events than others. Perhaps those missing events for longer periods of time were not playing. There is a clear break in the middle of all files as well. For scrums, we can see that the blue dots tend to line up, occurring at the same time for multiple athletes. This is what we expect because several players are involved in a scrum at the same time. Kicks are only seen in a few of the athletes. Again, this makes sense because typically the halves do most of the kicking.


Generation 2 Efforts

If Gen2 Efforts are enabled for the account, you can extract them with the function ofCloudGetActivityEfforts(). You must specify a specific athlete and activity (or period if using ofCloudGetPeriodEfforts()), and you can also isolate your selections to a specific time period using the from and to arguments. You can restrict the query to specific bands using the bands argument. This may be helpful if you only want higher band velocity efforts, or if you’d like to look at decelerations only. See the function help for more information

Velocity:


# velocity efforts
vel_efforts <- ofCloudGetActivityEfforts(
           token, 
           athlete_id = device_info$athlete_id[1], 
           activity_id = activities$id[27], 
           velocityOrAcceleration = TRUE, 
           bands = 1:8)

rmarkdown::paged_table(vel_efforts)

Acceleration:


# acceleration efforts
accel_efforts <- ofCloudGetActivityEfforts(
             token, 
             athlete_id = device_info$athlete_id[1], 
             activity_id = activities$id[27], 
             velocityOrAcceleration = FALSE, 
             bands = -3:3)

rmarkdown::paged_table(accel_efforts)

These functions can be looped through all athletes and saved in a single nested dataframe using the approach demonstrated above for IMA Events.


Extracting and Analyzing 10 Hz Data

To get 10 Hz “Sensor Data” for a single athlete, the function ofCloudGetActivitySensorData() can be used. You must specify an athlete_id and activity_id for the 10 Hz data. You can use the function ofCloudGetAthleteDevicesInActivity() to get a dataframe of relevant athlete_id’s for each activity. We did this above to create the device_info dataframe.


Get 10 Hz data for Single Athlete Using ofCloudGetActivitySensorData()

We can access the 10 Hz data for a specified device_id.


# get 10hz for a single athlete
sd <- ofCloudGetActivitySensorData(
     token, 
     athlete_id = device_info$athlete_id[1], 
     activity_id = activities$id[27])

glimpse(sd)

# look at middle of file
sd %>% 
  unnest(cols = "data") %>% 
  slice(2000:2100) %>% 
  rmarkdown::paged_table() 

Rows: 1
Columns: 11
$ athlete_id         <chr> "7a62b7e1-401f-4282-b309-04ae3e459db3"
$ device_id          <int> 5074
$ player_id          <chr> ""
$ athlete_first_name <chr> "Adrian"
$ athlete_last_name  <chr> "Tafoya"
$ jersey             <chr> "CH5"
$ team_id            <chr> "75054b55-9900-11e3-b9b6-22000af8166b"
$ team_name          <chr> "Thunder"
$ stream_type        <chr> "gps"
$ data               <list> [<data.frame[7079 x 13]>]
$ activity_id        <chr> "ia65ocdu-zlut-ru7h-z5dj-xteapn7wcybg"


Get 10 Hz data for Multiple Athletes

We might be interested in comparing load accumulation for all athletes over the course of a game or training session. For this, we’ll need the sensor data for all athletes in the activity. The code below will extract the data for all 10 athletes. The methodology is similar to that used for the event data above.

For this example, I have limited the data to just the first 5 athletes


sd_data_list <- list()

for (i in 5:8) { # just select random 6 athletes
  
  sd_data <- ofCloudGetActivitySensorData(
                token, 
                athlete_id = device_info$athlete_id[i], 
                activity_id = activities$id[27]) %>% 
    unnest(cols = "data")
  
  sd_data_list[[i]] <- sd_data
}

all_sd_data <- bind_rows(sd_data_list) %>% 
  group_by(device_id) %>% 
  arrange(ts, cs) %>% 
  mutate(time_unix = ts + cs/100,
         time_seconds = time_unix - min(time_unix),
         time_minutes = time_seconds/60)

all_sd_data %>%  
  ggplot(aes(x = time_minutes, y = pl, 
             group = as.factor(device_id), 
             color = as.factor(device_id))) +
  geom_line() +
  theme_minimal() +
  theme(legend.title = element_blank()) +
  labs(title = "Cumulative PlayerLoad",
       x = "Elapsed Time (Minutes)",
       y = "")


We can see from the plot that athletes accumulate PlayerLoad at different rates throughout the session.

100 Hz data is likely to be available in the future as well.


Other Resources

The demonstrations in this document only account for a small fraction of what is possible with the catapultR package. For more information about the package and functions, see the package documentation.


Citation

For attribution, please cite this work as

Sandler & Hart (2020, Feb. 20). catapultR: CatapultR Demo. Retrieved from http://catapultr.catapultsports.com/posts/2020-02-20-catapultr-demo/

BibTeX citation

@misc{sandler2020catapultr,
  author = {Sandler, Sergey and Hart, Brian},
  title = {catapultR: CatapultR Demo},
  url = {http://catapultr.catapultsports.com/posts/2020-02-20-catapultr-demo/},
  year = {2020}
}