Each function in the catapultR package has a very specific purpose. For this reason, it is often necessary to use several of them in order to access all desired data. This post introduces an alternative approach using an external function that serves as a wrapper for many commonly used catapultR functions.
Difficulty Level: Beginner
Due to the narrow scope of each individual catapultR function, access to a complete set of data often requires combining outputs from several functions. This is intentional because it helps the user better understand the organization of Catapult data in the cloud. The specificity also allows each individual function more flexibility. However, for many common data queries, it would be convenient to have a single function that could be used to access and join the data. The image below contains a simplified representation of the OpenField data model accessible through the Connect API.
This blog post introduces a function (and the thought process behind it) that aims to simplify the process for accessing data. It is not meant as a replacement for all requests, and it is recommended that users customize the logic to meet their specific needs.
To begin, a more clearly defined set or requirements is useful. Ultimately, we want a function that:
The function will combine ofCloudGetActivities()
, ofCloudGetAthleteDevicesInActivity()
, ofCloudGetStatistics()
, ofCloudGetActivityEvents()
, ofCloudGetActivityEfforts()
, and ofCloudGetActivitySensorData()
into a single function.
In general, the function will use the following logic:
Function Arguments: The get_cat_data()
function takes the following arguments:
ofCloudGetToken()
c("total_player_load", "max_vel")
); these parameters can be found in the slug column from ofCloudGetParameters()For the actual code used to create this function, you can download this R script file below. Feel free to alter the code to better meet your needs. It is not currently supported in the catapultR package and is intended as an example as it does not meet production level standards in terms of error handling and efficiency.
We will use an anonymized rugby account for a few examples. The first step is always to login. We will also load a few additional packages and source the R script for the function. You will see that we can use the function get_cat_data()
to accomplish almost everything by changing a few of the arguments.
To keep from printing excessively large tables in this post, all dataframes will be limited to the first 100 rows.
You will see in the examples below that several helpful messages are provided with the get_cat_data()
function. There are also built-in progress bars since many of the queries take a while to complete.
Login:
# first, load a few packages
library(catapultR)
library(tidyverse)
library(lubridate)
source("one_func_to_rule_them_all.R")
token <- ofCloudCreateToken(sToken = "8SyAJZonGYUH5adDgINl1re7WkOPXhB0EtbVzcMp", sRegion = "America")
Now that we’re logged in, we can see what activities are available.
Get Activities: Below, we will get activities for a specified date range with the data_type argument set to “activities_only”.
activities <- get_cat_data(credentials = token,
data_type = "activities_only",
from = "2021-01-01", to = Sys.Date())
rmarkdown::paged_table(head(activities, 100))
[1] "Total API Query Count: 1"