Coaching meets coding: Two passions, one project
I hope you'll find this interesting, but no promises... At least you tech savvy people out there will enjoy this.
COACHING MEETS CODING <3
For a long time, my weekly process of checking in on my athlete’s training involved going to Strava, searching for each athlete, and scrolling through their feed to see what they’d done for the week. I found this process to be somewhat tedious and saw the benefit of incorporating automation to get my athlete’s weeks in one, consolidated location, saving me precious time in my busy weeks. Plus, I was just looking for something to do to develop my coding skills, so this was the perfect excuse. Thankfully, I was well aware of Strava’s API (Application Programming Interface) which would enable me to pull my athlete’s activities from Strava in whatever programming language I desired.1
The first thing that I had to do was choose my programming language. After doing some research, I decided to use Google Sheets to hold the activity data, using their built-in, JavaScript-based platform, Google Apps Script, as my development backend to handle the data extraction, processing, and injection into sheets from within a spreadsheet. Google Sheets would serve as my database, holding all of my athlete’s runs in one sheet from which I can do whatever I please.
After choosing this development path and creating an initial spreadsheet, the next step was to start the development process. The first thing I had to do was determine how to utilize the Strava API to extract my athlete’s data. This took quite some time given the process is different from one development application to the next, and frankly, not many people online went down the route of using Google Apps Script. However, I was lucky enough to come across this startup guide from Ben Collins which really got things rolling for me. Using his guide, I was eventually able to grab data on my own activities, which is supported by Strava’s API.
Next up: Figure out how to grab my athlete’s data. The problem here was Strava’s API did not explicitly support this functionality. They made this difficult to accomplish, seemingly only providing guidance on how to use the API for your own activities. I believe this was intentional, probably for liability/privacy reasons. LAME! This lack of clarity plus the rather vague API documentation made things difficult. I had to do a lot of digging through forums (welcome to the world of coding) and dropping questions under the Strava API Developer Discussions board. THANKFULLY, I figured it out. It ended up being quite simple. I sent each of my athletes a link, from which they’d click it, log in to their account, and click “Authorize” to give my API Application permission to grab their activities. For those curious, the link looked like this (but with my client ID where you see “<my_client_id>”):
https://www.strava.com/oauth/authorize?client_id=<my_client_id>&response_type=code&redirect_uri=http://localhost/exchange_token&approval_prompt=force&scope=activity:read_all
BOOM! I could grab my goon’s activities (yes, I call them my goons). Now it was time for the fun stuff. The first thing I wanted was to have all of my athlete’s runs in one sheet, which I literally called “GOONS.” Thanks to Ben Collin’s resource (linked above) and Strava API’s getLoggedInAthleteActivities and getActivityById endpoints2, my sheet now contains the following columns:
ATHLETE - The athlete to whom the activity belongs.
ACTIVITY ID - The identifier (ID) of the activity extracted.
RUN - The name (i.e. title).
MOVING TIME - The moving time.
DISTANCE - The distance (in miles).
PACE - The pace (in minutes/mile).
FULL DATE - The date (e.g. 4/12/2023).
TIME - The time (e.g. 4:50:43 PM).
DAY - The weekday (e.g. MON, TUE, …).
MONTH - The month (e.g. 1 = January, 2 = February, …).
DATE - The day of the month (e.g. 12).
YEAR - The year (e.g. 2023).
SPM AVG - The average number of strides per minute (i.e. cadence).
HR AVG - The average heart rate (in beats/minute).
WKT TYPE - The run type (0 = None, 1 = Race, 2 = Long run, 3 = Workout).
DESCRIPTION - The description, or caption.
TOTAL ELEV GAIN - The total elevation gain (in meters).
MANUAL - Whether the run was manual or not.
MAX SPEED - The highest speed (in meters/second).
CALORIES - The number of calories burned.
ACHIEVEMENT COUNT - The number of achievements gained (e.g. Strava PRs).
KUDOS COUNT - The number of kudos received.
COMMENT COUNT - The number of comments received.
ATHLETE COUNT - The number of (identified) athletes run with.
You can check out this mocked sheet to see what the “GOONS” sheet looks like. It contains my athlete’s runs from Wednesday, April 12, 2023, all the way back to January 2022.
One of the most annoying, head-bang-inducing moments of this data acquisition and interpretation process was dealing with the time-based (MOVING TIME, PACE, TIME) and date-based values (e.g. FULL DATE). Turns out, JavaScript kind of sucks REALLY bad at this. There’s no particularly easy way of doing it. I digress…
NICE! I have all of this JUICY data on my lovely goon squad. Keeping to my vision of simplifying the weekly athlete check-ins, I then used the data from this database-like sheet to create and populate individual sheets for each athlete. These individualized sheets (we’ll call them “athlete sheets”) contain the current week’s activities for each of my athletes. In this way, I don’t have to look through the mess that was my database-serving “GOONS” sheet.
I also implemented sections below the activities to summarize the athlete’s week and their workouts, the former containing information on their total mileage and time, number of runs, average mileage, time, and pace per run, and finally, long run distance and date. The workout section included information on the workout number (e.g. 1 for the first workout of the week), run ID, title, description, full date, day, and time.
With these sheets in place, I then implemented additional code to consolidate all of the recap data from the athlete sheets into one last sheet, “GOONS RECAP.” And yes, that’s the name.
Boom shakalaka, baby!!! This was all that I (initially) set out to accomplish.
At the end of every week, this is now what my training check-in process looks like (thanks, automation):
Look at my “GOONS RECAP” sheet to get a general feel for how my athlete’s weeks went.
Navigate to each athlete’s sheet, grab the recap data, and paste it beside the corresponding week in their training plan.
You can navigate to the week of March 27, 2023, under this spreadsheet’s “PLAN” sheet to see what this looks like.
Compare each athlete’s recap data with what was planned for the given week, check in with each athlete via text3, and make any necessary adjustments to the proceeding week(s).
Hey, remember when I used the word “initially” above? That’s because I’ve had some new ideas since then. One of those ideas is a phrase that starts with “machine” and ends with “learning.” I sure hope you guessed it: machine learning (ML), baby. I will save this for another post, but to entertain the mind, I’ll share a couple of my ideas:
Utilize the data from the “GOONS” sheet to produce a ML model that can predict an athlete’s race performance based on their training.
Thoughts/Concerns:
There is a dependency on the athlete to make sure they’re marking their workouts as workouts in Strava. However, I can combat this by adding code to look at each activity and determine, from the data, whether we have a workout or not. This could make incorrect classifications, though.
In order to produce an effective race-predictive model, there must be plenty of race data for each athlete. Without this, our model will epically fail and (essentially) have zero credibility. This is because, without race data, all of the training data is rendered useless. If an athlete solely trains and never races, how could we train a model—on that athlete’s data—to be proficient in predicting race times? We can’t!
Utilize the data from the “GOONS” sheet to generate a ML model capable of approximating an athlete’s current fitness level based on their past runs.
Thoughts/Concerns:
With the currently existing columns (e.g. “MOVING TIME”), referred to as “predictors” in ML, can we produce an effective model? We will likely need to create additional columns within the Python code from the existing data. This could mean columns like…
Fitness rating (see below)
Number of runs from the last seven days
Number of days run from the last seven days
Total mileage from the last seven days
Average pace from the last seven days
Average heart rate from the last seven days
Total number of workouts from the last seven days
How do we define “fitness?” This is a question I’ll need to quantify in order to accomplish this. I suspect this will get rather messy, as there are many factors to fitness.
My current thought, subject to change, is that I can create a fitness rating system around distance, pace, heart rate average, average cadence (or strides per minute), workout type (easy run, long run, workout, race), total elevation gain, and the above statistics (e.g. number of runs from the last seven days to the total number of workouts from the last seven days).
Sorry Strava, but whatever ML model you’re using for the fitness growth subscription feature is complete bologna… I would, however, be curious as to the data utilized for that feature. I can imagine they came across MANY complications; I’ll probably come across a lot of the same ones…
Barring success, this ML model’s resulting “fitness rating” metric—attached to each activity—could certainly be used by the first ML model (see idea number one above) in predicting race performance.
Because I need a LOT of data to make this work, I may be requesting for additional Strava athletes to volunteer to authenticate my application so that I can grab their runs, too, and use such in producing my ML models discussed above. If you have made it this far (I’d be surprised, lol) and would be willing, please drop a message in the comments and/or email me (see contact information below). Of course, you’d be one of the first to be provided these services (hehe).
If you’d like to see the code for yourself (including my in-progress ML initiative), here is my GitHub repository. Feel free to use it to serve a similar purpose if you are a coach, or simply if you’re interested. If you have any issues during the development process, add an issue under the “Issues” tab and/or reach out to me and I will do my best to address such.
If you have further questions, please ask me of course! See my contact information at the bottom of this post; I would love to help.
Next up on All Things Running… Don’t get caught up in the present: See the bigger picture….
SUPPORT ME
Follow me on my other media:
If you want personalized coaching and/or training advice, I’d be happy to help. You can email me at jacobreesmontgomery@gmail.com or send me a direct message on Instagram.
An API is essentially a contract between two parties, one party being the provider of the services (Strava) and the other being a user (e.g. a developer like myself).
An API endpoint, according to SmartBear, is “[t]he place that APIs send requests and where the resource lives.” In this case, the request would be my ask for data on my athletes, and the resource would be the location of that Strava data. As an API user such as myself, the endpoint is where we make our requests to, from which we eventually get a response back with our requested information. Without the endpoint, there would be no data extraction.
This is a very important element of my coaching that I’ve discussed in a previous post.