Empirical Cycling Community Notes

Ten Minute Tips 18: Metrics Are Not Fitness

Original episode & show notes | Raw transcript

Training for Performance, Not Just the Metric: A Deep Dive

Introduction: The Core Problem

The central theme of the podcast is the critical distinction between training for performance and training to the metric. This distinction addresses a common pitfall in modern, data-driven endurance sports where athletes and coaches can become fixated on improving numbers in a software model, sometimes at the expense of real-world competitive ability.

Research Goal vs. Athlete Goal: In a scientific lab setting, the goal is often to isolate a variable and measure a specific metric’s response to an intervention (e.g., does a new training protocol increase VO2 max?). For an athlete, the goal is holistic and performance-based: to win a race, which involves a complex interplay of physiology, tactics, and psychology that cannot be captured by a single number.
“Training to the Metric”: This describes a training focus where the primary goal becomes increasing a specific data point, such as Functional Threshold Power (FTP) or Functional Reserve Capacity (FRC), rather than improving the physiological capabilities needed for a specific event.
The Ruler Analogy: The podcast uses a powerful analogy: asking someone to measure the distance from Washington D.C. to Boston with a one-foot ruler. The tool is completely inappropriate for the task. Similarly, a metric might be a valid measurement tool, but it can be misapplied or misunderstood, leading to incorrect conclusions about fitness.

1. The “Teaching to the Test” Trap: A Case Study on FTP

One of the most common examples of training to the metric is “teaching to the test” with FTP assessments, particularly shorthand tests like the 20-minute test or a ramp test.

The Interplay of Energy Systems

An effort like a 20-minute test is not a pure measure of aerobic fitness. It includes a significant anaerobic contribution. Your body is working above its maximal sustainable aerobic state, and the difference is made up by your finite anaerobic energy reserves.

Scenario A: Gaming the Test: An athlete can focus exclusively on high-intensity anaerobic training for several weeks. This will increase their ability to produce power anaerobically. When they re-test their 20-minute power, the number will likely go up. However, this increase comes from a larger anaerobic contribution, not an improvement in their sustainable aerobic power (their actual FTP). The test score improved, but their core aerobic fitness may have stagnated or even declined.
Scenario B: Genuine Improvement: An athlete focuses on a large volume of aerobic training (e.g., sweet spot, threshold, long endurance rides) with minimal anaerobic work. If their 20-minute power increases, it’s highly probable that this reflects a genuine improvement in their aerobic system’s capabilities.

Key Takeaway: The context of the training leading up to a test is crucial for interpreting the results. Without that context, a simple number can be misleading.

2. Demystifying Performance Models (WKO5, Critical Power)

Modern training software uses models to separate aerobic and anaerobic contributions to power. Understanding how these models work is essential to avoid misinterpreting the data.

Key Metrics:

Aerobic “Floor”: Represented by FTP (in WKO5) or Critical Power (CP). This is the modeled maximal power output from the aerobic system.
Anaerobic “Battery”: Represented by FRC (Functional Reserve Capacity, in kJ) or W’ (W prime, in kJ). This is the modeled amount of work that can be performed above the aerobic floor using anaerobic pathways.

The Inverse Relationship in the Model

These models are built on a fundamental mathematical relationship. In simple terms:

Total Power Output = Aerobic Power + Anaerobic Power

Because of this, when one component of the model goes up, the other often goes down to account for the same performance data.

The Example: An athlete performs a block of aerobic training. Their FTP increases from 330W to 350W. The model, observing this higher aerobic “floor,” now calculates that less anaerobic energy is needed to produce the same short-duration power outputs. Consequently, their FRC might drop from 17.5 kJ to 13 kJ.
The Reality: The athlete’s actual short-duration power (e.g., their best 30-second or 60-second effort) might be completely unchanged. They haven’t lost any physiological anaerobic power. The metric changed because the model re-balanced the aerobic/anaerobic contributions to explain the new, higher FTP.

Key Takeaway: A drop in FRC or W’ alongside a rise in FTP does not automatically mean a loss of anaerobic power. It is often an artifact of the model adjusting to a higher aerobic baseline. The true test is looking at raw power numbers for short durations.

3. The Danger of Averages: Two Critical Logical Fallacies

Much of our training “wisdom” comes from scientific studies that report group averages. Applying this group data to an individual can be a mistake due to two logical fallacies.

A. The Fallacy of Division

This fallacy occurs when you assume that what is true for the whole (the group average) must be true for all the parts (each individual).

Ramp Test Application: A study might find that, on average, a rider’s FTP is 75% of their peak 1-minute power in a ramp test. The fallacy is assuming that this 75% figure applies precisely to you. Due to individual variations in physiology, one athlete’s FTP might be 70% of their result, while another’s might be 82%. Using the 75% average for both would lead to incorrect training zones.
Lactate Threshold Application: The concept of a lactate threshold at 4 millimoles per liter (mmol/L) is a classic example. This number is an average from pooled data of many athletes. The podcast references a study where the individual values for maximal lactate steady state ranged from 3.05 to 5.52 mmol/L. Applying a rigid 4.0 mmol/L target to every individual is a misapplication of the data.

B. The Fallacy of Composition

This is the reverse of the Fallacy of Division. It occurs when you assume that what is true for one part (an individual) must be true for the whole (everyone else).

The “Pro Workout” Problem: An athlete sees their favorite professional cyclist post a specific, brutal workout on social media. The fallacy is believing that because this workout works for that elite individual, it will work for everyone. This ignores vast differences in genetic potential, training history (often decades long), recovery capacity, and lifestyle.
Individualization is Key: The podcast emphasizes that different athletes respond to different training stimuli. Some thrive on steady-state intervals, while others see better results from intermittent, high-intensity work. A successful coach or self-coached athlete discovers what works for the individual, rather than blindly copying what works for someone else.

4. From Diagnosis to Action: A Practical Framework

Instead of fixating on metrics, the podcast proposes a performance-first diagnostic framework. Analyze your race performance to identify your primary limiter, and then train to fix it.

Diagnostic Questions:

Scenario 1: “I get to the end of the race in the front group, but I get gapped in the final sprint. I just don’t have the ‘kick’.”
- Diagnosis: Your aerobic system is strong enough to get you to the finish, but you lack anaerobic power and capacity.
- Action: Prioritize training that targets the anaerobic system: sprints from a low speed, short maximal efforts (30-60 seconds), and neuromuscular training.
Scenario 2: “I feel strong at the start, but I run out of steam and get dropped before the critical moments of the race.”
- Diagnosis: Your anaerobic system might be fine for short bursts, but your aerobic engine (your endurance) is the limiting factor. Your “gas tank” is too small.
- Action: Prioritize training that builds your aerobic base: long endurance rides, sweet spot intervals, and threshold intervals.

The Role of Metrics: Once you have diagnosed the performance limiter, you can then select the appropriate metric to track progress.

If you’re working on your sprint, track your 5-second and 15-second peak power.
If you’re working on your aerobic engine, track your FTP or your power output for longer durations (30-60 minutes).

This approach places the focus on real-world outcomes and uses data as a tool to validate that the training is working, rather than making the data the goal itself.