A summary of Birkie results from 1999 to 2016

This website reviews results from 1999 through 2016 of the marathon length (~50-55k) American Birkebeiner cross country ski race, which is commonly referred to as the "Birkie". A large database containing 75,000+ individual results was assembled to address a number of interesting questions. For example, how does age affect performance? Cross country skiing may be unique among endurance sports in that excellent performance can be extended well into the sixth decade and beyond.

Other questions involve how classic and freestyle techniques compare historically, or whether relative performance of male and female skiers has changed over the years. There’s also the perennial question of whether wave assignments are done appropriately. The list is endless.

The data used in this website were not acquired directly from the American Birkebeiner Ski Foundation (ABSF), and the ABSF is in no manner responsible for the summaries provided by this website. Rather this dataset was created by a lengthy and cumbersome process of downloading publically-available PDF results files provided by the Birkie immediately after each race starting in 1999. These PDF files were then converted to Microsoft Excel files using the software Enolsoft PDF Converter and eventually combined over years. The process of converting PDF to Excel files, while conceptually simple, is actually quite tedious because of the number of transcriptional errors that can occur. Extensive error-checking procedures were used to eliminate as many of these mistakes as possible, but there is no doubt that a few errors remain for which I am entirely responsible. 

The Birkie was cancelled in 2000 and shortened to an untimed event in 2007 (for all but the elite wave), so there are a total of 16 timed freestyle and classic events in this database. Results from the shorter Kortelopet race are not included. From 2002 on it was easy to distinguish between skiers using freestyle and classical techniques in the Birkie since the techniques were considered as separate entry classes. In 1999 and 2001, all skiers were timed and ranked together, but, fortunately, technique was noted in the printed results provided in the Birch Scroll.

Before looking at any performance-related issues, a fundamental question needed to be answered - what is the most interesting way to measure performance. That is, what metric should be used? Most people relate best to finish time, but there’s a problem with finish time. Birkie distances have varied over the years because the courses for both the classic and freestyle events have changed somewhat. Since some of the most interesting questions involve changes in performance over years, finish times are not useable directly as stated in yearly results. Performance comparisons could be based on race pace, which is directly comparable from year to year, but many of us don’t relate well to pace comparisons. Whenever we talk to each other, do we ask what pace we skied the Birkie? Not likely. Rather the question is what was your finish time?

So race pace was used to standardize results to a finish time at a common distance - 55k for a classical race and 51k for a freestyle race, which are close to the stated distances for the current courses (more).

What do means mean?

Many of us do the annual exercise of asking, "What does my finish time have to be to get me into a faster wave." Or perhaps we ask, "I'm in my upper 40s, so what does one more year of aging do to my finish time?" Our answers look something like, "Well, based on last year, I need to take 12 min. and 13 sec. off my finish time to make it to the next wave," or, "Over the last two years, I might have slowed down by a minute or so, but perhaps I could buy a more expensive flouro topcoat to make up the difference next year." 

Using finishing times from only one or two previous years is an unreliable way to model what has happened in the past because there is so much year-to-year variation. Recent conditions could have been unusually cold, warm, sloppy, snowy, or icy. Statisticians refer to these uncontrolled conditions as confounding factors. As every Birkie skier knows only too well, every year brings something different - confounding factors abound, and there is no good way to account for them. Therefore, models of performance should be based on a large number of individual observations collected over many years spanning all sorts of confounding conditions.

It’s nice to have 16 years of results from North America’s most popular cross country ski event because comparisons among age groups, between genders, or between techniques can be quite robust. One can argue with some precision that over the 16 years of the freestyle race, males in the 30-34 age group averaged about 2 min. 22 sec. faster than those in the 40-44 age group. After all, there were 4328 observations used to calculate the former mean and 6813 observations for latter mean. With such large sample sizes we can reasonably explore how aging impacts performance.

In the summaries presented on this website, results for men are often presented before those for women. It is easier to spot trends for men since there are usually more men in each category being examined, and the means are thereby more accurate. For example, as mentioned above, there were 6813 men in the 40-44 age group, while there were only 1296 women. This issue becomes even more critical for the older age groups. In the 70-74 age group, there have been only 9 women in 16 years, while there have been 258 men. 

But means, even if made up of lots of observations, can sometimes be deceptive. What does the average human adult look like? There are about 7.4 billion people on the planet. If we could measure all adults, the mean height might be around 5 1/2 ft and the mean weight around 120 lbs, and these means are based on billions of observations. But one would also calculate with great precision that the human adult has, on average, one large breast and one testicle. So in some instances we need to drill down to variation within categories for more insight. This is done as necessary to highlight interesting observations.

Finally, this dataset is a combination of both longitudinal and cross-sectional data. It would be nice to separate out the two components, and perhaps this will be possible in the future. Longitudinal data would provide a better look at performance trends with aging, but there is no way to get at the longitudinal part since individual identities are not linked to each result. The Birkie files held by ABSF have this information, but it is not part of the dataset used for this website.