Grandmas Marathon Womens Participation History

You might not know that I’m currently a full-time masters student studying marketing analytics and I previously worked as a software engineer. I spent about two and a half years writing code in Ruby with a splash of Python and JS. For my masters program, I’ve been learning R for data visualization and my semester-long project looks at the impact of event sponsorship on participation rates for marathons.

While that’s a work in progress, I started looking at other races, specifically historical results, and was excited when I found that all the results for Grandma’s Marathon are available online.

Because I’m still getting comfortable with R, right now I just want to look at this data to see what trends I noticed.

Process:

Anonymized participant names using this script
Manually filled in sex of participants for years with all-male participation (1978, 1979)

Where can I learn more about the first Grandma’s Marathon?

Oh, you’d like to learn about the woman who won in 1977? Here you go

Some basic setup

grandmas_1977<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1977.csv',sep= ""))
grandmas_1978<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1978.csv',sep= ""))
grandmas_1979<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1979.csv',sep= ""))
grandmas_1980<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1980.csv',sep= ""))
grandmas_1981<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1981.csv',sep= ""))
grandmas_1982<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1982.csv',sep= ""))
grandmas_1983<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1983.csv',sep= ""))
grandmas_1984<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1984.csv',sep= ""))
grandmas_1985<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1985.csv',sep= ""))
grandmas_1986<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1986.csv',sep= ""))
grandmas_1987<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1987.csv',sep= ""))


grandmas_1977 <- grandmas_1977 %>% rowwise() %>% mutate(Year = 1977)
grandmas_1978 <- grandmas_1978 %>% rowwise() %>% mutate(Year = 1978)
grandmas_1979 <- grandmas_1979 %>% rowwise() %>% mutate(Year = 1979)
grandmas_1980 <- grandmas_1980 %>% rowwise() %>% mutate(Year = 1980)
grandmas_1981 <- grandmas_1981 %>% rowwise() %>% mutate(Year = 1981)
grandmas_1982 <- grandmas_1982 %>% rowwise() %>% mutate(Year = 1982)
grandmas_1983 <- grandmas_1983 %>% rowwise() %>% mutate(Year = 1983)
grandmas_1984 <- grandmas_1984 %>% rowwise() %>% mutate(Year = 1984)
grandmas_1985 <- grandmas_1985 %>% rowwise() %>% mutate(Year = 1985)
grandmas_1986 <- grandmas_1986 %>% rowwise() %>% mutate(Year = 1986)
grandmas_1987 <- grandmas_1987 %>% rowwise() %>% mutate(Year = 1987)

all_years_results <- rbind(grandmas_1977, grandmas_1978, grandmas_1979, grandmas_1980, grandmas_1981, grandmas_1982, grandmas_1983, grandmas_1984, grandmas_1985, grandmas_1986, grandmas_1987)

Discussion

The first thing that stood out was actually the data cleanup I mentioned earlier: Women participated in the first Grandma’s Marathon, but the second and third annual events were run only by men.

ggplot(data=all_years_results, aes(x=Year, fill=Sex)) + geom_histogram(position="dodge", binwidth=1) + geom_text(
  aes(label=..count..),position=position_dodge(width=0.9),
  stat = 'count', vjust=-0.25
)+ labs(title= "Participation in Grandma's Marathon by Gender, 1977-1987")+ ylab("Participants")+ xlab("Year") + scale_fill_manual(values=blog_palatte, na.value="#000000")

plot of chunk unnamed-chunk-3

How did the participation rates by women vary in those first eleven years?

for(i in 1977:1987){
  men <- nrow(all_years_results %>% rowwise() %>% filter(Year==i) %>%filter(Sex=="M"))
  women <- nrow(all_years_results %>%rowwise() %>% filter(Year==i) %>%filter(Sex=="F"))
  percent_women <- round((women / nrow(all_years_results %>% rowwise() %>% filter(Year==i))) * 100, 2)
  percent_women <- paste(percent_women, "%", sep="")
  year_data <- paste("**", i, "data:** ", "men: ", men, "|", "women: ",women,"|", "women as proportion of total: ", percent_women)
  print(year_data)

}

## [1] "** 1977 data:**  men:  66 | women:  6 | women as proportion of total:  5.17%"
## [1] "** 1978 data:**  men:  585 | women:  0 | women as proportion of total:  0%"
## [1] "** 1979 data:**  men:  1290 | women:  0 | women as proportion of total:  0%"
## [1] "** 1980 data:**  men:  2182 | women:  183 | women as proportion of total:  7.71%"
## [1] "** 1981 data:**  men:  2837 | women:  365 | women as proportion of total:  11.4%"
## [1] "** 1982 data:**  men:  3444 | women:  585 | women as proportion of total:  14.52%"
## [1] "** 1983 data:**  men:  3587 | women:  608 | women as proportion of total:  14.49%"
## [1] "** 1984 data:**  men:  4039 | women:  769 | women as proportion of total:  15.99%"
## [1] "** 1985 data:**  men:  3695 | women:  707 | women as proportion of total:  16.06%"
## [1] "** 1986 data:**  men:  3650 | women:  751 | women as proportion of total:  17.06%"
## [1] "** 1987 data:**  men:  3358 | women:  841 | women as proportion of total:  20.03%"

And another way we can look at this:

all_women <- all_years_results %>% rowwise() %>% filter(Sex=="F")
ggplot(data=all_women, aes(x=Year, fill=Sex)) + geom_histogram(position="dodge", binwidth=1) +
  geom_text(aes(label=..count..),
  position=position_dodge(width=0.9),
  stat = 'count', vjust=-0.25
)+ labs(title= "Womens Participation in Grandma's Marathon, 1977-1987")+
ylab("Participants") + xlab("Year") +
scale_fill_manual(values=blog_palatte, na.value="#000000")

plot of chunk unnamed-chunk-5

Get in touch

Following graduation I’m seeking a position at the intersection of marketing, data, and endurance sports. My professional site is a bit of a work in progress but can be found here