
Grandmas Marathon Womens Participation History
You might not know that I’m currently a full-time masters student studying marketing analytics and I previously worked as a software engineer. I spent about two and a half years writing code in Ruby with a splash of Python and JS. For my masters program, I’ve been learning R for data visualization and my semester-long project looks at the impact of event sponsorship on participation rates for marathons.
While that’s a work in progress, I started looking at other races, specifically historical results, and was excited when I found that all the results for Grandma’s Marathon are available online.
Because I’m still getting comfortable with R, right now I just want to look at this data to see what trends I noticed.
Process:
- Anonymized participant names using this script
- Manually filled in sex of participants for years with all-male participation (1978, 1979)
Where can I learn more about the first Grandma’s Marathon?
Oh, you’d like to learn about the woman who won in 1977? Here you go
Some basic setup
grandmas_1977<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1977.csv',sep= ""))
grandmas_1978<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1978.csv',sep= ""))
grandmas_1979<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1979.csv',sep= ""))
grandmas_1980<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1980.csv',sep= ""))
grandmas_1981<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1981.csv',sep= ""))
grandmas_1982<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1982.csv',sep= ""))
grandmas_1983<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1983.csv',sep= ""))
grandmas_1984<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1984.csv',sep= ""))
grandmas_1985<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1985.csv',sep= ""))
grandmas_1986<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1986.csv',sep= ""))
grandmas_1987<-read.csv(paste(grandmas_directory, 'anonymized_grandmas_1987.csv',sep= ""))
grandmas_1977 <- grandmas_1977 %>% rowwise() %>% mutate(Year = 1977)
grandmas_1978 <- grandmas_1978 %>% rowwise() %>% mutate(Year = 1978)
grandmas_1979 <- grandmas_1979 %>% rowwise() %>% mutate(Year = 1979)
grandmas_1980 <- grandmas_1980 %>% rowwise() %>% mutate(Year = 1980)
grandmas_1981 <- grandmas_1981 %>% rowwise() %>% mutate(Year = 1981)
grandmas_1982 <- grandmas_1982 %>% rowwise() %>% mutate(Year = 1982)
grandmas_1983 <- grandmas_1983 %>% rowwise() %>% mutate(Year = 1983)
grandmas_1984 <- grandmas_1984 %>% rowwise() %>% mutate(Year = 1984)
grandmas_1985 <- grandmas_1985 %>% rowwise() %>% mutate(Year = 1985)
grandmas_1986 <- grandmas_1986 %>% rowwise() %>% mutate(Year = 1986)
grandmas_1987 <- grandmas_1987 %>% rowwise() %>% mutate(Year = 1987)
all_years_results <- rbind(grandmas_1977, grandmas_1978, grandmas_1979, grandmas_1980, grandmas_1981, grandmas_1982, grandmas_1983, grandmas_1984, grandmas_1985, grandmas_1986, grandmas_1987)
Discussion
The first thing that stood out was actually the data cleanup I mentioned earlier: Women participated in the first Grandma’s Marathon, but the second and third annual events were run only by men.
ggplot(data=all_years_results, aes(x=Year, fill=Sex)) + geom_histogram(position="dodge", binwidth=1) + geom_text(
aes(label=..count..),position=position_dodge(width=0.9),
stat = 'count', vjust=-0.25
)+ labs(title= "Participation in Grandma's Marathon by Gender, 1977-1987")+ ylab("Participants")+ xlab("Year") + scale_fill_manual(values=blog_palatte, na.value="#000000")
How did the participation rates by women vary in those first eleven years?
for(i in 1977:1987){
men <- nrow(all_years_results %>% rowwise() %>% filter(Year==i) %>%filter(Sex=="M"))
women <- nrow(all_years_results %>%rowwise() %>% filter(Year==i) %>%filter(Sex=="F"))
percent_women <- round((women / nrow(all_years_results %>% rowwise() %>% filter(Year==i))) * 100, 2)
percent_women <- paste(percent_women, "%", sep="")
year_data <- paste("**", i, "data:** ", "men: ", men, "|", "women: ",women,"|", "women as proportion of total: ", percent_women)
print(year_data)
}
## [1] "** 1977 data:** men: 66 | women: 6 | women as proportion of total: 5.17%"
## [1] "** 1978 data:** men: 585 | women: 0 | women as proportion of total: 0%"
## [1] "** 1979 data:** men: 1290 | women: 0 | women as proportion of total: 0%"
## [1] "** 1980 data:** men: 2182 | women: 183 | women as proportion of total: 7.71%"
## [1] "** 1981 data:** men: 2837 | women: 365 | women as proportion of total: 11.4%"
## [1] "** 1982 data:** men: 3444 | women: 585 | women as proportion of total: 14.52%"
## [1] "** 1983 data:** men: 3587 | women: 608 | women as proportion of total: 14.49%"
## [1] "** 1984 data:** men: 4039 | women: 769 | women as proportion of total: 15.99%"
## [1] "** 1985 data:** men: 3695 | women: 707 | women as proportion of total: 16.06%"
## [1] "** 1986 data:** men: 3650 | women: 751 | women as proportion of total: 17.06%"
## [1] "** 1987 data:** men: 3358 | women: 841 | women as proportion of total: 20.03%"
And another way we can look at this:
all_women <- all_years_results %>% rowwise() %>% filter(Sex=="F")
ggplot(data=all_women, aes(x=Year, fill=Sex)) + geom_histogram(position="dodge", binwidth=1) +
geom_text(aes(label=..count..),
position=position_dodge(width=0.9),
stat = 'count', vjust=-0.25
)+ labs(title= "Womens Participation in Grandma's Marathon, 1977-1987")+
ylab("Participants") + xlab("Year") +
scale_fill_manual(values=blog_palatte, na.value="#000000")
Get in touch
Following graduation I’m seeking a position at the intersection of marketing, data, and endurance sports. My professional site is a bit of a work in progress but can be found here