Alongside the issue of opening up access to STEM for would-be-coders, retaining women who are already in the industry is a key problem. Data showing the general exodus from the industry of the girls who grew up to be programmers is in its abundance – but how do you identify potential causes for this attrition? One group of researchers from the Department of Computer Science North Carolina State University, USA has taken to GitHub to probe whether the “pure meritocracy” of the open source community may in fact be acting to conceal gender bias in the workplace.

After processing vast torrents of data, the answer to the question, “To what extent does gender bias exist among people who judge GitHub pull requests?” actually proved a little more surprising than the initial hypothesis would have suggested. Based on previous studies of gender bias, the researchers went in with the assumption that women would be overall more likely to have their pull requests rejected. This turned out to be less clear cut than anticipated.

Although GitHub doesn’t distinguish users by gender, by extracting user email addresses from GHTorrent and hunting down individual Google+ profiles, the researchers were able to make a classification for 35.3% of 4,037,953 GitHub user profiles with email addresses. Thanks to the humongous data source that the online code repository represents, this research could possibly be the largest study to date on gender bias in the workplace to date. Whilst it could well be argued that the experience of using GitHub is very different than of the average office, given that substantial part of activity on GitHub is done in a professional context, it was judged as a good source for the study.

You can read the full analysis here, but if you don’t have time to wade through the whole thing, here’s a quick summary of their findings:

  1. Women are more likely to have pull requests accepted than men.

  2. Women continue to have high acceptance rates as they gain experience.

  3. Women’s pull requests are less likely to serve an immediate project need.

  4. Women’s changes are larger.

  5. Women’s acceptance rates are higher across programming languages.

  6. Women have lower acceptance rates as outsiders when they are identifiable as women.


There are a number of possible reasons why women may spike higher than men for accepted pull requests. Self-selection bias could be a strong factor – women in open source are statistically more likely to hold advanced qualifications Master’s and PhD degrees. In contrast, men are much less likely to leave STEM once entered, regardless of their abilities or level of qualification – accounting for the fact that attrition rates between men and women in higher education seems slight, in spite of the higher number of males entering these fields of study.

There could  also be survivorship bias at play within GitHub. Although many women will have their initial pull requests rejected in comparison to men, and will thus be less encouraged to make further submissions, those who stay the course and remain could be “better equipped” to defend their contributions down the line. Interestingly, this correlates with research by Harvard Business Review, where two thirds of women questioned reported having to prove their capabilities “over and over again”. Another reason could simply be, as past papers suggest, women are simply as a rule held to higher standards than men – a phenomenon that is observed across the workforce as a whole.

Looking at the results of the investigation in their entirety, the team conclude that, even though women on GitHub “may be more competent overall, bias against them exists nonetheless.” The multi-billion dollar question now is, how do we solve it? 


