AI operates in societies alive with gender and ethnicity. Algorithmic bias arises from a variety of sources, ranging from human bias embedded in training data to unconscious choices in the algorithm design. As machine learning becomes increasingly ubiquitous in everyday lives, such bias, if uncorrected, can lead to social inequities. Researchers need to understand how gender and ethnicity operate within the context of their algorithm in order to enhance or, at least, not reinforce social equalities. Here we suggest avenues for reducing bias in training data and algorithms in efforts to produce AI that enhances social equalities.
Gender refers to cultural attitudes and behaviors. Humans function in large and complex societies through learned behaviors. The ways we speak, our mannerisms, the things we use, and our behaviors all signal who we are and establish rules for interaction. Gender is one of these sets of behaviors and attitudes; ethnicity is another of these sets of behaviors and attitudes, and the two often intersect.
1. Mapping Known Examples of Human Bias Amplified by Technology
2. Mapping Solutions
3. Systemic Solutions: Attending to infrastructure issues; performing rigorous social benefit reviews; creating interdisciplinary and socially diverse teams; and integrating social issues into the core CS curriculum
Machine learning algorithms can contain significant gender and ethnic bias. Where in the machine learning pipeline does bias reside: The input data? the algorithm itself? the types of deployment? More importantly, how can humans intervene in automated processes to enhance and, at least, not harm social equalities? And who should make these decisions?
Importantly, AI is creating the future (technology, i.e., our devices, programs, and processes shape human attitudes, behaviors, and culture). In other words, AI may unintentionally perpetuate past bias into the future, even when governments, universities, and companies such as Google and Facebook have implemented policies to foster equality. So, the big question is: how can we humans best ensure that AI supports social justice?
Method: Analyzing Gender
Gender refers to cultural attitudes and behaviors. Humans function in large and complex societies through learned behaviors. The ways we speak, our mannerisms, the things we use, and our behaviors all signal who we are and establish rules for interaction. Gender is one of these sets of behaviors and attitudes. Ethnicity is another of these sets of behaviors and attitudes.
Gender consists of:
- • Gender Norms consist of spoken and unspoken cultural rules (ranging from legislated to unconscious rules) produced through social institutions (such as families, schools, workplaces, laboratories, universities, or boardrooms) and wider cultural products (such as textbooks, literature, and social media) that influence individuals’ behaviors, expectations, and experiences.
- • Gender Identity refers to how individuals or groups perceive and present themselves, and how they are perceived by others. Gender identities are malleable, change over the life course, and are context specific. Gender identities may intersect with other identities, such as ethnicity, class, or sexual orientation to yield multifaceted self-understandings.
- • Gender Relations refer to social and power relations between people of different gender identities within families, the workplace, and societies at large.
Known examples of gender bias
Known examples of ethnic biasMethod: Analyzing Factors Intersecting with Sex and Gender
It is important to analyze sex and gender, but other important factors intersect with sex and gender. This is what scholars call "intersectionality." These factors or variables can be biological, socio-cultural, or psychological, and may include: age, disabilities, ethnicity, nationality, religion, sexual orientation, etc.
It is important to be able to detect when an algorithm is potentially biased. Several groups are developing tools for this purpose.
As we strive to improve the fairness of data and AI, we need to think carefully about appropriate notions of fairness. Should data, for example, represent the world as it is, or represent a world we aspire to—i.e., a world that achieves social equality? Who should make these decisions? Computer scientists and engineers working on problems? Ethics teams within companies? Government oversight committees? If computer scientists, how should they be educated?
Creating AI that results in both high-quality techniques and social justice requires a number of important steps. Here we highlight four:
Note: Some materials in this case study draw from Zou & Schiebinger (2018).
Angwin, J., & Larson, J. (2016, May 23). Bias in criminal risk scores is mathematically inevitable, researchers say. Propublica https://www.propublica.org/article/bias-in-criminal-risk-scores-is-mathematically-inevitable-researchers-say
Bivens, R. (2017). The gender binary will not be deprogrammed: Ten years of coding gender on Facebook. New Media & Society, 19 (6), 880-898.
Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. Advances in Neural Information Processing Systems, 4349-4357.
Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. Conference on Fairness, Accountability and Transparency, 77-91.
Caliskan, A., Bryson, J., & Narayanan, N. (2017). Semantics derived automatically from language corpora contain human-like biases. Science 356 (6334), 183-186.
Corbett-Davis, S., Pierson, E., Feller A., Goal S., & Huq A. (2017). Algorithmic decision making and the cost of fairness. Conference on Knowledge Discovery and Data Mining.
Commission Nationale Informatique & Libertés (CNIL). (2017). How Can Humans Keep the Upper Hand: Ethical Matters Raised by Algorithms and Artificial Intelligence. French Data Protection Authority.
Datta, A., Tschantz, M. C., & Datta, A. (2015). Automated experiments on ad privacy settings. Proceedings on Privacy Enhancing Technologies, 2015 (1), 92-112.
Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012). Fairness through awareness. Proceedings of the 3rd innovations in theoretical computer science conference, 214-226. ACM.
Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542 (7639), 115-126.
Executive Office of the President, Munoz, C., Director, D. P. C., Megan (US Chief Technology Officer Smith (Office of Science and Technology Policy), & DJ (Deputy Chief Technology Officer for Data Policy and Chief Data Scientist Patil (Office of Science and Technology Policy) (2016). Big Data: A Report on Algorithmic Systems, Opportunity, and Civil Rights. Executive Office of the President.
Ford, H., & Wajcman, J. (2017). ‘Anyone can edit’, not everyone does: Wikipedia’s infrastructure and the gender gap. Social studies of science, 47(4), 511-527.
Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences, 115(16), E3635-E3644.
Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Daumeé III, H., & Crawford, K. (2018). Datasheets for datasets. arXiv preprint arXiv:1803.09010.
Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. In Advances in neural information processing systems, 3315-3323.
Hébert-Johnson, U., Kim, M. P., Reingold, O., & Rothblum, G. N. (2017). Calibration for the (computationally-identifiable) masses. arXiv preprint arXiv:1711.08513.
Kim, Michael P., Amirata Ghorbani, and James Zou. Multiaccuracy: Black-Box Post-Processing for Fairness in Classification. arXiv preprint arXiv:1805.12317 (2018).
Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J., & Mullainathan, S. (2017). Human decisions and machine predictions. The Quarterly Journal of Economics, 133 (1), 237-293.
Kleinberg J., Mullainathan, S., Raghavan, M., (2017). Inherent trade-offs in the fair determination of risk scores. In Proceedings of Innovations in Theoretical Computer Science (ITCS).
MIT Media Lab (2018). http://datanutrition.media.mit.edu
Nielsen, M. W., Andersen, J. P., Schiebinger, L., & Schneider, J. W. (2017). One and a half million medical papers reveal a link between author gender and attention to gender and sex analysis. Nature Human Behaviour, 1(11), 791.
Noble, S.U. (2018). Algorithms of oppression: How search engines reinforce racism. NYU Press.
Popejoy, A. B., & Fullerton, S. M. (2016). Genomics is failing on diversity. Nature, 538(7624), 161-164.
Prates, M. O., Avelar, P. H., & Lamb, L. (2018). Assessing gender bias in machine translation—a case study with Google Translate. arXiv preprint arXiv:1809.02208.
Schiebinger, L., Klinge, I., Sánchez de Madariaga, I., Paik, H. Y., Schraudner, M., and Stefanick, M. (Eds.) (2011-2018). Gendered innovations in science, health & medicine, engineering and environment, engineering, machine translation.
Shankar, S., Halpern, Y., Breck, E., Atwood, J., Wilson, J., & Sculley, D. (2017). No classification without representation: assessing geodiversity issues in open data sets for the developing world. arXiv preprint arXiv:1711.08536.
Sweeney, L. (2013). Discrimination in online ad delivery. Queue, 11(3), 10.
Wagner, C., Garcia, D., Jadidi, M., & Strohmaier, M. (2015, April). It's a Man's Wikipedia? Assessing Gender Inequality in an Online Encyclopedia. ICWSM, 454-463.
Wikimedia (2018), personal communication.
Zhao, J., Wang, T., Yatskar, M., Ordonez, V. & Chang, K.-W. (2017). Men also like shopping: reducing gender bias amplification using corpus-level constraints. arXiv preprint arXiv:1707.09457.
Zhao, J., Zhou, Y., Li, Z., Wang, W., & Chang, K. W. (2018). Learning Gender-Neutral Word Embeddings. arXiv preprint arXiv:1809.01496.
Zou, J. & Schiebinger, L. (2018). Design AI that’s fair. Nature, 559(7714), 324-326.