Co-liberative Computing


Data Feminism is a paradigm that reimagines the concept of data and its applications while acknowledging the inherent power imbalances within data science. It recognizes that power is unequally distributed globally, with data itself serving as a form of power. Given the often unjust utilization of data, the primary goal of Data Feminism is to realize and reshape these imbalances. Data Feminism goes beyond a narrow focus on gender; rather, it adopts an intersectional approach by acknowledging various factors such as race, class, sexuality, ability, age, and religion that intersect to shape individuals' experiences and opportunities.

The Data Feminism course aims to bridge ethical and social justice themes with advancements in data science, exploring how individuals working with data can actively challenge and transform power differentials through a Data Feminism lens. This course is positioned at the intersection of data science and intersectional feminism. The objectives are mainly drawn based on the seven principles outlined in the book "Data Feminism" by Catherine D'Ignazio and Lauren F. Klein.


The Course Examiner

Amir H. Payberah


The Course Structure

The course contains seven modules, each dedicated to one of the outlined objectives. Within each module, students will engage in two sessions: one lecture and one discussion. In the lecture session of each module, the instructor will provide a comprehensive introduction to the module's context, offering an overview of the designated reading material. Subsequently, students will have one week to thoroughly review the assigned reading materials and submit a detailed critique of the selected papers. The discussion session of each module will be dedicated to a thorough review and in-depth exploration of the module's topic and associated papers.


Intended Learning Outcome (ILO)

After the course, the student should be able to:

  • ILO1: understand the theoretical and technical issues related to data justice.

  • ILO2: apply acquired knowledge to employ data and data science as tools to confront injustices magnified by data and associated techniques.

  • ILO3: analyze and evaluate data science practices by recognizing their biases and taking actions to address them.


Prerequisites

The students should have completed courses in machine learning and deep learning and be familiar with Python programming.


Assessment

Grading in this course will be based on four distinct tasks: completion of module reading assignments, presentation, active participation during module sessions, and the final project. The assignments can be undertaken in groups of two students.

  • Task 1 (reading assignments): each student/group is required to submit a comprehensive review for a set of assigned papers corresponding to each module.
    Reading Assignments 1
    Reading Assignments 2
    Reading Assignments 3
    Reading Assignments 4
    Reading Assignments 5
    Reading Assignments 6
    Reading Assignments 7

  • Task 2 (presentation): each student/group will act as the moderator for a pre-selected module. In this capacity, they are responsible for presenting the set of the assigned papers, contributing to the collective understanding of the module's content.

  • Task 3 (group discussion): students are expected to attend the group presentation sessions and actively engage in the subsequent group discussions.

  • Task 4 (final project): the final project requires each student/group to reproduce a paper relevant to the course topics. A set of papers will be provided to the students, but they also have the option to propose alternative papers for consideration.


Grading

The course will be assessed on a Pass/Fail basis, and successful completion is contingent on meeting specific criteria. These criteria encompass completing at least 75% of the reading assignments, delivering a presentation during the discussion session, attending a minimum of 75% of the student presentation sessions, and successfully implementing the chosen paper, incorporating basic experiments.


Credits

It is a 7.5 ECTS credits course that spans 224 hours over 14 weeks, including the time allocated for the final project.


Schedule

Module 1: Critiquing Power in Data Science

Lecture Session: Sep. 10, 13:00-15:00 [slides]
Discussion Session: Sep. 17, 13:00-15:00 [slides]

Required Reading
   - Data Feminism, Catherine D'Ignazio and Lauren F. Klein (intro, ch. 1-2)
   - Black Feminist Thought, Patricia Hill Collins (ch. 12)
   - Design Justice, Sasha Costanza-Chock (intro)
   - Dig Deep: Beyond Lean In, bell hooks [link]
   - Feminism for the 99%: A Manifesto, Nancy Fraser (thesis 1-10)

Optional Reading
   - Data Grab, Ulises A. Mejias and Nick Couldry (ch. 1, ch. 6)
   - Feminist Theory: From Margin to Center, bell hooks (ch. 1)
   - Algorithms of Oppression, Safiya Umoja Noble (ch. 1)
   - Race after Technology, Ruha Benjamin (intro)
   - Automating Inequality, Virginia Eubanks (ch. 4)
   - Demarginalizing the Intersection of Race and Sex, Kimberlé Crenshaw
   - Restorative Justice and Reparations, Margaret Urban Walker
   - Combahee River Collective Statement [link]
   - Exclusive: Workers at Google DeepMind Push Company to Drop Military Contracts [link]
   - Forget Project Maven. Here Are A Couple Other DoD Projects Google Is Working On [link]


Module 2: Ghost Work

Lecture Session: Sep. 24, 13:00-15:00 [slides]
Discussion Session: Oct. 1, 13:00-15:00 [slides]

Required Reading
   - Data Feminism, Catherine D'Ignazio and Lauren F. Klein (ch. 7)
   - The Exploited Labor Behind Artificial Intelligence, Adrienne Williams et al. [link]
   - Ghost Work, Mary L. Gray and Siddharth Suri (ch. 1)
   - Ethical Norms and Issues in Crowdsourcing Practices: A Habermasian Analysis, Daniel Schlagwein et al., Information Systems Journal, 2019
   - The Data-Production Dispositif, Milagros Miceli et al., ACM CSCW, 2022
   - The Cultural Work of Microwork, Lilly Irani, New Media & Society, 2015
   - Difference and Dependence Among Digital Workers: The Case of Amazon Mechanical Turk, Lilly Irani, South Atlantic Quarterly, 2015
   - Turkopticon: Interrupting Worker Invisibility in Amazon Mechanical Turk, Lilly Irani et al., ACM SIGCHI, 2013.
   - We are Dynamo: Overcoming Stalling and Friction in Collective Action for Crowd Workers, Niloufar Salehi et al., ACM SIGCHI, 2015.
   - A Typology of Artificial Intelligence Data Work, James Muldoon et al., Big Data & Society, 2024
   - Digital Labour Platforms and the Future of Work, Janine Berg et al., Rapport de l'OIT, 2018

Optional Reading
   - Atlas of AI, Kate Crawford (ch. 2)
   - Wages Against Housework, Silvia Federici [link]
   - Justice for Data Janitors, Lilly Irani [link]
   - Digital Labour Markets in the Platform Economy, Florian Schmidt, 2017
   - Whose Truth? Power, Labor, and the Production of Ground-Truth Data, Milagros Miceli, 2023
   - Platformization of Inequality: Gender and Race in Digital Labor Platforms, Isabel Munoz et al., ACM CSCW, 2024


Module 3: Data Colonialism

Lecture Session: Oct. 8, 13:00-15:00 [slides]
Discussion Session: Oct. 15, 13:00-15:00 [slides]

Required Reading
   - Data Feminism, (ch. 4-5)
   - Data Colonialism: Rethinking Big Data’s Relation to the Contemporary Subject
   - Against Cleaning, Katie Rawson and Trevor Muñoz [link]
   - Situated knowledges: The science question in feminism and the privilege of partial perspective, Donna Haraway, 2013
   - Social media for large studies of behavior, Derek Ruths and Jürgen Pfeffer, 2014
   - Artificial Intelligence and Inclusion: Formerly Gang-Involved Youth as Domain Experts for Analyzing Unstructured Twitter Data, W. Frey et al., 2020
   - Datasheets for datasets, Timnit Gebru et al., 2021
   - The Dataset Nutrition Label (2nd Gen), Kasia S. Chmielinski et al., 2018
   - Documenting Data Production Processes: A Participatory Approach for Data Work, Milagros Miceli et al., 2022


Optional Reading
   - The Anti-Eviction Mapping Project: Counter Mapping and Oral History Toward Bay Area Housing Justice, Manissa Maharawal et al., 2018
   - All Data Are Local: Thinking Critically in a Data-Driven Society, Yanni Loukissas (introduction)
   - Data Grab, Ulises A. Mejias and Nick Couldry (ch. 1, ch. 6)
   - Indigenous Statistics: A Quantitative Research Methodology, By Maggie Walter, Chris Andersen (introduction)
   - Why “Data for Good” Lacks Precision, Sara Hooker [link]
   - Tampering with Twitter’s sample API, Jürgen Pfeffer et al., EPJ Data Science, 2018
   - Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes, Nikhil Garg et al., 2018
   - Data Biographies: Getting to Know Your Data, Heather Krause [link]
   - Data user guides, Bob Gradeck [link]
   - The Subjects and Stages of AI Dataset Development: A Framework for Dataset Accountability, Mehtab Khan et al., 2023


Module 4: Bias and Fairness in Data

Lecture Session: Oct. 22, 13:00-15:00 [slides]
Discussion Session: Oct. 29, 13:00-15:00 [slides]

Required Reading
   - TBA

Optional Reading
   - TBA


Module 5: Bias and Fairness in Models

Lecture Session: Nov. 5, 13:00-15:00 [slides]
Discussion Session: Nov. 12, 13:00-15:00 [slides]

Required Reading
   - TBA

Optional Reading
   - TBA


Module 6: Intersectionality

Lecture Session: Nov. 15, 13:00-15:00 [slides]
Discussion Session: Nov. 26, 13:00-15:00 [slides]

Required Reading
   - TBA

Optional Reading
   - TBA


Module 7: Emotion and Embodiment

Lecture Session: Dec. 3, 13:00-15:00 [slides] (Guest Lecturer: Miriah Meyer)
Discussion Session: Dec. 10, 13:00-15:00 [slides]

Required Reading
   - TBA

Optional Reading
   - TBA