Reading Assignment 4

Reading and Implementation on Data Bias

For this assignment, please read the following articles:
1. Artificial Intelligence and Inclusion: Formerly Gang-Involved Youth as Domain Experts for Analyzing Unstructured Twitter Data, William R. Frey, , et al., Social Science Computer Review, 2020
2. Gender Bias and Stereotypes in Large Language Models, Hadas Kotek et al., ACM Collective Intelligence Conference, 2023
3. Cultural Bias and Cultural Alignment of Large Language Models, Yan Tao et al., PNAS nexus, 2024
4. Assessing and Remedying Coverage for a Given Dataset, Abolfazl Asudeh et al., IEEE ICDE, 2019
5. Machine Bias, Julia Angwin et al., ProPublica, 2016 [link]
After completing these readings, please critical answer the following questions.

Based on paper [1], answer the following questions:
1. The authors emphasize that there may not be one correct interpretation of a tweet. Using the case examples in the paper, discuss how ambiguity itself becomes a source of bias in AI systems that demand fixed labels. And, how do the authors attempt to manage that ambiguity?
2. The authors position inclusion of domain experts as a way to make AI systems fairer. Does inclusion within an unchanged system risk legitimizing its underlying logic?

Based on paper [2], answer the following questions:
1. The authors analyze model behavior when associating occupations with gender. What patterns do they observe in LLMs assign gendered roles to different professions? How do these patterns reflect deeper imbalances or stereotypes embedded in the model's training data?
2. The paper reports that small changes in prompt wording can alter the model's gender associations. Identify one example of this sensitivity from the study. What does it reveal about the instability of fairness testing in LLMs?

Based on paper [3], answer the following questions:
1. The authors show that asking the model to respond as if from a specific country improves alignment for many nations but worsens it for others. How do the authors interpret this paradox, and what does it imply about the internal cultural representation of LLMs?
2. The authors explains that cultural alignment is essential to distribute AI benefits more evenly across societies. Based on their evidence, does cultural alignment actually equalize representation, or does it simply adjust the model to mimic global diversity on Western terms? What might a more justice-oriented approach to cultural alignment look like?

Based on paper [4] and link [5], answer the following questions:
Your task is to evaluate coverage issues [4] of COMPAS dataset [5] using an existing Python fairness library, e.g., AIF360 or Fairlearn. Specifically, you should:
1. Assess Coverage: Investigate the distribution of sub-populations based on protected attributes (e.g., race, gender) as well as the distribution of the outcome variable (recidivism) across these groups in the COMPAS dataset. Identify any underrepresented or missing subpopulations.
2. Measure Bias: Compute the following fairness metrics for race and gender: (i) statistical parity to measure the difference in favorable outcomes between privileged and unprivileged groups, (ii) predictive parity that assesses whether the probability of a correct positive prediction (i.e., predicting recidivism for those who reoffend) is the same for different groups, and (iii) equalized odds that requires both true positive rates (TPR) and false positive rates (FPR) be the same across groups. Compare and contrast the results. Which metric provides the most meaningful insight into bias? Do these metrics align, or do they reveal different aspects of bias?
3. Train a Classifier: Train a logistic regression model to predict recidivism using the dataset. After training the model, evaluate its predictions for bias using fairness metrics defined at step 2.
4. Apply Bias Mitigation Techniques: Apply the following pre-processing bias mitigation techniques to reduce bias in your model: (i) reweighting that adjusts the weights of individuals in the dataset to ensure that privileged and unprivileged groups receive equal consideration during model training, (ii) disparate impact remover that modifies feature values to reduce the impact of protected attributes (such as race and gender) on the predictions, and (iii) learning fair representations that learns a new, fair feature representation of the data. Compare the effectiveness of these techniques in reducing bias. Do these bias mitigation strategies lead to better outcomes and reduce bias across all groups?