Fred Hohman

Discovery of Intersectional Bias in Machine Learning Using Automatic Subgroup Generation

Debugging Machine Learning Models Workshop at ICLR (Debug ML), 2019


As machine learning is applied to data about people, it is crucial to understand how learned models treat different demographic groups. Many factors, including what training data and class of models are used, can encode biased behavior into learned outcomes. These biases are often small when considering a single feature (e.g., sex or race) in isolation, but appear more blatantly at the intersection of multiple features. We present our ongoing work of designing automatic techniques and interactive tools to help users discover subgroups of data instances on which a model underperforms. Using a bottom-up clustering technique for subgroup generation, users can quickly find areas of a dataset in which their models are encoding bias. Our work presents some of the first user-focused, interactive methods for discovering bias in machine learning models.


    title={Discovery of Intersectional Bias in Machine Learning Using Automatic Subgroup Generation},
    author={Cabrera, {\'A}ngel and Kahng, Minsuk and Hohman, Fred and Morgenstern, Jamie and Chau, Duen Horng},
    journal={Debugging Machine Learning Models Workshop (Debug ML) at ICLR},