MIT Researchers Develop Novel Technique that Reduces Bias in AI Models

Researchers at MIT unveil new technique that mitigates bias while maintaining accuracy in machine-learning models.

Researchers at the Massachusetts Institute of Technology (MIT) have developed a process that improves the fairness of machine-learning models without compromising their accuracy. The method finds and eliminates specific data points that contribute most to a model's shortcomings, particularly in relation to underrepresented subgroups.

AI systems can fail to predict outcomes accurately for underrepresented groups in their training data. For example, a model trained on data skewed towards male patients may make inaccurate treatment recommendations for female patients when used in a hospital setting.

Current strategies to address this bias often involve removing data points until all groups are represented equally. However, this approach can result in the deletion of significant amounts of data, ultimately compromising the overall efficacy of the model.

This new technique removes far fewer data points, preserving the model’s accuracy. Furthermore, it identifies hidden sources of bias within unlabelled datasets, which are considerably more common than those found in labelled data.

“Many other algorithms that try to address this issue assume each datapoint matters as much as every other datapoint. We are showing that assumption is not true. There are specific points in our dataset that are contributing to this bias, and we can find those data points, remove them, and get better performance,” stated Kimia Hamidieh, an electrical engineering and computer science (EECS) graduate student at MIT and co-lead author of the paper detailing these findings.

The process builds upon previous work by Hamidieh and her colleagues in which they introduced TRAK, a data attribution method designed to identify the most important training examples for a specific model output.

In this new approach, they take incorrect projections about underrepresented subgroups made by the model and use TRAK to identify which training examples contributed the most to that incorrect projection. These examples are then removed and the model retrained using the remaining dataset.

Funded, in part, by the National Science Foundation and the U.S. Defence Advanced Research Projects Agency, this new technique has significant real-world potential. It could one day, for instance, be used to reduce misdiagnosis in the healthcare sector and provide more accurate treatment options for underrepresented patients. 


Discover how Narus can help you harness AI safely and successfully.