For more information about the broad and growing landscape of research related to bias in AI, we recommend the excellent and papers, both of which give useful overviews of methods related to bias and fairness. We take inspiration from related initiatives such as Datasheets for Datasets, A Nutrition Label for Rankings, and Data Statements for Natural Language Processing, and have been heartened to see that this area of work has inspired some of the large platforms and research initiatives with their own related projects, including Apple’s Privacy Labels, Google’s Model Cards, IBM’s AI FactSheets 360, and Partnership on AI About ML Project. You can learn more about our work and its position in the landscape in our published white papers. The exercise of mapping the space for our internal use as a team has proved invaluable at articulating a clear and growing need for effective dataset documentation and algorithmic auditing. In order to understand both the unique offering of our Label, and to learn from others so that we do not reinvent the wheel, we have been tracking related research, the development of new and related tools, and the general trajectory of labeling as an intervention. Since 2018, we have seen a confluence of initiatives arise in the domain of tools to combat bias in data. Read more about the methodology behind the second generation in our most recent white paper.ĭNP is a research organization as well as a product development team.Īlongside development of the tool, we have been doing ongoing research into the broader landscape of tools and practices designed to address problems in underlying data, whether due to the data itself, the data collection practices, or the dataset documentation.
The second generation Dataset Nutrition Label now provides targeted information about a dataset based on its intended use case, including alerts and flags that are pertinent to that particular use. Our belief is that deeper transparency into dataset health can lead to better data decisions, which in turn lead to better AI.įounded in 2018 through the Assembly Fellowship, The Data Nutrition Project takes inspiration from nutritional labels on food, aiming to build labels that highlight the key ingredients in a dataset such as meta-data and populations, as well as unique or anomalous features regarding distributions, missing data, and comparisons to other ‘ground truth’ datasets.īuilding off of the ‘modular’ framework initially presented in our 2018 prototype and based on feedback from data scientists and dataset owners, we have further adjusted the Label to support a common user journey: a data scientist looking for a dataset with a particular purpose in mind. The Data Nutrition Project aims to create a standard label for interrogating datasets.