Skip to main content

Data-Centric AI

Project Member(s): Xu, M.

Funding or Partner Organisation: Google LLC (Google PhD Fellowship)
Google LLC (Google PhD Fellowship)

Start year: 2022

Summary: The great advances in deep learning over the past decades have been powered by ever-bigger models crunching ever-bigger amounts of data. Building and using datasets for AI systems is often artisanal—painstaking and expensive. Also, train- ing on such massive data comes at a price of huge computational and infrastructural costs. Therefore, how to efficiently create a dataset with high-quality samples is becoming a hot research topic, called data-centric AI (DCAI) in machine learning community. DCAI represents the transition from focusing on the model to the underlying data to make building, maintaining, and evaluating datasets easier, cheaper and more repeatable. DCAI aims to provide machine learning-based tools for automated data governance, such as data denoising, data condensation, data augmentation and data quality evaluation. In this research, we focus on research problems from several aspects of data-centric AI, including data reliability, data interpretability, and data efficiency.

FOR Codes: Computer vision, Machine learning, Application software packages