Presentation + Paper
6 June 2022 Using influence functions to identify potential improvements for synthetic data generation
S. Ross Glandon, Rajeev Agrawal, Jing-Ru C. Cheng, Andrew Maxwell
Author Affiliations +
Abstract

Computer vision, enabled by artificial intelligence and deep learning, has a nearly limitless number of possible applications, military and civilian. Object detection methods are a particularly notable type of computer vision, with broad usefulness in a variety of systems, such as autonomous vehicles, robotics, and security. Development of effective object detection methods faces many challenges; one such challenge of significance is a lack of good labeled data for the target domain, as hand labeled data is time-consuming and expensive to produce. Synthetic data generation seeks to solve this problem by programmatically generating both training data and labels simultaneously, allowing for the creation of arbitrarily large training datasets. However, synthetic data has several drawbacks { generating realistic imagery is challenging and computationally expensive, and models trained with synthetic data frequently suffer in accuracy when applied to real test data.

In our research, we use model explainability techniques to connect model predictions back to the model training data, in order to identify the most important features that need to be represented accurately in synthetic training data. Influence functions score model training samples based on how influential each sample was to a particular prediction, by approximating the effect of retraining the model after leaving each individual training sample out of the training set. In this work, we seek to extend influence functions to identify the most valuable features in real and synthetic training data for use in improving our synthetic data generation tools.
Conference Presentation
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
S. Ross Glandon, Rajeev Agrawal, Jing-Ru C. Cheng, and Andrew Maxwell "Using influence functions to identify potential improvements for synthetic data generation", Proc. SPIE 12113, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications IV, 1211309 (6 June 2022); https://doi.org/10.1117/12.2619021
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Statistical modeling

Visual process modeling

Machine learning

Machine vision

Computer vision technology

Image sensors

Back to Top