Crucial Data Omitted During Self-Driving Tests
Feb. 21, 2020—A popular self-driving car dataset contains omissions, reported VentureBeat. Udacity Dataset 2, which contains 15,000 images captured while driving in Mountain View and neighboring cities during daylight, has omissions.
The errors have to do with how the onboard artificial intelligence identifies objects that it sees and catalogs. Labels allow an AI system to understand the implications of patterns. Mislabeled or unlabeled items could lead to low accuracy and poor decision-making in turn.
Thousands of unlabeled vehicles, hundreds of unlabeled pedestrians, and dozens of unlabeled cyclists are present in roughly 5,000 of the samples, or 33 percent (217 lack any annotations at all but actually contain cars, trucks, street lights, or pedestrians).
In a statement, Udacity noted that it created the data set “as a tool purely for educational purposes” and that it never suggested the data set was fully labeled or complete. It also claims that its self-driving car — which currently operates for educational purposes only on a closed test track — hasn’t operated on public streets for several years.