Abstract
Embedded intelligent systems ranging from tiny im- plantable biomedical devices to large swarms of autonomous un- manned aerial systems are becoming pervasive in our daily lives. While we depend on the flawless functioning of such intelligent systems, and often take their behavioral correctness and safety for granted, it is notoriously difficult to generate test cases that expose subtle errors in the implementations of machine learning algorithms. Hence, the validation of intelligent systems is usually achieved by studying their behavior on representative data sets, using methods such as cross-validation and bootstrapping.In this paper, we present a new testing methodology for studying
the correctness of intelligent systems. Our approach uses symbolic
decision procedures coupled with statistical hypothesis testing to. We also use our algorithm to analyze the
robustness of a human detection algorithm built using the OpenCV
open-source computer vision library. We show that the human
detection implementation can fail to detect humans in perturbed
video frames even when the perturbations are so small that the
corresponding frames look identical to the naked eye.