Autonomous and semi-autonomous vehicles are increasingly being used on land, water, air and space. These systems strategically use machine learning (ML) components to perceive their environment and arrive at decisions that affect the health, safety and perhaps even security aspects of the vehicle. Unfortunately, very little is understood about the logic behind decision-making by these ML components. This is highly undesirable, since an erroneous decision taken by an ML component can have serious legal and ethical ramifications. It is
therefore important to generate “explanations” of decisions taken by ML components used in cyber-physical systems, such that these explanations are credible and also easily interpretable by humans. Since the behaviour of ML components are often controlled by hundreds of thousands of tunable parameters, finding human-understandable explanations in terms of these parameters is difficult. We therefore propose to view ML components as black-boxes, and infer explanations from samples of their input-output behaviour, drawn as per the principles of Probably Approximately Correct learning. We also propose to use techniques from automated reasoning and formal methods to arrive at a set of Pareto-optimal explanations that successively tradeoff human-understandability with accuracy of prediction. We focus on explanations that can be represented as decision diagrams or loop-free programs in domain-specific languages, with a user-graded collection of predicates on the features. This allows user control over the shape and quality of explanations.