Aiding Evaluation of Adversarial AI Defenses

GARD researchers from Two Six Technologies, IBM, MITRE, University of Chicago, and Google Research have collaboratively generated a virtual testbed, toolbox, benchmarking dataset, and training materials to enable this effort. Further, they have made these assets available to the broader research community via a public repository. “What makes you trust a system? Often it comes down to understanding that the system has been tested by a skilled evaluator with the right tools and data. Through this openly available GARD repository, we are providing a starting point for all of these pieces,” noted Draper.

Central to the asset list is a virtual platform called Armory that enables repeatable, scalable, and robust evaluations of adversarial defenses. The Armory “testbed” provides researchers with a way to pit their defenses against known attacks and relevant scenarios. It also provides the ability to alter the scenarios and make changes, ensuring that the defenses are capable of delivering repeatable results across a range of attacks.

Armory utilizes a Python library for ML security called Adversarial Robustness Toolbox, or ART. ART provides tools that enable developers and researchers to defend and evaluate their ML models and applications against a number of adversarial threats, such as evasion, poisoning, extraction, and inference. The toolbox was originally developed outside of the GARD program as an academic-to-academic sharing platform. The GARD program is working to mature the library and elevate it to a definitive standard for users, adding datasets and evaluation methodology as well as new items like entire processes. Armory heavily leverages ART library components for attacks and model integration as well as MITRE-generated datasets and scenarios.

The Adversarial Patches Rearranged in COnText, or APRICOT, benchmark dataset is also available via the repository. APRICOT was created to enable reproducible research on the real-world effectiveness of physical adversarial patch attacks on object detection systems. The dataset lets users project things in 3D so they can more easily replicate and defeat physical attacks, which is a unique function of this resource. “Essentially, we’re making it easier for researchers to test their defenses and ensure they are actually solving the problems they are designed to address,” said Draper.

Despite being an emerging field, there are already a number of common themes and failure modes witnessed across current adversarial AI defenses. Often, researchers and developers believe something will work across a spectrum of attacks, only to realize it lacks robustness against even minor deviations. To help address this challenge, Google Research has made the Google Research Self-Study repository that is available via the GARD evaluation toolkit. The repository contains “test dummies” – or defenses that aren’t designed to be the state-of-the-art but represent a common idea or approach that’s used to build defenses. The “dummies” are known to be broken, but offer a way for researchers to dive into the defenses and go through the process of properly evaluating their faults.

“The goal is to help the GARD community improve their system evaluation skills by understanding how their ideas really work, and how to avoid common mistakes that detract from their defense’s robustness,” said Draper. “With the Self-Study repository, researchers are provided hands-on understanding. This project is designed to give them in the field experience to help improve their evaluation skills.”

DARPA notes that the GARD program’s Holistic Evaluation of Adversarial Defenses repository is available at Interested researchers are encouraged to take advantage of these resources and check back often for updates.