At CAMLIS Red 2025, we introduced BlackIce, an open-source, containerized toolkit that bundles 14 widely used AI security tools into a single, reproducible environment. In this post, we highlight the motivation behind BlackIce, outline its core capabilities, and share resources to help you get started.
BlackIce was motivated by four practical challenges faced by AI red teamers: (1) each tool has a unique setup and configuration that is time consuming, (2) tools often require separate runtime environments because of dependency conflicts, (3) managed notebooks expose a single Python interpreter per kernel, and (4) the tool landscape is large and hard to navigate for newcomers.
Inspired by Kali Linux for traditional penetration testing, BlackIce aims to let teams bypass setup hassles and focus on security testing by providing a ready-to-run container image.
BlackIce provides a version-pinned Docker image that bundles 14 selected open-source tools spanning Responsible AI, Security testing, and classical adversarial ML. Exposed through a unified command-line interface, these tools can be run from the shell or within a Databricks notebook that uses a compute environment built from the image. Below is a summary of the tools included in this initial release, along with their supporting organizations and GitHub star counts at the time of writing:
| Tool | Organization | Stars |
|---|---|---|
| LM Eval Harness | Eleuther AI | 10.3K |
| Promptfoo | Promptfoo | 8.6K |
| CleverHans | CleverHans Lab | 6.4K |
| Garak | NVIDIA | 6.1K |
| ART | IBM | 5.6K |
| Giskard | Giskard | 4.9K |
| CyberSecEval | Meta | 3.8K |
| PyRIT | Microsoft | 2.9K |
| EasyEdit | ZJUNLP | 2.6K |
| Promptmap | N/A | 1K |
| Fuzzy AI | CyberArk | 800 |
| Fickling | Trail of Bits | 560 |
| Rigging | Dreadnode | 380 |
| Judges | Quotient AI | 290 |
To show how BlackIce fits into established AI risk frameworks, we mapped its capabilities to MITRE ATLAS and the Databricks AI Security Framework (DASF). The table below illustrates that the toolkit covers critical areas such as prompt injection, data leakage, hallucination detection, and supply chain security.
| BlackIce Capability | MITRE ATLAS | Databricks AI Security Framework (DASF) |
|---|---|---|
| Prompt-injection and jailbreak testing of LLMs | AML.T0051 LLM Prompt Injection; AML.T0054 LLM Jailbreak; AML.T0056 LLM Meta Prompt Extraction | 9.1 Prompt inject; 9.12 LLM jailbreak |
| Indirect prompt injection via untrusted content (e.g., RAG/email) | AML.T0051 LLM Prompt Injection [Indirect] | 9.9 Input resource control |
| LLM data leakage testing | AML.T0057 LLM Data Leakage | 10.6 Sensitive data output from a model |
| Hallucination stress-testing and detection | AML.T0062 Discover LLM Hallucinations | 9.8 LLM hallucinations |
| Adversarial example generation and evasion testing (CV/ML) | AML.T0015 Evade ML Model; AML.T0043 Craft Adversarial Data | 10.5 Black box attacks |
| Supply-chain and artifact safety scanning (e.g., malicious pickles) | AML.T0010 AI Supply Chain Compromise; AML.T0011.000 Unsafe AI Artifacts | 7.3 ML supply chain vulnerabilities |
BlackIce organizes its integrated tools into two categories. Static tools evaluate AI applications through simple command-line interfaces and require little to no programming expertise. Dynamic tools offer similar evaluation capabilities but also support advanced Python-based customization, allowing users to develop custom attack code. Within the container image, static tools are installed in isolated Python virtual environments (or separate Node.js projects), each maintaining independent dependencies and accessible directly from the CLI. Alternatively, dynamic tools are installed into the global Python environment, with dependency conflicts managed via a global_requirements.txt file.
Some tools in the image required minor additions or modifications to connect seamlessly with Databricks Model Serving endpoints. We applied custom patches to these tools so they can interact directly with Databricks workspaces out of the box.
For a detailed explanation of the build process, including how to add new tools or update tool versions, see the Docker build README in the GitHub repo.
The BlackIce image is available on Databricks’ Docker Hub, and the current version can be pulled using the following command:
To use BlackIce within a Databricks workspace, configure your compute with Databricks Container Services and specify databricksruntime/blackice:17.3-LTS as the Docker image URL in the Docker menu when creating the cluster.
After the cluster is created, you can attach it to this demo notebook to see how multiple AI security tools can be orchestrated within a single environment to test AI models and systems for vulnerabilities such as prompt injections and jailbreak attacks.
Check out our GitHub Repo to learn more about the integrated tools, find examples for running them with Databricks-hosted models, and access all Docker build artifacts.
For additional details on the tool selection process and the Docker image architecture, see our CAMLIS Red Paper.
Product
November 21, 2024/3 min read

