Announcing BlackIce: A Containerized Red Teaming Toolkit for AI Security Testing

Published: January 21, 2026

Summary

Announces the release of BlackIce, an open-source, containerized toolkit for AI security testing, first introduced at CAMLIS Red 2025
Explains how BlackIce unifies 14 open-source tools, mapped to MITRE ATLAS and the Databricks AI Security Framework (DASF)
Shares links to the paper, GitHub repo, and Docker image to get started

At CAMLIS Red 2025, we introduced BlackIce, an open-source, containerized toolkit that bundles 14 widely used AI security tools into a single, reproducible environment. In this post, we highlight the motivation behind BlackIce, outline its core capabilities, and share resources to help you get started.

Why BlackIce

BlackIce was motivated by four practical challenges faced by AI red teamers: (1) each tool has a unique setup and configuration that is time consuming, (2) tools often require separate runtime environments because of dependency conflicts, (3) managed notebooks expose a single Python interpreter per kernel, and (4) the tool landscape is large and hard to navigate for newcomers.

Inspired by Kali Linux for traditional penetration testing, BlackIce aims to let teams bypass setup hassles and focus on security testing by providing a ready-to-run container image.

What’s inside

BlackIce provides a version-pinned Docker image that bundles 14 selected open-source tools spanning Responsible AI, Security testing, and classical adversarial ML. Exposed through a unified command-line interface, these tools can be run from the shell or within a Databricks notebook that uses a compute environment built from the image. Below is a summary of the tools included in this initial release, along with their supporting organizations and GitHub star counts at the time of writing:

Tool	Organization	Stars
LM Eval Harness	Eleuther AI	10.3K
Promptfoo	Promptfoo	8.6K
CleverHans	CleverHans Lab	6.4K
Garak	NVIDIA	6.1K
ART	IBM	5.6K
Giskard	Giskard	4.9K
CyberSecEval	Meta	3.8K
PyRIT	Microsoft	2.9K
EasyEdit	ZJUNLP	2.6K
Promptmap	N/A	1K
Fuzzy AI	CyberArk	800
Fickling	Trail of Bits	560
Rigging	Dreadnode	380
Judges	Quotient AI	290

To show how BlackIce fits into established AI risk frameworks, we mapped its capabilities to MITRE ATLAS and the Databricks AI Security Framework (DASF). The table below illustrates that the toolkit covers critical areas such as prompt injection, data leakage, hallucination detection, and supply chain security.

BlackIce Capability	MITRE ATLAS	Databricks AI Security Framework (DASF)
Prompt-injection and jailbreak testing of LLMs	AML.T0051 LLM Prompt Injection; AML.T0054 LLM Jailbreak; AML.T0056 LLM Meta Prompt Extraction	9.1 Prompt inject; 9.12 LLM jailbreak
Indirect prompt injection via untrusted content (e.g., RAG/email)	AML.T0051 LLM Prompt Injection [Indirect]	9.9 Input resource control
LLM data leakage testing	AML.T0057 LLM Data Leakage	10.6 Sensitive data output from a model
Hallucination stress-testing and detection	AML.T0062 Discover LLM Hallucinations	9.8 LLM hallucinations
Adversarial example generation and evasion testing (CV/ML)	AML.T0015 Evade ML Model; AML.T0043 Craft Adversarial Data	10.5 Black box attacks
Supply-chain and artifact safety scanning (e.g., malicious pickles)	AML.T0010 AI Supply Chain Compromise; AML.T0011.000 Unsafe AI Artifacts	7.3 ML supply chain vulnerabilities

How It Works

BlackIce organizes its integrated tools into two categories. Static tools evaluate AI applications through simple command-line interfaces and require little to no programming expertise. Dynamic tools offer similar evaluation capabilities but also support advanced Python-based customization, allowing users to develop custom attack code. Within the container image, static tools are installed in isolated Python virtual environments (or separate Node.js projects), each maintaining independent dependencies and accessible directly from the CLI. Alternatively, dynamic tools are installed into the global Python environment, with dependency conflicts managed via a global_requirements.txt file.

Some tools in the image required minor additions or modifications to connect seamlessly with Databricks Model Serving endpoints. We applied custom patches to these tools so they can interact directly with Databricks workspaces out of the box.

For a detailed explanation of the build process, including how to add new tools or update tool versions, see the Docker build README in the GitHub repo.

Get Started

The BlackIce image is available on Databricks’ Docker Hub, and the current version can be pulled using the following command:

To use BlackIce within a Databricks workspace, configure your compute with Databricks Container Services and specify databricksruntime/blackice:17.3-LTS as the Docker image URL in the Docker menu when creating the cluster.

After the cluster is created, you can attach it to this demo notebook to see how multiple AI security tools can be orchestrated within a single environment to test AI models and systems for vulnerabilities such as prompt injections and jailbreak attacks.

Check out our GitHub Repo to learn more about the integrated tools, find examples for running them with Databricks-hosted models, and access all Docker build artifacts.

For additional details on the tool selection process and the Docker image architecture, see our CAMLIS Red Paper.

What's next?

November 20, 2024/4 min read

Introducing Predictive Optimization for Statistics

November 21, 2024/3 min read

Summary

Why BlackIce

What’s inside

Gartner®: Databricks Cloud Database Leader

How It Works

Get Started

Never miss a Databricks post

Sign up

What's next?

Introducing Predictive Optimization for Statistics

How to present and share your Notebook insights in AI/BI Dashboards