Skip to main content

Announcing BlackIce: A Containerized Red Teaming Toolkit for AI Security Testing

Announcing BlackIce: A Containerized Red Teaming Toolkit for AI Security Testing

Published: January 21, 2026

Security and Trust3 min read

Summary

  • Announces the release of BlackIce, an open-source, containerized toolkit for AI security testing, first introduced at CAMLIS Red 2025
  • Explains how BlackIce unifies 14 open-source tools, mapped to MITRE ATLAS and the Databricks AI Security Framework (DASF)
  • Shares links to the paper, GitHub repo, and Docker image to get started

At CAMLIS Red 2025, we introduced BlackIce, an open-source, containerized toolkit that bundles 14 widely used AI security tools into a single, reproducible environment. In this post, we highlight the motivation behind BlackIce, outline its core capabilities, and share resources to help you get started.

Why BlackIce

BlackIce was motivated by four practical challenges faced by AI red teamers: (1) each tool has a unique setup and configuration that is time consuming, (2) tools often require separate runtime environments because of dependency conflicts, (3) managed notebooks expose a single Python interpreter per kernel, and (4) the tool landscape is large and hard to navigate for newcomers. 

Inspired by Kali Linux for traditional penetration testing, BlackIce aims to let teams bypass setup hassles and focus on security testing by providing a ready-to-run container image.

What’s inside

BlackIce provides a version-pinned Docker image that bundles 14 selected open-source tools spanning Responsible AI, Security testing, and classical adversarial ML. Exposed through a unified command-line interface, these tools can be run from the shell or within a Databricks notebook that uses a compute environment built from the image. Below is a summary of the tools included in this initial release, along with their supporting organizations and GitHub star counts at the time of writing:

ToolOrganizationStars
LM Eval HarnessEleuther AI10.3K
PromptfooPromptfoo8.6K
CleverHansCleverHans Lab6.4K
GarakNVIDIA6.1K
ARTIBM5.6K
GiskardGiskard4.9K
CyberSecEvalMeta3.8K
PyRITMicrosoft2.9K
EasyEditZJUNLP2.6K
PromptmapN/A1K
Fuzzy AICyberArk800
FicklingTrail of Bits560
RiggingDreadnode380
JudgesQuotient AI290

To show how BlackIce fits into established AI risk frameworks, we mapped its capabilities to MITRE ATLAS and the Databricks AI Security Framework (DASF). The table below illustrates that the toolkit covers critical areas such as prompt injection, data leakage, hallucination detection, and supply chain security.

BlackIce CapabilityMITRE ATLASDatabricks AI Security Framework (DASF)
Prompt-injection and jailbreak testing of LLMsAML.T0051 LLM Prompt Injection; AML.T0054 LLM Jailbreak; AML.T0056 LLM Meta Prompt Extraction9.1 Prompt inject; 9.12 LLM jailbreak
Indirect prompt injection via untrusted content (e.g., RAG/email)AML.T0051 LLM Prompt Injection [Indirect]9.9 Input resource control
LLM data leakage testingAML.T0057 LLM Data Leakage10.6 Sensitive data output from a model
Hallucination stress-testing and detectionAML.T0062 Discover LLM Hallucinations9.8 LLM hallucinations
Adversarial example generation and evasion testing (CV/ML)AML.T0015 Evade ML Model; AML.T0043 Craft Adversarial Data10.5 Black box attacks
Supply-chain and artifact safety scanning (e.g., malicious pickles)AML.T0010 AI Supply Chain Compromise; AML.T0011.000 Unsafe AI Artifacts7.3 ML supply chain vulnerabilities

How It Works

BlackIce organizes its integrated tools into two categories. Static tools evaluate AI applications through simple command-line interfaces and require little to no programming expertise. Dynamic tools offer similar evaluation capabilities but also support advanced Python-based customization, allowing users to develop custom attack code. Within the container image, static tools are installed in isolated Python virtual environments (or separate Node.js projects), each maintaining independent dependencies and accessible directly from the CLI. Alternatively, dynamic tools are installed into the global Python environment, with dependency conflicts managed via a global_requirements.txt file.

Some tools in the image required minor additions or modifications to connect seamlessly with Databricks Model Serving endpoints. We applied custom patches to these tools so they can interact directly with Databricks workspaces out of the box.

For a detailed explanation of the build process, including how to add new tools or update tool versions, see the Docker build README in the GitHub repo.

Get Started

The BlackIce image is available on Databricks’ Docker Hub, and the current version can be pulled using the following command:

To use BlackIce within a Databricks workspace, configure your compute with Databricks Container Services and specify databricksruntime/blackice:17.3-LTS as the Docker image URL in the Docker menu when creating the cluster.

After the cluster is created, you can attach it to this demo notebook to see how multiple AI security tools can be orchestrated within a single environment to test AI models and systems for vulnerabilities such as prompt injections and jailbreak attacks.

Check out our GitHub Repo to learn more about the integrated tools, find examples for running them with Databricks-hosted models, and access all Docker build artifacts.

For additional details on the tool selection process and the Docker image architecture, see our CAMLIS Red Paper.

Never miss a Databricks post

Subscribe to our blog and get the latest posts delivered to your inbox

What's next?

Introducing Predictive Optimization for Statistics

Product

November 20, 2024/4 min read

Introducing Predictive Optimization for Statistics

How to present and share your Notebook insights in AI/BI Dashboards

Product

November 21, 2024/3 min read

How to present and share your Notebook insights in AI/BI Dashboards