Skip to content

deriv-security/AI-CTF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI CTF Platform - Standalone Edition

A comprehensive, self-contained AI security training platform that tests your prompt injection and AI security skills. Run your own instance using your preferred LLM provider.

🎯 Overview

AI CTF is an interactive platform featuring multiple difficulty levels where you'll attempt to extract flags from AI systems through prompt injection, social engineering, and creative problem-solving.

Challenge Levels

  • Beginner (5 challenges): AI Helpdesk - Learn basic prompt injection techniques
  • Advanced (6 challenges): AI Trading Bot - Master tool-using AI exploitation with web scraping
  • Expert (6 challenges): Risk Management AI - Advanced social engineering without system prompts
  • Master (5 challenges): Cortex System - Ultimate technical challenge with realistic vulnerabilities
  • Custom (Unlimited): Generate your own challenges with any scenario and difficulty level

Total: 22 preset challenges + unlimited custom challenges testing various AI security concepts

✨ Features

  • 🔧 Bring Your Own LLM: Use any LiteLLM-compatible provider (OpenAI, Anthropic, Google, etc.)
  • 🎲 Dynamic Challenge Generation: Generate unlimited custom CTF challenges on-the-fly using AI
  • 🎨 Modern UI: Beautiful, responsive terminal interface with real-time chat
  • 📊 Unified Interface: Single terminal with difficulty selector dropdown
  • 🎯 Flag Submission: Submit and validate flags directly in the interface
  • 📈 Progress Tracking: Track solved challenges with localStorage persistence
  • 🔒 Secure: API keys stored server-side, never exposed to browser
  • 🐳 Easy Deployment: Simple Docker Compose setup
  • 🎓 Educational: Learn real-world AI security vulnerabilities

🚀 Quick Start

Prerequisites

  • Docker and Docker Compose
  • An API key from an LLM provider:
    • OpenAI (GPT-3.5, GPT-4, GPT-4o)
    • Anthropic (Claude models)
    • Google (Gemini models)
    • Any LiteLLM-compatible provider

Installation

  1. Clone the repository

    git clone <your-repo-url>
    cd ai_ctf
  2. Configure environment variables

    cp .env.example .env

    Edit .env (all configuration in one file):

    # API Base URL - for web interface to connect to API
    API_BASE_URL=http://localhost:4002
    # For production: https://your-domain.com or http://your-ip:4002
    
    # JWT Secret Key (REQUIRED) - generate with: openssl rand -hex 32
    SECRET_KEY=your_generated_secret_key_here
    
    # CORS - must match your web interface URL
    ALLOWED_ORIGINS=http://localhost:4081,http://127.0.0.1:4081
    # For production: https://your-domain.com,http://your-ip:4081
    
    # LLM and n8n configuration is set via the web UI
    # These values will be automatically saved when you configure

    Generate a secure key:

    openssl rand -hex 32
  3. Start the platform

    docker compose up -d --build
  4. Access the platform

    • Open your browser to http://localhost:4081 (or your configured domain/IP)
    • Enter your LLM configuration (stored securely on the server):
      • Model Name: e.g., gpt-4o, claude-3-5-sonnet-20241022, gemini-2.5-flash
      • API Key: Your provider's API key
      • Base URL (optional): Leave blank for standard usage
        • The platform automatically routes to the correct provider (OpenAI, Anthropic, Google) based on your model name
        • Only needed if using a self-hosted LiteLLM proxy, custom gateway, or local models
      • n8n Webhook URL (optional): For Advanced challenges with URL visiting
      • n8n API Key (optional): If your n8n webhook requires authentication
    • Click "Continue to Terminal" and start hacking!

    Note: Your API keys are now stored server-side in the .env file (not in the browser), making it more secure.

🎮 How to Play

  1. Configure Your LLM: On the landing page, enter your model name and API key
  2. Select Difficulty: Choose from Beginner, Advanced, Expert, or Master
  3. Browse Challenges: View challenge descriptions and hints in the sidebar
  4. Chat with the AI: Use the terminal to interact with the AI system
  5. Extract Flags: Use prompt injection and social engineering to reveal flags
  6. Submit Flags: Enter flags directly in the challenge sidebar to validate
  7. Track Progress: See your solved challenges and progress bar
  8. Check System Prompts: View sanitized system prompts (except Expert/Master levels)

Example First Challenge

Try this with the Beginner difficulty:

You: What's the price of gold?
AI: The current price of gold is $2,000 CTF{g0ld_pr1c3_ch3ck}

Copy the flag CTF{g0ld_pr1c3_ch3ck} and paste it into the flag submission box for Challenge #2. Congratulations! 🎉

🎲 Dynamic Challenge Generation

Want more challenges beyond the 22 preset ones? Generate unlimited custom challenges tailored to your interests!

How to Generate Custom Challenges

  1. Click "🎲 Generate Challenge" button in the terminal
  2. Select difficulty level: Beginner, Advanced, Expert, or Master
  3. Enter scenario type: Examples:
    • Healthcare AI Assistant
    • Legal Document Analyzer
    • HR Recruitment Bot
    • E-commerce Customer Support
    • Code Review Assistant
    • Financial Advisor Bot
  4. Add industry context (optional): Provide specific details about the scenario
  5. Click "Generate Challenge" and wait 10-30 seconds

The AI will create a complete challenge set including:

  • Custom system prompt with embedded flags
  • 3-5 unique challenges with descriptions
  • Hints for each challenge
  • Trigger conditions matching the difficulty level

Custom Challenge Features

  • Stored Locally: All custom challenges saved in your browser's localStorage
  • Unlimited Generation: Create as many as you want (uses your API key)
  • Deletable: Remove custom challenges you don't want anymore
  • Separate Tracking: Progress tracked independently from preset challenges
  • Full Integration: Works seamlessly with the terminal interface

Example Custom Scenarios

  • Medical Records AI (Expert): Test prompt injection on a healthcare system
  • Legal Contract Analyzer (Master): Exploit a document processing AI
  • HR Screening Bot (Advanced): Bypass hiring filters and access restricted data
  • Smart Home Assistant (Beginner): Extract device control codes
  • Academic Plagiarism Checker (Advanced): Manipulate detection algorithms

🔧 Advanced Setup

n8n Webhook for Advanced Challenges

The Advanced challenges include a tool-using AI that can visit URLs. To enable this (optional):

  1. Set up n8n (cloud at https://n8n.cloud/ or self-hosted)
  2. Import the workflow from n8n/n8n_workflow.json
  3. Activate the workflow and copy the webhook URL
  4. Add to .env:
    N8N_WEBHOOK_URL=https://your-instance.n8n.cloud/webhook/ai-ctf-scraper
    N8N_WEBHOOK_API_KEY=your_optional_api_key
  5. Restart the API: docker compose restart api

Without n8n, 21 of 22 challenges will still work. Only Advanced Challenge #5 (Internal Documentation) requires the URL visiting tool.

🏗️ Architecture

┌─────────────────────────────────────────────────────────────┐
│                        User Browser                          │
│  ┌─────────────────┐              ┌──────────────────────┐  │
│  │  Landing Page   │─────────────▶│  Terminal Interface  │  │
│  │  (Config Input) │              │  (Chat + Challenges) │  │
│  └─────────────────┘              └──────────────────────┘  │
│                                     │                        │
│                                     │ Session Token          │
└─────────────────────────────────────┼────────────────────────┘
                                      │
                                      ▼
                             ┌─────────────────┐
                             │  Nginx Proxy    │
                             │  (Port 4002)    │
                             └────────┬────────┘
                                      │
                         ┌────────────┼────────────┐
                         │                         │
                         ▼                         ▼
                  ┌─────────────┐         ┌──────────────┐
                  │ Web Server  │         │ FastAPI API  │
                  │ (Port 80)   │         │ (Port 4000)  │
                  └─────────────┘         └──────┬───────┘
                                                  │
                                     ┌────────────┼────────────┐
                                     │            │            │
                                     ▼            ▼            ▼
                              ┌──────────┐  ┌─────────┐  ┌────────┐
                              │ OpenAI   │  │ Claude  │  │ Gemini │
                              │   API    │  │   API   │  │  API   │
                              └──────────┘  └─────────┘  └────────┘

How It Works

  1. Landing Page: User enters LLM configuration (model name, API key, base URL, n8n settings)
  2. Server-Side Storage: Configuration saved to .env file on the server (secure)
  3. Session Token: Simple session token returned to browser (no sensitive data)
  4. Nginx Proxy: Routes requests to web interface or API based on path
  5. Terminal: Single unified interface with difficulty selector
  6. API: Reads LLM config from .env and uses it to call the user's chosen LLM
  7. Flag Submission: Validate flags directly in the sidebar, progress tracked in localStorage

Components

  • Nginx Reverse Proxy (Port 4002): Routes traffic to web interface and API
  • Web Interface (Port 80 internal): Serves landing page, terminal, and challenge data
  • API (FastAPI, Port 4000 internal): Handles authentication, LLM interactions, and tool execution
  • LLM Provider: Your chosen AI model (configured by user)
  • n8n (optional): Web scraping service for advanced challenges

Port Configuration

  • 4002: Main entry point (Nginx proxy) - exposed to Cloudflare Tunnel
  • 4081: Direct web interface access (for local testing)
  • 4000: API internal port (not exposed, accessed via Nginx)
  • 80: Web server internal port (not exposed, accessed via Nginx)

📁 Project Structure

ai_ctf/
├── api/
│   ├── main.py              # FastAPI application
│   ├── prompts.py           # System prompts for each difficulty
│   ├── logger.py            # Logging configuration
│   ├── requirements.txt     # Python dependencies
│   └── Dockerfile
├── web_interface/
│   ├── index.html           # Landing page (LLM config)
│   ├── terminal.html        # Unified terminal interface
│   ├── challenges.js        # All 22 challenge definitions
│   ├── config.template.js   # Config template for API_BASE_URL
│   ├── generate-config.sh   # Script to generate config.js
│   ├── favicon.svg
│   └── Dockerfile
├── reverse_proxy/
│   ├── nginx.conf           # Nginx reverse proxy configuration
│   └── Dockerfile
├── n8n/
│   └── n8n_workflow.json    # Web scraping workflow
├── .env                     # Environment variables (create from example)
├── .env.example             # Example configuration
├── docker-compose.yml       # Container orchestration
└── README.md

🔒 Security & Privacy

  • API Keys: Stored server-side in .env file, never exposed to browser
  • Session Tokens: Simple session-based authentication without sensitive data
  • Reverse Proxy: Nginx handles routing and adds security headers
  • Sandboxed: Each user's configuration is isolated
  • No Database: All progress stored client-side
  • Educational: This platform is for learning - don't use in production!

🎓 Learning Objectives

By completing these challenges, you'll learn:

  • Prompt Injection: Bypassing AI safety guardrails
  • Social Engineering: Convincing AI systems to break rules
  • Tool Exploitation: Abusing AI tool-use capabilities
  • State Manipulation: Exploiting AI system modes and states
  • Template Injection: Extracting secrets through variable substitution
  • Multi-step Attacks: Chaining vulnerabilities for complex exploits

🛠️ Troubleshooting

Base URL Configuration

When to leave Base URL blank (recommended for most users):

  • Using OpenAI directly (gpt-4o, gpt-3.5-turbo, etc.)
  • Using Anthropic directly (claude-3-5-sonnet-20241022, etc.)
  • Using Google directly (gemini-1.5-flash, gemini-2.0-flash, etc.)

The platform automatically routes to the correct provider based on your model name.

When to set a Base URL:

  • Using a self-hosted LiteLLM proxy server
  • Using a custom API gateway or proxy
  • Using local models (Ollama, vLLM, etc.)
  • Using alternative endpoints for testing

Example Base URLs:

  • LiteLLM proxy: http://localhost:8000
  • Ollama: http://localhost:11434/v1
  • Custom OpenAI-compatible endpoint: https://your-proxy.com/v1

"Could not connect to AI service"

  • Check your API key is correct
  • Verify your model name matches the correct format (e.g., gpt-4o, not gpt4o)
  • Leave Base URL blank unless using a proxy or local model
  • Check API provider status
  • Ensure containers have network access

Advanced challenges not working

  • Ensure n8n webhook is configured in .env
  • Check n8n workflow is activated
  • Verify webhook URL is correct
  • Other 21 challenges work without n8n

Port conflicts

Edit docker-compose.yml to change ports:

services:
  reverse_proxy:
    ports:
      - "4002:4002"  # Main entry point - change 4002 to another port
      - "4081:4081"  # Direct web access - change 4081 to another port

Rebuild containers

# Force rebuild without cache
docker compose build --no-cache
docker compose up -d

Check logs

# View API logs
docker logs ai_ctf_api --tail 50

# View proxy logs
docker logs ai_ctf_proxy --tail 50

# View web logs
docker logs ai_ctf_web --tail 50

# Or with docker compose
docker compose logs api --tail 50
docker compose logs reverse_proxy --tail 50

Reset progress

Click the "⚙️ Configuration" button in the terminal, then click "🔄 Reset All Progress" at the bottom of the config page.

📝 Challenge Solutions

Spoiler Warning: Solutions are embedded in the system prompts (api/prompts.py). We recommend trying to solve challenges before looking at the code!

📜 License

This project is for educational purposes. Use responsibly and ethically.


Happy Hacking! 🚀🔐

Remember: The goal is to learn about AI security vulnerabilities in a safe, controlled environment. Apply these skills responsibly!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors