deepdefend-api / README.md
nishchandel's picture
Initial deployment without models
60efa5a
|
raw
history blame
7.26 kB

DeepDefend

Multi-Modal Deepfake Detection System
Detect AI-generated deepfakes in videos using computer vision and audio analysis

Python 3.10+ FastAPI

Overview

DeepDefend is a comprehensive deepfake detection system that combines video frame analysis and audio analysis to identify AI-generated synthetic media. Using machine learning models and AI-powered evidence fusion, it provides detailed, interval-by-interval analysis with explainable results.

Why DeepDefend?

  • Multi-Modal Analysis: Combines video and audio detection for higher accuracy
  • AI-Powered Fusion: Uses LLM to generate human-readable reports
  • Interval Breakdown: Shows exactly which parts of the video are suspicious
  • REST API: Easy integration with any frontend or application

Features

Core Detection Capabilities

  • Video Analysis

    • Frame-by-frame deepfake detection using pre-trained models
    • Face detection and region-specific analysis
    • Suspicious region identification (eyes, mouth, face boundaries)
    • Confidence scoring per frame
  • Audio Analysis

    • Voice synthesis detection
    • Spectrogram analysis for audio artifacts
    • Frequency pattern recognition
    • Audio splicing detection
  • AI-Powered Reporting

    • LLM-based evidence fusion (Google Gemini)
    • Natural language explanation of findings
    • Verdict with confidence percentage
    • Timestamped suspicious intervals

Processing Pipeline

Video Input
    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Media Extraction  β”‚ β†’ Extract frames (5 per interval)
β”‚                   β”‚ β†’ Extract audio chunks
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β–Ό                      β–Ό                      β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Video Analysis  β”‚   β”‚ Audio Analysis  β”‚   β”‚ Timeline Gen   β”‚
β”‚ β€’ Face detect   β”‚   β”‚ β€’ Spectrogram   β”‚   β”‚ β€’ 2s intervals β”‚
β”‚ β€’ Region scan   β”‚   β”‚ β€’ Voice synth   β”‚   β”‚ β€’ Metadata     β”‚
β”‚ β€’ Fake score    β”‚   β”‚ β€’ Artifacts     β”‚   β”‚                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                     β”‚                      β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β–Ό              β–Ό
                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                β”‚   LLM Fusion Engine      β”‚
                β”‚ β€’ Combine evidence       β”‚
                β”‚ β€’ Generate verdict       β”‚
                β”‚ β€’ Natural language reportβ”‚
                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β–Ό
                      Final Report
                    (JSON Response)

Demo

Live Demo

API: https://deepdefend-api.hf.space
Docs: https://deepdefend-api.hf.space/docs

Example Analysis

Click to see sample output
{
  "verdict": "DEEPFAKE",
  "confidence": 87.5,
  "overall_scores": {
    "overall_video_score": 0.823,
    "overall_audio_score": 0.756,
    "overall_combined_score": 0.789
  },
  "detailed_analysis": "This video shows strong indicators of deepfake manipulation...",
  "suspicious_intervals": [
    {
      "interval": "4.0-6.0",
      "video_score": 0.891,
      "audio_score": 0.834,
      "video_regions": ["eyes", "mouth"],
      "audio_regions": ["voice_synthesis_artifacts"]
    }
  ],
  "total_intervals_analyzed": 15,
  "video_info": {
    "duration": 12.498711111111112,
    "fps": 29.923085402583734,
    "total_frames": 374,
    "file_size_mb": 31.36
  },
  "analysis_id": "4cd98ea5-8c14-4cae-8da4-689345b0aabc",
  "timestamp": "2025-10-10T23:34:35.724916"
}

Installation

Prerequisites

  • Python 3.10 or higher
  • FFmpeg installed on your system
  • Google Gemini API key

Local Setup

  1. Clone the repository
git clone https://github.com/yourusername/deepdefend.git
  1. Create virtual environment
python -m venv venv

# On Linux/Mac
source venv/bin/activate

# On Windows
venv\Scripts\activate
  1. Install dependencies
pip install -r requirements.txt
  1. Download ML models
python models/download_model.py

This will download ~2GB of models from Hugging Face

  1. Configure environment
cp .env.example .env
# Edit .env and add your GOOGLE_API_KEY
  1. Run the server
uvicorn main:app --reload

The API will be available at http://127.0.0.1:8000

Docker Setup

# Build image
docker build -t deepdefend .

# Run container
docker run -p 8000:8000 -e GOOGLE_API_KEY=your_key deepdefend

Tech Stack

Backend

  • Framework: FastAPI 0.109.0
  • Server: Uvicorn
  • ML Framework: PyTorch 2.3.1
  • Transformers: Hugging Face Transformers 4.36.2

ML Models

Processing

  • Computer Vision: OpenCV, Pillow
  • Audio Processing: Librosa, SoundFile
  • Video Processing: FFmpeg

Deployment

  • Container: Docker
  • Platforms: Hugging Face Spaces

Project Structure

deepdefend/
β”‚   
│── extraction/
β”‚   β”œβ”€β”€ media_extractor.py     # Frame & audio extraction
β”‚   └── timeline_generator.py  # Timeline creation
β”‚
│── analysis/
β”‚   β”œβ”€β”€ video_analyser.py      # Video deepfake detection
β”‚   β”œβ”€β”€ audio_analyser.py      # Audio deepfake detection
β”‚   β”œβ”€β”€ llm_analyser.py        # LLM-based fusion
β”‚   └── prompt.py              # LLM prompts
β”‚ 
│── models/
β”‚   β”œβ”€β”€ download_model.py      # Model downloader
β”‚   β”œβ”€β”€ load_models.py         # Model loader
β”‚   β”œβ”€β”€ video_model/           # (Downloaded)
β”‚   └── audio_model/           # (Downloaded)
β”‚
│── main.py                    # FastAPI application
│── pipeline.py                # Main detection pipeline
│── requirements.txt           # Python dependencies
│── Dockerfile                 # Container configuration
β”œβ”€β”€ .gitignore
└── README.md