Spaces:

nishchandel
/

deepdefend-api

Sleeping

App Files Files Community

deepdefend-api / README.md

nishchandel

Initial deployment without models

60efa5a 2 months ago

preview code

raw

history blame

7.26 kB

DeepDefend

Multi-Modal Deepfake Detection System
Detect AI-generated deepfakes in videos using computer vision and audio analysis

Overview

DeepDefend is a comprehensive deepfake detection system that combines video frame analysis and audio analysis to identify AI-generated synthetic media. Using machine learning models and AI-powered evidence fusion, it provides detailed, interval-by-interval analysis with explainable results.

Why DeepDefend?

Multi-Modal Analysis: Combines video and audio detection for higher accuracy
AI-Powered Fusion: Uses LLM to generate human-readable reports
Interval Breakdown: Shows exactly which parts of the video are suspicious
REST API: Easy integration with any frontend or application

Features

Core Detection Capabilities

Video Analysis
- Frame-by-frame deepfake detection using pre-trained models
- Face detection and region-specific analysis
- Suspicious region identification (eyes, mouth, face boundaries)
- Confidence scoring per frame
Audio Analysis
- Voice synthesis detection
- Spectrogram analysis for audio artifacts
- Frequency pattern recognition
- Audio splicing detection
AI-Powered Reporting
- LLM-based evidence fusion (Google Gemini)
- Natural language explanation of findings
- Verdict with confidence percentage
- Timestamped suspicious intervals

Processing Pipeline

Video Input
    ↓
┌───────────────────┐
│ Media Extraction  │ → Extract frames (5 per interval)
│                   │ → Extract audio chunks
└────────┬──────────┘
         │
         ├──────────────────────┬──────────────────────┐
         ▼                      ▼                      ▼
┌─────────────────┐   ┌─────────────────┐   ┌────────────────┐
│ Video Analysis  │   │ Audio Analysis  │   │ Timeline Gen   │
│ • Face detect   │   │ • Spectrogram   │   │ • 2s intervals │
│ • Region scan   │   │ • Voice synth   │   │ • Metadata     │
│ • Fake score    │   │ • Artifacts     │   │                │
└────────┬────────┘   └────────┬────────┘   └────────┬───────┘
         │                     │                      │
         └──────────────┬──────────────┬─────────────┘
                        ▼              ▼
                ┌──────────────────────────┐
                │   LLM Fusion Engine      │
                │ • Combine evidence       │
                │ • Generate verdict       │
                │ • Natural language report│
                └────────────┬─────────────┘
                             ▼
                      Final Report
                    (JSON Response)

Demo

Live Demo

API: https://deepdefend-api.hf.space
Docs: https://deepdefend-api.hf.space/docs

Example Analysis

Click to see sample output

{
  "verdict": "DEEPFAKE",
  "confidence": 87.5,
  "overall_scores": {
    "overall_video_score": 0.823,
    "overall_audio_score": 0.756,
    "overall_combined_score": 0.789
  },
  "detailed_analysis": "This video shows strong indicators of deepfake manipulation...",
  "suspicious_intervals": [
    {
      "interval": "4.0-6.0",
      "video_score": 0.891,
      "audio_score": 0.834,
      "video_regions": ["eyes", "mouth"],
      "audio_regions": ["voice_synthesis_artifacts"]
    }
  ],
  "total_intervals_analyzed": 15,
  "video_info": {
    "duration": 12.498711111111112,
    "fps": 29.923085402583734,
    "total_frames": 374,
    "file_size_mb": 31.36
  },
  "analysis_id": "4cd98ea5-8c14-4cae-8da4-689345b0aabc",
  "timestamp": "2025-10-10T23:34:35.724916"
}

Installation

Prerequisites

Python 3.10 or higher
FFmpeg installed on your system
Google Gemini API key

Local Setup

Clone the repository

git clone https://github.com/yourusername/deepdefend.git

Create virtual environment

python -m venv venv

# On Linux/Mac
source venv/bin/activate

# On Windows
venv\Scripts\activate

Install dependencies

pip install -r requirements.txt

Download ML models

python models/download_model.py

This will download ~2GB of models from Hugging Face

Configure environment

cp .env.example .env
# Edit .env and add your GOOGLE_API_KEY

Run the server

uvicorn main:app --reload

The API will be available at http://127.0.0.1:8000

Docker Setup

# Build image
docker build -t deepdefend .

# Run container
docker run -p 8000:8000 -e GOOGLE_API_KEY=your_key deepdefend

Tech Stack

Backend

Framework: FastAPI 0.109.0
Server: Uvicorn
ML Framework: PyTorch 2.3.1
Transformers: Hugging Face Transformers 4.36.2

ML Models

Video Detection: dima806/deepfake_vs_real_image_detection
Audio Detection: mo-thecreator/Deepfake-audio-detection
LLM Fusion: Google Gemini 2.5 Flash

Processing

Computer Vision: OpenCV, Pillow
Audio Processing: Librosa, SoundFile
Video Processing: FFmpeg

Deployment

Container: Docker
Platforms: Hugging Face Spaces

Project Structure

deepdefend/
│   
│── extraction/
│   ├── media_extractor.py     # Frame & audio extraction
│   └── timeline_generator.py  # Timeline creation
│
│── analysis/
│   ├── video_analyser.py      # Video deepfake detection
│   ├── audio_analyser.py      # Audio deepfake detection
│   ├── llm_analyser.py        # LLM-based fusion
│   └── prompt.py              # LLM prompts
│ 
│── models/
│   ├── download_model.py      # Model downloader
│   ├── load_models.py         # Model loader
│   ├── video_model/           # (Downloaded)
│   └── audio_model/           # (Downloaded)
│
│── main.py                    # FastAPI application
│── pipeline.py                # Main detection pipeline
│── requirements.txt           # Python dependencies
│── Dockerfile                 # Container configuration
├── .gitignore
└── README.md