deepdefend-api / README.md
nishchandel's picture
Initial deployment without models
60efa5a
|
raw
history blame
7.26 kB
# DeepDefend
> **Multi-Modal Deepfake Detection System**
> Detect AI-generated deepfakes in videos using computer vision and audio analysis
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![FastAPI](https://img.shields.io/badge/FastAPI-0.109.0-009688.svg)](https://fastapi.tiangolo.com)
## Overview
DeepDefend is a comprehensive deepfake detection system that combines **video frame analysis** and **audio analysis** to identify AI-generated synthetic media. Using machine learning models and AI-powered evidence fusion, it provides detailed, interval-by-interval analysis with explainable results.
### Why DeepDefend?
- **Multi-Modal Analysis**: Combines video and audio detection for higher accuracy
- **AI-Powered Fusion**: Uses LLM to generate human-readable reports
- **Interval Breakdown**: Shows exactly which parts of the video are suspicious
- **REST API**: Easy integration with any frontend or application
## Features
### Core Detection Capabilities
- **Video Analysis**
- Frame-by-frame deepfake detection using pre-trained models
- Face detection and region-specific analysis
- Suspicious region identification (eyes, mouth, face boundaries)
- Confidence scoring per frame
- **Audio Analysis**
- Voice synthesis detection
- Spectrogram analysis for audio artifacts
- Frequency pattern recognition
- Audio splicing detection
- **AI-Powered Reporting**
- LLM-based evidence fusion (Google Gemini)
- Natural language explanation of findings
- Verdict with confidence percentage
- Timestamped suspicious intervals
### Processing Pipeline
```
Video Input
↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Media Extraction β”‚ β†’ Extract frames (5 per interval)
β”‚ β”‚ β†’ Extract audio chunks
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β–Ό β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Video Analysis β”‚ β”‚ Audio Analysis β”‚ β”‚ Timeline Gen β”‚
β”‚ β€’ Face detect β”‚ β”‚ β€’ Spectrogram β”‚ β”‚ β€’ 2s intervals β”‚
β”‚ β€’ Region scan β”‚ β”‚ β€’ Voice synth β”‚ β”‚ β€’ Metadata β”‚
β”‚ β€’ Fake score β”‚ β”‚ β€’ Artifacts β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ LLM Fusion Engine β”‚
β”‚ β€’ Combine evidence β”‚
β”‚ β€’ Generate verdict β”‚
β”‚ β€’ Natural language reportβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β–Ό
Final Report
(JSON Response)
```
## Demo
### Live Demo
**API**: [https://deepdefend-api.hf.space](https://deepdefend-api.hf.space)
**Docs**: [https://deepdefend-api.hf.space/docs](https://deepdefend-api.hf.space/docs)
### Example Analysis
<details>
<summary>Click to see sample output</summary>
```json
{
"verdict": "DEEPFAKE",
"confidence": 87.5,
"overall_scores": {
"overall_video_score": 0.823,
"overall_audio_score": 0.756,
"overall_combined_score": 0.789
},
"detailed_analysis": "This video shows strong indicators of deepfake manipulation...",
"suspicious_intervals": [
{
"interval": "4.0-6.0",
"video_score": 0.891,
"audio_score": 0.834,
"video_regions": ["eyes", "mouth"],
"audio_regions": ["voice_synthesis_artifacts"]
}
],
"total_intervals_analyzed": 15,
"video_info": {
"duration": 12.498711111111112,
"fps": 29.923085402583734,
"total_frames": 374,
"file_size_mb": 31.36
},
"analysis_id": "4cd98ea5-8c14-4cae-8da4-689345b0aabc",
"timestamp": "2025-10-10T23:34:35.724916"
}
```
</details>
## Installation
### Prerequisites
- Python 3.10 or higher
- FFmpeg installed on your system
- Google Gemini API key
### Local Setup
1. **Clone the repository**
```bash
git clone https://github.com/yourusername/deepdefend.git
```
2. **Create virtual environment**
```bash
python -m venv venv
# On Linux/Mac
source venv/bin/activate
# On Windows
venv\Scripts\activate
```
3. **Install dependencies**
```bash
pip install -r requirements.txt
```
4. **Download ML models**
```bash
python models/download_model.py
```
*This will download ~2GB of models from Hugging Face*
5. **Configure environment**
```bash
cp .env.example .env
# Edit .env and add your GOOGLE_API_KEY
```
6. **Run the server**
```bash
uvicorn main:app --reload
```
The API will be available at `http://127.0.0.1:8000`
### Docker Setup
```bash
# Build image
docker build -t deepdefend .
# Run container
docker run -p 8000:8000 -e GOOGLE_API_KEY=your_key deepdefend
```
## Tech Stack
### Backend
- **Framework**: FastAPI 0.109.0
- **Server**: Uvicorn
- **ML Framework**: PyTorch 2.3.1
- **Transformers**: Hugging Face Transformers 4.36.2
### ML Models
- **Video Detection**: [dima806/deepfake_vs_real_image_detection](https://huggingface.co/dima806/deepfake_vs_real_image_detection)
- **Audio Detection**: [mo-thecreator/Deepfake-audio-detection](https://huggingface.co/mo-thecreator/Deepfake-audio-detection)
- **LLM Fusion**: Google Gemini 2.5 Flash
### Processing
- **Computer Vision**: OpenCV, Pillow
- **Audio Processing**: Librosa, SoundFile
- **Video Processing**: FFmpeg
### Deployment
- **Container**: Docker
- **Platforms**: Hugging Face Spaces
## Project Structure
```
deepdefend/
β”‚
│── extraction/
β”‚ β”œβ”€β”€ media_extractor.py # Frame & audio extraction
β”‚ └── timeline_generator.py # Timeline creation
β”‚
│── analysis/
β”‚ β”œβ”€β”€ video_analyser.py # Video deepfake detection
β”‚ β”œβ”€β”€ audio_analyser.py # Audio deepfake detection
β”‚ β”œβ”€β”€ llm_analyser.py # LLM-based fusion
β”‚ └── prompt.py # LLM prompts
β”‚
│── models/
β”‚ β”œβ”€β”€ download_model.py # Model downloader
β”‚ β”œβ”€β”€ load_models.py # Model loader
β”‚ β”œβ”€β”€ video_model/ # (Downloaded)
β”‚ └── audio_model/ # (Downloaded)
β”‚
│── main.py # FastAPI application
│── pipeline.py # Main detection pipeline
│── requirements.txt # Python dependencies
│── Dockerfile # Container configuration
β”œβ”€β”€ .gitignore
└── README.md
```