File size: 7,264 Bytes
60efa5a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
# DeepDefend

> **Multi-Modal Deepfake Detection System**  
> Detect AI-generated deepfakes in videos using computer vision and audio analysis

[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![FastAPI](https://img.shields.io/badge/FastAPI-0.109.0-009688.svg)](https://fastapi.tiangolo.com)

## Overview

DeepDefend is a comprehensive deepfake detection system that combines **video frame analysis** and **audio analysis** to identify AI-generated synthetic media. Using machine learning models and AI-powered evidence fusion, it provides detailed, interval-by-interval analysis with explainable results.

### Why DeepDefend?

- **Multi-Modal Analysis**: Combines video and audio detection for higher accuracy
- **AI-Powered Fusion**: Uses LLM to generate human-readable reports
- **Interval Breakdown**: Shows exactly which parts of the video are suspicious
- **REST API**: Easy integration with any frontend or application

## Features

### Core Detection Capabilities

- **Video Analysis**
  - Frame-by-frame deepfake detection using pre-trained models
  - Face detection and region-specific analysis
  - Suspicious region identification (eyes, mouth, face boundaries)
  - Confidence scoring per frame

- **Audio Analysis**
  - Voice synthesis detection
  - Spectrogram analysis for audio artifacts
  - Frequency pattern recognition
  - Audio splicing detection

- **AI-Powered Reporting**
  - LLM-based evidence fusion (Google Gemini)
  - Natural language explanation of findings
  - Verdict with confidence percentage
  - Timestamped suspicious intervals

### Processing Pipeline

```
Video Input
    ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Media Extraction  β”‚ β†’ Extract frames (5 per interval)
β”‚                   β”‚ β†’ Extract audio chunks
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β–Ό                      β–Ό                      β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Video Analysis  β”‚   β”‚ Audio Analysis  β”‚   β”‚ Timeline Gen   β”‚
β”‚ β€’ Face detect   β”‚   β”‚ β€’ Spectrogram   β”‚   β”‚ β€’ 2s intervals β”‚
β”‚ β€’ Region scan   β”‚   β”‚ β€’ Voice synth   β”‚   β”‚ β€’ Metadata     β”‚
β”‚ β€’ Fake score    β”‚   β”‚ β€’ Artifacts     β”‚   β”‚                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                     β”‚                      β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β–Ό              β–Ό
                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                β”‚   LLM Fusion Engine      β”‚
                β”‚ β€’ Combine evidence       β”‚
                β”‚ β€’ Generate verdict       β”‚
                β”‚ β€’ Natural language reportβ”‚
                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β–Ό
                      Final Report
                    (JSON Response)
```

## Demo

### Live Demo
**API**: [https://deepdefend-api.hf.space](https://deepdefend-api.hf.space)  
**Docs**: [https://deepdefend-api.hf.space/docs](https://deepdefend-api.hf.space/docs)

### Example Analysis

<details>
<summary>Click to see sample output</summary>

```json
{
  "verdict": "DEEPFAKE",
  "confidence": 87.5,
  "overall_scores": {
    "overall_video_score": 0.823,
    "overall_audio_score": 0.756,
    "overall_combined_score": 0.789
  },
  "detailed_analysis": "This video shows strong indicators of deepfake manipulation...",
  "suspicious_intervals": [
    {
      "interval": "4.0-6.0",
      "video_score": 0.891,
      "audio_score": 0.834,
      "video_regions": ["eyes", "mouth"],
      "audio_regions": ["voice_synthesis_artifacts"]
    }
  ],
  "total_intervals_analyzed": 15,
  "video_info": {
    "duration": 12.498711111111112,
    "fps": 29.923085402583734,
    "total_frames": 374,
    "file_size_mb": 31.36
  },
  "analysis_id": "4cd98ea5-8c14-4cae-8da4-689345b0aabc",
  "timestamp": "2025-10-10T23:34:35.724916"
}
```
</details>

## Installation

### Prerequisites

- Python 3.10 or higher
- FFmpeg installed on your system
- Google Gemini API key 

### Local Setup

1. **Clone the repository**
```bash
git clone https://github.com/yourusername/deepdefend.git
```

2. **Create virtual environment**
```bash
python -m venv venv

# On Linux/Mac
source venv/bin/activate

# On Windows
venv\Scripts\activate
```

3. **Install dependencies**
```bash
pip install -r requirements.txt
```

4. **Download ML models**
```bash
python models/download_model.py
```
*This will download ~2GB of models from Hugging Face*

5. **Configure environment**
```bash
cp .env.example .env
# Edit .env and add your GOOGLE_API_KEY
```

6. **Run the server**
```bash
uvicorn main:app --reload
```

The API will be available at `http://127.0.0.1:8000`

### Docker Setup

```bash
# Build image
docker build -t deepdefend .

# Run container
docker run -p 8000:8000 -e GOOGLE_API_KEY=your_key deepdefend
```

## Tech Stack

### Backend
- **Framework**: FastAPI 0.109.0
- **Server**: Uvicorn
- **ML Framework**: PyTorch 2.3.1
- **Transformers**: Hugging Face Transformers 4.36.2

### ML Models
- **Video Detection**: [dima806/deepfake_vs_real_image_detection](https://huggingface.co/dima806/deepfake_vs_real_image_detection)
- **Audio Detection**: [mo-thecreator/Deepfake-audio-detection](https://huggingface.co/mo-thecreator/Deepfake-audio-detection)
- **LLM Fusion**: Google Gemini 2.5 Flash

### Processing
- **Computer Vision**: OpenCV, Pillow
- **Audio Processing**: Librosa, SoundFile
- **Video Processing**: FFmpeg

### Deployment
- **Container**: Docker
- **Platforms**: Hugging Face Spaces

## Project Structure

```
deepdefend/
β”‚   
│── extraction/
β”‚   β”œβ”€β”€ media_extractor.py     # Frame & audio extraction
β”‚   └── timeline_generator.py  # Timeline creation
β”‚
│── analysis/
β”‚   β”œβ”€β”€ video_analyser.py      # Video deepfake detection
β”‚   β”œβ”€β”€ audio_analyser.py      # Audio deepfake detection
β”‚   β”œβ”€β”€ llm_analyser.py        # LLM-based fusion
β”‚   └── prompt.py              # LLM prompts
β”‚ 
│── models/
β”‚   β”œβ”€β”€ download_model.py      # Model downloader
β”‚   β”œβ”€β”€ load_models.py         # Model loader
β”‚   β”œβ”€β”€ video_model/           # (Downloaded)
β”‚   └── audio_model/           # (Downloaded)
β”‚
│── main.py                    # FastAPI application
│── pipeline.py                # Main detection pipeline
│── requirements.txt           # Python dependencies
│── Dockerfile                 # Container configuration
β”œβ”€β”€ .gitignore
└── README.md
```