Spaces:
Sleeping
Sleeping
| # DeepDefend | |
| > **Multi-Modal Deepfake Detection System** | |
| > Detect AI-generated deepfakes in videos using computer vision and audio analysis | |
| [](https://www.python.org/downloads/) | |
| [](https://fastapi.tiangolo.com) | |
| ## Overview | |
| DeepDefend is a comprehensive deepfake detection system that combines **video frame analysis** and **audio analysis** to identify AI-generated synthetic media. Using machine learning models and AI-powered evidence fusion, it provides detailed, interval-by-interval analysis with explainable results. | |
| ### Why DeepDefend? | |
| - **Multi-Modal Analysis**: Combines video and audio detection for higher accuracy | |
| - **AI-Powered Fusion**: Uses LLM to generate human-readable reports | |
| - **Interval Breakdown**: Shows exactly which parts of the video are suspicious | |
| - **REST API**: Easy integration with any frontend or application | |
| ## Features | |
| ### Core Detection Capabilities | |
| - **Video Analysis** | |
| - Frame-by-frame deepfake detection using pre-trained models | |
| - Face detection and region-specific analysis | |
| - Suspicious region identification (eyes, mouth, face boundaries) | |
| - Confidence scoring per frame | |
| - **Audio Analysis** | |
| - Voice synthesis detection | |
| - Spectrogram analysis for audio artifacts | |
| - Frequency pattern recognition | |
| - Audio splicing detection | |
| - **AI-Powered Reporting** | |
| - LLM-based evidence fusion (Google Gemini) | |
| - Natural language explanation of findings | |
| - Verdict with confidence percentage | |
| - Timestamped suspicious intervals | |
| ### Processing Pipeline | |
| ``` | |
| Video Input | |
| β | |
| βββββββββββββββββββββ | |
| β Media Extraction β β Extract frames (5 per interval) | |
| β β β Extract audio chunks | |
| ββββββββββ¬βββββββββββ | |
| β | |
| ββββββββββββββββββββββββ¬βββββββββββββββββββββββ | |
| βΌ βΌ βΌ | |
| βββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββββ | |
| β Video Analysis β β Audio Analysis β β Timeline Gen β | |
| β β’ Face detect β β β’ Spectrogram β β β’ 2s intervals β | |
| β β’ Region scan β β β’ Voice synth β β β’ Metadata β | |
| β β’ Fake score β β β’ Artifacts β β β | |
| ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ ββββββββββ¬ββββββββ | |
| β β β | |
| ββββββββββββββββ¬βββββββββββββββ¬ββββββββββββββ | |
| βΌ βΌ | |
| ββββββββββββββββββββββββββββ | |
| β LLM Fusion Engine β | |
| β β’ Combine evidence β | |
| β β’ Generate verdict β | |
| β β’ Natural language reportβ | |
| ββββββββββββββ¬ββββββββββββββ | |
| βΌ | |
| Final Report | |
| (JSON Response) | |
| ``` | |
| ## Demo | |
| ### Live Demo | |
| **API**: [https://deepdefend-api.hf.space](https://deepdefend-api.hf.space) | |
| **Docs**: [https://deepdefend-api.hf.space/docs](https://deepdefend-api.hf.space/docs) | |
| ### Example Analysis | |
| <details> | |
| <summary>Click to see sample output</summary> | |
| ```json | |
| { | |
| "verdict": "DEEPFAKE", | |
| "confidence": 87.5, | |
| "overall_scores": { | |
| "overall_video_score": 0.823, | |
| "overall_audio_score": 0.756, | |
| "overall_combined_score": 0.789 | |
| }, | |
| "detailed_analysis": "This video shows strong indicators of deepfake manipulation...", | |
| "suspicious_intervals": [ | |
| { | |
| "interval": "4.0-6.0", | |
| "video_score": 0.891, | |
| "audio_score": 0.834, | |
| "video_regions": ["eyes", "mouth"], | |
| "audio_regions": ["voice_synthesis_artifacts"] | |
| } | |
| ], | |
| "total_intervals_analyzed": 15, | |
| "video_info": { | |
| "duration": 12.498711111111112, | |
| "fps": 29.923085402583734, | |
| "total_frames": 374, | |
| "file_size_mb": 31.36 | |
| }, | |
| "analysis_id": "4cd98ea5-8c14-4cae-8da4-689345b0aabc", | |
| "timestamp": "2025-10-10T23:34:35.724916" | |
| } | |
| ``` | |
| </details> | |
| ## Installation | |
| ### Prerequisites | |
| - Python 3.10 or higher | |
| - FFmpeg installed on your system | |
| - Google Gemini API key | |
| ### Local Setup | |
| 1. **Clone the repository** | |
| ```bash | |
| git clone https://github.com/yourusername/deepdefend.git | |
| ``` | |
| 2. **Create virtual environment** | |
| ```bash | |
| python -m venv venv | |
| # On Linux/Mac | |
| source venv/bin/activate | |
| # On Windows | |
| venv\Scripts\activate | |
| ``` | |
| 3. **Install dependencies** | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 4. **Download ML models** | |
| ```bash | |
| python models/download_model.py | |
| ``` | |
| *This will download ~2GB of models from Hugging Face* | |
| 5. **Configure environment** | |
| ```bash | |
| cp .env.example .env | |
| # Edit .env and add your GOOGLE_API_KEY | |
| ``` | |
| 6. **Run the server** | |
| ```bash | |
| uvicorn main:app --reload | |
| ``` | |
| The API will be available at `http://127.0.0.1:8000` | |
| ### Docker Setup | |
| ```bash | |
| # Build image | |
| docker build -t deepdefend . | |
| # Run container | |
| docker run -p 8000:8000 -e GOOGLE_API_KEY=your_key deepdefend | |
| ``` | |
| ## Tech Stack | |
| ### Backend | |
| - **Framework**: FastAPI 0.109.0 | |
| - **Server**: Uvicorn | |
| - **ML Framework**: PyTorch 2.3.1 | |
| - **Transformers**: Hugging Face Transformers 4.36.2 | |
| ### ML Models | |
| - **Video Detection**: [dima806/deepfake_vs_real_image_detection](https://huggingface.co/dima806/deepfake_vs_real_image_detection) | |
| - **Audio Detection**: [mo-thecreator/Deepfake-audio-detection](https://huggingface.co/mo-thecreator/Deepfake-audio-detection) | |
| - **LLM Fusion**: Google Gemini 2.5 Flash | |
| ### Processing | |
| - **Computer Vision**: OpenCV, Pillow | |
| - **Audio Processing**: Librosa, SoundFile | |
| - **Video Processing**: FFmpeg | |
| ### Deployment | |
| - **Container**: Docker | |
| - **Platforms**: Hugging Face Spaces | |
| ## Project Structure | |
| ``` | |
| deepdefend/ | |
| β | |
| βββ extraction/ | |
| β βββ media_extractor.py # Frame & audio extraction | |
| β βββ timeline_generator.py # Timeline creation | |
| β | |
| βββ analysis/ | |
| β βββ video_analyser.py # Video deepfake detection | |
| β βββ audio_analyser.py # Audio deepfake detection | |
| β βββ llm_analyser.py # LLM-based fusion | |
| β βββ prompt.py # LLM prompts | |
| β | |
| βββ models/ | |
| β βββ download_model.py # Model downloader | |
| β βββ load_models.py # Model loader | |
| β βββ video_model/ # (Downloaded) | |
| β βββ audio_model/ # (Downloaded) | |
| β | |
| βββ main.py # FastAPI application | |
| βββ pipeline.py # Main detection pipeline | |
| βββ requirements.txt # Python dependencies | |
| βββ Dockerfile # Container configuration | |
| βββ .gitignore | |
| βββ README.md | |
| ``` | |