supraptin commited on
Commit
bd2c5ca
·
0 Parent(s):

Initial deployment to Hugging Face Spaces

Browse files
.gitignore ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Byte-compiled / optimized / DLL files
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+
6
+ # Virtual environment
7
+ venv/
8
+ .venv/
9
+ ENV/
10
+
11
+ # Environment files
12
+ .env
13
+ .env.local
14
+
15
+ # IDE
16
+ .idea/
17
+ .vscode/
18
+ *.swp
19
+ *.swo
20
+
21
+ # OS
22
+ .DS_Store
23
+ Thumbs.db
24
+
25
+ # Models (downloaded during build)
26
+ models/
27
+ Silent-Face-Anti-Spoofing/
28
+
29
+ # Logs
30
+ *.log
31
+
32
+ # Test files
33
+ test_images/
34
+ *.jpg
35
+ *.png
36
+ *.jpeg
37
+
38
+ # HuggingFace cache
39
+ .cache/
Dockerfile ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Hugging Face Spaces Dockerfile for KYC POC Backend
2
+ FROM python:3.10-slim
3
+
4
+ # Set environment variables
5
+ ENV PYTHONDONTWRITEBYTECODE=1
6
+ ENV PYTHONUNBUFFERED=1
7
+ ENV DEBIAN_FRONTEND=noninteractive
8
+
9
+ # Install system dependencies
10
+ RUN apt-get update && apt-get install -y --no-install-recommends \
11
+ git \
12
+ libgl1-mesa-glx \
13
+ libglib2.0-0 \
14
+ libsm6 \
15
+ libxext6 \
16
+ libxrender-dev \
17
+ libgomp1 \
18
+ && rm -rf /var/lib/apt/lists/*
19
+
20
+ # Create app user for HF Spaces (required)
21
+ RUN useradd -m -u 1000 user
22
+ WORKDIR /home/user/app
23
+
24
+ # Copy requirements first for better caching
25
+ COPY --chown=user:user requirements.txt .
26
+
27
+ # Install Python dependencies
28
+ RUN pip install --no-cache-dir --upgrade pip && \
29
+ pip install --no-cache-dir -r requirements.txt
30
+
31
+ # Copy application code
32
+ COPY --chown=user:user . .
33
+
34
+ # Download models during build
35
+ RUN python setup_models.py
36
+
37
+ # Switch to non-root user (required for HF Spaces)
38
+ USER user
39
+
40
+ # Expose port (HF Spaces uses 7860 by default)
41
+ EXPOSE 7860
42
+
43
+ # Set environment variables for production
44
+ ENV DEBUG=False
45
+ ENV USE_GPU=False
46
+ ENV DEVICE_ID=-1
47
+
48
+ # Run the application
49
+ CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "7860"]
README.md ADDED
@@ -0,0 +1,321 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: KYC POC Backend
3
+ emoji: 🔐
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: docker
7
+ app_port: 7860
8
+ pinned: false
9
+ license: mit
10
+ ---
11
+
12
+ # KYC POC API
13
+
14
+ A proof-of-concept API for KYC (Know Your Customer) verification using:
15
+ - **AuraFace** for face recognition and matching
16
+ - **Silent-Face-Anti-Spoofing** for liveness detection
17
+
18
+ ## Features
19
+
20
+ - Face matching between KTP (ID card) and selfie
21
+ - Liveness detection to prevent spoofing attacks
22
+ - Face quality analysis (blur, brightness, pose)
23
+ - Age and gender estimation
24
+ - Automatic face extraction from KTP images
25
+ - Multiple faces rejection
26
+
27
+ ## Requirements
28
+
29
+ - Python 3.9+
30
+ - Git (for cloning Silent-Face-Anti-Spoofing)
31
+
32
+ ## Installation
33
+
34
+ ### 1. Create Virtual Environment
35
+
36
+ ```bash
37
+ # Windows
38
+ python -m venv venv
39
+ venv\Scripts\activate
40
+
41
+ # Linux/Mac
42
+ python -m venv venv
43
+ source venv/bin/activate
44
+ ```
45
+
46
+ ### 2. Install Dependencies
47
+
48
+ ```bash
49
+ pip install -r requirements.txt
50
+ ```
51
+
52
+ ### 3. Download ML Models
53
+
54
+ Run the setup script to download the required models:
55
+
56
+ ```bash
57
+ python setup_models.py
58
+ ```
59
+
60
+ This will:
61
+ - Download AuraFace model from HuggingFace
62
+ - Clone Silent-Face-Anti-Spoofing repository
63
+ - Copy model files to the correct locations
64
+
65
+ ### 4. Run the Application
66
+
67
+ ```bash
68
+ uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
69
+ ```
70
+
71
+ The API will be available at: http://localhost:8000
72
+
73
+ ## API Documentation
74
+
75
+ - **Swagger UI**: http://localhost:8000/docs
76
+ - **ReDoc**: http://localhost:8000/redoc
77
+
78
+ ## API Endpoints
79
+
80
+ ### Health Check
81
+
82
+ ```
83
+ GET /health
84
+ ```
85
+
86
+ ### File Upload Endpoints
87
+
88
+ These endpoints accept `multipart/form-data`:
89
+
90
+ | Endpoint | Method | Description |
91
+ |----------|--------|-------------|
92
+ | `/api/v1/kyc/verify` | POST | Full KYC verification |
93
+ | `/api/v1/kyc/face-match` | POST | Face matching only |
94
+ | `/api/v1/kyc/liveness` | POST | Liveness detection only |
95
+ | `/api/v1/kyc/quality` | POST | Face quality check only |
96
+
97
+ ### Base64 Endpoints
98
+
99
+ These endpoints accept `application/json` with base64 encoded images:
100
+
101
+ | Endpoint | Method | Description |
102
+ |----------|--------|-------------|
103
+ | `/api/v1/kyc/base64/verify` | POST | Full KYC verification |
104
+ | `/api/v1/kyc/base64/face-match` | POST | Face matching only |
105
+ | `/api/v1/kyc/base64/liveness` | POST | Liveness detection only |
106
+ | `/api/v1/kyc/base64/quality` | POST | Face quality check only |
107
+
108
+ ## Usage Examples
109
+
110
+ ### Using curl (File Upload)
111
+
112
+ **Full KYC Verification:**
113
+ ```bash
114
+ curl -X POST "http://localhost:8000/api/v1/kyc/verify" \
115
+ -F "ktp_image=@/path/to/ktp.jpg" \
116
+ -F "selfie_image=@/path/to/selfie.jpg" \
117
+ -F "threshold=0.5"
118
+ ```
119
+
120
+ **Face Match Only:**
121
+ ```bash
122
+ curl -X POST "http://localhost:8000/api/v1/kyc/face-match" \
123
+ -F "ktp_image=@/path/to/ktp.jpg" \
124
+ -F "selfie_image=@/path/to/selfie.jpg"
125
+ ```
126
+
127
+ **Liveness Check:**
128
+ ```bash
129
+ curl -X POST "http://localhost:8000/api/v1/kyc/liveness" \
130
+ -F "image=@/path/to/selfie.jpg"
131
+ ```
132
+
133
+ ### Using Insomnia/Postman (Base64)
134
+
135
+ **Full KYC Verification:**
136
+
137
+ ```http
138
+ POST /api/v1/kyc/base64/verify
139
+ Content-Type: application/json
140
+
141
+ {
142
+ "ktp_image": "base64_encoded_ktp_image_here...",
143
+ "selfie_image": "base64_encoded_selfie_image_here...",
144
+ "threshold": 0.5
145
+ }
146
+ ```
147
+
148
+ **Face Match Only:**
149
+
150
+ ```http
151
+ POST /api/v1/kyc/base64/face-match
152
+ Content-Type: application/json
153
+
154
+ {
155
+ "image1": "base64_encoded_image1_here...",
156
+ "image2": "base64_encoded_image2_here...",
157
+ "threshold": 0.5
158
+ }
159
+ ```
160
+
161
+ **Liveness Check:**
162
+
163
+ ```http
164
+ POST /api/v1/kyc/base64/liveness
165
+ Content-Type: application/json
166
+
167
+ {
168
+ "image": "base64_encoded_image_here..."
169
+ }
170
+ ```
171
+
172
+ **Quality Check:**
173
+
174
+ ```http
175
+ POST /api/v1/kyc/base64/quality
176
+ Content-Type: application/json
177
+
178
+ {
179
+ "image": "base64_encoded_image_here..."
180
+ }
181
+ ```
182
+
183
+ ## Response Examples
184
+
185
+ ### Successful Verification
186
+
187
+ ```json
188
+ {
189
+ "success": true,
190
+ "face_match": {
191
+ "is_match": true,
192
+ "similarity_score": 0.87,
193
+ "threshold": 0.5
194
+ },
195
+ "liveness": {
196
+ "is_real": true,
197
+ "confidence": 0.95,
198
+ "label": "Real Face",
199
+ "prediction_class": 1,
200
+ "models_used": 2
201
+ },
202
+ "quality": {
203
+ "ktp": {
204
+ "blur_score": 125.5,
205
+ "is_blurry": false,
206
+ "brightness": 0.65,
207
+ "is_too_dark": false,
208
+ "is_too_bright": false,
209
+ "is_good_quality": true
210
+ },
211
+ "selfie": {
212
+ "blur_score": 200.3,
213
+ "is_blurry": false,
214
+ "brightness": 0.58,
215
+ "is_too_dark": false,
216
+ "is_too_bright": false,
217
+ "pose": {
218
+ "yaw": 5.2,
219
+ "pitch": -3.1,
220
+ "roll": 1.5,
221
+ "is_frontal": true
222
+ },
223
+ "is_good_quality": true
224
+ }
225
+ },
226
+ "demographics": {
227
+ "ktp": { "age": 28, "gender": "Male" },
228
+ "selfie": { "age": 29, "gender": "Male" }
229
+ },
230
+ "face_boxes": {
231
+ "ktp": { "x": 120, "y": 80, "width": 150, "height": 180 },
232
+ "selfie": { "x": 200, "y": 100, "width": 250, "height": 300 }
233
+ },
234
+ "message": "KYC verification successful"
235
+ }
236
+ ```
237
+
238
+ ### Error Response
239
+
240
+ ```json
241
+ {
242
+ "error_code": "FACE_NOT_DETECTED",
243
+ "message": "No face detected in image"
244
+ }
245
+ ```
246
+
247
+ ## Error Codes
248
+
249
+ | Code | HTTP | Description |
250
+ |------|------|-------------|
251
+ | `FACE_NOT_DETECTED` | 400 | No face found in uploaded image |
252
+ | `MULTIPLE_FACES_DETECTED` | 400 | Multiple faces detected - rejected |
253
+ | `LIVENESS_FAILED` | 400 | Spoofing attempt detected |
254
+ | `IMAGE_INVALID` | 400 | Invalid or corrupt image file |
255
+ | `IMAGE_TOO_LARGE` | 413 | Image exceeds size limit |
256
+ | `UNSUPPORTED_FORMAT` | 415 | Image format not JPEG/PNG |
257
+ | `MODEL_NOT_LOADED` | 503 | ML models not initialized |
258
+
259
+ ## Configuration
260
+
261
+ Configuration can be set via environment variables or `.env` file:
262
+
263
+ | Variable | Default | Description |
264
+ |----------|---------|-------------|
265
+ | `DEBUG` | `true` | Enable debug mode |
266
+ | `FACE_MATCH_THRESHOLD` | `0.5` | Face similarity threshold |
267
+ | `LIVENESS_THRESHOLD` | `0.5` | Liveness confidence threshold |
268
+ | `BLUR_THRESHOLD` | `100.0` | Blur detection threshold |
269
+ | `BRIGHTNESS_MIN` | `0.2` | Minimum brightness |
270
+ | `BRIGHTNESS_MAX` | `0.8` | Maximum brightness |
271
+ | `USE_GPU` | `false` | Enable GPU acceleration |
272
+ | `MAX_IMAGE_SIZE_MB` | `10.0` | Maximum upload size |
273
+
274
+ ## Project Structure
275
+
276
+ ```
277
+ sentinel/
278
+ ├── app/
279
+ │ ├── __init__.py
280
+ │ ├── main.py # FastAPI application entry point
281
+ │ ├── config.py # Configuration settings
282
+ │ ├── api/
283
+ │ │ ├── __init__.py
284
+ │ │ ├── dependencies.py # Shared dependencies
285
+ │ │ └── routes/
286
+ │ │ ├── __init__.py
287
+ │ │ ├── health.py # Health check endpoint
288
+ │ │ ├── kyc.py # File upload endpoints
289
+ │ │ └── kyc_base64.py # Base64 endpoints
290
+ │ ├── services/
291
+ │ │ ├── __init__.py
292
+ │ │ ├── face_recognition.py # AuraFace service
293
+ │ │ ├── face_quality.py # Quality analysis service
294
+ │ │ └── liveness_detection.py # Anti-spoofing service
295
+ │ ├── models/
296
+ │ │ ├── __init__.py
297
+ │ │ └── schemas.py # Pydantic models
298
+ │ └── utils/
299
+ │ ├── __init__.py
300
+ │ ├── image_utils.py # Image processing
301
+ │ └── ktp_extractor.py # KTP face extraction
302
+ ├── models/ # ML model files
303
+ │ ├── auraface/ # AuraFace model
304
+ │ └── anti_spoof/ # Anti-spoofing models
305
+ ├── Silent-Face-Anti-Spoofing/ # Cloned repository
306
+ ├── requirements.txt
307
+ ├── setup_models.py # Model download script
308
+ └── README.md
309
+ ```
310
+
311
+ ## Notes
312
+
313
+ - AuraFace produces 512-dimensional face embeddings
314
+ - Similarity threshold of 0.5 is balanced; increase for higher security
315
+ - Silent-Face-Anti-Spoofing uses 2 models for fusion (MiniFASNetV1SE + MiniFASNetV2)
316
+ - First request may be slow due to model warm-up
317
+ - CPU mode is used by default; set `USE_GPU=true` for GPU acceleration
318
+
319
+ ## License
320
+
321
+ This is a proof-of-concept for educational purposes.
app/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # KYC POC Application
app/api/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # API Package
app/api/dependencies.py ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Shared dependencies for API routes.
3
+ """
4
+
5
+ from fastapi import UploadFile, HTTPException
6
+ from typing import Tuple
7
+ import numpy as np
8
+
9
+ from ..config import settings
10
+ from ..utils.image_utils import (
11
+ read_image_from_upload,
12
+ validate_content_type,
13
+ decode_base64_image
14
+ )
15
+ from ..services.face_recognition import face_recognition_service
16
+ from ..services.liveness_detection import liveness_detection_service
17
+ from ..services.face_quality import face_quality_service
18
+ from ..services.ktp_ocr import ktp_ocr_service
19
+
20
+
21
+ async def get_validated_image(file: UploadFile) -> np.ndarray:
22
+ """
23
+ Validate and read an uploaded image file.
24
+
25
+ Args:
26
+ file: Uploaded file
27
+
28
+ Returns:
29
+ Image as numpy array (BGR format)
30
+ """
31
+ # Validate content type
32
+ validate_content_type(file.content_type, settings.ALLOWED_IMAGE_TYPES)
33
+
34
+ # Read and decode image
35
+ image = await read_image_from_upload(file)
36
+
37
+ return image
38
+
39
+
40
+ async def get_validated_images(
41
+ file1: UploadFile,
42
+ file2: UploadFile
43
+ ) -> Tuple[np.ndarray, np.ndarray]:
44
+ """
45
+ Validate and read two uploaded image files.
46
+
47
+ Args:
48
+ file1: First uploaded file
49
+ file2: Second uploaded file
50
+
51
+ Returns:
52
+ Tuple of images as numpy arrays (BGR format)
53
+ """
54
+ image1 = await get_validated_image(file1)
55
+ image2 = await get_validated_image(file2)
56
+ return image1, image2
57
+
58
+
59
+ def get_face_recognition_service():
60
+ """Get the face recognition service instance."""
61
+ if not face_recognition_service.initialized:
62
+ raise HTTPException(
63
+ status_code=503,
64
+ detail={
65
+ "error_code": "MODEL_NOT_LOADED",
66
+ "message": "Face recognition model not loaded. Please wait for initialization."
67
+ }
68
+ )
69
+ return face_recognition_service
70
+
71
+
72
+ def get_liveness_service():
73
+ """Get the liveness detection service instance."""
74
+ if not liveness_detection_service.initialized:
75
+ raise HTTPException(
76
+ status_code=503,
77
+ detail={
78
+ "error_code": "MODEL_NOT_LOADED",
79
+ "message": "Liveness detection model not loaded. Please wait for initialization."
80
+ }
81
+ )
82
+ return liveness_detection_service
83
+
84
+
85
+ def get_quality_service():
86
+ """Get the face quality service instance."""
87
+ return face_quality_service
88
+
89
+
90
+ def get_ocr_service():
91
+ """Get the KTP OCR service instance."""
92
+ if not ktp_ocr_service.initialized:
93
+ raise HTTPException(
94
+ status_code=503,
95
+ detail={
96
+ "error_code": "MODEL_NOT_LOADED",
97
+ "message": "OCR model not loaded. Please wait for initialization."
98
+ }
99
+ )
100
+ return ktp_ocr_service
app/api/routes/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Routes Package
app/api/routes/health.py ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Health check endpoints.
3
+ """
4
+
5
+ from fastapi import APIRouter
6
+
7
+ from ...config import settings
8
+ from ...models.schemas import HealthResponse
9
+ from ...services.face_recognition import face_recognition_service
10
+ from ...services.liveness_detection import liveness_detection_service
11
+ from ...services.ktp_ocr import ktp_ocr_service
12
+
13
+ router = APIRouter()
14
+
15
+
16
+ @router.get(
17
+ "/health",
18
+ response_model=HealthResponse,
19
+ summary="Health Check",
20
+ description="Check the health status of the API and its models."
21
+ )
22
+ async def health_check() -> HealthResponse:
23
+ """
24
+ Check the health status of the API.
25
+
26
+ Returns:
27
+ Health status including model loading states.
28
+ """
29
+ models_loaded = {
30
+ "face_recognition": face_recognition_service.initialized,
31
+ "liveness_detection": liveness_detection_service.initialized,
32
+ "ktp_ocr": ktp_ocr_service.initialized
33
+ }
34
+
35
+ # Determine overall status
36
+ all_loaded = all(models_loaded.values())
37
+ status = "healthy" if all_loaded else "degraded"
38
+
39
+ return HealthResponse(
40
+ status=status,
41
+ models_loaded=models_loaded,
42
+ version=settings.APP_VERSION
43
+ )
44
+
45
+
46
+ @router.get(
47
+ "/",
48
+ summary="Root",
49
+ description="API root endpoint."
50
+ )
51
+ async def root():
52
+ """API root endpoint."""
53
+ return {
54
+ "name": settings.APP_NAME,
55
+ "version": settings.APP_VERSION,
56
+ "docs": "/docs"
57
+ }
app/api/routes/kyc.py ADDED
@@ -0,0 +1,440 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ KYC verification endpoints (file upload).
3
+
4
+ These endpoints accept multipart/form-data file uploads.
5
+ """
6
+
7
+ from fastapi import APIRouter, File, UploadFile, Form, HTTPException
8
+ from typing import Optional
9
+ import logging
10
+
11
+ from ...config import settings
12
+ from ...models.schemas import (
13
+ VerifyResponse,
14
+ FaceMatchResponse,
15
+ LivenessResponse,
16
+ QualityResponse,
17
+ FaceMatchResult,
18
+ LivenessResult,
19
+ QualityAnalysis,
20
+ BoundingBox,
21
+ Demographics,
22
+ FaceInfo,
23
+ FacePose
24
+ )
25
+ from ..dependencies import (
26
+ get_validated_image,
27
+ get_validated_images,
28
+ get_face_recognition_service,
29
+ get_liveness_service,
30
+ get_quality_service
31
+ )
32
+ from ...utils.ktp_extractor import KTPFaceExtractor
33
+
34
+ logger = logging.getLogger(__name__)
35
+ router = APIRouter(prefix="/kyc", tags=["KYC - File Upload"])
36
+
37
+
38
+ @router.post(
39
+ "/verify",
40
+ response_model=VerifyResponse,
41
+ summary="Full KYC Verification",
42
+ description="Perform complete KYC verification: face matching, liveness detection, and quality analysis."
43
+ )
44
+ async def verify_kyc(
45
+ ktp_image: UploadFile = File(..., description="KTP/ID card photo"),
46
+ selfie_image: UploadFile = File(..., description="Selfie photo"),
47
+ threshold: float = Form(default=0.5, ge=0.0, le=1.0, description="Face match threshold")
48
+ ) -> VerifyResponse:
49
+ """
50
+ Perform complete KYC verification.
51
+
52
+ This endpoint:
53
+ 1. Extracts face from KTP image
54
+ 2. Checks liveness of selfie
55
+ 3. Compares faces between KTP and selfie
56
+ 4. Analyzes image quality
57
+ 5. Extracts demographics (age, gender)
58
+
59
+ Args:
60
+ ktp_image: KTP/ID card image file
61
+ selfie_image: Selfie image file
62
+ threshold: Similarity threshold for face matching
63
+
64
+ Returns:
65
+ Complete verification results
66
+ """
67
+ # Get services
68
+ face_service = get_face_recognition_service()
69
+ liveness_service = get_liveness_service()
70
+ quality_service = get_quality_service()
71
+
72
+ # Read and validate images
73
+ ktp_img, selfie_img = await get_validated_images(ktp_image, selfie_image)
74
+
75
+ # Setup KTP extractor
76
+ ktp_extractor = KTPFaceExtractor()
77
+ ktp_extractor.set_detector(face_service.face_app)
78
+
79
+ try:
80
+ # Extract face from KTP
81
+ try:
82
+ ktp_face_img, ktp_face_info = ktp_extractor.extract_face(ktp_img, padding=0.3)
83
+ except ValueError as e:
84
+ raise HTTPException(
85
+ status_code=400,
86
+ detail={
87
+ "error_code": "FACE_NOT_DETECTED",
88
+ "message": f"KTP image: {str(e)}"
89
+ }
90
+ )
91
+
92
+ # Extract face info from selfie
93
+ try:
94
+ selfie_face_info = face_service.extract_face_info(selfie_img, allow_multiple=False)
95
+ except ValueError as e:
96
+ if "Multiple faces" in str(e):
97
+ raise HTTPException(
98
+ status_code=400,
99
+ detail={
100
+ "error_code": "MULTIPLE_FACES_DETECTED",
101
+ "message": f"Selfie: {str(e)}"
102
+ }
103
+ )
104
+ raise HTTPException(
105
+ status_code=400,
106
+ detail={
107
+ "error_code": "FACE_NOT_DETECTED",
108
+ "message": f"Selfie: {str(e)}"
109
+ }
110
+ )
111
+
112
+ # Extract face info from KTP (cropped face)
113
+ try:
114
+ ktp_embedding_info = face_service.extract_face_info(ktp_face_img, allow_multiple=False)
115
+ except ValueError as e:
116
+ raise HTTPException(
117
+ status_code=400,
118
+ detail={
119
+ "error_code": "FACE_NOT_DETECTED",
120
+ "message": f"Could not extract embedding from KTP face: {str(e)}"
121
+ }
122
+ )
123
+
124
+ # Compare faces
125
+ face_match = face_service.compare_faces(
126
+ ktp_embedding_info["embedding"],
127
+ selfie_face_info["embedding"],
128
+ threshold
129
+ )
130
+
131
+ # Check liveness on selfie
132
+ liveness = liveness_service.check_liveness(selfie_img)
133
+
134
+ # Quality analysis
135
+ ktp_quality = quality_service.analyze_quality(ktp_face_img, ktp_embedding_info)
136
+ selfie_quality = quality_service.analyze_quality(selfie_img, selfie_face_info)
137
+
138
+ # Build response
139
+ return VerifyResponse(
140
+ success=face_match["is_match"] and liveness.get("is_real", False),
141
+ face_match=FaceMatchResult(**face_match),
142
+ liveness=LivenessResult(**liveness),
143
+ quality={
144
+ "ktp": _build_quality_analysis(ktp_quality),
145
+ "selfie": _build_quality_analysis(selfie_quality)
146
+ },
147
+ demographics={
148
+ "ktp": Demographics(
149
+ age=ktp_embedding_info.get("age"),
150
+ gender=ktp_embedding_info.get("gender")
151
+ ),
152
+ "selfie": Demographics(
153
+ age=selfie_face_info.get("age"),
154
+ gender=selfie_face_info.get("gender")
155
+ )
156
+ },
157
+ face_boxes={
158
+ "ktp": BoundingBox(**ktp_face_info["bbox"]),
159
+ "selfie": BoundingBox(**selfie_face_info["bbox"])
160
+ },
161
+ message=_build_verification_message(face_match, liveness)
162
+ )
163
+
164
+ except HTTPException:
165
+ raise
166
+ except Exception as e:
167
+ logger.error(f"Verification error: {e}", exc_info=True)
168
+ raise HTTPException(
169
+ status_code=500,
170
+ detail={
171
+ "error_code": "VERIFICATION_ERROR",
172
+ "message": f"Verification failed: {str(e)}"
173
+ }
174
+ )
175
+
176
+
177
+ @router.post(
178
+ "/face-match",
179
+ response_model=FaceMatchResponse,
180
+ summary="Face Matching Only",
181
+ description="Compare faces between two images without liveness check."
182
+ )
183
+ async def face_match(
184
+ ktp_image: UploadFile = File(..., description="KTP/ID card photo"),
185
+ selfie_image: UploadFile = File(..., description="Selfie photo"),
186
+ threshold: float = Form(default=0.5, ge=0.0, le=1.0, description="Face match threshold")
187
+ ) -> FaceMatchResponse:
188
+ """
189
+ Compare faces between KTP and selfie images.
190
+
191
+ Args:
192
+ ktp_image: KTP/ID card image file
193
+ selfie_image: Selfie image file
194
+ threshold: Similarity threshold for face matching
195
+
196
+ Returns:
197
+ Face matching results
198
+ """
199
+ face_service = get_face_recognition_service()
200
+
201
+ # Read and validate images
202
+ ktp_img, selfie_img = await get_validated_images(ktp_image, selfie_image)
203
+
204
+ # Setup KTP extractor
205
+ ktp_extractor = KTPFaceExtractor()
206
+ ktp_extractor.set_detector(face_service.face_app)
207
+
208
+ try:
209
+ # Extract face from KTP
210
+ try:
211
+ ktp_face_img, ktp_face_info = ktp_extractor.extract_face(ktp_img, padding=0.3)
212
+ ktp_embedding_info = face_service.extract_face_info(ktp_face_img, allow_multiple=False)
213
+ except ValueError as e:
214
+ raise HTTPException(
215
+ status_code=400,
216
+ detail={
217
+ "error_code": "FACE_NOT_DETECTED",
218
+ "message": f"KTP image: {str(e)}"
219
+ }
220
+ )
221
+
222
+ # Extract face from selfie
223
+ try:
224
+ selfie_face_info = face_service.extract_face_info(selfie_img, allow_multiple=False)
225
+ except ValueError as e:
226
+ if "Multiple faces" in str(e):
227
+ raise HTTPException(
228
+ status_code=400,
229
+ detail={
230
+ "error_code": "MULTIPLE_FACES_DETECTED",
231
+ "message": f"Selfie: {str(e)}"
232
+ }
233
+ )
234
+ raise HTTPException(
235
+ status_code=400,
236
+ detail={
237
+ "error_code": "FACE_NOT_DETECTED",
238
+ "message": f"Selfie: {str(e)}"
239
+ }
240
+ )
241
+
242
+ # Compare faces
243
+ face_match_result = face_service.compare_faces(
244
+ ktp_embedding_info["embedding"],
245
+ selfie_face_info["embedding"],
246
+ threshold
247
+ )
248
+
249
+ return FaceMatchResponse(
250
+ success=face_match_result["is_match"],
251
+ face_match=FaceMatchResult(**face_match_result),
252
+ face1=FaceInfo(
253
+ bbox=BoundingBox(**ktp_face_info["bbox"]),
254
+ demographics=Demographics(
255
+ age=ktp_embedding_info.get("age"),
256
+ gender=ktp_embedding_info.get("gender")
257
+ ),
258
+ det_score=ktp_embedding_info.get("det_score")
259
+ ),
260
+ face2=FaceInfo(
261
+ bbox=BoundingBox(**selfie_face_info["bbox"]),
262
+ demographics=Demographics(
263
+ age=selfie_face_info.get("age"),
264
+ gender=selfie_face_info.get("gender")
265
+ ),
266
+ det_score=selfie_face_info.get("det_score")
267
+ ),
268
+ message="Faces match" if face_match_result["is_match"] else "Faces do not match"
269
+ )
270
+
271
+ except HTTPException:
272
+ raise
273
+ except Exception as e:
274
+ logger.error(f"Face match error: {e}", exc_info=True)
275
+ raise HTTPException(
276
+ status_code=500,
277
+ detail={
278
+ "error_code": "FACE_MATCH_ERROR",
279
+ "message": f"Face matching failed: {str(e)}"
280
+ }
281
+ )
282
+
283
+
284
+ @router.post(
285
+ "/liveness",
286
+ response_model=LivenessResponse,
287
+ summary="Liveness Detection Only",
288
+ description="Check if a face image is from a real person."
289
+ )
290
+ async def check_liveness(
291
+ image: UploadFile = File(..., description="Face image to check")
292
+ ) -> LivenessResponse:
293
+ """
294
+ Check liveness of a face image.
295
+
296
+ Args:
297
+ image: Face image file
298
+
299
+ Returns:
300
+ Liveness detection results
301
+ """
302
+ liveness_service = get_liveness_service()
303
+
304
+ # Read and validate image
305
+ img = await get_validated_image(image)
306
+
307
+ try:
308
+ liveness = liveness_service.check_liveness(img)
309
+
310
+ return LivenessResponse(
311
+ success=liveness.get("is_real", False),
312
+ liveness=LivenessResult(**liveness),
313
+ message="Real face detected" if liveness.get("is_real") else "Possible spoofing detected"
314
+ )
315
+
316
+ except Exception as e:
317
+ logger.error(f"Liveness check error: {e}", exc_info=True)
318
+ raise HTTPException(
319
+ status_code=500,
320
+ detail={
321
+ "error_code": "LIVENESS_ERROR",
322
+ "message": f"Liveness check failed: {str(e)}"
323
+ }
324
+ )
325
+
326
+
327
+ @router.post(
328
+ "/quality",
329
+ response_model=QualityResponse,
330
+ summary="Face Quality Check Only",
331
+ description="Analyze the quality of a face image."
332
+ )
333
+ async def check_quality(
334
+ image: UploadFile = File(..., description="Face image to analyze")
335
+ ) -> QualityResponse:
336
+ """
337
+ Analyze the quality of a face image.
338
+
339
+ Args:
340
+ image: Face image file
341
+
342
+ Returns:
343
+ Quality analysis results
344
+ """
345
+ face_service = get_face_recognition_service()
346
+ quality_service = get_quality_service()
347
+
348
+ # Read and validate image
349
+ img = await get_validated_image(image)
350
+
351
+ try:
352
+ # Extract face info
353
+ try:
354
+ face_info = face_service.extract_face_info(img, allow_multiple=False)
355
+ except ValueError as e:
356
+ if "Multiple faces" in str(e):
357
+ raise HTTPException(
358
+ status_code=400,
359
+ detail={
360
+ "error_code": "MULTIPLE_FACES_DETECTED",
361
+ "message": str(e)
362
+ }
363
+ )
364
+ raise HTTPException(
365
+ status_code=400,
366
+ detail={
367
+ "error_code": "FACE_NOT_DETECTED",
368
+ "message": str(e)
369
+ }
370
+ )
371
+
372
+ # Analyze quality
373
+ quality = quality_service.analyze_quality(img, face_info)
374
+
375
+ return QualityResponse(
376
+ success=quality.get("is_good_quality", False),
377
+ quality=_build_quality_analysis(quality),
378
+ face_box=BoundingBox(**face_info["bbox"]),
379
+ demographics=Demographics(
380
+ age=face_info.get("age"),
381
+ gender=face_info.get("gender")
382
+ ),
383
+ message="Good quality" if quality.get("is_good_quality") else "Quality issues detected"
384
+ )
385
+
386
+ except HTTPException:
387
+ raise
388
+ except Exception as e:
389
+ logger.error(f"Quality check error: {e}", exc_info=True)
390
+ raise HTTPException(
391
+ status_code=500,
392
+ detail={
393
+ "error_code": "QUALITY_ERROR",
394
+ "message": f"Quality check failed: {str(e)}"
395
+ }
396
+ )
397
+
398
+
399
+ # ============================================================================
400
+ # Helper Functions
401
+ # ============================================================================
402
+
403
+ def _build_quality_analysis(quality: dict) -> QualityAnalysis:
404
+ """Build QualityAnalysis from quality dict."""
405
+ pose = None
406
+ if "pose" in quality:
407
+ pose = FacePose(
408
+ yaw=quality["pose"].get("yaw", 0),
409
+ pitch=quality["pose"].get("pitch", 0),
410
+ roll=quality["pose"].get("roll", 0),
411
+ is_frontal=quality["pose"].get("is_frontal", True)
412
+ )
413
+
414
+ return QualityAnalysis(
415
+ blur_score=quality.get("blur_score", 0),
416
+ blur_threshold=quality.get("blur_threshold", settings.BLUR_THRESHOLD),
417
+ is_blurry=quality.get("is_blurry", False),
418
+ brightness=quality.get("brightness", 0.5),
419
+ brightness_min=quality.get("brightness_min", settings.BRIGHTNESS_MIN),
420
+ brightness_max=quality.get("brightness_max", settings.BRIGHTNESS_MAX),
421
+ is_too_dark=quality.get("is_too_dark", False),
422
+ is_too_bright=quality.get("is_too_bright", False),
423
+ pose=pose,
424
+ is_good_quality=quality.get("is_good_quality", True)
425
+ )
426
+
427
+
428
+ def _build_verification_message(face_match: dict, liveness: dict) -> str:
429
+ """Build verification result message."""
430
+ is_match = face_match.get("is_match", False)
431
+ is_real = liveness.get("is_real", False)
432
+
433
+ if is_match and is_real:
434
+ return "KYC verification successful"
435
+ elif not is_real:
436
+ return "Liveness check failed - possible spoofing attempt"
437
+ elif not is_match:
438
+ return "Face matching failed - faces do not match"
439
+ else:
440
+ return "Verification failed"
app/api/routes/kyc_base64.py ADDED
@@ -0,0 +1,465 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ KYC verification endpoints (Base64 input).
3
+
4
+ These endpoints accept JSON with base64 encoded images.
5
+ Useful for testing with Insomnia, Postman, or similar tools.
6
+ """
7
+
8
+ from fastapi import APIRouter, HTTPException
9
+ import logging
10
+
11
+ from ...config import settings
12
+ from ...models.schemas import (
13
+ VerifyResponse,
14
+ FaceMatchResponse,
15
+ LivenessResponse,
16
+ QualityResponse,
17
+ Base64VerifyRequest,
18
+ Base64FaceMatchRequest,
19
+ Base64SingleImageRequest,
20
+ FaceMatchResult,
21
+ LivenessResult,
22
+ QualityAnalysis,
23
+ BoundingBox,
24
+ Demographics,
25
+ FaceInfo,
26
+ FacePose
27
+ )
28
+ from ..dependencies import (
29
+ get_face_recognition_service,
30
+ get_liveness_service,
31
+ get_quality_service
32
+ )
33
+ from ...utils.image_utils import decode_base64_image
34
+ from ...utils.ktp_extractor import KTPFaceExtractor
35
+
36
+ logger = logging.getLogger(__name__)
37
+ router = APIRouter(prefix="/kyc/base64", tags=["KYC - Base64"])
38
+
39
+
40
+ @router.post(
41
+ "/verify",
42
+ response_model=VerifyResponse,
43
+ summary="Full KYC Verification (Base64)",
44
+ description="Perform complete KYC verification with base64 encoded images."
45
+ )
46
+ async def verify_kyc_base64(request: Base64VerifyRequest) -> VerifyResponse:
47
+ """
48
+ Perform complete KYC verification with base64 images.
49
+
50
+ Args:
51
+ request: Request containing base64 encoded KTP and selfie images
52
+
53
+ Returns:
54
+ Complete verification results
55
+ """
56
+ # Get services
57
+ face_service = get_face_recognition_service()
58
+ liveness_service = get_liveness_service()
59
+ quality_service = get_quality_service()
60
+
61
+ # Decode images
62
+ try:
63
+ ktp_img = decode_base64_image(request.ktp_image)
64
+ selfie_img = decode_base64_image(request.selfie_image)
65
+ except HTTPException:
66
+ raise
67
+ except Exception as e:
68
+ raise HTTPException(
69
+ status_code=400,
70
+ detail={
71
+ "error_code": "IMAGE_INVALID",
72
+ "message": f"Failed to decode base64 image: {str(e)}"
73
+ }
74
+ )
75
+
76
+ # Setup KTP extractor
77
+ ktp_extractor = KTPFaceExtractor()
78
+ ktp_extractor.set_detector(face_service.face_app)
79
+
80
+ try:
81
+ # Extract face from KTP
82
+ try:
83
+ ktp_face_img, ktp_face_info = ktp_extractor.extract_face(ktp_img, padding=0.3)
84
+ except ValueError as e:
85
+ raise HTTPException(
86
+ status_code=400,
87
+ detail={
88
+ "error_code": "FACE_NOT_DETECTED",
89
+ "message": f"KTP image: {str(e)}"
90
+ }
91
+ )
92
+
93
+ # Extract face info from selfie
94
+ try:
95
+ selfie_face_info = face_service.extract_face_info(selfie_img, allow_multiple=False)
96
+ except ValueError as e:
97
+ if "Multiple faces" in str(e):
98
+ raise HTTPException(
99
+ status_code=400,
100
+ detail={
101
+ "error_code": "MULTIPLE_FACES_DETECTED",
102
+ "message": f"Selfie: {str(e)}"
103
+ }
104
+ )
105
+ raise HTTPException(
106
+ status_code=400,
107
+ detail={
108
+ "error_code": "FACE_NOT_DETECTED",
109
+ "message": f"Selfie: {str(e)}"
110
+ }
111
+ )
112
+
113
+ # Extract face info from KTP (cropped face)
114
+ try:
115
+ ktp_embedding_info = face_service.extract_face_info(ktp_face_img, allow_multiple=False)
116
+ except ValueError as e:
117
+ raise HTTPException(
118
+ status_code=400,
119
+ detail={
120
+ "error_code": "FACE_NOT_DETECTED",
121
+ "message": f"Could not extract embedding from KTP face: {str(e)}"
122
+ }
123
+ )
124
+
125
+ # Compare faces
126
+ face_match = face_service.compare_faces(
127
+ ktp_embedding_info["embedding"],
128
+ selfie_face_info["embedding"],
129
+ request.threshold
130
+ )
131
+
132
+ # Check liveness on selfie
133
+ liveness = liveness_service.check_liveness(selfie_img)
134
+
135
+ # Quality analysis
136
+ ktp_quality = quality_service.analyze_quality(ktp_face_img, ktp_embedding_info)
137
+ selfie_quality = quality_service.analyze_quality(selfie_img, selfie_face_info)
138
+
139
+ # Build response
140
+ return VerifyResponse(
141
+ success=face_match["is_match"] and liveness.get("is_real", False),
142
+ face_match=FaceMatchResult(**face_match),
143
+ liveness=LivenessResult(**liveness),
144
+ quality={
145
+ "ktp": _build_quality_analysis(ktp_quality),
146
+ "selfie": _build_quality_analysis(selfie_quality)
147
+ },
148
+ demographics={
149
+ "ktp": Demographics(
150
+ age=ktp_embedding_info.get("age"),
151
+ gender=ktp_embedding_info.get("gender")
152
+ ),
153
+ "selfie": Demographics(
154
+ age=selfie_face_info.get("age"),
155
+ gender=selfie_face_info.get("gender")
156
+ )
157
+ },
158
+ face_boxes={
159
+ "ktp": BoundingBox(**ktp_face_info["bbox"]),
160
+ "selfie": BoundingBox(**selfie_face_info["bbox"])
161
+ },
162
+ message=_build_verification_message(face_match, liveness)
163
+ )
164
+
165
+ except HTTPException:
166
+ raise
167
+ except Exception as e:
168
+ logger.error(f"Verification error: {e}", exc_info=True)
169
+ raise HTTPException(
170
+ status_code=500,
171
+ detail={
172
+ "error_code": "VERIFICATION_ERROR",
173
+ "message": f"Verification failed: {str(e)}"
174
+ }
175
+ )
176
+
177
+
178
+ @router.post(
179
+ "/face-match",
180
+ response_model=FaceMatchResponse,
181
+ summary="Face Matching Only (Base64)",
182
+ description="Compare faces between two base64 encoded images."
183
+ )
184
+ async def face_match_base64(request: Base64FaceMatchRequest) -> FaceMatchResponse:
185
+ """
186
+ Compare faces between two base64 encoded images.
187
+
188
+ Args:
189
+ request: Request containing base64 encoded images
190
+
191
+ Returns:
192
+ Face matching results
193
+ """
194
+ face_service = get_face_recognition_service()
195
+
196
+ # Decode images
197
+ try:
198
+ img1 = decode_base64_image(request.image1)
199
+ img2 = decode_base64_image(request.image2)
200
+ except HTTPException:
201
+ raise
202
+ except Exception as e:
203
+ raise HTTPException(
204
+ status_code=400,
205
+ detail={
206
+ "error_code": "IMAGE_INVALID",
207
+ "message": f"Failed to decode base64 image: {str(e)}"
208
+ }
209
+ )
210
+
211
+ # Setup KTP extractor for first image
212
+ ktp_extractor = KTPFaceExtractor()
213
+ ktp_extractor.set_detector(face_service.face_app)
214
+
215
+ try:
216
+ # Extract face from first image (treated as KTP)
217
+ try:
218
+ img1_face, img1_face_info = ktp_extractor.extract_face(img1, padding=0.3)
219
+ img1_embedding_info = face_service.extract_face_info(img1_face, allow_multiple=False)
220
+ except ValueError as e:
221
+ raise HTTPException(
222
+ status_code=400,
223
+ detail={
224
+ "error_code": "FACE_NOT_DETECTED",
225
+ "message": f"Image 1: {str(e)}"
226
+ }
227
+ )
228
+
229
+ # Extract face from second image
230
+ try:
231
+ img2_face_info = face_service.extract_face_info(img2, allow_multiple=False)
232
+ except ValueError as e:
233
+ if "Multiple faces" in str(e):
234
+ raise HTTPException(
235
+ status_code=400,
236
+ detail={
237
+ "error_code": "MULTIPLE_FACES_DETECTED",
238
+ "message": f"Image 2: {str(e)}"
239
+ }
240
+ )
241
+ raise HTTPException(
242
+ status_code=400,
243
+ detail={
244
+ "error_code": "FACE_NOT_DETECTED",
245
+ "message": f"Image 2: {str(e)}"
246
+ }
247
+ )
248
+
249
+ # Compare faces
250
+ face_match_result = face_service.compare_faces(
251
+ img1_embedding_info["embedding"],
252
+ img2_face_info["embedding"],
253
+ request.threshold
254
+ )
255
+
256
+ return FaceMatchResponse(
257
+ success=face_match_result["is_match"],
258
+ face_match=FaceMatchResult(**face_match_result),
259
+ face1=FaceInfo(
260
+ bbox=BoundingBox(**img1_face_info["bbox"]),
261
+ demographics=Demographics(
262
+ age=img1_embedding_info.get("age"),
263
+ gender=img1_embedding_info.get("gender")
264
+ ),
265
+ det_score=img1_embedding_info.get("det_score")
266
+ ),
267
+ face2=FaceInfo(
268
+ bbox=BoundingBox(**img2_face_info["bbox"]),
269
+ demographics=Demographics(
270
+ age=img2_face_info.get("age"),
271
+ gender=img2_face_info.get("gender")
272
+ ),
273
+ det_score=img2_face_info.get("det_score")
274
+ ),
275
+ message="Faces match" if face_match_result["is_match"] else "Faces do not match"
276
+ )
277
+
278
+ except HTTPException:
279
+ raise
280
+ except Exception as e:
281
+ logger.error(f"Face match error: {e}", exc_info=True)
282
+ raise HTTPException(
283
+ status_code=500,
284
+ detail={
285
+ "error_code": "FACE_MATCH_ERROR",
286
+ "message": f"Face matching failed: {str(e)}"
287
+ }
288
+ )
289
+
290
+
291
+ @router.post(
292
+ "/liveness",
293
+ response_model=LivenessResponse,
294
+ summary="Liveness Detection Only (Base64)",
295
+ description="Check if a base64 encoded face image is from a real person."
296
+ )
297
+ async def check_liveness_base64(request: Base64SingleImageRequest) -> LivenessResponse:
298
+ """
299
+ Check liveness of a base64 encoded face image.
300
+
301
+ Args:
302
+ request: Request containing base64 encoded image
303
+
304
+ Returns:
305
+ Liveness detection results
306
+ """
307
+ liveness_service = get_liveness_service()
308
+
309
+ # Decode image
310
+ try:
311
+ img = decode_base64_image(request.image)
312
+ except HTTPException:
313
+ raise
314
+ except Exception as e:
315
+ raise HTTPException(
316
+ status_code=400,
317
+ detail={
318
+ "error_code": "IMAGE_INVALID",
319
+ "message": f"Failed to decode base64 image: {str(e)}"
320
+ }
321
+ )
322
+
323
+ try:
324
+ liveness = liveness_service.check_liveness(img)
325
+
326
+ return LivenessResponse(
327
+ success=liveness.get("is_real", False),
328
+ liveness=LivenessResult(**liveness),
329
+ message="Real face detected" if liveness.get("is_real") else "Possible spoofing detected"
330
+ )
331
+
332
+ except Exception as e:
333
+ logger.error(f"Liveness check error: {e}", exc_info=True)
334
+ raise HTTPException(
335
+ status_code=500,
336
+ detail={
337
+ "error_code": "LIVENESS_ERROR",
338
+ "message": f"Liveness check failed: {str(e)}"
339
+ }
340
+ )
341
+
342
+
343
+ @router.post(
344
+ "/quality",
345
+ response_model=QualityResponse,
346
+ summary="Face Quality Check Only (Base64)",
347
+ description="Analyze the quality of a base64 encoded face image."
348
+ )
349
+ async def check_quality_base64(request: Base64SingleImageRequest) -> QualityResponse:
350
+ """
351
+ Analyze the quality of a base64 encoded face image.
352
+
353
+ Args:
354
+ request: Request containing base64 encoded image
355
+
356
+ Returns:
357
+ Quality analysis results
358
+ """
359
+ face_service = get_face_recognition_service()
360
+ quality_service = get_quality_service()
361
+
362
+ # Decode image
363
+ try:
364
+ img = decode_base64_image(request.image)
365
+ except HTTPException:
366
+ raise
367
+ except Exception as e:
368
+ raise HTTPException(
369
+ status_code=400,
370
+ detail={
371
+ "error_code": "IMAGE_INVALID",
372
+ "message": f"Failed to decode base64 image: {str(e)}"
373
+ }
374
+ )
375
+
376
+ try:
377
+ # Extract face info
378
+ try:
379
+ face_info = face_service.extract_face_info(img, allow_multiple=False)
380
+ except ValueError as e:
381
+ if "Multiple faces" in str(e):
382
+ raise HTTPException(
383
+ status_code=400,
384
+ detail={
385
+ "error_code": "MULTIPLE_FACES_DETECTED",
386
+ "message": str(e)
387
+ }
388
+ )
389
+ raise HTTPException(
390
+ status_code=400,
391
+ detail={
392
+ "error_code": "FACE_NOT_DETECTED",
393
+ "message": str(e)
394
+ }
395
+ )
396
+
397
+ # Analyze quality
398
+ quality = quality_service.analyze_quality(img, face_info)
399
+
400
+ return QualityResponse(
401
+ success=quality.get("is_good_quality", False),
402
+ quality=_build_quality_analysis(quality),
403
+ face_box=BoundingBox(**face_info["bbox"]),
404
+ demographics=Demographics(
405
+ age=face_info.get("age"),
406
+ gender=face_info.get("gender")
407
+ ),
408
+ message="Good quality" if quality.get("is_good_quality") else "Quality issues detected"
409
+ )
410
+
411
+ except HTTPException:
412
+ raise
413
+ except Exception as e:
414
+ logger.error(f"Quality check error: {e}", exc_info=True)
415
+ raise HTTPException(
416
+ status_code=500,
417
+ detail={
418
+ "error_code": "QUALITY_ERROR",
419
+ "message": f"Quality check failed: {str(e)}"
420
+ }
421
+ )
422
+
423
+
424
+ # ============================================================================
425
+ # Helper Functions
426
+ # ============================================================================
427
+
428
+ def _build_quality_analysis(quality: dict) -> QualityAnalysis:
429
+ """Build QualityAnalysis from quality dict."""
430
+ pose = None
431
+ if "pose" in quality:
432
+ pose = FacePose(
433
+ yaw=quality["pose"].get("yaw", 0),
434
+ pitch=quality["pose"].get("pitch", 0),
435
+ roll=quality["pose"].get("roll", 0),
436
+ is_frontal=quality["pose"].get("is_frontal", True)
437
+ )
438
+
439
+ return QualityAnalysis(
440
+ blur_score=quality.get("blur_score", 0),
441
+ blur_threshold=quality.get("blur_threshold", settings.BLUR_THRESHOLD),
442
+ is_blurry=quality.get("is_blurry", False),
443
+ brightness=quality.get("brightness", 0.5),
444
+ brightness_min=quality.get("brightness_min", settings.BRIGHTNESS_MIN),
445
+ brightness_max=quality.get("brightness_max", settings.BRIGHTNESS_MAX),
446
+ is_too_dark=quality.get("is_too_dark", False),
447
+ is_too_bright=quality.get("is_too_bright", False),
448
+ pose=pose,
449
+ is_good_quality=quality.get("is_good_quality", True)
450
+ )
451
+
452
+
453
+ def _build_verification_message(face_match: dict, liveness: dict) -> str:
454
+ """Build verification result message."""
455
+ is_match = face_match.get("is_match", False)
456
+ is_real = liveness.get("is_real", False)
457
+
458
+ if is_match and is_real:
459
+ return "KYC verification successful"
460
+ elif not is_real:
461
+ return "Liveness check failed - possible spoofing attempt"
462
+ elif not is_match:
463
+ return "Face matching failed - faces do not match"
464
+ else:
465
+ return "Verification failed"
app/api/routes/ocr.py ADDED
@@ -0,0 +1,272 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ KTP OCR endpoints (File upload and Base64).
3
+
4
+ These endpoints extract text from Indonesian KTP (ID card) images
5
+ and return structured, sanitized data.
6
+ """
7
+
8
+ from fastapi import APIRouter, HTTPException, UploadFile, File, Query
9
+ import logging
10
+
11
+ from ...models.schemas import (
12
+ OCRResponse,
13
+ Base64OCRRequest,
14
+ KTPOCRData,
15
+ OCRFieldResult,
16
+ OCRTextBlock,
17
+ KTPValidation,
18
+ NIKValidation
19
+ )
20
+ from ..dependencies import get_ocr_service, get_validated_image
21
+ from ...utils.image_utils import decode_base64_image
22
+
23
+ logger = logging.getLogger(__name__)
24
+ router = APIRouter(prefix="/kyc/ocr", tags=["KTP OCR"])
25
+
26
+
27
+ def _build_ocr_response(result: dict) -> OCRResponse:
28
+ """Build OCRResponse from service result."""
29
+ # Build KTPOCRData from extracted data
30
+ data_dict = result.get('data', {})
31
+ ktp_data = KTPOCRData(
32
+ provinsi=_build_field_result(data_dict.get('provinsi')),
33
+ kabupaten_kota=_build_field_result(data_dict.get('kabupaten_kota')),
34
+ nik=_build_field_result(data_dict.get('nik')),
35
+ nama=_build_field_result(data_dict.get('nama')),
36
+ tempat_lahir=_build_field_result(data_dict.get('tempat_lahir')),
37
+ tanggal_lahir=_build_field_result(data_dict.get('tanggal_lahir')),
38
+ jenis_kelamin=_build_field_result(data_dict.get('jenis_kelamin')),
39
+ golongan_darah=_build_field_result(data_dict.get('golongan_darah')),
40
+ alamat=_build_field_result(data_dict.get('alamat')),
41
+ rt_rw=_build_field_result(data_dict.get('rt_rw')),
42
+ kelurahan_desa=_build_field_result(data_dict.get('kelurahan_desa')),
43
+ kecamatan=_build_field_result(data_dict.get('kecamatan')),
44
+ agama=_build_field_result(data_dict.get('agama')),
45
+ status_perkawinan=_build_field_result(data_dict.get('status_perkawinan')),
46
+ pekerjaan=_build_field_result(data_dict.get('pekerjaan')),
47
+ kewarganegaraan=_build_field_result(data_dict.get('kewarganegaraan')),
48
+ berlaku_hingga=_build_field_result(data_dict.get('berlaku_hingga'))
49
+ )
50
+
51
+ # Build raw text blocks (convert numpy types to native Python types)
52
+ raw_text = []
53
+ for item in result.get('raw_text', []):
54
+ bbox = item.get('bbox', [])
55
+ # Convert numpy arrays/values to native Python lists/ints
56
+ if bbox:
57
+ bbox = [[int(coord) for coord in point] for point in bbox]
58
+ raw_text.append(
59
+ OCRTextBlock(
60
+ text=item.get('text', ''),
61
+ confidence=float(item.get('confidence', 0.0)),
62
+ bbox=bbox
63
+ )
64
+ )
65
+
66
+ # Build validation result
67
+ validation = None
68
+ if result.get('validation'):
69
+ nik_validation = result['validation'].get('nik')
70
+ if nik_validation:
71
+ validation = KTPValidation(
72
+ nik=NIKValidation(
73
+ is_valid=nik_validation.get('is_valid', False),
74
+ errors=nik_validation.get('errors', []),
75
+ extracted=nik_validation.get('extracted', {})
76
+ )
77
+ )
78
+
79
+ # Determine success based on whether any fields were extracted
80
+ fields_extracted = sum(1 for v in data_dict.values() if v is not None)
81
+ success = fields_extracted > 0
82
+
83
+ return OCRResponse(
84
+ success=success,
85
+ data=ktp_data,
86
+ raw_text=raw_text,
87
+ validation=validation,
88
+ message=f"Extracted {fields_extracted} fields from KTP" if success else "No fields could be extracted"
89
+ )
90
+
91
+
92
+ def _build_field_result(field_data: dict | None) -> OCRFieldResult | None:
93
+ """Build OCRFieldResult from field data dict."""
94
+ if not field_data:
95
+ return None
96
+ return OCRFieldResult(
97
+ value=field_data.get('value', ''),
98
+ confidence=field_data.get('confidence', 0.0),
99
+ raw_value=field_data.get('raw_value', '')
100
+ )
101
+
102
+
103
+ # ============================================================================
104
+ # File Upload Endpoints
105
+ # ============================================================================
106
+
107
+ @router.post(
108
+ "/extract",
109
+ response_model=OCRResponse,
110
+ summary="Extract KTP Data (File Upload)",
111
+ description="""
112
+ Extract and parse data from a KTP (Indonesian ID card) image.
113
+
114
+ This endpoint performs OCR on the uploaded KTP image and returns:
115
+ - Structured data (NIK, name, address, birth date, etc.)
116
+ - Raw OCR text with confidence scores and bounding boxes
117
+ - NIK validation (optional)
118
+
119
+ Supported image formats: JPEG, PNG
120
+ Max file size: 10MB
121
+ """
122
+ )
123
+ async def extract_ktp_data(
124
+ ktp_image: UploadFile = File(..., description="KTP image file"),
125
+ validate: bool = Query(default=True, description="Validate extracted data (e.g., NIK)")
126
+ ) -> OCRResponse:
127
+ """
128
+ Extract data from KTP image (file upload).
129
+
130
+ Args:
131
+ ktp_image: Uploaded KTP image file
132
+ validate: Whether to validate extracted data
133
+
134
+ Returns:
135
+ Structured KTP data with validation results
136
+ """
137
+ ocr_service = get_ocr_service()
138
+
139
+ # Validate and read image
140
+ try:
141
+ image = await get_validated_image(ktp_image)
142
+ except HTTPException:
143
+ raise
144
+ except Exception as e:
145
+ raise HTTPException(
146
+ status_code=400,
147
+ detail={
148
+ "error_code": "IMAGE_INVALID",
149
+ "message": f"Failed to read image: {str(e)}"
150
+ }
151
+ )
152
+
153
+ try:
154
+ # Extract KTP data
155
+ result = ocr_service.extract_ktp_data(image, validate=validate)
156
+ return _build_ocr_response(result)
157
+
158
+ except Exception as e:
159
+ logger.error(f"OCR extraction error: {e}", exc_info=True)
160
+ raise HTTPException(
161
+ status_code=500,
162
+ detail={
163
+ "error_code": "OCR_ERROR",
164
+ "message": f"OCR extraction failed: {str(e)}"
165
+ }
166
+ )
167
+
168
+
169
+ # ============================================================================
170
+ # Base64 Endpoints
171
+ # ============================================================================
172
+
173
+ @router.post(
174
+ "/base64/extract",
175
+ response_model=OCRResponse,
176
+ summary="Extract KTP Data (Base64)",
177
+ description="""
178
+ Extract and parse data from a base64-encoded KTP image.
179
+
180
+ This endpoint performs OCR on the KTP image and returns:
181
+ - Structured data (NIK, name, address, birth date, etc.)
182
+ - Raw OCR text with confidence scores and bounding boxes
183
+ - NIK validation (optional)
184
+ """
185
+ )
186
+ async def extract_ktp_data_base64(request: Base64OCRRequest) -> OCRResponse:
187
+ """
188
+ Extract data from KTP image (base64).
189
+
190
+ Args:
191
+ request: Request containing base64 encoded KTP image
192
+
193
+ Returns:
194
+ Structured KTP data with validation results
195
+ """
196
+ ocr_service = get_ocr_service()
197
+
198
+ # Decode base64 image
199
+ try:
200
+ image = decode_base64_image(request.image)
201
+ except HTTPException:
202
+ raise
203
+ except Exception as e:
204
+ raise HTTPException(
205
+ status_code=400,
206
+ detail={
207
+ "error_code": "IMAGE_INVALID",
208
+ "message": f"Failed to decode base64 image: {str(e)}"
209
+ }
210
+ )
211
+
212
+ try:
213
+ # Extract KTP data
214
+ result = ocr_service.extract_ktp_data(image, validate=request.validate)
215
+ return _build_ocr_response(result)
216
+
217
+ except Exception as e:
218
+ logger.error(f"OCR extraction error: {e}", exc_info=True)
219
+ raise HTTPException(
220
+ status_code=500,
221
+ detail={
222
+ "error_code": "OCR_ERROR",
223
+ "message": f"OCR extraction failed: {str(e)}"
224
+ }
225
+ )
226
+
227
+
228
+ @router.post(
229
+ "/validate-nik",
230
+ summary="Validate NIK",
231
+ description="""
232
+ Validate a 16-digit Indonesian NIK (Nomor Induk Kependudukan).
233
+
234
+ Returns validation status and extracted information:
235
+ - Province code
236
+ - City/Regency code
237
+ - District code
238
+ - Birth date
239
+ - Gender
240
+ - Sequence number
241
+ """
242
+ )
243
+ async def validate_nik(
244
+ nik: str = Query(..., description="16-digit NIK to validate", min_length=16, max_length=16)
245
+ ) -> NIKValidation:
246
+ """
247
+ Validate a NIK string.
248
+
249
+ Args:
250
+ nik: 16-digit NIK string
251
+
252
+ Returns:
253
+ Validation result with extracted information
254
+ """
255
+ ocr_service = get_ocr_service()
256
+
257
+ try:
258
+ result = ocr_service.validate_nik(nik)
259
+ return NIKValidation(
260
+ is_valid=result.get('is_valid', False),
261
+ errors=result.get('errors', []),
262
+ extracted=result.get('extracted', {})
263
+ )
264
+ except Exception as e:
265
+ logger.error(f"NIK validation error: {e}", exc_info=True)
266
+ raise HTTPException(
267
+ status_code=500,
268
+ detail={
269
+ "error_code": "VALIDATION_ERROR",
270
+ "message": f"NIK validation failed: {str(e)}"
271
+ }
272
+ )
app/config.py ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Configuration settings for KYC POC application.
3
+ """
4
+
5
+ from pydantic_settings import BaseSettings
6
+ from typing import List
7
+ from pathlib import Path
8
+
9
+
10
+ class Settings(BaseSettings):
11
+ """Application settings."""
12
+
13
+ # Application
14
+ APP_NAME: str = "KYC POC API"
15
+ APP_VERSION: str = "1.0.0"
16
+ DEBUG: bool = True
17
+
18
+ # Model paths
19
+ AURAFACE_MODEL_DIR: str = "models/auraface"
20
+ ANTISPOOF_MODEL_DIR: str = "models/anti_spoof"
21
+ SILENT_FACE_REPO_DIR: str = "Silent-Face-Anti-Spoofing"
22
+
23
+ # Face matching
24
+ FACE_MATCH_THRESHOLD: float = 0.5
25
+
26
+ # Liveness detection
27
+ LIVENESS_THRESHOLD: float = 0.5
28
+
29
+ # Face quality thresholds
30
+ BLUR_THRESHOLD: float = 100.0 # Below this = blurry
31
+ BRIGHTNESS_MIN: float = 0.2 # Below this = too dark
32
+ BRIGHTNESS_MAX: float = 0.8 # Above this = too bright
33
+ POSE_MAX_YAW: float = 30.0 # Max yaw angle for frontal face
34
+ POSE_MAX_PITCH: float = 30.0 # Max pitch angle for frontal face
35
+ POSE_MAX_ROLL: float = 30.0 # Max roll angle for frontal face
36
+
37
+ # Device settings
38
+ USE_GPU: bool = False # CPU mode for POC
39
+ DEVICE_ID: int = -1 # -1 for CPU, 0+ for GPU
40
+
41
+ # API settings
42
+ MAX_IMAGE_SIZE_MB: float = 10.0
43
+ ALLOWED_IMAGE_TYPES: List[str] = ["image/jpeg", "image/png", "image/jpg"]
44
+
45
+ # Face detection settings
46
+ DET_SIZE: tuple = (640, 640) # Detection input size
47
+
48
+ class Config:
49
+ env_file = ".env"
50
+ env_file_encoding = "utf-8"
51
+
52
+ @property
53
+ def max_image_size_bytes(self) -> int:
54
+ """Get max image size in bytes."""
55
+ return int(self.MAX_IMAGE_SIZE_MB * 1024 * 1024)
56
+
57
+ @property
58
+ def auraface_path(self) -> Path:
59
+ """Get AuraFace model path."""
60
+ return Path(self.AURAFACE_MODEL_DIR)
61
+
62
+ @property
63
+ def antispoof_path(self) -> Path:
64
+ """Get anti-spoof model path."""
65
+ return Path(self.ANTISPOOF_MODEL_DIR)
66
+
67
+
68
+ # Global settings instance
69
+ settings = Settings()
app/main.py ADDED
@@ -0,0 +1,183 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ KYC POC API - Main Application Entry Point
3
+
4
+ This is a FastAPI application for KYC (Know Your Customer) verification
5
+ using face matching (AuraFace) and liveness detection (Silent-Face-Anti-Spoofing).
6
+
7
+ Run with:
8
+ uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
9
+ """
10
+
11
+ import logging
12
+ from contextlib import asynccontextmanager
13
+ from concurrent.futures import ThreadPoolExecutor
14
+ import asyncio
15
+
16
+ from fastapi import FastAPI, Request
17
+ from fastapi.middleware.cors import CORSMiddleware
18
+ from fastapi.responses import JSONResponse
19
+ from fastapi.exceptions import RequestValidationError
20
+
21
+ from .config import settings
22
+ from .api.routes import health, kyc, kyc_base64, ocr
23
+ from .services.face_recognition import face_recognition_service
24
+ from .services.liveness_detection import liveness_detection_service
25
+ from .services.ktp_ocr import ktp_ocr_service
26
+
27
+ # Configure logging
28
+ logging.basicConfig(
29
+ level=logging.INFO,
30
+ format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"
31
+ )
32
+ logger = logging.getLogger(__name__)
33
+
34
+ # Thread pool for ML model initialization
35
+ executor = ThreadPoolExecutor(max_workers=3)
36
+
37
+
38
+ @asynccontextmanager
39
+ async def lifespan(app: FastAPI):
40
+ """
41
+ Application lifespan manager.
42
+ Initializes ML models on startup and cleans up on shutdown.
43
+ """
44
+ logger.info("Starting KYC POC API...")
45
+
46
+ # Initialize ML models in background threads
47
+ loop = asyncio.get_event_loop()
48
+
49
+ try:
50
+ # Initialize face recognition service
51
+ logger.info("Initializing face recognition service...")
52
+ await loop.run_in_executor(executor, face_recognition_service.initialize)
53
+ logger.info("Face recognition service ready")
54
+ except Exception as e:
55
+ logger.error(f"Failed to initialize face recognition: {e}")
56
+
57
+ try:
58
+ # Initialize liveness detection service
59
+ logger.info("Initializing liveness detection service...")
60
+ await loop.run_in_executor(executor, liveness_detection_service.initialize)
61
+ logger.info("Liveness detection service ready")
62
+ except Exception as e:
63
+ logger.error(f"Failed to initialize liveness detection: {e}")
64
+
65
+ try:
66
+ # Initialize KTP OCR service
67
+ logger.info("Initializing KTP OCR service...")
68
+ await loop.run_in_executor(executor, ktp_ocr_service.initialize)
69
+ logger.info("KTP OCR service ready")
70
+ except Exception as e:
71
+ logger.error(f"Failed to initialize KTP OCR: {e}")
72
+
73
+ logger.info("KYC POC API started successfully")
74
+
75
+ yield
76
+
77
+ # Cleanup on shutdown
78
+ logger.info("Shutting down KYC POC API...")
79
+ executor.shutdown(wait=True)
80
+ logger.info("Shutdown complete")
81
+
82
+
83
+ # Create FastAPI application
84
+ app = FastAPI(
85
+ title=settings.APP_NAME,
86
+ version=settings.APP_VERSION,
87
+ description="""
88
+ ## KYC POC API
89
+
90
+ A proof-of-concept API for KYC (Know Your Customer) verification using:
91
+ - **AuraFace** for face recognition and matching
92
+ - **Silent-Face-Anti-Spoofing** for liveness detection
93
+ - **EasyOCR** for KTP text extraction
94
+
95
+ ### Features
96
+ - Face matching between KTP (ID card) and selfie
97
+ - Liveness detection to prevent spoofing
98
+ - Face quality analysis (blur, brightness, pose)
99
+ - Age and gender estimation
100
+ - **KTP OCR**: Extract and parse Indonesian ID card data (NIK, name, address, etc.)
101
+ - **NIK Validation**: Validate and decode NIK information
102
+
103
+ ### Endpoints
104
+ - **File Upload**: `/api/v1/kyc/*` - Accepts multipart/form-data
105
+ - **Base64**: `/api/v1/kyc/base64/*` - Accepts JSON with base64 images
106
+ - **OCR**: `/api/v1/kyc/ocr/*` - KTP text extraction and NIK validation
107
+ """,
108
+ docs_url="/docs",
109
+ redoc_url="/redoc",
110
+ lifespan=lifespan
111
+ )
112
+
113
+ # Add CORS middleware
114
+ app.add_middleware(
115
+ CORSMiddleware,
116
+ allow_origins=["*"],
117
+ allow_credentials=True,
118
+ allow_methods=["*"],
119
+ allow_headers=["*"],
120
+ )
121
+
122
+
123
+ # ============================================================================
124
+ # Exception Handlers
125
+ # ============================================================================
126
+
127
+ @app.exception_handler(RequestValidationError)
128
+ async def validation_exception_handler(request: Request, exc: RequestValidationError):
129
+ """Handle request validation errors."""
130
+ errors = exc.errors()
131
+ return JSONResponse(
132
+ status_code=422,
133
+ content={
134
+ "error_code": "VALIDATION_ERROR",
135
+ "message": "Request validation failed",
136
+ "detail": errors
137
+ }
138
+ )
139
+
140
+
141
+ @app.exception_handler(Exception)
142
+ async def general_exception_handler(request: Request, exc: Exception):
143
+ """Handle unexpected errors."""
144
+ logger.error(f"Unexpected error: {exc}", exc_info=True)
145
+ return JSONResponse(
146
+ status_code=500,
147
+ content={
148
+ "error_code": "INTERNAL_ERROR",
149
+ "message": "An unexpected error occurred",
150
+ "detail": str(exc) if settings.DEBUG else None
151
+ }
152
+ )
153
+
154
+
155
+ # ============================================================================
156
+ # Register Routes
157
+ # ============================================================================
158
+
159
+ # Health check routes (no prefix)
160
+ app.include_router(health.router)
161
+
162
+ # KYC routes (file upload)
163
+ app.include_router(kyc.router, prefix="/api/v1")
164
+
165
+ # KYC routes (base64)
166
+ app.include_router(kyc_base64.router, prefix="/api/v1")
167
+
168
+ # OCR routes
169
+ app.include_router(ocr.router, prefix="/api/v1")
170
+
171
+
172
+ # ============================================================================
173
+ # Main Entry Point
174
+ # ============================================================================
175
+
176
+ if __name__ == "__main__":
177
+ import uvicorn
178
+ uvicorn.run(
179
+ "app.main:app",
180
+ host="0.0.0.0",
181
+ port=8000,
182
+ reload=settings.DEBUG
183
+ )
app/services/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Services Package
app/services/face_quality.py ADDED
@@ -0,0 +1,233 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Face Quality Analysis Service.
3
+
4
+ This service provides face quality assessment including:
5
+ - Blur detection (Laplacian variance)
6
+ - Brightness analysis
7
+ - Face pose estimation
8
+ """
9
+
10
+ import cv2
11
+ import numpy as np
12
+ from typing import Dict, Any, Optional
13
+ import logging
14
+
15
+ from ..config import settings
16
+
17
+ logger = logging.getLogger(__name__)
18
+
19
+
20
+ class FaceQualityService:
21
+ """Service for analyzing face image quality."""
22
+
23
+ def __init__(self):
24
+ """Initialize the face quality service."""
25
+ pass
26
+
27
+ def analyze_quality(
28
+ self,
29
+ image: np.ndarray,
30
+ face_info: Optional[Dict[str, Any]] = None
31
+ ) -> Dict[str, Any]:
32
+ """
33
+ Analyze the quality of a face image.
34
+
35
+ Args:
36
+ image: Input image (BGR format)
37
+ face_info: Optional face info dict containing pose data from face detection
38
+
39
+ Returns:
40
+ Dictionary containing quality metrics
41
+ """
42
+ result = {}
43
+
44
+ # Analyze blur
45
+ blur_result = self.analyze_blur(image)
46
+ result.update(blur_result)
47
+
48
+ # Analyze brightness
49
+ brightness_result = self.analyze_brightness(image)
50
+ result.update(brightness_result)
51
+
52
+ # Add pose analysis if face_info provided
53
+ if face_info and "pose" in face_info:
54
+ pose_result = self.analyze_pose(face_info["pose"])
55
+ result["pose"] = pose_result
56
+
57
+ # Overall quality assessment
58
+ result["is_good_quality"] = self._assess_overall_quality(result)
59
+
60
+ return result
61
+
62
+ def analyze_blur(self, image: np.ndarray) -> Dict[str, Any]:
63
+ """
64
+ Analyze image blur using Laplacian variance method.
65
+
66
+ Higher variance = sharper image
67
+ Lower variance = blurrier image
68
+
69
+ Args:
70
+ image: Input image (BGR format)
71
+
72
+ Returns:
73
+ Dictionary with blur metrics
74
+ """
75
+ # Convert to grayscale
76
+ if len(image.shape) == 3:
77
+ gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
78
+ else:
79
+ gray = image
80
+
81
+ # Calculate Laplacian variance
82
+ laplacian = cv2.Laplacian(gray, cv2.CV_64F)
83
+ variance = laplacian.var()
84
+
85
+ is_blurry = variance < settings.BLUR_THRESHOLD
86
+
87
+ return {
88
+ "blur_score": round(float(variance), 2),
89
+ "blur_threshold": settings.BLUR_THRESHOLD,
90
+ "is_blurry": is_blurry
91
+ }
92
+
93
+ def analyze_brightness(self, image: np.ndarray) -> Dict[str, Any]:
94
+ """
95
+ Analyze image brightness.
96
+
97
+ Args:
98
+ image: Input image (BGR format)
99
+
100
+ Returns:
101
+ Dictionary with brightness metrics
102
+ """
103
+ # Convert to grayscale
104
+ if len(image.shape) == 3:
105
+ gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
106
+ else:
107
+ gray = image
108
+
109
+ # Calculate mean brightness (normalized to 0-1)
110
+ mean_brightness = np.mean(gray) / 255.0
111
+
112
+ is_too_dark = mean_brightness < settings.BRIGHTNESS_MIN
113
+ is_too_bright = mean_brightness > settings.BRIGHTNESS_MAX
114
+
115
+ return {
116
+ "brightness": round(float(mean_brightness), 3),
117
+ "brightness_min": settings.BRIGHTNESS_MIN,
118
+ "brightness_max": settings.BRIGHTNESS_MAX,
119
+ "is_too_dark": is_too_dark,
120
+ "is_too_bright": is_too_bright
121
+ }
122
+
123
+ def analyze_pose(self, pose: Dict[str, float]) -> Dict[str, Any]:
124
+ """
125
+ Analyze face pose angles.
126
+
127
+ Args:
128
+ pose: Dictionary with yaw, pitch, roll angles
129
+
130
+ Returns:
131
+ Dictionary with pose analysis
132
+ """
133
+ yaw = abs(pose.get("yaw", 0))
134
+ pitch = abs(pose.get("pitch", 0))
135
+ roll = abs(pose.get("roll", 0))
136
+
137
+ is_frontal = (
138
+ yaw <= settings.POSE_MAX_YAW and
139
+ pitch <= settings.POSE_MAX_PITCH and
140
+ roll <= settings.POSE_MAX_ROLL
141
+ )
142
+
143
+ return {
144
+ "yaw": round(pose.get("yaw", 0), 2),
145
+ "pitch": round(pose.get("pitch", 0), 2),
146
+ "roll": round(pose.get("roll", 0), 2),
147
+ "max_yaw": settings.POSE_MAX_YAW,
148
+ "max_pitch": settings.POSE_MAX_PITCH,
149
+ "max_roll": settings.POSE_MAX_ROLL,
150
+ "is_frontal": is_frontal
151
+ }
152
+
153
+ def analyze_contrast(self, image: np.ndarray) -> Dict[str, Any]:
154
+ """
155
+ Analyze image contrast.
156
+
157
+ Args:
158
+ image: Input image (BGR format)
159
+
160
+ Returns:
161
+ Dictionary with contrast metrics
162
+ """
163
+ # Convert to grayscale
164
+ if len(image.shape) == 3:
165
+ gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
166
+ else:
167
+ gray = image
168
+
169
+ # Calculate standard deviation as contrast measure
170
+ contrast = np.std(gray) / 255.0
171
+
172
+ return {
173
+ "contrast": round(float(contrast), 3),
174
+ "is_low_contrast": contrast < 0.1
175
+ }
176
+
177
+ def analyze_face_size(
178
+ self,
179
+ image: np.ndarray,
180
+ bbox: Dict[str, int],
181
+ min_face_ratio: float = 0.1
182
+ ) -> Dict[str, Any]:
183
+ """
184
+ Analyze face size relative to image.
185
+
186
+ Args:
187
+ image: Input image
188
+ bbox: Face bounding box
189
+ min_face_ratio: Minimum acceptable face to image ratio
190
+
191
+ Returns:
192
+ Dictionary with face size metrics
193
+ """
194
+ img_height, img_width = image.shape[:2]
195
+ img_area = img_height * img_width
196
+
197
+ face_area = bbox["width"] * bbox["height"]
198
+ face_ratio = face_area / img_area
199
+
200
+ return {
201
+ "face_area": face_area,
202
+ "image_area": img_area,
203
+ "face_ratio": round(face_ratio, 4),
204
+ "is_face_too_small": face_ratio < min_face_ratio
205
+ }
206
+
207
+ def _assess_overall_quality(self, metrics: Dict[str, Any]) -> bool:
208
+ """
209
+ Assess overall image quality based on metrics.
210
+
211
+ Args:
212
+ metrics: Dictionary of quality metrics
213
+
214
+ Returns:
215
+ True if image passes quality checks
216
+ """
217
+ # Check blur
218
+ if metrics.get("is_blurry", False):
219
+ return False
220
+
221
+ # Check brightness
222
+ if metrics.get("is_too_dark", False) or metrics.get("is_too_bright", False):
223
+ return False
224
+
225
+ # Check pose if available
226
+ if "pose" in metrics and not metrics["pose"].get("is_frontal", True):
227
+ return False
228
+
229
+ return True
230
+
231
+
232
+ # Global service instance
233
+ face_quality_service = FaceQualityService()
app/services/face_recognition.py ADDED
@@ -0,0 +1,228 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Face Recognition Service using AuraFace.
3
+
4
+ This service provides face detection, embedding extraction,
5
+ and face comparison functionality using InsightFace with AuraFace model.
6
+ """
7
+
8
+ import numpy as np
9
+ from typing import Dict, Any, Optional, List, Tuple
10
+ from pathlib import Path
11
+ import logging
12
+
13
+ from ..config import settings
14
+
15
+ logger = logging.getLogger(__name__)
16
+
17
+
18
+ class FaceRecognitionService:
19
+ """Service for face recognition using AuraFace model."""
20
+
21
+ def __init__(self):
22
+ """Initialize the face recognition service."""
23
+ self.face_app = None
24
+ self.initialized = False
25
+
26
+ def initialize(self) -> None:
27
+ """
28
+ Initialize the face recognition model.
29
+ Should be called on application startup.
30
+ """
31
+ if self.initialized:
32
+ logger.info("Face recognition service already initialized")
33
+ return
34
+
35
+ try:
36
+ from insightface.app import FaceAnalysis
37
+
38
+ logger.info("Initializing AuraFace model...")
39
+
40
+ # Determine provider based on GPU setting
41
+ if settings.USE_GPU:
42
+ providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
43
+ else:
44
+ providers = ["CPUExecutionProvider"]
45
+
46
+ # Initialize FaceAnalysis with AuraFace
47
+ self.face_app = FaceAnalysis(
48
+ name="auraface",
49
+ root=str(Path(settings.AURAFACE_MODEL_DIR).parent),
50
+ providers=providers
51
+ )
52
+
53
+ # Prepare the model
54
+ ctx_id = settings.DEVICE_ID if settings.USE_GPU else -1
55
+ self.face_app.prepare(ctx_id=ctx_id, det_size=settings.DET_SIZE)
56
+
57
+ self.initialized = True
58
+ logger.info("Face recognition service initialized successfully")
59
+
60
+ except Exception as e:
61
+ logger.error(f"Failed to initialize face recognition service: {e}")
62
+ raise RuntimeError(f"Face recognition initialization failed: {e}")
63
+
64
+ def get_faces(self, image: np.ndarray) -> List[Any]:
65
+ """
66
+ Detect faces in image and return face objects.
67
+
68
+ Args:
69
+ image: Input image (BGR format)
70
+
71
+ Returns:
72
+ List of detected face objects
73
+ """
74
+ self._ensure_initialized()
75
+ return self.face_app.get(image)
76
+
77
+ def extract_face_info(
78
+ self,
79
+ image: np.ndarray,
80
+ allow_multiple: bool = False
81
+ ) -> Dict[str, Any]:
82
+ """
83
+ Extract face information from image.
84
+
85
+ Args:
86
+ image: Input image (BGR format)
87
+ allow_multiple: If False, raises error when multiple faces detected
88
+
89
+ Returns:
90
+ Dictionary containing face information
91
+
92
+ Raises:
93
+ ValueError: If no face detected or multiple faces detected (when not allowed)
94
+ """
95
+ self._ensure_initialized()
96
+
97
+ faces = self.face_app.get(image)
98
+
99
+ if not faces:
100
+ raise ValueError("No face detected in image")
101
+
102
+ if len(faces) > 1 and not allow_multiple:
103
+ raise ValueError(f"Multiple faces detected ({len(faces)}). Expected single face.")
104
+
105
+ face = faces[0]
106
+
107
+ # Extract bounding box
108
+ bbox = face.bbox.astype(int)
109
+ x1, y1, x2, y2 = bbox
110
+
111
+ # Build result dictionary
112
+ result = {
113
+ "embedding": face.normed_embedding,
114
+ "bbox": {
115
+ "x": int(x1),
116
+ "y": int(y1),
117
+ "width": int(x2 - x1),
118
+ "height": int(y2 - y1)
119
+ },
120
+ "det_score": float(face.det_score) if hasattr(face, 'det_score') else None,
121
+ "face_count": len(faces)
122
+ }
123
+
124
+ # Add age if available
125
+ if hasattr(face, 'age') and face.age is not None:
126
+ result["age"] = int(face.age)
127
+
128
+ # Add gender if available
129
+ if hasattr(face, 'gender') and face.gender is not None:
130
+ # Gender: 0 = Female, 1 = Male
131
+ result["gender"] = "Male" if face.gender == 1 else "Female"
132
+
133
+ # Add pose if available (yaw, pitch, roll)
134
+ if hasattr(face, 'pose') and face.pose is not None:
135
+ result["pose"] = {
136
+ "yaw": float(face.pose[1]) if len(face.pose) > 1 else 0.0,
137
+ "pitch": float(face.pose[0]) if len(face.pose) > 0 else 0.0,
138
+ "roll": float(face.pose[2]) if len(face.pose) > 2 else 0.0
139
+ }
140
+
141
+ # Add landmarks if available
142
+ if hasattr(face, 'landmark_2d_106') and face.landmark_2d_106 is not None:
143
+ result["has_landmarks"] = True
144
+ elif hasattr(face, 'kps') and face.kps is not None:
145
+ result["has_landmarks"] = True
146
+ else:
147
+ result["has_landmarks"] = False
148
+
149
+ return result
150
+
151
+ def compare_faces(
152
+ self,
153
+ embedding1: np.ndarray,
154
+ embedding2: np.ndarray,
155
+ threshold: Optional[float] = None
156
+ ) -> Dict[str, Any]:
157
+ """
158
+ Compare two face embeddings.
159
+
160
+ Args:
161
+ embedding1: First face embedding
162
+ embedding2: Second face embedding
163
+ threshold: Similarity threshold (uses default from config if not provided)
164
+
165
+ Returns:
166
+ Dictionary with comparison results
167
+ """
168
+ if threshold is None:
169
+ threshold = settings.FACE_MATCH_THRESHOLD
170
+
171
+ # Calculate cosine similarity (embeddings are already normalized)
172
+ similarity = float(np.dot(embedding1, embedding2))
173
+
174
+ return {
175
+ "is_match": similarity >= threshold,
176
+ "similarity_score": round(similarity, 4),
177
+ "threshold": threshold
178
+ }
179
+
180
+ def verify_faces(
181
+ self,
182
+ image1: np.ndarray,
183
+ image2: np.ndarray,
184
+ threshold: Optional[float] = None
185
+ ) -> Dict[str, Any]:
186
+ """
187
+ Verify if two images contain the same person.
188
+
189
+ Args:
190
+ image1: First image (BGR format)
191
+ image2: Second image (BGR format)
192
+ threshold: Similarity threshold
193
+
194
+ Returns:
195
+ Dictionary with verification results and face info
196
+ """
197
+ # Extract face info from both images
198
+ face1_info = self.extract_face_info(image1, allow_multiple=False)
199
+ face2_info = self.extract_face_info(image2, allow_multiple=False)
200
+
201
+ # Compare embeddings
202
+ comparison = self.compare_faces(
203
+ face1_info["embedding"],
204
+ face2_info["embedding"],
205
+ threshold
206
+ )
207
+
208
+ # Remove embeddings from result (they're large arrays)
209
+ face1_info.pop("embedding")
210
+ face2_info.pop("embedding")
211
+
212
+ return {
213
+ "face_match": comparison,
214
+ "face1": face1_info,
215
+ "face2": face2_info
216
+ }
217
+
218
+ def _ensure_initialized(self) -> None:
219
+ """Ensure the service is initialized."""
220
+ if not self.initialized:
221
+ raise RuntimeError(
222
+ "Face recognition service not initialized. "
223
+ "Call initialize() first or wait for app startup."
224
+ )
225
+
226
+
227
+ # Global service instance
228
+ face_recognition_service = FaceRecognitionService()
app/services/ktp_ocr.py ADDED
@@ -0,0 +1,775 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ KTP OCR Service for extracting and parsing Indonesian ID card data.
3
+
4
+ This service uses EasyOCR to extract text from KTP images and parses
5
+ the extracted text into structured fields with sanitization.
6
+ """
7
+
8
+ import re
9
+ import logging
10
+ from typing import Dict, Any, Optional, List, Tuple
11
+ from dataclasses import dataclass, field
12
+ from datetime import datetime
13
+
14
+ import cv2
15
+ import numpy as np
16
+
17
+ logger = logging.getLogger(__name__)
18
+
19
+
20
+ @dataclass
21
+ class KTPField:
22
+ """Represents a single KTP field with confidence score."""
23
+ value: str
24
+ confidence: float
25
+ raw_value: str = ""
26
+
27
+
28
+ @dataclass
29
+ class KTPData:
30
+ """Structured KTP data extracted from OCR."""
31
+ provinsi: Optional[KTPField] = None
32
+ kabupaten_kota: Optional[KTPField] = None
33
+ nik: Optional[KTPField] = None
34
+ nama: Optional[KTPField] = None
35
+ tempat_lahir: Optional[KTPField] = None
36
+ tanggal_lahir: Optional[KTPField] = None
37
+ jenis_kelamin: Optional[KTPField] = None
38
+ golongan_darah: Optional[KTPField] = None
39
+ alamat: Optional[KTPField] = None
40
+ rt_rw: Optional[KTPField] = None
41
+ kelurahan_desa: Optional[KTPField] = None
42
+ kecamatan: Optional[KTPField] = None
43
+ agama: Optional[KTPField] = None
44
+ status_perkawinan: Optional[KTPField] = None
45
+ pekerjaan: Optional[KTPField] = None
46
+ kewarganegaraan: Optional[KTPField] = None
47
+ berlaku_hingga: Optional[KTPField] = None
48
+
49
+ def to_dict(self) -> Dict[str, Any]:
50
+ """Convert to dictionary for API response."""
51
+ result = {}
52
+ for field_name in [
53
+ 'provinsi', 'kabupaten_kota', 'nik', 'nama', 'tempat_lahir',
54
+ 'tanggal_lahir', 'jenis_kelamin', 'golongan_darah', 'alamat',
55
+ 'rt_rw', 'kelurahan_desa', 'kecamatan', 'agama', 'status_perkawinan',
56
+ 'pekerjaan', 'kewarganegaraan', 'berlaku_hingga'
57
+ ]:
58
+ field_value = getattr(self, field_name)
59
+ if field_value:
60
+ result[field_name] = {
61
+ 'value': field_value.value,
62
+ 'confidence': field_value.confidence,
63
+ 'raw_value': field_value.raw_value
64
+ }
65
+ else:
66
+ result[field_name] = None
67
+ return result
68
+
69
+
70
+ class KTPOCRService:
71
+ """
72
+ Service for performing OCR on Indonesian KTP (ID card) images.
73
+
74
+ Features:
75
+ - Text extraction using EasyOCR
76
+ - Field parsing and validation
77
+ - NIK validation
78
+ - Data sanitization
79
+ """
80
+
81
+ def __init__(self):
82
+ self.reader = None
83
+ self.initialized = False
84
+
85
+ # KTP field labels for matching
86
+ self.field_labels = {
87
+ 'nik': ['NIK', 'N I K', 'NlK'],
88
+ 'nama': ['Nama', 'NAMA', 'Name'],
89
+ 'tempat_tanggal_lahir': ['Tempat/Tgl Lahir', 'Tempat/TglLahir', 'Tempat / Tgl Lahir', 'Tempat/Tgl.Lahir'],
90
+ 'jenis_kelamin': ['Jenis Kelamin', 'Jenis kelamin', 'JenisKelamin', 'JENIS KELAMIN'],
91
+ 'golongan_darah': ['Gol. Darah', 'Gol.Darah', 'Gol Darah', 'GOL. DARAH'],
92
+ 'alamat': ['Alamat', 'ALAMAT', 'Address'],
93
+ 'rt_rw': ['RT/RW', 'RT / RW', 'RTRW'],
94
+ 'kelurahan_desa': ['Kel/Desa', 'Kel / Desa', 'Kelurahan/Desa', 'KEL/DESA'],
95
+ 'kecamatan': ['Kecamatan', 'KECAMATAN', 'Kec'],
96
+ 'agama': ['Agama', 'AGAMA', 'Religion'],
97
+ 'status_perkawinan': ['Status Perkawinan', 'Status perkawinan', 'STATUS PERKAWINAN'],
98
+ 'pekerjaan': ['Pekerjaan', 'PEKERJAAN', 'Occupation'],
99
+ 'kewarganegaraan': ['Kewarganegaraan', 'KEWARGANEGARAAN', 'Nationality'],
100
+ 'berlaku_hingga': ['Berlaku Hingga', 'Berlaku hingga', 'BERLAKU HINGGA', 'Valid Until']
101
+ }
102
+
103
+ # Valid values for certain fields
104
+ self.valid_genders = ['LAKI-LAKI', 'PEREMPUAN']
105
+ self.valid_religions = ['ISLAM', 'KRISTEN', 'KATOLIK', 'HINDU', 'BUDDHA', 'KONGHUCU']
106
+ self.valid_marital_status = ['BELUM KAWIN', 'KAWIN', 'CERAI HIDUP', 'CERAI MATI']
107
+ self.valid_blood_types = ['A', 'B', 'AB', 'O', 'A+', 'A-', 'B+', 'B-', 'AB+', 'AB-', 'O+', 'O-', '-']
108
+ self.valid_nationalities = ['WNI', 'WNA', 'INDONESIA']
109
+
110
+ def initialize(self) -> None:
111
+ """Initialize PaddleOCR reader."""
112
+ if self.initialized:
113
+ return
114
+
115
+ try:
116
+ from paddleocr import PaddleOCR
117
+ logger.info("Initializing PaddleOCR reader...")
118
+ self.reader = PaddleOCR(
119
+ lang='en', # Use English (includes Latin characters for Indonesian KTP)
120
+ )
121
+ self.initialized = True
122
+ logger.info("PaddleOCR reader initialized successfully")
123
+ except Exception as e:
124
+ logger.error(f"Failed to initialize PaddleOCR: {e}")
125
+ raise
126
+
127
+ def preprocess_image(self, image: np.ndarray) -> np.ndarray:
128
+ """
129
+ Preprocess KTP image for better OCR results.
130
+
131
+ Args:
132
+ image: Input image (BGR format)
133
+
134
+ Returns:
135
+ Preprocessed image
136
+ """
137
+ # Convert to grayscale
138
+ if len(image.shape) == 3:
139
+ gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
140
+ else:
141
+ gray = image.copy()
142
+
143
+ # Apply CLAHE for contrast enhancement
144
+ clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
145
+ enhanced = clahe.apply(gray)
146
+
147
+ # Denoise
148
+ denoised = cv2.fastNlMeansDenoising(enhanced, None, 10, 7, 21)
149
+
150
+ # Adaptive thresholding for better text contrast
151
+ binary = cv2.adaptiveThreshold(
152
+ denoised, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
153
+ cv2.THRESH_BINARY, 11, 2
154
+ )
155
+
156
+ # Convert back to BGR for EasyOCR
157
+ result = cv2.cvtColor(binary, cv2.COLOR_GRAY2BGR)
158
+
159
+ return result
160
+
161
+ def extract_text(
162
+ self,
163
+ image: np.ndarray,
164
+ preprocess: bool = True
165
+ ) -> List[Tuple[List[List[int]], str, float]]:
166
+ """
167
+ Extract text from KTP image using PaddleOCR.
168
+
169
+ Args:
170
+ image: Input image (BGR format)
171
+ preprocess: Whether to preprocess the image
172
+
173
+ Returns:
174
+ List of (bounding_box, text, confidence) tuples
175
+ """
176
+ if not self.initialized:
177
+ raise RuntimeError("KTP OCR service not initialized")
178
+
179
+ # Preprocess image if requested
180
+ if preprocess:
181
+ processed = self.preprocess_image(image)
182
+ else:
183
+ processed = image
184
+
185
+ # Run PaddleOCR
186
+ result = self.reader.ocr(processed)
187
+
188
+ # Convert PaddleOCR format to expected format
189
+ # New PaddleOCR returns: [{'rec_texts': [...], 'rec_scores': [...], 'rec_polys': [...]}]
190
+ results = []
191
+ if result and len(result) > 0:
192
+ ocr_result = result[0]
193
+ texts = ocr_result.get('rec_texts', [])
194
+ scores = ocr_result.get('rec_scores', [])
195
+ polys = ocr_result.get('rec_polys', [])
196
+
197
+ for i, text in enumerate(texts):
198
+ bbox = polys[i].tolist() if i < len(polys) else []
199
+ confidence = scores[i] if i < len(scores) else 0.0
200
+ results.append((bbox, text, confidence))
201
+
202
+ # Also try on original image and merge results
203
+ if preprocess:
204
+ original_result = self.reader.ocr(image)
205
+ original_results = []
206
+ if original_result and len(original_result) > 0:
207
+ ocr_result = original_result[0]
208
+ texts = ocr_result.get('rec_texts', [])
209
+ scores = ocr_result.get('rec_scores', [])
210
+ polys = ocr_result.get('rec_polys', [])
211
+
212
+ for i, text in enumerate(texts):
213
+ bbox = polys[i].tolist() if i < len(polys) else []
214
+ confidence = scores[i] if i < len(scores) else 0.0
215
+ original_results.append((bbox, text, confidence))
216
+ # Merge results, preferring higher confidence
217
+ results = self._merge_ocr_results(results, original_results)
218
+
219
+ return results
220
+
221
+ def _merge_ocr_results(
222
+ self,
223
+ results1: List[Tuple],
224
+ results2: List[Tuple]
225
+ ) -> List[Tuple]:
226
+ """Merge OCR results from two runs, keeping higher confidence."""
227
+ all_results = results1 + results2
228
+
229
+ # Group by similar text and keep highest confidence
230
+ text_map = {}
231
+ for bbox, text, conf in all_results:
232
+ normalized_text = text.upper().strip()
233
+ if normalized_text not in text_map or text_map[normalized_text][2] < conf:
234
+ text_map[normalized_text] = (bbox, text, conf)
235
+
236
+ return list(text_map.values())
237
+
238
+ def parse_ktp_data(
239
+ self,
240
+ ocr_results: List[Tuple[List[List[int]], str, float]]
241
+ ) -> KTPData:
242
+ """
243
+ Parse OCR results into structured KTP data.
244
+
245
+ Args:
246
+ ocr_results: List of (bounding_box, text, confidence) tuples
247
+
248
+ Returns:
249
+ Structured KTP data
250
+ """
251
+ ktp_data = KTPData()
252
+
253
+ # Sort results by vertical position (y-coordinate)
254
+ sorted_results = sorted(ocr_results, key=lambda x: x[0][0][1] if x[0] else 0)
255
+
256
+ # Extract all text lines
257
+ lines = [(text.strip(), conf) for _, text, conf in sorted_results if text.strip()]
258
+
259
+ # Join all text for regex-based extraction
260
+ full_text = ' '.join([line[0] for line in lines])
261
+
262
+ # Extract NIK (16 digits)
263
+ ktp_data.nik = self._extract_nik(lines, full_text)
264
+
265
+ # Extract province and city from header
266
+ ktp_data.provinsi, ktp_data.kabupaten_kota = self._extract_location(lines)
267
+
268
+ # Extract other fields
269
+ ktp_data.nama = self._extract_field_value(lines, full_text, 'nama')
270
+
271
+ # Extract birth place and date
272
+ birth_info = self._extract_birth_info(lines, full_text)
273
+ ktp_data.tempat_lahir = birth_info[0]
274
+ ktp_data.tanggal_lahir = birth_info[1]
275
+
276
+ ktp_data.jenis_kelamin = self._extract_gender(lines, full_text)
277
+ ktp_data.golongan_darah = self._extract_blood_type(lines, full_text)
278
+ ktp_data.alamat = self._extract_address(lines, full_text)
279
+ ktp_data.rt_rw = self._extract_rt_rw(lines, full_text)
280
+ ktp_data.kelurahan_desa = self._extract_field_value(lines, full_text, 'kelurahan_desa')
281
+ ktp_data.kecamatan = self._extract_field_value(lines, full_text, 'kecamatan')
282
+ ktp_data.agama = self._extract_religion(lines, full_text)
283
+ ktp_data.status_perkawinan = self._extract_marital_status(lines, full_text)
284
+ ktp_data.pekerjaan = self._extract_field_value(lines, full_text, 'pekerjaan')
285
+ ktp_data.kewarganegaraan = self._extract_nationality(lines, full_text)
286
+ ktp_data.berlaku_hingga = self._extract_validity(lines, full_text)
287
+
288
+ return ktp_data
289
+
290
+ def _extract_nik(
291
+ self,
292
+ lines: List[Tuple[str, float]],
293
+ full_text: str
294
+ ) -> Optional[KTPField]:
295
+ """Extract NIK (16-digit ID number)."""
296
+ # Pattern for NIK: 16 consecutive digits
297
+ nik_pattern = r'\b(\d{16})\b'
298
+
299
+ for line_text, conf in lines:
300
+ # Clean the text
301
+ cleaned = re.sub(r'[^\d]', '', line_text)
302
+ if len(cleaned) == 16:
303
+ return KTPField(
304
+ value=cleaned,
305
+ confidence=conf,
306
+ raw_value=line_text
307
+ )
308
+
309
+ # Try from full text
310
+ match = re.search(nik_pattern, re.sub(r'\s', '', full_text))
311
+ if match:
312
+ return KTPField(
313
+ value=match.group(1),
314
+ confidence=0.7, # Lower confidence for pattern match
315
+ raw_value=match.group(1)
316
+ )
317
+
318
+ return None
319
+
320
+ def _extract_location(
321
+ self,
322
+ lines: List[Tuple[str, float]]
323
+ ) -> Tuple[Optional[KTPField], Optional[KTPField]]:
324
+ """Extract province and city from KTP header."""
325
+ provinsi = None
326
+ kab_kota = None
327
+
328
+ for i, (line_text, conf) in enumerate(lines[:5]): # Check first 5 lines
329
+ upper_text = line_text.upper()
330
+
331
+ # Look for "PROVINSI" keyword
332
+ if 'PROVINSI' in upper_text:
333
+ # Extract province name
334
+ prov_match = re.search(r'PROVINSI\s*[:\.]?\s*(.+)', upper_text)
335
+ if prov_match:
336
+ provinsi = KTPField(
337
+ value=self._sanitize_text(prov_match.group(1)),
338
+ confidence=conf,
339
+ raw_value=line_text
340
+ )
341
+ elif i + 1 < len(lines):
342
+ # Province name might be on next line
343
+ provinsi = KTPField(
344
+ value=self._sanitize_text(lines[i + 1][0]),
345
+ confidence=lines[i + 1][1],
346
+ raw_value=lines[i + 1][0]
347
+ )
348
+
349
+ # Look for "KABUPATEN" or "KOTA"
350
+ if 'KABUPATEN' in upper_text or 'KOTA' in upper_text:
351
+ kab_match = re.search(r'(KABUPATEN|KOTA)\s*[:\.]?\s*(.+)', upper_text)
352
+ if kab_match:
353
+ kab_kota = KTPField(
354
+ value=self._sanitize_text(kab_match.group(0)),
355
+ confidence=conf,
356
+ raw_value=line_text
357
+ )
358
+
359
+ return provinsi, kab_kota
360
+
361
+ def _extract_birth_info(
362
+ self,
363
+ lines: List[Tuple[str, float]],
364
+ full_text: str
365
+ ) -> Tuple[Optional[KTPField], Optional[KTPField]]:
366
+ """Extract birth place and date."""
367
+ tempat_lahir = None
368
+ tanggal_lahir = None
369
+
370
+ # Date pattern: DD-MM-YYYY or DD/MM/YYYY
371
+ date_pattern = r'(\d{2}[-/]\d{2}[-/]\d{4})'
372
+
373
+ for line_text, conf in lines:
374
+ upper_text = line_text.upper()
375
+
376
+ # Look for birth info line
377
+ if any(label.upper() in upper_text for label in self.field_labels.get('tempat_tanggal_lahir', [])):
378
+ # Extract after the label
379
+ for label in self.field_labels['tempat_tanggal_lahir']:
380
+ if label.upper() in upper_text:
381
+ rest = upper_text.split(label.upper())[-1].strip()
382
+ rest = re.sub(r'^[:\s]+', '', rest)
383
+
384
+ # Find date in the rest
385
+ date_match = re.search(date_pattern, rest)
386
+ if date_match:
387
+ date_str = date_match.group(1)
388
+ place = rest[:date_match.start()].strip().rstrip(',')
389
+
390
+ tempat_lahir = KTPField(
391
+ value=self._sanitize_text(place),
392
+ confidence=conf,
393
+ raw_value=place
394
+ )
395
+ tanggal_lahir = KTPField(
396
+ value=self._sanitize_date(date_str),
397
+ confidence=conf,
398
+ raw_value=date_str
399
+ )
400
+ break
401
+
402
+ # Also check for standalone date
403
+ if not tanggal_lahir:
404
+ date_match = re.search(date_pattern, line_text)
405
+ if date_match:
406
+ tanggal_lahir = KTPField(
407
+ value=self._sanitize_date(date_match.group(1)),
408
+ confidence=conf,
409
+ raw_value=date_match.group(1)
410
+ )
411
+
412
+ return tempat_lahir, tanggal_lahir
413
+
414
+ def _extract_gender(
415
+ self,
416
+ lines: List[Tuple[str, float]],
417
+ full_text: str
418
+ ) -> Optional[KTPField]:
419
+ """Extract gender (Jenis Kelamin)."""
420
+ for line_text, conf in lines:
421
+ upper_text = line_text.upper()
422
+
423
+ for valid_gender in self.valid_genders:
424
+ if valid_gender in upper_text:
425
+ return KTPField(
426
+ value=valid_gender,
427
+ confidence=conf,
428
+ raw_value=line_text
429
+ )
430
+
431
+ return None
432
+
433
+ def _extract_blood_type(
434
+ self,
435
+ lines: List[Tuple[str, float]],
436
+ full_text: str
437
+ ) -> Optional[KTPField]:
438
+ """Extract blood type (Golongan Darah)."""
439
+ for line_text, conf in lines:
440
+ upper_text = line_text.upper()
441
+
442
+ # Look for blood type field
443
+ if any(label.upper() in upper_text for label in self.field_labels.get('golongan_darah', [])):
444
+ for blood_type in self.valid_blood_types:
445
+ if blood_type in upper_text:
446
+ return KTPField(
447
+ value=blood_type,
448
+ confidence=conf,
449
+ raw_value=line_text
450
+ )
451
+
452
+ return None
453
+
454
+ def _extract_address(
455
+ self,
456
+ lines: List[Tuple[str, float]],
457
+ full_text: str
458
+ ) -> Optional[KTPField]:
459
+ """Extract address (Alamat)."""
460
+ for i, (line_text, conf) in enumerate(lines):
461
+ upper_text = line_text.upper()
462
+
463
+ if any(label.upper() in upper_text for label in self.field_labels.get('alamat', [])):
464
+ # Get the address part after the label
465
+ for label in self.field_labels['alamat']:
466
+ if label.upper() in upper_text:
467
+ rest = upper_text.split(label.upper())[-1].strip()
468
+ rest = re.sub(r'^[:\s]+', '', rest)
469
+
470
+ if rest:
471
+ return KTPField(
472
+ value=self._sanitize_text(rest),
473
+ confidence=conf,
474
+ raw_value=line_text
475
+ )
476
+ # Address might be on next line
477
+ elif i + 1 < len(lines):
478
+ next_line = lines[i + 1]
479
+ return KTPField(
480
+ value=self._sanitize_text(next_line[0]),
481
+ confidence=next_line[1],
482
+ raw_value=next_line[0]
483
+ )
484
+
485
+ return None
486
+
487
+ def _extract_rt_rw(
488
+ self,
489
+ lines: List[Tuple[str, float]],
490
+ full_text: str
491
+ ) -> Optional[KTPField]:
492
+ """Extract RT/RW."""
493
+ rt_rw_pattern = r'(\d{3})\s*/\s*(\d{3})'
494
+
495
+ for line_text, conf in lines:
496
+ match = re.search(rt_rw_pattern, line_text)
497
+ if match:
498
+ value = f"{match.group(1)}/{match.group(2)}"
499
+ return KTPField(
500
+ value=value,
501
+ confidence=conf,
502
+ raw_value=line_text
503
+ )
504
+
505
+ return None
506
+
507
+ def _extract_religion(
508
+ self,
509
+ lines: List[Tuple[str, float]],
510
+ full_text: str
511
+ ) -> Optional[KTPField]:
512
+ """Extract religion (Agama)."""
513
+ for line_text, conf in lines:
514
+ upper_text = line_text.upper()
515
+
516
+ for religion in self.valid_religions:
517
+ if religion in upper_text:
518
+ return KTPField(
519
+ value=religion,
520
+ confidence=conf,
521
+ raw_value=line_text
522
+ )
523
+
524
+ return None
525
+
526
+ def _extract_marital_status(
527
+ self,
528
+ lines: List[Tuple[str, float]],
529
+ full_text: str
530
+ ) -> Optional[KTPField]:
531
+ """Extract marital status (Status Perkawinan)."""
532
+ for line_text, conf in lines:
533
+ upper_text = line_text.upper()
534
+
535
+ for status in self.valid_marital_status:
536
+ if status in upper_text:
537
+ return KTPField(
538
+ value=status,
539
+ confidence=conf,
540
+ raw_value=line_text
541
+ )
542
+
543
+ return None
544
+
545
+ def _extract_nationality(
546
+ self,
547
+ lines: List[Tuple[str, float]],
548
+ full_text: str
549
+ ) -> Optional[KTPField]:
550
+ """Extract nationality (Kewarganegaraan)."""
551
+ for line_text, conf in lines:
552
+ upper_text = line_text.upper()
553
+
554
+ for nationality in self.valid_nationalities:
555
+ if nationality in upper_text:
556
+ return KTPField(
557
+ value=nationality if nationality != 'INDONESIA' else 'WNI',
558
+ confidence=conf,
559
+ raw_value=line_text
560
+ )
561
+
562
+ return None
563
+
564
+ def _extract_validity(
565
+ self,
566
+ lines: List[Tuple[str, float]],
567
+ full_text: str
568
+ ) -> Optional[KTPField]:
569
+ """Extract validity period (Berlaku Hingga)."""
570
+ for line_text, conf in lines:
571
+ upper_text = line_text.upper()
572
+
573
+ if any(label.upper() in upper_text for label in self.field_labels.get('berlaku_hingga', [])):
574
+ # Check for "SEUMUR HIDUP"
575
+ if 'SEUMUR HIDUP' in upper_text:
576
+ return KTPField(
577
+ value='SEUMUR HIDUP',
578
+ confidence=conf,
579
+ raw_value=line_text
580
+ )
581
+
582
+ # Check for date
583
+ date_pattern = r'(\d{2}[-/]\d{2}[-/]\d{4})'
584
+ date_match = re.search(date_pattern, line_text)
585
+ if date_match:
586
+ return KTPField(
587
+ value=self._sanitize_date(date_match.group(1)),
588
+ confidence=conf,
589
+ raw_value=line_text
590
+ )
591
+
592
+ return None
593
+
594
+ def _extract_field_value(
595
+ self,
596
+ lines: List[Tuple[str, float]],
597
+ full_text: str,
598
+ field_name: str
599
+ ) -> Optional[KTPField]:
600
+ """Generic field value extraction."""
601
+ labels = self.field_labels.get(field_name, [])
602
+
603
+ for i, (line_text, conf) in enumerate(lines):
604
+ for label in labels:
605
+ if label.upper() in line_text.upper():
606
+ # Get value after label
607
+ rest = line_text.upper().split(label.upper())[-1].strip()
608
+ rest = re.sub(r'^[:\s]+', '', rest)
609
+
610
+ if rest:
611
+ return KTPField(
612
+ value=self._sanitize_text(rest),
613
+ confidence=conf,
614
+ raw_value=line_text
615
+ )
616
+ # Value might be on next line
617
+ elif i + 1 < len(lines):
618
+ next_line = lines[i + 1]
619
+ return KTPField(
620
+ value=self._sanitize_text(next_line[0]),
621
+ confidence=next_line[1],
622
+ raw_value=next_line[0]
623
+ )
624
+
625
+ return None
626
+
627
+ def _sanitize_text(self, text: str) -> str:
628
+ """Sanitize extracted text."""
629
+ if not text:
630
+ return ""
631
+
632
+ # Remove extra whitespace
633
+ text = ' '.join(text.split())
634
+
635
+ # Remove leading/trailing punctuation
636
+ text = text.strip('.:,;-_')
637
+
638
+ # Convert to title case for names
639
+ text = text.strip()
640
+
641
+ return text
642
+
643
+ def _sanitize_date(self, date_str: str) -> str:
644
+ """Sanitize and standardize date format to DD-MM-YYYY."""
645
+ if not date_str:
646
+ return ""
647
+
648
+ # Replace / with -
649
+ date_str = date_str.replace('/', '-')
650
+
651
+ return date_str
652
+
653
+ def validate_nik(self, nik: str) -> Dict[str, Any]:
654
+ """
655
+ Validate NIK and extract encoded information.
656
+
657
+ NIK Format: PPKKCC-DDMMYY-XXXX
658
+ - PP: Province code (2 digits)
659
+ - KK: City/Regency code (2 digits)
660
+ - CC: District code (2 digits)
661
+ - DD: Birth date (01-31, add 40 for females)
662
+ - MM: Birth month (01-12)
663
+ - YY: Birth year (last 2 digits)
664
+ - XXXX: Sequence number (4 digits)
665
+
666
+ Args:
667
+ nik: NIK string (16 digits)
668
+
669
+ Returns:
670
+ Validation result with extracted info
671
+ """
672
+ result = {
673
+ 'is_valid': False,
674
+ 'errors': [],
675
+ 'extracted': {}
676
+ }
677
+
678
+ # Clean NIK
679
+ nik = re.sub(r'[^\d]', '', nik)
680
+
681
+ # Check length
682
+ if len(nik) != 16:
683
+ result['errors'].append(f"Invalid length: {len(nik)} (expected 16)")
684
+ return result
685
+
686
+ try:
687
+ # Extract components
688
+ province_code = nik[0:2]
689
+ city_code = nik[2:4]
690
+ district_code = nik[4:6]
691
+ birth_day = int(nik[6:8])
692
+ birth_month = int(nik[8:10])
693
+ birth_year = int(nik[10:12])
694
+ sequence = nik[12:16]
695
+
696
+ # Determine gender from birth day
697
+ gender = 'PEREMPUAN' if birth_day > 40 else 'LAKI-LAKI'
698
+ actual_day = birth_day - 40 if birth_day > 40 else birth_day
699
+
700
+ # Validate birth date
701
+ if actual_day < 1 or actual_day > 31:
702
+ result['errors'].append(f"Invalid birth day: {actual_day}")
703
+
704
+ if birth_month < 1 or birth_month > 12:
705
+ result['errors'].append(f"Invalid birth month: {birth_month}")
706
+
707
+ # Determine full birth year (assume 19xx for > 30, 20xx for <= 30)
708
+ current_year = datetime.now().year % 100
709
+ if birth_year > current_year:
710
+ full_year = 1900 + birth_year
711
+ else:
712
+ full_year = 2000 + birth_year
713
+
714
+ result['extracted'] = {
715
+ 'province_code': province_code,
716
+ 'city_code': city_code,
717
+ 'district_code': district_code,
718
+ 'birth_date': f"{actual_day:02d}-{birth_month:02d}-{full_year}",
719
+ 'gender': gender,
720
+ 'sequence': sequence
721
+ }
722
+
723
+ result['is_valid'] = len(result['errors']) == 0
724
+
725
+ except Exception as e:
726
+ result['errors'].append(f"Parsing error: {str(e)}")
727
+
728
+ return result
729
+
730
+ def extract_ktp_data(
731
+ self,
732
+ image: np.ndarray,
733
+ validate: bool = True
734
+ ) -> Dict[str, Any]:
735
+ """
736
+ Extract and parse all KTP data from image.
737
+
738
+ Args:
739
+ image: Input image (BGR format)
740
+ validate: Whether to validate extracted data
741
+
742
+ Returns:
743
+ Dictionary with extracted data, raw OCR results, and validation
744
+ """
745
+ # Run OCR
746
+ ocr_results = self.extract_text(image)
747
+
748
+ # Parse into structured data
749
+ ktp_data = self.parse_ktp_data(ocr_results)
750
+
751
+ # Build response
752
+ response = {
753
+ 'data': ktp_data.to_dict(),
754
+ 'raw_text': [
755
+ {
756
+ 'text': text,
757
+ 'confidence': conf,
758
+ 'bbox': bbox
759
+ }
760
+ for bbox, text, conf in ocr_results
761
+ ],
762
+ 'validation': None
763
+ }
764
+
765
+ # Validate NIK if found and validation requested
766
+ if validate and ktp_data.nik:
767
+ response['validation'] = {
768
+ 'nik': self.validate_nik(ktp_data.nik.value)
769
+ }
770
+
771
+ return response
772
+
773
+
774
+ # Global service instance
775
+ ktp_ocr_service = KTPOCRService()
app/services/liveness_detection.py ADDED
@@ -0,0 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Liveness Detection Service using Silent-Face-Anti-Spoofing.
3
+
4
+ This service detects whether a face image is from a real person
5
+ or a spoofing attempt (photo, video, mask, etc.).
6
+ """
7
+
8
+ import os
9
+ import sys
10
+ import numpy as np
11
+ import cv2
12
+ from typing import Dict, Any, Optional
13
+ from pathlib import Path
14
+ import logging
15
+
16
+ from ..config import settings
17
+
18
+ logger = logging.getLogger(__name__)
19
+
20
+
21
+ class LivenessDetectionService:
22
+ """Service for detecting face liveness (anti-spoofing)."""
23
+
24
+ def __init__(self):
25
+ """Initialize the liveness detection service."""
26
+ self.model = None
27
+ self.image_cropper = None
28
+ self.initialized = False
29
+ self._models_info = []
30
+
31
+ def initialize(self) -> None:
32
+ """
33
+ Initialize the liveness detection models.
34
+ Should be called on application startup.
35
+ """
36
+ if self.initialized:
37
+ logger.info("Liveness detection service already initialized")
38
+ return
39
+
40
+ try:
41
+ # Add Silent-Face-Anti-Spoofing to path
42
+ repo_path = Path(settings.SILENT_FACE_REPO_DIR)
43
+ if repo_path.exists():
44
+ sys.path.insert(0, str(repo_path))
45
+
46
+ from src.anti_spoof_predict import AntiSpoofPredict
47
+ from src.generate_patches import CropImage
48
+
49
+ logger.info("Initializing liveness detection models...")
50
+
51
+ # Initialize predictor
52
+ device_id = settings.DEVICE_ID if settings.USE_GPU else 0
53
+ self.model = AntiSpoofPredict(device_id)
54
+ self.image_cropper = CropImage()
55
+
56
+ # Verify model files exist
57
+ model_dir = Path(settings.ANTISPOOF_MODEL_DIR) / "anti_spoof_models"
58
+ if model_dir.exists():
59
+ self._models_info = list(model_dir.glob("*.pth"))
60
+ logger.info(f"Found {len(self._models_info)} anti-spoof models")
61
+ else:
62
+ logger.warning(f"Anti-spoof models directory not found: {model_dir}")
63
+
64
+ self.initialized = True
65
+ logger.info("Liveness detection service initialized successfully")
66
+
67
+ except ImportError as e:
68
+ logger.error(f"Failed to import Silent-Face-Anti-Spoofing: {e}")
69
+ logger.warning("Liveness detection will not be available")
70
+ self.initialized = False
71
+ except Exception as e:
72
+ logger.error(f"Failed to initialize liveness detection: {e}")
73
+ raise RuntimeError(f"Liveness detection initialization failed: {e}")
74
+
75
+ def check_liveness(
76
+ self,
77
+ image: np.ndarray,
78
+ threshold: Optional[float] = None
79
+ ) -> Dict[str, Any]:
80
+ """
81
+ Check if the face in the image is real or spoofed.
82
+
83
+ Args:
84
+ image: Input image (BGR format)
85
+ threshold: Confidence threshold for classification
86
+
87
+ Returns:
88
+ Dictionary with liveness detection results
89
+ """
90
+ self._ensure_initialized()
91
+
92
+ if threshold is None:
93
+ threshold = settings.LIVENESS_THRESHOLD
94
+
95
+ try:
96
+ # Import utilities
97
+ from src.utility import parse_model_name
98
+
99
+ # Get face bounding box
100
+ image_bbox = self.model.get_bbox(image)
101
+
102
+ if image_bbox is None:
103
+ return {
104
+ "is_real": False,
105
+ "confidence": 0.0,
106
+ "label": "No Face Detected",
107
+ "error": "No face detected in image"
108
+ }
109
+
110
+ # Get model directory
111
+ model_dir = Path(settings.ANTISPOOF_MODEL_DIR) / "anti_spoof_models"
112
+
113
+ # Accumulate predictions from all models
114
+ prediction = np.zeros((1, 3))
115
+ model_count = 0
116
+
117
+ for model_name in os.listdir(model_dir):
118
+ if not model_name.endswith(".pth"):
119
+ continue
120
+
121
+ try:
122
+ # Parse model parameters from filename
123
+ h_input, w_input, model_type, scale = parse_model_name(model_name)
124
+
125
+ # Crop face patch according to model requirements
126
+ param = {
127
+ "org_img": image,
128
+ "bbox": image_bbox,
129
+ "scale": scale,
130
+ "out_w": w_input,
131
+ "out_h": h_input,
132
+ "crop": True,
133
+ }
134
+
135
+ if scale is not None:
136
+ img_patch = self.image_cropper.crop(**param)
137
+ else:
138
+ img_patch = image
139
+
140
+ # Run prediction
141
+ model_path = os.path.join(str(model_dir), model_name)
142
+ prediction += self.model.predict(img_patch, model_path)
143
+ model_count += 1
144
+
145
+ except Exception as e:
146
+ logger.warning(f"Error processing model {model_name}: {e}")
147
+ continue
148
+
149
+ if model_count == 0:
150
+ return {
151
+ "is_real": False,
152
+ "confidence": 0.0,
153
+ "label": "Model Error",
154
+ "error": "No models could process the image"
155
+ }
156
+
157
+ # Get final prediction
158
+ # Label: 1 = Real, 0 or 2 = Fake
159
+ label = np.argmax(prediction)
160
+ confidence = float(prediction[0][label] / model_count)
161
+
162
+ is_real = label == 1
163
+
164
+ return {
165
+ "is_real": is_real,
166
+ "confidence": round(confidence, 4),
167
+ "label": "Real Face" if is_real else "Fake Face",
168
+ "prediction_class": int(label),
169
+ "models_used": model_count
170
+ }
171
+
172
+ except Exception as e:
173
+ logger.error(f"Liveness detection error: {e}")
174
+ return {
175
+ "is_real": False,
176
+ "confidence": 0.0,
177
+ "label": "Error",
178
+ "error": str(e)
179
+ }
180
+
181
+ def check_liveness_simple(self, image: np.ndarray) -> bool:
182
+ """
183
+ Simple liveness check returning only boolean.
184
+
185
+ Args:
186
+ image: Input image (BGR format)
187
+
188
+ Returns:
189
+ True if face is real, False otherwise
190
+ """
191
+ result = self.check_liveness(image)
192
+ return result.get("is_real", False)
193
+
194
+ def _ensure_initialized(self) -> None:
195
+ """Ensure the service is initialized."""
196
+ if not self.initialized:
197
+ raise RuntimeError(
198
+ "Liveness detection service not initialized. "
199
+ "Call initialize() first or wait for app startup."
200
+ )
201
+
202
+
203
+ # Global service instance
204
+ liveness_detection_service = LivenessDetectionService()
app/utils/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # Utils Package
app/utils/image_utils.py ADDED
@@ -0,0 +1,214 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Image processing utilities for KYC POC.
3
+ """
4
+
5
+ import cv2
6
+ import numpy as np
7
+ import base64
8
+ from typing import Optional, Tuple
9
+ from fastapi import UploadFile, HTTPException
10
+
11
+
12
+ async def read_image_from_upload(file: UploadFile) -> np.ndarray:
13
+ """
14
+ Read uploaded image file into numpy array (OpenCV BGR format).
15
+
16
+ Args:
17
+ file: FastAPI UploadFile object
18
+
19
+ Returns:
20
+ numpy array in BGR format (OpenCV)
21
+
22
+ Raises:
23
+ HTTPException: If image is invalid or cannot be decoded
24
+ """
25
+ contents = await file.read()
26
+ return decode_image_bytes(contents)
27
+
28
+
29
+ def decode_image_bytes(image_bytes: bytes) -> np.ndarray:
30
+ """
31
+ Decode image bytes to numpy array.
32
+
33
+ Args:
34
+ image_bytes: Raw image bytes
35
+
36
+ Returns:
37
+ numpy array in BGR format (OpenCV)
38
+
39
+ Raises:
40
+ HTTPException: If image cannot be decoded
41
+ """
42
+ nparr = np.frombuffer(image_bytes, np.uint8)
43
+ image = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
44
+
45
+ if image is None:
46
+ raise HTTPException(
47
+ status_code=400,
48
+ detail={
49
+ "error_code": "IMAGE_INVALID",
50
+ "message": "Failed to decode image. Please ensure the file is a valid image."
51
+ }
52
+ )
53
+
54
+ return image
55
+
56
+
57
+ def decode_base64_image(base64_string: str) -> np.ndarray:
58
+ """
59
+ Decode base64 encoded image string to numpy array.
60
+
61
+ Args:
62
+ base64_string: Base64 encoded image string (with or without data URI prefix)
63
+
64
+ Returns:
65
+ numpy array in BGR format (OpenCV)
66
+
67
+ Raises:
68
+ HTTPException: If base64 string is invalid or image cannot be decoded
69
+ """
70
+ try:
71
+ # Remove data URI prefix if present
72
+ if "," in base64_string:
73
+ base64_string = base64_string.split(",")[1]
74
+
75
+ # Decode base64
76
+ image_bytes = base64.b64decode(base64_string)
77
+ return decode_image_bytes(image_bytes)
78
+
79
+ except Exception as e:
80
+ raise HTTPException(
81
+ status_code=400,
82
+ detail={
83
+ "error_code": "IMAGE_INVALID",
84
+ "message": f"Failed to decode base64 image: {str(e)}"
85
+ }
86
+ )
87
+
88
+
89
+ def encode_image_to_base64(image: np.ndarray, format: str = ".jpg") -> str:
90
+ """
91
+ Encode numpy array image to base64 string.
92
+
93
+ Args:
94
+ image: numpy array in BGR format
95
+ format: Image format (.jpg, .png)
96
+
97
+ Returns:
98
+ Base64 encoded string
99
+ """
100
+ _, buffer = cv2.imencode(format, image)
101
+ return base64.b64encode(buffer).decode("utf-8")
102
+
103
+
104
+ def resize_image(
105
+ image: np.ndarray,
106
+ max_size: int = 1024,
107
+ keep_aspect: bool = True
108
+ ) -> np.ndarray:
109
+ """
110
+ Resize image if it exceeds max size.
111
+
112
+ Args:
113
+ image: Input image
114
+ max_size: Maximum dimension size
115
+ keep_aspect: Whether to keep aspect ratio
116
+
117
+ Returns:
118
+ Resized image
119
+ """
120
+ height, width = image.shape[:2]
121
+
122
+ if max(height, width) <= max_size:
123
+ return image
124
+
125
+ if keep_aspect:
126
+ if width > height:
127
+ new_width = max_size
128
+ new_height = int(height * max_size / width)
129
+ else:
130
+ new_height = max_size
131
+ new_width = int(width * max_size / height)
132
+ else:
133
+ new_width = max_size
134
+ new_height = max_size
135
+
136
+ return cv2.resize(image, (new_width, new_height), interpolation=cv2.INTER_AREA)
137
+
138
+
139
+ def crop_face_region(
140
+ image: np.ndarray,
141
+ bbox: Tuple[int, int, int, int],
142
+ padding: float = 0.2
143
+ ) -> np.ndarray:
144
+ """
145
+ Crop face region from image with padding.
146
+
147
+ Args:
148
+ image: Input image
149
+ bbox: Face bounding box (x1, y1, x2, y2)
150
+ padding: Padding ratio to add around face
151
+
152
+ Returns:
153
+ Cropped face image
154
+ """
155
+ height, width = image.shape[:2]
156
+ x1, y1, x2, y2 = bbox
157
+
158
+ # Calculate padding
159
+ face_width = x2 - x1
160
+ face_height = y2 - y1
161
+ pad_x = int(face_width * padding)
162
+ pad_y = int(face_height * padding)
163
+
164
+ # Apply padding with bounds checking
165
+ x1 = max(0, x1 - pad_x)
166
+ y1 = max(0, y1 - pad_y)
167
+ x2 = min(width, x2 + pad_x)
168
+ y2 = min(height, y2 + pad_y)
169
+
170
+ return image[y1:y2, x1:x2]
171
+
172
+
173
+ def validate_image_size(image_bytes: bytes, max_size_bytes: int) -> None:
174
+ """
175
+ Validate image size doesn't exceed maximum.
176
+
177
+ Args:
178
+ image_bytes: Image bytes
179
+ max_size_bytes: Maximum allowed size in bytes
180
+
181
+ Raises:
182
+ HTTPException: If image exceeds size limit
183
+ """
184
+ if len(image_bytes) > max_size_bytes:
185
+ max_mb = max_size_bytes / (1024 * 1024)
186
+ actual_mb = len(image_bytes) / (1024 * 1024)
187
+ raise HTTPException(
188
+ status_code=413,
189
+ detail={
190
+ "error_code": "IMAGE_TOO_LARGE",
191
+ "message": f"Image size ({actual_mb:.2f}MB) exceeds maximum allowed ({max_mb:.2f}MB)"
192
+ }
193
+ )
194
+
195
+
196
+ def validate_content_type(content_type: Optional[str], allowed_types: list) -> None:
197
+ """
198
+ Validate image content type.
199
+
200
+ Args:
201
+ content_type: MIME type of the file
202
+ allowed_types: List of allowed MIME types
203
+
204
+ Raises:
205
+ HTTPException: If content type is not allowed
206
+ """
207
+ if content_type not in allowed_types:
208
+ raise HTTPException(
209
+ status_code=415,
210
+ detail={
211
+ "error_code": "UNSUPPORTED_FORMAT",
212
+ "message": f"Unsupported image format: {content_type}. Allowed: {allowed_types}"
213
+ }
214
+ )
app/utils/ktp_extractor.py ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ KTP (Indonesian ID Card) face extraction utility.
3
+
4
+ This module provides functionality to detect and extract the face
5
+ photo from a KTP card image.
6
+ """
7
+
8
+ import cv2
9
+ import numpy as np
10
+ from typing import Optional, Tuple, Dict, Any
11
+
12
+
13
+ class KTPFaceExtractor:
14
+ """Extracts face from KTP (Indonesian ID card) images."""
15
+
16
+ def __init__(self, face_detector=None):
17
+ """
18
+ Initialize KTP face extractor.
19
+
20
+ Args:
21
+ face_detector: Face detector instance (from face_recognition service)
22
+ """
23
+ self.face_detector = face_detector
24
+
25
+ def set_detector(self, face_detector):
26
+ """Set the face detector instance."""
27
+ self.face_detector = face_detector
28
+
29
+ def extract_face(
30
+ self,
31
+ ktp_image: np.ndarray,
32
+ padding: float = 0.3
33
+ ) -> Tuple[np.ndarray, Dict[str, Any]]:
34
+ """
35
+ Extract face from KTP image.
36
+
37
+ Args:
38
+ ktp_image: KTP image as numpy array (BGR)
39
+ padding: Padding ratio around detected face
40
+
41
+ Returns:
42
+ Tuple of (cropped_face_image, face_info_dict)
43
+
44
+ Raises:
45
+ ValueError: If no face detected or multiple faces found
46
+ """
47
+ if self.face_detector is None:
48
+ raise RuntimeError("Face detector not initialized. Call set_detector first.")
49
+
50
+ # Detect faces in KTP image
51
+ faces = self.face_detector.get(ktp_image)
52
+
53
+ if not faces:
54
+ raise ValueError("No face detected in KTP image")
55
+
56
+ if len(faces) > 1:
57
+ raise ValueError(f"Multiple faces ({len(faces)}) detected in KTP image")
58
+
59
+ face = faces[0]
60
+
61
+ # Get bounding box
62
+ bbox = face.bbox.astype(int)
63
+ x1, y1, x2, y2 = bbox
64
+
65
+ # Apply padding
66
+ height, width = ktp_image.shape[:2]
67
+ face_width = x2 - x1
68
+ face_height = y2 - y1
69
+ pad_x = int(face_width * padding)
70
+ pad_y = int(face_height * padding)
71
+
72
+ # Expand bounding box with padding (with bounds checking)
73
+ x1_padded = max(0, x1 - pad_x)
74
+ y1_padded = max(0, y1 - pad_y)
75
+ x2_padded = min(width, x2 + pad_x)
76
+ y2_padded = min(height, y2 + pad_y)
77
+
78
+ # Crop face region
79
+ cropped_face = ktp_image[y1_padded:y2_padded, x1_padded:x2_padded]
80
+
81
+ # Build face info
82
+ face_info = {
83
+ "bbox": {
84
+ "x": int(x1),
85
+ "y": int(y1),
86
+ "width": int(x2 - x1),
87
+ "height": int(y2 - y1)
88
+ },
89
+ "bbox_padded": {
90
+ "x": int(x1_padded),
91
+ "y": int(y1_padded),
92
+ "width": int(x2_padded - x1_padded),
93
+ "height": int(y2_padded - y1_padded)
94
+ },
95
+ "det_score": float(face.det_score) if hasattr(face, 'det_score') else None
96
+ }
97
+
98
+ return cropped_face, face_info
99
+
100
+ def extract_face_with_fallback(
101
+ self,
102
+ ktp_image: np.ndarray,
103
+ padding: float = 0.3
104
+ ) -> Tuple[np.ndarray, Dict[str, Any], bool]:
105
+ """
106
+ Extract face from KTP with fallback to full image if detection fails.
107
+
108
+ Args:
109
+ ktp_image: KTP image as numpy array (BGR)
110
+ padding: Padding ratio around detected face
111
+
112
+ Returns:
113
+ Tuple of (image, face_info, is_face_detected)
114
+ """
115
+ try:
116
+ cropped, info = self.extract_face(ktp_image, padding)
117
+ return cropped, info, True
118
+ except ValueError:
119
+ # Fallback: return the whole image
120
+ height, width = ktp_image.shape[:2]
121
+ info = {
122
+ "bbox": {"x": 0, "y": 0, "width": width, "height": height},
123
+ "bbox_padded": {"x": 0, "y": 0, "width": width, "height": height},
124
+ "det_score": None,
125
+ "warning": "Face not detected, using full image"
126
+ }
127
+ return ktp_image, info, False
128
+
129
+
130
+ def preprocess_ktp_image(image: np.ndarray) -> np.ndarray:
131
+ """
132
+ Preprocess KTP image for better face detection.
133
+
134
+ Args:
135
+ image: Input KTP image
136
+
137
+ Returns:
138
+ Preprocessed image
139
+ """
140
+ # Convert to grayscale for processing
141
+ if len(image.shape) == 3:
142
+ gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
143
+ else:
144
+ gray = image
145
+
146
+ # Apply CLAHE (Contrast Limited Adaptive Histogram Equalization)
147
+ clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
148
+ enhanced = clahe.apply(gray)
149
+
150
+ # Convert back to BGR if original was color
151
+ if len(image.shape) == 3:
152
+ enhanced = cv2.cvtColor(enhanced, cv2.COLOR_GRAY2BGR)
153
+
154
+ return enhanced
requirements.txt ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ fastapi>=0.104.0
2
+ uvicorn[standard]>=0.24.0
3
+ python-multipart>=0.0.6
4
+ opencv-python>=4.8.0
5
+ numpy>=1.26.0,<2.0.0
6
+ insightface>=0.7.3
7
+ huggingface-hub>=0.19.0
8
+ onnxruntime>=1.18.0
9
+ torch>=2.2.0
10
+ torchvision>=0.17.0
11
+ scikit-learn>=1.3.0
12
+ pydantic>=2.0.0
13
+ pydantic-settings>=2.0.0
14
+ paddlepaddle>=2.6.0
15
+ paddleocr>=2.7.0
16
+ rapidfuzz>=3.0.0
setup_models.py ADDED
@@ -0,0 +1,168 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Model Setup Script for KYC POC
3
+
4
+ This script downloads and sets up the required ML models:
5
+ 1. AuraFace - Face recognition model from HuggingFace
6
+ 2. Silent-Face-Anti-Spoofing - Liveness detection models from GitHub
7
+
8
+ Run this script before starting the application:
9
+ python setup_models.py
10
+ """
11
+
12
+ import os
13
+ import sys
14
+ import shutil
15
+ import subprocess
16
+ from pathlib import Path
17
+
18
+
19
+ def setup_auraface():
20
+ """Download AuraFace model from HuggingFace."""
21
+ print("=" * 50)
22
+ print("Setting up AuraFace model...")
23
+ print("=" * 50)
24
+
25
+ try:
26
+ from huggingface_hub import snapshot_download
27
+
28
+ model_dir = Path("models/auraface")
29
+ model_dir.mkdir(parents=True, exist_ok=True)
30
+
31
+ print("Downloading AuraFace-v1 from HuggingFace...")
32
+ snapshot_download(
33
+ repo_id="fal/AuraFace-v1",
34
+ local_dir=str(model_dir),
35
+ local_dir_use_symlinks=False
36
+ )
37
+ print(f"AuraFace model downloaded to: {model_dir}")
38
+ return True
39
+
40
+ except ImportError:
41
+ print("ERROR: huggingface_hub not installed. Run: pip install huggingface-hub")
42
+ return False
43
+ except Exception as e:
44
+ print(f"ERROR downloading AuraFace: {e}")
45
+ return False
46
+
47
+
48
+ def setup_silent_face_anti_spoofing():
49
+ """Clone Silent-Face-Anti-Spoofing repository and copy models."""
50
+ print("\n" + "=" * 50)
51
+ print("Setting up Silent-Face-Anti-Spoofing...")
52
+ print("=" * 50)
53
+
54
+ repo_dir = Path("Silent-Face-Anti-Spoofing")
55
+ models_dir = Path("models/anti_spoof")
56
+
57
+ # Clone repository if not exists
58
+ if not repo_dir.exists():
59
+ print("Cloning Silent-Face-Anti-Spoofing repository...")
60
+ try:
61
+ result = subprocess.run(
62
+ ["git", "clone", "https://github.com/minivision-ai/Silent-Face-Anti-Spoofing.git"],
63
+ capture_output=True,
64
+ text=True
65
+ )
66
+ if result.returncode != 0:
67
+ print(f"ERROR cloning repository: {result.stderr}")
68
+ return False
69
+ print("Repository cloned successfully.")
70
+ except FileNotFoundError:
71
+ print("ERROR: git not found. Please install git and try again.")
72
+ return False
73
+ else:
74
+ print("Repository already exists, skipping clone.")
75
+
76
+ # Copy model files
77
+ models_dir.mkdir(parents=True, exist_ok=True)
78
+
79
+ # Copy anti_spoof_models
80
+ src_anti_spoof = repo_dir / "resources" / "anti_spoof_models"
81
+ dst_anti_spoof = models_dir / "anti_spoof_models"
82
+
83
+ if src_anti_spoof.exists():
84
+ if dst_anti_spoof.exists():
85
+ shutil.rmtree(dst_anti_spoof)
86
+ shutil.copytree(src_anti_spoof, dst_anti_spoof)
87
+ print(f"Copied anti_spoof_models to: {dst_anti_spoof}")
88
+ else:
89
+ print(f"WARNING: {src_anti_spoof} not found")
90
+
91
+ # Copy detection_model
92
+ src_detection = repo_dir / "resources" / "detection_model"
93
+ dst_detection = models_dir / "detection_model"
94
+
95
+ if src_detection.exists():
96
+ if dst_detection.exists():
97
+ shutil.rmtree(dst_detection)
98
+ shutil.copytree(src_detection, dst_detection)
99
+ print(f"Copied detection_model to: {dst_detection}")
100
+ else:
101
+ print(f"WARNING: {src_detection} not found")
102
+
103
+ return True
104
+
105
+
106
+ def verify_models():
107
+ """Verify all required model files exist."""
108
+ print("\n" + "=" * 50)
109
+ print("Verifying model files...")
110
+ print("=" * 50)
111
+
112
+ required_files = [
113
+ # AuraFace models (these are in the auraface directory after download)
114
+ "models/auraface",
115
+ # Anti-spoofing models
116
+ "models/anti_spoof/anti_spoof_models",
117
+ "models/anti_spoof/detection_model",
118
+ ]
119
+
120
+ all_exist = True
121
+ for file_path in required_files:
122
+ path = Path(file_path)
123
+ exists = path.exists()
124
+ status = "OK" if exists else "MISSING"
125
+ print(f" [{status}] {file_path}")
126
+ if not exists:
127
+ all_exist = False
128
+
129
+ return all_exist
130
+
131
+
132
+ def main():
133
+ """Main setup function."""
134
+ print("\n" + "#" * 60)
135
+ print("# KYC POC - Model Setup")
136
+ print("#" * 60 + "\n")
137
+
138
+ # Change to script directory
139
+ script_dir = Path(__file__).parent
140
+ os.chdir(script_dir)
141
+
142
+ success = True
143
+
144
+ # Setup AuraFace
145
+ if not setup_auraface():
146
+ success = False
147
+
148
+ # Setup Silent-Face-Anti-Spoofing
149
+ if not setup_silent_face_anti_spoofing():
150
+ success = False
151
+
152
+ # Verify all models
153
+ if not verify_models():
154
+ success = False
155
+
156
+ print("\n" + "#" * 60)
157
+ if success:
158
+ print("# Setup completed successfully!")
159
+ print("# You can now run the application with: uvicorn app.main:app --reload")
160
+ else:
161
+ print("# Setup completed with errors. Please check the messages above.")
162
+ print("#" * 60 + "\n")
163
+
164
+ return 0 if success else 1
165
+
166
+
167
+ if __name__ == "__main__":
168
+ sys.exit(main())