narugo1992 commited on
Commit
e7bd63d
·
verified ·
1 Parent(s): 941bf7c

Auto-update README.md via abstractor, on 2025-11-17 20:43:39 CST

Browse files
Files changed (1) hide show
  1. README.md +127 -2
README.md CHANGED
@@ -1,9 +1,134 @@
1
  ---
 
2
  tags:
3
- - art
4
  - ocr
 
 
 
 
 
5
  license: other
6
  license_name: model-distribution-disclaimer-license
7
  license_link: https://huggingface.co/spaces/deepghs/RDLicence
 
8
  ---
9
- Onnx version of [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ pipeline_tag: image-to-text
3
  tags:
 
4
  - ocr
5
+ - text-detection
6
+ - text-recognition
7
+ - onnx
8
+ - computer-vision
9
+ - multilingual
10
  license: other
11
  license_name: model-distribution-disclaimer-license
12
  license_link: https://huggingface.co/spaces/deepghs/RDLicence
13
+ library_name: dghs-imgutils
14
  ---
15
+
16
+ # PaddleOCR ONNX Models
17
+
18
+ ## Summary
19
+
20
+ This repository provides **ONNX-format** implementations of **PaddleOCR** models, offering comprehensive **optical character recognition** capabilities for multilingual text detection and recognition. The models are exported from the original [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR) framework and optimized for efficient inference across various deployment scenarios.
21
+
22
+ The repository contains two main types of models: **text detection** models that identify text regions in images, and **text recognition** models that convert detected text regions into actual text content. The detection models utilize advanced segmentation-based approaches with post-processing techniques including contour detection, polygon approximation, and score thresholding to accurately localize text regions with high precision.
23
+
24
+ For text recognition, the models support **multiple languages** including Chinese, English, Japanese, Korean, Arabic, Cyrillic, Devanagari, and several other scripts. Each recognition model comes with its own character dictionary (`dict.txt`) tailored to the specific language or script family. The recognition pipeline handles text normalization, feature extraction, and sequence decoding to produce accurate text transcriptions with confidence scores.
25
+
26
+ The models are designed with **practical deployment** in mind, offering various versions optimized for different use cases - from lightweight mobile applications to high-accuracy server deployments. Key features include support for rotated text detection, duplicate character removal, and configurable confidence thresholds for balancing precision and recall in real-world applications.
27
+
28
+ ## Usage
29
+
30
+ The models can be easily used through the `dghs-imgutils` library:
31
+
32
+ ```bash
33
+ # Installation
34
+ pip install dghs-imgutils
35
+ ```
36
+
37
+ ```python
38
+ # Basic OCR usage
39
+ from imgutils.ocr import ocr, list_det_models, list_rec_models
40
+
41
+ # List available models
42
+ print("Detection models:", list_det_models())
43
+ print("Recognition models:", list_rec_models())
44
+
45
+ # Perform OCR on an image
46
+ results = ocr('your_image.jpg')
47
+ for bbox, text, confidence in results:
48
+ print(f"Text: {text}, Confidence: {confidence:.4f}, BBox: {bbox}")
49
+
50
+ # Custom model selection
51
+ results = ocr('your_image.jpg',
52
+ detect_model='ch_PP-OCRv4_det',
53
+ recognize_model='japan_PP-OCRv3_rec')
54
+ ```
55
+
56
+ ```python
57
+ # Text detection only
58
+ from imgutils.ocr import detect_text_with_ocr
59
+
60
+ # Detect text regions without recognition
61
+ detections = detect_text_with_ocr('your_image.jpg')
62
+ for bbox, label, confidence in detections:
63
+ print(f"BBox: {bbox}, Confidence: {confidence:.4f}")
64
+ ```
65
+
66
+ ## Available Models
67
+
68
+ ### Text Detection Models
69
+ - `ch_PP-OCRv2_det` - Chinese text detection v2
70
+ - `ch_PP-OCRv3_det` - Chinese text detection v3
71
+ - `ch_PP-OCRv4_det` - Chinese text detection v4
72
+ - `ch_PP-OCRv4_server_det` - Server-optimized Chinese detection v4
73
+ - `ch_ppocr_mobile_slim_v2.0_det` - Lightweight mobile detection
74
+ - `ch_ppocr_mobile_v2.0_det` - Mobile-optimized detection
75
+ - `ch_ppocr_server_v2.0_det` - Server-optimized detection
76
+ - `en_PP-OCRv3_det` - English text detection
77
+
78
+ ### Text Recognition Models
79
+ - `arabic_PP-OCRv3_rec` - Arabic text recognition
80
+ - `ch_PP-OCRv2_rec` - Chinese text recognition v2
81
+ - `ch_PP-OCRv3_rec` - Chinese text recognition v3
82
+ - `ch_PP-OCRv4_rec` - Chinese text recognition v4
83
+ - `ch_PP-OCRv4_server_rec` - Server-optimized Chinese recognition v4
84
+ - `ch_ppocr_mobile_v2.0_rec` - Mobile-optimized Chinese recognition
85
+ - `ch_ppocr_server_v2.0_rec` - Server-optimized Chinese recognition
86
+ - `chinese_cht_PP-OCRv3_rec` - Traditional Chinese recognition
87
+ - `cyrillic_PP-OCRv3_rec` - Cyrillic script recognition
88
+ - `devanagari_PP-OCRv3_rec` - Devanagari script recognition
89
+ - `en_PP-OCRv3_rec` - English text recognition v3
90
+ - `en_PP-OCRv4_rec` - English text recognition v4
91
+ - `en_number_mobile_v2.0_rec` - Mobile-optimized number recognition
92
+ - `japan_PP-OCRv3_rec` - Japanese text recognition
93
+ - `ka_PP-OCRv3_rec` - Kannada text recognition
94
+ - `korean_PP-OCRv3_rec` - Korean text recognition
95
+ - `latin_PP-OCRv3_rec` - Latin script recognition
96
+ - `ta_PP-OCRv3_rec` - Tamil text recognition
97
+ - `te_PP-OCRv3_rec` - Telugu text recognition
98
+
99
+ ## Model Configuration
100
+
101
+ The OCR pipeline supports several configurable parameters:
102
+
103
+ - `heat_threshold`: Heat map threshold for text detection (default: 0.3)
104
+ - `box_threshold`: Box confidence threshold (default: 0.7)
105
+ - `max_candidates`: Maximum number of text candidates (default: 1000)
106
+ - `unclip_ratio`: Expansion ratio for detected boxes (default: 2.0)
107
+ - `rotation_threshold`: Aspect ratio threshold for rotation detection (default: 1.5)
108
+ - `is_remove_duplicate`: Whether to remove duplicate characters (default: False)
109
+
110
+ ## Performance Notes
111
+
112
+ - The default detection model `ch_PP-OCRv4_det` provides excellent balance of accuracy and speed
113
+ - The default recognition model `ch_PP-OCRv4_rec` supports both Chinese and English with high accuracy
114
+ - For specific languages, choose the corresponding recognition model for optimal results
115
+ - Server versions generally offer higher accuracy at the cost of increased computational requirements
116
+ - Mobile versions are optimized for speed and resource efficiency
117
+
118
+ ## Original Content
119
+
120
+ Onnx version of [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR).
121
+
122
+ ## Citation
123
+
124
+ ```bibtex
125
+ @misc{paddleocr_onnx,
126
+ title = {{PaddleOCR ONNX Models}},
127
+ author = {PaddlePaddle and Repository Contributors},
128
+ howpublished = {\url{https://huggingface.co/deepghs/paddleocr}},
129
+ year = {2023},
130
+ note = {ONNX-format implementations of PaddleOCR models for multilingual text detection and recognition},
131
+ abstract = {This repository provides ONNX-format implementations of PaddleOCR models, offering comprehensive optical character recognition capabilities for multilingual text detection and recognition. The models are exported from the original PaddleOCR framework and optimized for efficient inference across various deployment scenarios. The repository contains text detection models that identify text regions in images and text recognition models that convert detected text regions into actual text content, supporting multiple languages including Chinese, English, Japanese, Korean, Arabic, Cyrillic, Devanagari, and several other scripts.},
132
+ keywords = {OCR, text-detection, text-recognition, multilingual, ONNX}
133
+ }
134
+ ```