Image-Text-to-Text
Transformers
Safetensors
English
Chinese
qwen2_5_vl
image-to-text
Document
VLM
OCR
VL
Camel
Openpdf
text-generation-inference
Extraction
Linking
Markdown
Document Digitization
Intelligent Document Processing (IDP)
Intelligent Word Recognition (IWR)
Optical Mark Recognition (OMR)
conversational
Update README.md
Browse files
README.md
CHANGED
|
@@ -10,6 +10,7 @@ library_name: transformers
|
|
| 10 |
tags:
|
| 11 |
- trl
|
| 12 |
- Document
|
|
|
|
| 13 |
- KIE
|
| 14 |
- OCR
|
| 15 |
- VL
|
|
@@ -51,7 +52,7 @@ This model shows significant improvements in [LaTeX rendering and Markdown rende
|
|
| 51 |
|
| 52 |
* **Visually-Grounded Device Interaction**: Enables mobile/robotic device operation via visual inputs and text-based instructions using contextual understanding and decision-making logic.
|
| 53 |
|
| 54 |
-
# Quick Start with Transformers
|
| 55 |
|
| 56 |
```python
|
| 57 |
from transformers import Qwen2_5_VLForConditionalGeneration, AutoTokenizer, AutoProcessor
|
|
|
|
| 10 |
tags:
|
| 11 |
- trl
|
| 12 |
- Document
|
| 13 |
+
- VLM
|
| 14 |
- KIE
|
| 15 |
- OCR
|
| 16 |
- VL
|
|
|
|
| 52 |
|
| 53 |
* **Visually-Grounded Device Interaction**: Enables mobile/robotic device operation via visual inputs and text-based instructions using contextual understanding and decision-making logic.
|
| 54 |
|
| 55 |
+
# Quick Start with Transformers
|
| 56 |
|
| 57 |
```python
|
| 58 |
from transformers import Qwen2_5_VLForConditionalGeneration, AutoTokenizer, AutoProcessor
|