Update README_zh.md
Browse files- README_zh.md +10 -3
README_zh.md
CHANGED
|
@@ -6,16 +6,25 @@
|
|
| 6 |
<img src=https://raw.githubusercontent.com/THUDM/CogVLM2/cf9cb3c60a871e0c8e5bde7feaf642e3021153e6/resources/logo.svg>
|
| 7 |
</div>
|
| 8 |
|
|
|
|
|
|
|
| 9 |
通常情况下,大部分视频数据并没有附带相应的描述性文本,因此有必要将视频数据转换成文本描述,以提供文本到视频模型所需的必要训练数据。
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
|
| 11 |
## 使用方式
|
| 12 |
```python
|
| 13 |
import io
|
|
|
|
|
|
|
| 14 |
import numpy as np
|
| 15 |
import torch
|
| 16 |
from decord import cpu, VideoReader, bridge
|
| 17 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 18 |
-
import argparse
|
| 19 |
|
| 20 |
MODEL_PATH = "THUDM/cogvlm2-llama3-caption"
|
| 21 |
|
|
@@ -63,7 +72,6 @@ def load_video(video_data, strategy='chat'):
|
|
| 63 |
tokenizer = AutoTokenizer.from_pretrained(
|
| 64 |
MODEL_PATH,
|
| 65 |
trust_remote_code=True,
|
| 66 |
-
# padding_side="left"
|
| 67 |
)
|
| 68 |
|
| 69 |
model = AutoModelForCausalLM.from_pretrained(
|
|
@@ -118,7 +126,6 @@ def test():
|
|
| 118 |
|
| 119 |
if __name__ == '__main__':
|
| 120 |
test()
|
| 121 |
-
|
| 122 |
```
|
| 123 |
|
| 124 |
## 模型协议
|
|
|
|
| 6 |
<img src=https://raw.githubusercontent.com/THUDM/CogVLM2/cf9cb3c60a871e0c8e5bde7feaf642e3021153e6/resources/logo.svg>
|
| 7 |
</div>
|
| 8 |
|
| 9 |
+
[代码](https://github.com/THUDM/CogVideo/tree/main/tools/caption) | 🤗 [Hugging Face](https://huggingface.co/THUDM/cogvlm2-llama3-caption) | 🤖 [ModelScope](https://modelscope.cn/models/ZhipuAI/cogvlm2-llama3-caption/)
|
| 10 |
+
|
| 11 |
通常情况下,大部分视频数据并没有附带相应的描述性文本,因此有必要将视频数据转换成文本描述,以提供文本到视频模型所需的必要训练数据。
|
| 12 |
+
CogVLM2-Caption是用于生成CogVideoX模型训练数据的视频caption模型。
|
| 13 |
+
|
| 14 |
+
<div align="center">
|
| 15 |
+
<img width="600px" height="auto" src="./CogVLM2-Caption-example.png">
|
| 16 |
+
</div>
|
| 17 |
+
|
| 18 |
|
| 19 |
## 使用方式
|
| 20 |
```python
|
| 21 |
import io
|
| 22 |
+
|
| 23 |
+
import argparse
|
| 24 |
import numpy as np
|
| 25 |
import torch
|
| 26 |
from decord import cpu, VideoReader, bridge
|
| 27 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
|
| 28 |
|
| 29 |
MODEL_PATH = "THUDM/cogvlm2-llama3-caption"
|
| 30 |
|
|
|
|
| 72 |
tokenizer = AutoTokenizer.from_pretrained(
|
| 73 |
MODEL_PATH,
|
| 74 |
trust_remote_code=True,
|
|
|
|
| 75 |
)
|
| 76 |
|
| 77 |
model = AutoModelForCausalLM.from_pretrained(
|
|
|
|
| 126 |
|
| 127 |
if __name__ == '__main__':
|
| 128 |
test()
|
|
|
|
| 129 |
```
|
| 130 |
|
| 131 |
## 模型协议
|