DeepSeek-OCR 模型调用示例

DeepSeek-OCR 是一款先进的 OCR 模型，能够识别图片中的文字并将其转换为指定的文本格式。本文详细介绍该模型的调用方式、请求示例及注意事项，帮助开发者快速集成使用。

请求端点说明

您可以通过向以下端点发送请求来使用 DeepSeek-OCR 模型：

https://api.modelverse.cn/v1/chat/completions

核心说明

参数限制：DeepSeek-OCR 支持max\_tokens 参数最大设置为 8192。
使用费用：当前该模型免费开放使用，无需付费。

注意事项

该模型输入仅支持 base64 编码的图片（即 \&\#34;data:image/\.\.\.\&\#34; 格式），不支持直接通过 image\_url 远程图片地址。

如果你的图片在远程地址，可以通过如下命令一键获取 base64 字符串：

curl -s https://umodelverse-inference.cn-wlcb.ufileos.com/ucloud-maxcot.jpg | base64 | tr -d '\n'

非流式请求

适用于无需实时返回结果的场景，请求完成后一次性获取完整响应。

cURL 示例

curl https://api.modelverse.cn/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $YOUR_API_KEY" \
  -d '{
    "model": "deepseek-ai/DeepSeek-OCR",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "convert to markdown"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "data:image/jpeg;base64,'$(curl -s https://umodelverse-inference.cn-wlcb.ufileos.com/ucloud-maxcot.jpg | base64 | tr -d '\n')'"
            }
          }
        ]
      }
    ]
  }'

Python 示例

import base64
import os
from openai import OpenAI

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

# Path to your image
image_path = os.path.expanduser("ucloud.png")

# Getting the base64 string
base64_image = encode_image(image_path)

client = OpenAI(
    api_key=os.getenv("MODELVERSE_API_KEY", "<YOUR_MODELVERSE_API_KEY>"),
    base_url="https://api.modelverse.cn/v1/",
)

response = client.chat.completions.create(
  model="deepseek-ai/DeepSeek-OCR",
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "convert to markdown"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": f"data:image/jpeg;base64,{base64_image}"
          }
        }
      ]
    }
  ]
)

print(response.choices[0].message.content)

流式请求

通过将 stream 参数设置为 true，您可以实现流式响应，实时获取识别结果。

cURL 示例

curl https://api.modelverse.cn/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $YOUR_API_KEY" \
  -d '{
    "model": "deepseek-ai/DeepSeek-OCR",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "convert to markdown"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "data:image/jpeg;base64,'$(curl -s https://umodelverse-inference.cn-wlcb.ufileos.com/ucloud-maxcot.jpg | base64 | tr -d '\n')'"
            }
          }
        ]
      }
    ],
    "stream": true
  }'

Python 示例

import base64
import os
from openai import OpenAI

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

# Path to your image
image_path = os.path.expanduser("ucloud.png")

# Getting the base64 string
base64_image = encode_image(image_path)

client = OpenAI(
    api_key=os.getenv("MODELVERSE_API_KEY", "<YOUR_MODELVERSE_API_KEY>"),
    base_url="https://api.modelverse.cn/v1/",
)

stream = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-OCR",
    messages=[
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "convert to markdown"
            },
            {
              "type": "image_url",
              "image_url": {
                "url": f"data:image/jpeg;base64,{base64_image}"
              }
            }
          ]
        }
    ],
    stream=True,
)

for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")