跳到内容

使用 Mistral 和 instructor 生成结构化输出,附完整指南

本指南演示了如何将 Mistral 与 Instructor 结合使用来生成结构化输出。您将学习如何利用 Mistral Large 的函数调用能力来创建类型安全的响应。

Mistral Large 是 Mistral AI 的旗舰模型,支持 32k 上下文窗口和函数调用功能。Mistral Large 增加了函数调用功能,这使得使用 JSON schema 获取结构化输出成为可能。

快速开始

要开始使用 Instructor 和 Mistral,您需要安装所需的软件包

pip install "instructor[mistral]"

⚠️ 重要:您必须通过在客户端上显式设置来配置您的 Mistral API 密钥

import os
from mistralai import Mistral
client = Mistral(api_key='your-api-key-here')

可用模式

Instructor 提供了两种与 Mistral 配合使用的模式

  1. instructor.Mode.MISTRAL_TOOLS:使用 Mistral 的函数调用 API 返回结构化输出(默认)
  2. instructor.Mode.MISTRAL_STRUCTURED_OUTPUTS:使用 Mistral 的结构化输出能力

要为您的 mistral 客户端设置模式,只需使用下面的代码片段

import os
from pydantic import BaseModel
from mistralai import Mistral
from instructor import from_mistral


# Initialize with API key
client = Mistral(api_key=os.environ.get("MISTRAL_API_KEY"))

# Enable instructor patches for Mistral client
instructor_client = from_mistral(
    client=client,
    # Set the mode here
    mode=Mode.MISTRAL_TOOLS,
)

简单用户示例 (同步)

import os
from pydantic import BaseModel
from mistralai import Mistral
from instructor import from_mistral, Mode


class UserDetails(BaseModel):
    name: str
    age: int


# Initialize with API key
client = Mistral(api_key=os.environ.get("MISTRAL_API_KEY"))

# Enable instructor patches for Mistral client
instructor_client = from_mistral(
    client=client,
    mode=Mode.MISTRAL_TOOLS,
)

# Extract a single user
user = instructor_client.chat.completions.create(
    response_model=UserDetails,
    model="mistral-large-latest",
    messages=[{"role": "user", "content": "Jason is 25 years old"}],
    temperature=0,
)

print(user)
# Output: UserDetails(name='Jason', age=25)

异步示例

对于异步操作,您可以在创建客户端时使用 use_async=True 参数

import os
import asyncio
from pydantic import BaseModel
from mistralai import Mistral
from instructor import from_mistral, Mode


class User(BaseModel):
    name: str
    age: int


# Initialize with API key
client = Mistral(api_key=os.environ.get("MISTRAL_API_KEY"))

# Enable instructor patches for async Mistral client
instructor_client = from_mistral(
    client=client,
    mode=Mode.MISTRAL_TOOLS,
    use_async=True,
)

async def extract_user():
    user = await instructor_client.chat.completions.create(
        response_model=User,
        messages=[{"role": "user", "content": "Jack is 28 years old."}],
        temperature=0,
        model="mistral-large-latest",
    )
    return user

# Run async function
user = asyncio.run(extract_user())
print(user)
# Output: User(name='Jack', age=28)

嵌套示例

您也可以使用嵌套模型

from pydantic import BaseModel
from typing import List
import os
from mistralai import Mistral
from instructor import from_mistral, Mode

class Address(BaseModel):
    street: str
    city: str
    country: str

class User(BaseModel):
    name: str
    age: int
    addresses: List[Address]

# Initialize with API key
client = Mistral(api_key=os.environ.get("MISTRAL_API_KEY"))

# Enable instructor patches for Mistral client
instructor_client = from_mistral(
    client=client,
    mode=Mode.MISTRAL_TOOLS,
)

# Create structured output with nested objects
user = instructor_client.chat.completions.create(
    response_model=User,
    messages=[
        {"role": "user", "content": """
            Extract: Jason is 25 years old.
            He lives at 123 Main St, New York, USA
            and has a summer house at 456 Beach Rd, Miami, USA
        """}
    ],
    model="mistral-large-latest",
    temperature=0,
)

print(user)
# Output:
# User(
#     name='Jason',
#     age=25,
#     addresses=[
#         Address(street='123 Main St', city='New York', country='USA'),
#         Address(street='456 Beach Rd', city='Miami', country='USA')
#     ]
# )

流式支持

Instructor 现在支持与 Mistral 一起使用的流式功能!您可以使用 create_partial 进行增量模型构建,也可以使用 create_iterable 流式处理集合。

流式处理部分响应

from pydantic import BaseModel
import instructor
from mistralai import Mistral
from instructor.dsl.partial import Partial

class UserExtract(BaseModel):
    name: str
    age: int

# Initialize with API key
client = Mistral(api_key=os.environ.get("MISTRAL_API_KEY"))

# Enable instructor patches for Mistral client
instructor_client = instructor.from_mistral(client)

# Stream partial responses
model = instructor_client.chat.completions.create(
    model="mistral-large-latest",
    response_model=Partial[UserExtract],
    stream=True,
    messages=[
        {"role": "user", "content": "Jason Liu is 25 years old"},
    ],
)

for partial_user in model:
    print(f"Received update: {partial_user}")
# Output might show:
# Received update: UserExtract(name='Jason', age=None)
# Received update: UserExtract(name='Jason Liu', age=None)
# Received update: UserExtract(name='Jason Liu', age=25)

流式处理可迭代集合

from pydantic import BaseModel
import instructor
from mistralai import Mistral

class UserExtract(BaseModel):
    name: str
    age: int

# Initialize with API key
client = Mistral(api_key=os.environ.get("MISTRAL_API_KEY"))

# Enable instructor patches for Mistral client
instructor_client = instructor.from_mistral(client)

# Stream iterable responses
users = instructor_client.chat.completions.create_iterable(
    model="mistral-large-latest",
    response_model=UserExtract,
    messages=[
        {"role": "user", "content": "Make up two people"},
    ],
)

for user in users:
    print(f"Generated user: {user}")
# Output:
# Generated user: UserExtract(name='Emily Johnson', age=32)
# Generated user: UserExtract(name='Michael Chen', age=28)

异步流式处理

您还可以使用这两种流式处理方法的异步版本

import asyncio
from pydantic import BaseModel
import instructor
from mistralai import Mistral
from instructor.dsl.partial import Partial

class UserExtract(BaseModel):
    name: str
    age: int

# Initialize client with async support
client = Mistral(api_key=os.environ.get("MISTRAL_API_KEY"))
instructor_client = instructor.from_mistral(client, use_async=True)

async def stream_partial():
    model = await instructor_client.chat.completions.create(
        model="mistral-large-latest",
        response_model=Partial[UserExtract],
        stream=True,
        messages=[
            {"role": "user", "content": "Jason Liu is 25 years old"},
        ],
    )

    async for partial_user in model:
        print(f"Received update: {partial_user}")

async def stream_iterable():
    users = instructor_client.chat.completions.create_iterable(
        model="mistral-large-latest",
        response_model=UserExtract,
        messages=[
            {"role": "user", "content": "Make up two people"},
        ],
    )

    async for user in users:
        print(f"Generated user: {user}")

# Run async functions
asyncio.run(stream_partial())
asyncio.run(stream_iterable())

更新与兼容性

Instructor 与最新的 Mistral API 版本和模型保持兼容。请查看变更日志了解 Mistral 集成功能的更新。

多模态

Instructor 使得使用 Mistral 模型分析和提取 PDF 中的语义信息变得容易。请看下面的示例,我们使用 from_url 方法加载上面的示例 PDF。请注意,目前 Mistral 仅支持文档 URL。

from instructor.multimodal import PDF
from pydantic import BaseModel
import instructor
from mistralai import Mistral
import os


class Receipt(BaseModel):
    total: int
    items: list[str]


client = instructor.from_mistral(Mistral(os.environ["MISTRAL_API_KEY"]))

url = "https://raw.githubusercontent.com/instructor-ai/instructor/main/tests/assets/invoice.pdf"

response = client.chat.completions.create(
    model="ministral-8b-latest",
    response_model=Receipt,
    max_tokens=1000,
    messages=[
        {
            "role": "user",
            "content": [
                "Extract out the total and line items from the invoice",
                PDF.from_url(
                    url
                ),  # Also supports PDF.from_path() and PDF.from_base64()
            ],
        },
    ],
)

print(response)
# > Receipt(total=220, items=['English Tea', 'Tofu'])