响应模型¶

在 Pydantic 中定义 LLM 输出模式是通过 pydantic.BaseModel 完成的。要了解更多关于 Pydantic 模型的信息，请查看其文档。

定义 Pydantic 模型后，我们可以将其用作客户端对 OpenAI 或任何其他支持模型的 create 调用中的 response_model 参数。response_model 参数的作用是

定义语言模型的模式和 Prompt
验证来自 API 的响应
返回一个 Pydantic 模型实例。

提示¶

在定义响应模型时，我们可以使用 docstrings 和字段注解来定义用于生成响应的 Prompt。

from pydantic import BaseModel, Field


class User(BaseModel):
    """
    This is the prompt that will be used to generate the response.
    Any instructions here will be passed to the language model.
    """

    name: str = Field(description="The name of the user.")
    age: int = Field(description="The age of the user.")

在这里，所有的 docstrings、类型和字段注解都将用于生成 Prompt。Prompt 将由客户端的 create 方法生成，并用于生成响应。

可选值¶

如果我们使用 Optional 和 default，当发送给语言模型时，它们将被视为非必需的。

from pydantic import BaseModel, Field
from typing import Optional


class User(BaseModel):
    name: str = Field(description="The name of the user.")
    age: int = Field(description="The age of the user.")
    email: Optional[str] = Field(description="The email of the user.", default=None)

请注意，通过使用 Pydantic 的 SkipJsonSchema 注解，字段也可以完全不发送给语言模型。更多详细信息请参阅字段。

动态模型创建¶

在某些情况下，我们可能希望使用运行时信息来指定字段以创建模型。为此，Pydantic 提供了 create_model 函数，允许模型动态创建。

from pydantic import BaseModel, create_model


class FooModel(BaseModel):
    foo: str
    bar: int = 123


BarModel = create_model(
    'BarModel',
    apple=(str, 'russet'),
    banana=(str, 'yellow'),
    __base__=FooModel,
)
print(BarModel)
#> <class '__main__.BarModel'>
print(BarModel.model_fields.keys())
#> dict_keys(['foo', 'bar', 'apple', 'banana'])

什么时候使用它？

考虑一种情况，模型是基于某些配置或数据库动态定义的。例如，我们可以有一个数据库表，存储某个模型名称或 ID 对应的模型属性。然后我们可以查询数据库获取模型的属性，并用这些属性创建模型。

SELECT property_name, property_type, description
FROM prompt
WHERE model_name = {model_name}

然后我们可以利用这些信息来创建模型。

from pydantic import BaseModel, create_model
from typing import List

types = {
    'string': str,
    'integer': int,
    'boolean': bool,
    'number': float,
    'List[str]': List[str],
}

# Mocked cursor.fetchall()
cursor = [
    ('name', 'string', 'The name of the user.'),
    ('age', 'integer', 'The age of the user.'),
    ('email', 'string', 'The email of the user.'),
]

BarModel = create_model(
    'User',
    **{
        property_name: (types[property_type], description)
        for property_name, property_type, description in cursor
    },
    __base__=BaseModel,
)

print(BarModel.model_json_schema())
"""
{
    'properties': {
        'name': {'default': 'The name of the user.', 'title': 'Name', 'type': 'string'},
        'age': {'default': 'The age of the user.', 'title': 'Age', 'type': 'integer'},
        'email': {
            'default': 'The email of the user.',
            'title': 'Email',
            'type': 'string',
        },
    },
    'title': 'User',
    'type': 'object',
}
"""

当不同用户对同一个模型有不同的描述时，这会很有用。我们可以使用同一个模型，但为每个用户提供不同的 Prompt。

添加行为¶

我们可以像任何普通的 Python 类一样，向我们的 Pydantic 模型添加方法。我们可能希望这样做，为我们的模型添加一些自定义逻辑。

from pydantic import BaseModel
from typing import Literal

from openai import OpenAI

import instructor

client = instructor.from_openai(OpenAI())


class SearchQuery(BaseModel):
    query: str
    query_type: Literal["web", "image", "video"]

    def execute(self):
        print(f"Searching for {self.query} of type {self.query_type}")
        #> Searching for cat of type image
        return "Results for cat"


query = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Search for a picture of a cat"}],
    response_model=SearchQuery,
)

results = query.execute()
print(results)
#> Results for cat

现在，从语言模型中提取模型实例后，我们就可以调用其 execute 方法。如果您想查看更多示例，请查看我们关于RAG 不仅仅是 embeddings 的文章。