简单对象提取¶
本指南介绍如何从文本中提取具有定义字段的简单对象 - 这是结构化数据提取中最常见的模式。
基本示例¶
from pydantic import BaseModel
import instructor
from openai import OpenAI
# Define the structure you want to extract
class Person(BaseModel):
name: str
age: int
occupation: str
# Extract the structured data
client = instructor.from_openai(OpenAI())
person = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "user", "content": "John Smith is a 35-year-old software engineer."}
],
response_model=Person
)
print(f"Name: {person.name}")
print(f"Age: {person.age}")
print(f"Occupation: {person.occupation}")
┌───────────────┐ ┌───────────────┐
│ Define Model │ │ Extracted │
│ name: str │ Extract │ name: "John" │
│ age: int │ ─────────> │ age: 35 │
│ occupation: str│ │ occupation: │
└───────────────┘ │ "software..." │
└───────────────┘
使用字段描述¶
添加描述有助于模型理解需要提取什么
from pydantic import BaseModel, Field
class Book(BaseModel):
title: str = Field(description="The full title of the book")
author: str = Field(description="The author's full name")
publication_year: int = Field(description="The year the book was published")
字段描述就像提取过程的指令一样。
处理可选字段¶
有时文本不包含所有信息
from typing import Optional
from pydantic import BaseModel
class MovieReview(BaseModel):
title: str
director: Optional[str] = None # Optional field
rating: float
通过使用 Optional
并提供默认值,即使字段缺失也不会导致错误。
添加简单验证¶
你可以添加基本的验证规则
from pydantic import BaseModel, Field
class Product(BaseModel):
name: str
price: float = Field(gt=0, description="The product price in USD")
in_stock: bool
此示例确保 price
必须大于零。
实际示例¶
这是一个更实际的示例
from pydantic import BaseModel
from typing import Optional
class Address(BaseModel):
street: str
city: str
state: str
zip_code: str
class ContactInfo(BaseModel):
name: str
email: str
phone: Optional[str] = None
address: Optional[Address] = None
# Extract structured data
client = instructor.from_openai(OpenAI())
contact = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "user", "content": """
Contact information:
Name: Sarah Johnson
Email: sarah.j@example.com
Phone: (555) 123-4567
Address: 123 Main St, Boston, MA 02108
"""}
],
response_model=ContactInfo
)
print(f"Name: {contact.name}")
print(f"Email: {contact.email}")