可选字段¶
本指南解释如何在数据模型中使用可选字段。可选字段允许模型在信息不可用或不确定时跳过某些字段。
为何使用可选字段?¶
在以下情况下,可选字段很有用:
- 输入文本中缺少某些信息
- 某些字段仅在特定上下文中相关
- 大型语言模型 (LLM) 无法确定地提取所有字段
- 您希望允许部分成功而不是完全失败
基本可选字段¶
要将字段设置为可选,请使用 Python 的 Optional
类型并提供一个默认值
from typing import Optional
from pydantic import BaseModel
import instructor
from openai import OpenAI
client = instructor.from_openai(OpenAI())
class Person(BaseModel):
name: str # Required field
age: Optional[int] = None # Optional field with None default
occupation: Optional[str] = None # Optional field with None default
这里,name
是必需的,而 age
和 occupation
是可选的,如果未找到,它们将默认为 None
。
使用默认值¶
您可以为可选字段提供有意义的默认值
from typing import List
from pydantic import BaseModel
import instructor
from openai import OpenAI
client = instructor.from_openai(OpenAI())
class Product(BaseModel):
name: str
price: float
currency: str = "USD" # Default value
in_stock: bool = True # Default value
tags: List[str] = [] # Default empty list
带验证的可选字段¶
您可以添加 Field
类以实现更多控制和验证
from typing import Optional
from pydantic import BaseModel, Field
import instructor
from openai import OpenAI
client = instructor.from_openai(OpenAI())
class UserProfile(BaseModel):
username: str
email: str
bio: Optional[str] = Field(
None, # Default value
max_length=200, # Validation applies if present
description="User's biography, limited to 200 characters"
)
可选嵌套结构¶
整个嵌套结构都可以是可选的
from typing import Optional
from pydantic import BaseModel
import instructor
from openai import OpenAI
client = instructor.from_openai(OpenAI())
class Address(BaseModel):
street: str
city: str
state: str
zip_code: str
class Contact(BaseModel):
email: str
phone: Optional[str] = None
address: Optional[Address] = None # Optional nested structure
class Person(BaseModel):
name: str
contact: Contact
使用可选嵌套结构时,在访问之前检查它们是否存在
# Access nested data safely
if person.contact.address:
print(f"Address: {person.contact.address.city}")
else:
print("No address information available")
为不确定字段使用 Maybe
¶
Instructor 提供了一个 Maybe
类型,用于表示不确定或模棱两可的字段
from pydantic import BaseModel
import instructor
from openai import OpenAI
from instructor.types import Maybe
client = instructor.from_openai(OpenAI())
class PersonInfo(BaseModel):
name: str
age: Maybe[int] = None # Maybe type for uncertain fields
检查 Maybe
字段是否包含不确定信息
if person.age and person.age.is_uncertain:
print(f"Uncertain age: approximately {person.age.value}")
elif person.age:
print(f"Age: {person.age.value}")
else:
print("Age: Unknown")
有关 Maybe
类型的更多信息,请参阅缺失概念页面。
处理可选值¶
始终在您的代码中处理 None
值的可能性
# Check for None before using
if person.age is not None:
drinking_age = "Legal" if person.age >= 21 else "Underage"
else:
drinking_age = "Unknown"
# Use conditional expressions
price_display = f"${product.price}" if product.price is not None else "Price unavailable"
# Provide defaults with 'or'
display_name = user.nickname or user.username
可选字段的验证¶
可选字段在存在时仍然可以进行验证
from typing import Optional
from pydantic import BaseModel, field_validator
import instructor
from openai import OpenAI
import re
client = instructor.from_openai(OpenAI())
class ContactInfo(BaseModel):
email: str
phone: Optional[str] = None
@field_validator('phone')
@classmethod
def validate_phone(cls, v):
if v is not None and not re.match(r'^\+?[1-9]\d{1,14}$', v):
raise ValueError("Invalid phone format")
return v