自验证响应

我们希望验证大型语言模型（LLM）的响应是否正确。如何自动化这个过程？

自验证框架会生成多个候选响应，然后使用大型语言模型来验证这些候选响应。该过程分为两个阶段：

正向推理
反向验证

正向推理¶

在正向推理阶段，我们利用思维链（CoT）生成多个候选解。

反向验证¶

反向验证包括三个步骤。

改写为陈述句¶

将原始问题及其解改写为一个陈述句。

改写后的陈述句示例

原始问题: Jackie 有 10 个苹果。Adam 有 8 个苹果。Jackie 比 Adam 多几个苹果？ 候选响应: Jackie 有 10 个苹果。所以 Jackie 比 Adam 多 10-8=2 个苹果，答案是 2。 改写后的陈述句: Jackie 有 10 个苹果。Adam 有 8 个苹果。Jackie 比 Adam 多 2 个苹果。

构造新问题¶

构造一个新的问题并提示大型语言模型进行验证。有两种可能的方法：

真假项验证 (TFV)
条件掩码验证 (CMV)

真假项验证（TFV）询问大型语言模型改写后的陈述句是否正确。条件掩码验证（CMV）过滤掉原始问题中提供的条件，并要求大型语言模型预测被过滤的条件。

TFV 示例提示

Jackie 有 10 个苹果。Adam 有 8 个苹果。Jackie 比 Adam 多 2 个苹果。这是正确的吗？

CMV 示例提示

Jackie 有 X 个苹果。Adam 有 8 个苹果。Jackie 比 Adam 多 2 个苹果。X 是多少？

计算验证分数¶

然后，使用新问题对每个候选进行 k 次查询大型语言模型。如果使用 TFV，验证分数就是大型语言模型输出“正确”的次数。如果使用 CMV，验证分数就是掩码值与真实值匹配的次数。

验证分数最高的候选响应被选为最终答案。

实现¶

包含正向推理和反向验证的完整流程可以使用 instructor 实现，如下所示

import instructor
from pydantic import BaseModel
from openai import OpenAI
from typing import Literal

client = instructor.from_openai(OpenAI())

n = 3  # Number of candidates to generate
k = 5  # Number of times to verify


class Date(BaseModel):
    month: int
    day: int


class Candidate(BaseModel):
    reasoning_steps: list[str]
    month: str


class Rewritten(BaseModel):
    declarative: str


class Verification(BaseModel):
    correct: Literal["True", "False"]


def query_llm(query, model):
    return client.chat.completions.create(
        model="gpt-4o",
        response_model=model,
        messages=[
            {
                "role": "user",
                "content": f"Think step by step: {query}",
            }
        ],
    )


def rewrite(query, candidate):
    return client.chat.completions.create(
        model="gpt-4o",
        response_model=Rewritten,
        messages=[
            {
                "role": "user",
                "content": f"""
                    Please change the questions and answers into complete declarative sentences
                    {query}
                    The answer is {candidate.month}.
                """,
            }
        ],
    )


def verify(question):
    return client.chat.completions.create(
        model="gpt-4o",
        response_model=Verification,
        messages=[{"role": "user", "content": question}],
    )


if __name__ == "__main__":
    query = "What month is it now if it has been 3 weeks, 10 days, and 2 hours since May 1, 2024 6pm?"

    # Step 1: Forward Reasoning
    candidates = [query_llm(query, Candidate) for _ in range(n)]

    # Step 2: Backwards Verification
    for candidate in candidates:
        # 2.a Rewrite
        rewritten = rewrite(query, candidate)
        # 2.b Construct new questions
        question = f"{rewritten.declarative} Do it is correct (True or False)?"
        # 2.c Compute verification score
        scores = [verify(question).correct for _ in range(k)]
        verification_score = sum(1 for s in scores if s == "True")

        print(f"Candidate: {candidate.month}, Verification Score: {verification_score}")
        #> Candidate: May, Verification Score: 0
        #> Candidate: June, Verification Score: 2
        #> Candidate: May, Verification Score: 1

参考文献¶

¹: Large Language Models are Better Reasoners with Self-Verification

^*: The Prompt Report: A Systematic Survey of Prompting Techniques