跳到内容

确定推理链的不确定性

我们希望我们的语言模型能够输出它们对预测的置信程度。为此,我们可以让语言模型使用一种称为自校准 (Self Calibration) 的技术来评估它们对给定提示的响应 1

原始论文使用了对语言模型最终输出进行微调的回归头部。然而,由于我们无法访问模型的最终隐藏状态,我们可以用函数调用来代替它,以达到类似的效果。

我们可以使用以下模板要求语言模型评估其输出

我们可以使用 instructor 实现这一点,如下所示

import instructor
from openai import OpenAI
from pydantic import BaseModel, Field

client = instructor.from_openai(OpenAI())


class SelfCalibration(BaseModel):
    chain_of_thought: str
    is_valid_answer: bool = Field(description="Whether the answer is correct or not")


def evaluate_model_output(original_prompt: str, model_response: str):
    return client.chat.completions.create(
        messages=[
            {
                "role": "user",
                "content": f"""
                Question: {original_prompt}

                {model_response}

                Is this a valid answer to the question?
                Make sure to examine the question
                thoroughly and generate a complete
                reasoning for why the answer is correct
                or not before responding.
                """,
            }
        ],
        response_model=SelfCalibration,
        model="gpt-4o",
    )


if __name__ == "__main__":
    original_prompt = """
    Question: Who was the third president of the
    United States?
    """
    model_response = """
    Here are some brainstormed ideas: James Monroe
    Thomas Jefferson
    Jefferson
    Thomas Jefferson
    George Washington
    """
    response = evaluate_model_output(original_prompt, model_response)
    print(response.model_dump_json(indent=2))
    """
    {
      "chain_of_thought": "Let's examine the question
      carefully: 'Who was the third president of the
      United States?'\n\nThe brainstormed ideas are:
      \n1. James Monroe\n2. Thomas Jefferson\n3.
      Jefferson\n4. Thomas Jefferson\n5. George
      Washington.\n\nTo determine the validity of these
      answers, I'll cross-check with historical
      records.\n\n1. James Monroe was not the third
      president; he was the fifth president.\n2. Thomas
      Jefferson was indeed the third president of the
      United States.\n3. 'Jefferson' is a correct but
      incomplete answer; it lacks the first name, though
      it is commonly understood.\n4. 'Thomas Jefferson'
      is the full name and correct answer.\n5. George
      Washington was the first president, not the
      third.\n\nTherefore, the correct, valid answer to
      the question 'Who was the third president of the
      United States?' is 'Thomas Jefferson,' and this
      answer is correct.",
      "is_valid_answer": true
    }
    """

参考文献

1: 语言模型(大多)知道它们知道什么