OpenAI官方吴达恩《ChatGPT Prompt Engineering 提示词工程师》(1）指南：提示LLM的原则

简介

教学目标

在这门课程中，我们将与您分享一些可能性，以及如何实现这些可能性的最佳实践。

首先，您将学习一些用提示词做一个app开发的最佳实践。
然后我们将介绍一些常见用例，例如总结、推断、转换、扩展。
最后并带您使用LLM构建聊天机器人。

Two Types of large language models (LLMs)

base LLMs 基础大模型：基于文本训练数据来预测做“文字接龙”，通常是在互联网和其他来源的大量数据上进行训练，以确定下一个最有可能的单词是什么
instruction tuned LLMs 指令调整型模型：是LLM研究和实践的巨大动力所在，指令调整型LLM接受了遵循指示的培训。你首先使用已经大量文本数据上训练过的基本LLM，然后使用输入和输出的指令来进行微调，让它更好地遵循这些指令。然后，通常使用一种叫做DLUF（人类反馈强化学习）的技术进一步优化以使系統能够更好地提供帮助和遵循指令，所以它们更有可能输出有益、诚实和无害的文本

指南/Guidelines

提示LLM的原则

编写明确和具体的指令
给LLM足够的时间思考

环境准备

使用OpenAI Python库来访问OpenAI API

pip install openai

import openai
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

openai.api_key = os.getenv('OPENAI_API_KEY')

不需要执行任何此操作。只需运行此代码，因为在环境中设置了API密钥。

定义帮助函数

定义帮助函数，以便更轻松地使用提示并查看生成的输出。
函数getCompletion接收提示并返回该提示的完成结果。

def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]

原则1：Write clear and specific instructions

Tactic 1：Use delimiters 使用区分符

使用分隔符，明确指示输入的不同部分
这些分隔符可以是任何明确的标点符号，例如三个反引号，引号，XML标记，部分标题
让模型清楚地知道哪些是独立的部分，以避免提示注入。
提示注入（Prompt Injection）：是指用户添加输入到提示中，可能会与我们的指令相矛盾，导致模型遵循用户的指令而不是我们的指令。

text = f"""
You should express what you want a model to do by \ 
providing instructions that are as clear and \ 
specific as you can possibly make them. \ 
This will guide the model towards the desired output, \ 
and reduce the chances of receiving irrelevant \ 
or incorrect responses. Don't confuse writing a \ 
clear prompt with writing a short prompt. \ 
In many cases, longer prompts provide more clarity \ 
and context for the model, which can lead to \ 
more detailed and relevant outputs.
"""
prompt = f"""
Summarize the text delimited by triple backticks \ 
into a single sentence.
``{text}`` #这里用三个单引号
"""
response = get_completion(prompt)
print(response)
"""
To guide a model towards the desired output and reduce the chances of irrelevant or incorrect responses, it is important to provide clear and specific instructions, which may require longer prompts for more clarity and context.
"""

Tactic 2: Ask for structured output结构化输出

为了更容易解析模型输出，要求模型以HTML或JSON等结构化格式提供输出
我们要求模型生成三本虚构图书的书名、作者和类型，并使用JSON格式以特定键的形式提供输出。

prompt = f"""
Generate a list of three made-up book titles along \ 
with their authors and genres. 
Provide them in JSON format with the following keys: 
book_id, title, author, genre.
"""
response = get_completion(prompt)
print(response)
"""
[
  {
    "book_id": 1,
    "title": "The Lost City of Zorath",
    "author": "Aria Blackwood",
    "genre": "Fantasy"
  },
  {
    "book_id": 2,
    "title": "The Last Survivors",
    "author": "Ethan Stone",
    "genre": "Science Fiction"
  },
  {
    "book_id": 3,
    "title": "The Secret Life of Bees",
    "author": "Lila Rose",
    "genre": "Romance"
  }
]
"""

这种输出方式的好处是，我们可以在Python中将其读入字典或列表中。

Tactic 3: Check whether conditions are satisfied条件是否满足

如果任务有一些假设并不一定满足，我们可以告诉模型先检查这些假设，如果不满足，则指出并停止任务
举例：我们给出了一段描述泡茶步骤的段落，然后我们用一个 prompt 让模型提取这些指令。如果它在文本中找不到任何指令，我们让它输出“no steps provided”。

text_1 = f"""
Making a cup of tea is easy! First, you need to get some \ 
water boiling. While that's happening, \ 
grab a cup and put a tea bag in it. Once the water is \ 
hot enough, just pour it over the tea bag. \ 
Let it sit for a bit so the tea can steep. After a \ 
few minutes, take out the tea bag. If you \ 
like, you can add some sugar or milk to taste. \ 
And that's it! You've got yourself a delicious \ 
cup of tea to enjoy.
"""
prompt = f"""
You will be provided with text delimited by triple quotes. 
If it contains a sequence of instructions, \ 
re-write those instructions in the following format:

Step 1 - ...
Step 2 - …
…
Step N - …

If the text does not contain a sequence of instructions, \ 
then simply write "No steps provided."

"""{text_1}"""
"""
response = get_completion(prompt)
print("Completion for Text 1:")
print(response)
Completion for Text 1:
Step 1 - Get some water boiling.
Step 2 - Grab a cup and put a tea bag in it.
Step 3 - Once the water is hot enough, pour it over the tea bag.
Step 4 - Let it sit for a bit so the tea can steep.
Step 5 - After a few minutes, take out the tea bag.
Step 6 - Add some sugar or milk to taste.
Step 7 - Enjoy your delicious cup of tea!

在下面中，模型判断没有找到任何指令。

text_2 = f"""
The sun is shining brightly today, and the birds are \
singing. It's a beautiful day to go for a \ 
walk in the park. The flowers are blooming, and the \ 
trees are swaying gently in the breeze. People \ 
are out and about, enjoying the lovely weather. \ 
Some are having picnics, while others are playing \ 
games or simply relaxing on the grass. It's a \ 
perfect day to spend time outdoors and appreciate the \ 
beauty of nature.
"""
prompt = f"""
You will be provided with text delimited by triple quotes. 
If it contains a sequence of instructions, \ 
re-write those instructions in the following format:

Step 1 - ...
Step 2 - …
…
Step N - …

If the text does not contain a sequence of instructions, \ 
then simply write "No steps provided."

"""{text_2}"""
"""
response = get_completion(prompt)
print("Completion for Text 2:")
print(response)
"""
Completion for Text 2:
No steps provided.
"""

Tactic 4: Few-shot prompting 少样本提示

这种方法是在让模型执行实际任务之前，提供已经成功执行所需任务的示例。

prompt = f"""
Your task is to answer in a consistent style.

: Teach me about patience.

: The river that carves the deepest \ 
valley flows from a modest spring; the \ 
grandest symphony originates from a single note; \ 
the most intricate tapestry begins with a solitary thread.

: Teach me about resilience.
"""
response = get_completion(prompt)
print(response)
"""
: Resilience is like a tree that bends with the wind but never breaks. It's the ability to bounce back from adversity and keep moving forward, even when things get tough. Just like a tree needs strong roots to withstand the storm, we need to cultivate inner strength and perseverance to overcome life's challenges.
"""

原则2：Give the model time to think

Tactic 1: Specify the steps to complete a task 给定步骤来补全任务

Step 1:
Step 2:
Step N:
举例：
我们给模型提供了一个包含Jack and Jill故事概述的段落，并且使用明确的步骤指示模型完成四个任务：
首先，用一句话来概括文本
其次将概述翻译成法语
然后列出法语概述中的每个名称
并且输出一个JSON对象包含"French summary"和"num names"两个key。

text = f"""
In a charming village, siblings Jack and Jill set out on \ 
a quest to fetch water from a hilltop \ 
well. As they climbed, singing joyfully, misfortune \ 
struck—Jack tripped on a stone and tumbled \ 
down the hill, with Jill following suit. \ 
Though slightly battered, the pair returned home to \ 
comforting embraces. Despite the mishap, \ 
their adventurous spirits remained undimmed, and they \ 
continued exploring with delight.
"""
# example 1
prompt_1 = f"""
Perform the following actions: 
1 - Summarize the following text delimited by triple \
backticks with 1 sentence.
2 - Translate the summary into French.
3 - List each name in the French summary.
4 - Output a json object that contains the following \
keys: french_summary, num_names.

Separate your answers with line breaks.

Text:
`{text}`
"""
response = get_completion(prompt_1)
print("Completion for prompt 1:")
print(response)
"""
Completion for prompt 1:
Two siblings, Jack and Jill, go on a quest to fetch water from a well on a hilltop, but misfortune strikes and they both tumble down the hill, returning home slightly battered but with their adventurous spirits undimmed.

Deux frères et sœurs, Jack et Jill, partent en quête d'eau d'un puits sur une colline, mais un malheur frappe et ils tombent tous les deux de la colline, rentrant chez eux légèrement meurtris mais avec leurs esprits aventureux intacts. 
Noms: Jack, Jill.

{
  "french_summary": "Deux frères et sœurs, Jack et Jill, partent en quête d'eau d'un puits sur une colline, mais un malheur frappe et ils tombent tous les deux de la colline, rentrant chez eux légèrement meurtris mais avec leurs esprits aventureux intacts.",
  "num_names": 2
}
"""

下面的例子做一点修改，我们要求类似的内容，要求相同的步骤，然后要求模型使用以下格式：文本、摘要、翻译、名称和输出JSON。具有更加标准化的格式

prompt_2 = f"""
Your task is to perform the following actions: 
1 - Summarize the following text delimited by 
  <> with 1 sentence.
2 - Translate the summary into French.
3 - List each name in the French summary.
4 - Output a json object that contains the 
  following keys: french_summary, num_names.

Use the following format: #这部分更加的标准化
Text: 
Summary:

Translation: Names: Output JSON: Text: <{text}> """ response = get_completion(prompt_2) print("\nCompletion for prompt 2:") print(response) """ Completion for prompt 2: Summary: Jack and Jill go on a quest to fetch water, but misfortune strikes and they tumble down the hill, returning home slightly battered but with their adventurous spirits undimmed. Translation: Jack et Jill partent en quête d'eau, mais un malheur frappe et ils tombent de la colline, rentrant chez eux légèrement meurtris mais avec leurs esprits aventureux intacts. Names: Jack, Jill Output JSON: {"french_summary": "Jack et Jill partent en quête d'eau, mais un malheur frappe et ils tombent de la colline, rentrant chez eux légèrement meurtris mais avec leurs esprits aventureux intacts.", "num_names": 2} """

Tactic 2: Instruct the model to work out its own solution before rushing to a conclusion让模型先梳理计算再给结论

在线例子中，要求模型判断学生的解答是否正确。
首先，我们有这个数学问题，然后是学生的解答。
事实上，学生的解答是错误的，但是如果我们运行这个单元格代码，模型会说学生的解答是正确的。

prompt = f"""
Determine if the student's solution is correct or not.

Question:
I'm building a solar power installation and I need \
 help working out the financials. 
Land costs $100 / square foot
I can buy solar panels for $250 / square foot
I negotiated a contract for maintenance that will cost \ 
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations 
as a function of the number of square feet.

Student's Solution:
Let x be the size of the installation in square feet.
Costs:
Land cost: 100x
Solar panel cost: 250x
Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
"""
response = get_completion(prompt)
print(response)
"""
The student's solution is correct.
"""

模型只是按照我的思考方式匆匆看过它，然后同意了学生的解决方案，因此，我们可以通过指示模型先计算自己的解答，然后再将其与学生的解答进行比较来解决这个问题。
我们告诉模型如下：
您的任务是确定学生的解答是否正确。要解决这个问题，请执行以下步骤。
首先，计算出您自己的解答。
然后将您的解答与学生的解答进行比较，并评估学生的解答是否正确。
在你自己解题之前，不要决定学生的解答是否正确。确保您自己解决了这个问题。

prompt = f"""
Your task is to determine if the student's solution \
is correct or not.
To solve the problem do the following:
First, work out your own solution to the problem. 
Then compare your solution to the student's solution \ 
and evaluate if the student's solution is correct or not. 
Don't decide if the student's solution is correct until 
you have done the problem yourself.

Use the following format:
Question:
...
question here
...
Student's solution:
...
student's solution here
...
Actual solution:
...
steps to work out the solution and your solution here
...
Is the student's solution the same as actual solution \
just calculated:
...
yes or no
...
Student grade:
...
correct or incorrect
...

Question:
...
I'm building a solar power installation and I need help \
working out the financials. 
Land costs $100 / square foot
I can buy solar panels for $250 / square foot
I negotiated a contract for maintenance that will cost \
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations \
as a function of the number of square feet.

Student's solution:
...
Let x be the size of the installation in square feet.
Costs:
Land cost: 100x
Solar panel cost: 250x
Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
...
Actual solution:
"""
response = get_completion(prompt)
print(response)
"""
Let x be the size of the installation in square feet.

Costs:
Land cost: 100x
Solar panel cost: 250x
Maintenance cost: 100,000 + 10x

Total cost: 100x + 250x + 100,000 + 10x = 360x + 100,000

Is the student's solution the same as actual solution just calculated:
No

Student grade:
Incorrect
"""

Model Limitations局限：模型幻觉

什么是模型幻觉：模型在训练过程中接触到大量的知识，它并不是完美地记住了它看到的信息，因此它并不很好地知道它的知识边界。这意味着它可能会尝试回答关于晦涩话题的问题，并且可能会编造听起来合理但实际上不正确的内容。我们将这些编造的想法称为幻觉。
举例：下列中模型从一个真正的牙刷公司创造了一个虚构的产品名字。因此，提示是“告诉我关于Boy的AeroGlide Ultra Slim Smart Toothbrush

prompt = f"""
Tell me about AeroGlide UltraSlim Smart Toothbrush by Boie
"""
response = get_completion(prompt)
print(response)
"""
The AeroGlide UltraSlim Smart Toothbrush by Boie is a high-tech toothbrush that uses advanced sonic technology to provide a deep and thorough clean. It features a slim and sleek design that makes it easy to hold and maneuver, and it comes with a range of smart features that help you optimize your brushing routine.

One of the key features of the AeroGlide UltraSlim Smart Toothbrush is its advanced sonic technology, which uses high-frequency vibrations to break up plaque and bacteria on your teeth and gums. This technology is highly effective at removing even the toughest stains and buildup, leaving your teeth feeling clean and fresh.

In addition to its sonic technology, the AeroGlide UltraSlim Smart Toothbrush also comes with a range of smart features that help you optimize your brushing routine. These include a built-in timer that ensures you brush for the recommended two minutes, as well as a pressure sensor that alerts you if you're brushing too hard.

Overall, the AeroGlide UltraSlim Smart Toothbrush by Boie is a highly advanced and effective toothbrush that is perfect for anyone looking to take their oral hygiene to the next level. With its advanced sonic technology and smart features, it provides a deep and thorough clean that leaves your teeth feeling fresh and healthy.
"""

在希望模型基干文本生成答案的情况下，减少幻觉的一种额外策略是要求模型首先从文本中找到任何相关的引用，然后要求它使用这些引用来回答。并且可以追溯答案回源文档通常可
以帮助减少这些幻觉。

李又懂