Prompt Engineering with LLMs
This Blog will discuss the basics of prompt engineering using LLM.
There are broadly two types of Large Language Models:
- Base LLM — They are the LLMs which we most of us had studied to build. Here we train our model on large corpus of data from web or other sources in such a way that the model predicts the next words given a set of words. For example, if I write, “Paris is famed for” then the model might complete the sentence as “Paris is famed for its fashion houses.” But if I give a prompt to this model such as “What is the capital of France?” then instead of answering with Paris, this model would suggest more questions like “What is France’s largest city?”, “What is France’s population?”, etc. This is because of the nature of training of base LLMs. we did not train it to answer queries as such but to complete sentences with next most probable token.
- Instruction Tuned LLM — In contrast, Instruction Tuned LLMs are trained to follow instruction. This is done in order to get answers of our prompts like in above example, when prompted with “What is the capital of France?”, this model should answer “The capital of France is Paris.” These model are trained on top of our base LLMs. We put two more layers of training on our base LLMs. First we fine tune it to lots of inputs (instructions) — outputs (attempt to follow the instruction) pairs. Then we apply Reinforcement Learning with Human feedback (RLHF) to make sure our model is logical and is following the ethical aspects of AI.
Intuition behind Instruction Tuning
Suppose you are running a startup in healthcare domain and you hire a smart employee well versed in programming. But that employee is not useful to you untill you do not provide all the necessary intructions, specific to your startup, to him. Once he has understood the instructions then only he can apply his skillset to business logic. This is what we do in instruction tuning. Base LLM is that smart employee whom we need to provide instruction to perform tasks for us.
Guidelines for Instructions (Prompting)
So basically we as prompt engineer act as the channel between user application and chatGPT. Our main role is to develop, refine and optimize LLM generated texts to ensure they are inline and accurate to the application standards. We take the input text, feed that text to the LLM with our prompts (instructions) . Our instructions are something which ask the LLM to do the optimization and refining on the given text. Based on the input text and our instructions, we receive response from LLM. Then we can return that response to the end user. We can use any LLM but say we are using chatGPT.
First and foremost, we should have access to chatGPT API. We need to install openAI python library and set the openAI API secret key (can be obtained on openAI website). Then we just need to set up the API gateway using openAI chat completion. Ex:
import openai
openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
{"role": "user", "content": "Where was it played?"}
]
)
We can get responses from the chatGPT using this set up. But before that, we need to perform instruction tuning using prompts. There are mainly two main principles of instruction tuning:
Writing clear and specific instructions — To make our instruction clear and specific, you should use delimiters such as quotes, backticks, dashes, brackets, xml tags, etc. These delimiters helps us to avoid prompt injection issues. Prompt Injection is when user is able to add input in our prompt which can cause conflicting instructions for our model. This also helps in separating the prompt from rest of the text. Also, we should ask for structured responses from our models. For example, in json or xml. This will help us to use various handy data structures like dictionary or list in python. We can also ask the model to check whether the conditions are satisfied. It is like exception handling but with prompt. For example, if you have a paragraph in which recipe of a food is described, you can ask the chatGPT to write them in steps such as:
prompt = f"""
You will be provided with text delimited by triple quotes.
If it contains a sequence of instructions, \
re-write those instructions in the following format:
Step 1 - ...
Step 2 - …
…
Step N - …
If the text does not contain a sequence of instructions, \
then simply write \"No steps provided.\"
\"\"\"{text}\"\"\"
"""
messages = [{"role": "user", "content": prompt}]
response = openai.ChatCompletion.create(
model=model,
messages=messages,
)
Give the model time to think — If some complex problems are there in query, we can instruct our model to take time to think propoerly and not jump on wrong answers in hurry. Suppose if we want to validate the answer of a maths problem then we can write instructions such as “solve the question by yourself then validate”. If we want to ask description about any product, we can add instruction to take time and check your knowledge before replying or so.
Limitations
Even though the LLMs are exposed to vast amount of knowledge during training processes they not always remember everything correctly. They often lead to hallucinations where they give plausible but false information. For example, previously if you asked chat gpt about some fake product names in an original industry, it could have given lots of false information which seemed plausible.
Iterative Prompt Development
Just like machine learning model development, where we try various parameters and test the model iteratively, here also we test the model with prompts and then iteratively improve. We analyze the results with the current prompt (instructions), refine the idea and the prompt and then repeat.
Different Applications
We can use prompt engineering for different applications such as summarisation, inferring, transforming, expanding, chatbot, etc. Suppose we have a big textual review of some product and you want to summarise the content in 30 words, then you can give prompt such as
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site.
Summarize the review below, delimited by triple
backticks, in at most 30 words.
Review: ```{review}```
"""
messages = [{"role": "user", "content": prompt}]
response = openai.ChatCompletion.create(
model=model,
messages=messages,
)
Similarly if you some text, you can simply ask in the prompt “What is the sentiment of the following product review, delimited by {whatever delimiter you used}?”. If we go trivial way, we need a lot of data and label them with positive and negative classes and then train for multiple epochs to build a sentiment classifier. But with advancement in LLMs and prompt engineering, you build this app in no time. We can also use prompt engineering for developing translation systems where we can ask the LLM API to generate the translation and return in desired format of user. This can also be used to build chatbots. We can also create document simplification applications and many more.
Reference: https://www.deeplearning.ai