Contents
- Introduction
- Chat Completions API Overview
- Interacting with the API
- Token Counting
- Truncating Conversation History to Limit Tokens
- Demo
- Command Line Interface
- Limitations
- Conclusion
Introduction
In this blog post, we will explore how to implement a minimalist ChatGPT-style app in a Jupyter Notebook or command line. The goal is to provide an understanding of the important concepts, components, and techniques required to create a chat app on top of a large language model (LLM), specifically OpenAI’s GPT. The resulting chat app can serve as a foundation for creating your own customised conversational AI applications.
The code in this blog post can be found in a notebook here. The script for the command line version can be found here.
Chat Completions API Overview
Let us begin with a quick overview of the Chat Completions endpoint of the OpenAI API, which enables you to interact with OpenAI’s large language models to generate text-based responses in a conversational manner. It’s designed for both single-turn tasks and multi-turn conversations.
Example API Call:
The provided code snippet demonstrates how to make an API call for chat completions. In this example, the chat model used is “gpt-3.5-turbo,” and a conversation is created with system, user, and assistant messages:
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who wrote 'A Tale of Two Cities'?"},
{"role": "assistant", "content": "Charles Dickens wrote 'A Tale of Two Cities'."},
{"role": "user", "content": "When was it first published?"}
]
)
Message Structure:
- The main input for the API call is the
messages
parameter, which is an array of message objects. - Each message object has two properties:
role
(either “system,” “user,” or “assistant”) andcontent
(the text of the message). - Conversations can be as short as one message or include many back-and-forth turns.
Typical Conversation Format:
A typical conversation format starts with a system message, followed by alternating user and assistant messages. The system message helps set the behavior of the assistant, but it’s optional. If omitted, the model’s behavior will be similar to a generic message like “You are a helpful assistant.”
Importance of Conversation History:
Including conversation history is crucial when user instructions refer to prior messages. The model has no memory of past requests, so all relevant information must be supplied within the conversation history. If a conversation exceeds the model’s token limit, it needs to be shortened.
Interacting with the API
Now let write a set of functions for communicating with OpenAI’s Chat Completions API. These functions will serve as the backbone of our minimalist chat app. Each function plays a specific role in managing the conversation, formatting messages, and handling responses.
import openai
import json
import os
import sys
# Uncomment and replace with your api key or api key path
# openai.api_key = YOUR_API_KEY
# openai.api_key_path = YOUR_API_KEY_PATH
def get_system_message(system=None):
"""
Generate a system message for the conversation.
Args:
system (str, optional): The system message content. Defaults to None.
Returns:
dict: A message object with 'role' set to 'system' and 'content' containing the system message.
"""
if system is None:
system = "You are a helpful assistant."
return {"role": "system", "content": system}
get_system_message
is responsible for creating a system message. This message is optional but can be used to set the behavior of the assistant. If no system message is provided, it defaults to “You are a helpful assistant.” The function returns a message object with ‘role’ set to ‘system’ and ‘content’ containing the system message.
def get_response(msg,
system_msg=None,
msgs=[], model='gpt-4',
return_incomplete=False):
"""
Get a response from the Chat Completions API.
Args:
msg (str): The user's message.
system_msg (str, optional): The system message. Defaults to None.
msgs (list, optional): Previous conversation messages. Defaults to an empty list.
model (str, optional): The chat model to use. Defaults to 'gpt-4'.
return_incomplete (bool, optional): Whether to return incomplete responses. Defaults to False.
Returns:
list or tuple: A list of response chunks if not returning incomplete, or a tuple containing the list of chunks and a completion status.
"""
_stream_response = openai.ChatCompletion.create(
model=model,
messages=[
system_msg if system_msg is not None else get_system_message(),
*msgs,
{"role": "user", "content": msg}
],
stream=True
)
_chunks = []
if return_incomplete:
complete = False
try:
for _chunk in _stream_response:
_delta = _chunk['choices'][0]['delta']
# Last will be empty
if 'content' in _delta:
sys.stdout.write(_delta['content'])
_chunks.append(_chunk)
# Raise KeyboardInterrupt if return_incomplete is False
except KeyboardInterrupt:
if not return_incomplete:
raise
return _chunks if not return_incomplete else (_chunks, complete)
get_response
is the core function for obtaining a response from the Chat Completions API. It takes the user’s message, an optional system message, previous messages, the model to use, and a flag to indicate whether incomplete responses should be returned.- The function first creates an API call with the specified parameters, including the system message (or a default one) and the user’s message. It uses
stream=True
to stream the response chunks. - It then processes the response chunks, extracting the content of each chunk and storing it in
_chunks
. - If
return_incomplete
is set toTrue
, the function returns a result even if the stream interrupted. In this case the function returns a tuple containing the list of chunks and a completion status. Ifreturn_incomplete
isFalse
, it only returns the result when the full stream is processed and returns only the list of chunks.
def stream2msg(stream):
"""
Convert a stream of response chunks into a single message.
Args:
stream (list): A list of response chunks.
Returns:
str: A single message containing the concatenated content of the response chunks.
"""
return "".join([i["choices"][0]["delta"].get("content", "") for i in stream])
stream2msg
is a utility function that converts a stream of response chunks into a single message. It takes a list of response chunks as input and concatenates the content of each chunk to form a complete message.
def format_msgs(inp, ans):
"""
Format user input and model's response into message objects.
Args:
inp (str): User input message.
ans (str or list): Model's response message as a string or a list of response stream chunks
Returns:
list: A list containing user and assistant message objects.
"""
msg_inp = {"role": "user", "content": inp}
msg_ans = {"role": "assistant", "content": stream2msg(ans) if not isinstance(ans, str) else ans}
return [msg_inp, msg_ans]
format_msgs
takes the user’s input and the model’s response (which can be a message string or a list of response chunks) and creates a list containing message objects for both the user and the assistant, which can subsequently be used in the conversation history.
Token Counting
Before we delve into the implementation details, let us briefly discuss token counting. Tokens are chunks of text that language models use to process and generate responses. It’s crucial to keep track of token counts, as they impact the cost and feasibility of using the API. Token counting includes both input and output tokens. This means that not only the messages you send to the model but also the responses you receive contribute to the total token count.
The exact way tokens are counted can vary between different model versions. The function below for counting tokens is adapted from the OpenAI API documentation (date 05.09.2023). It was written for gpt-3.5-turbo-0613
and serves as a reference. The documentation adds this caveat.
The exact way that messages are converted into tokens may change from model to model. So when future model versions are released, the answers returned by this function may be only approximate.
Depending on the model, the value returned by the function might not be exact but it will be a decent estimate that suffices for this simple example.
It’s also worth noting that each model has a maximum token limit. Exact details for each model are available in the Models section of the documentation. For example it is 8192 for gpt-4
and 4097 for gpt-3.5-turbo
. In our example, we are using the model’s maximum token limit, but in practice, you may want to use a lower value to ensure that both input and output tokens are within the limit.
import tiktoken
def num_tokens_from_messages(messages, model="gpt-4"):
"""Returns the number of tokens used by a list of messages."""
try:
encoding = tiktoken.encoding_for_model(model)
except KeyError:
encoding = tiktoken.get_encoding("cl100k_base")
num_tokens = 0
for message in messages:
num_tokens += 4 # every message follows <im_start>{role/name}\n{content}<im_end>\n
for key, value in message.items():
num_tokens += len(encoding.encode(value))
if key == "name": # if there's a name, the role is omitted
num_tokens += -1 # role is always required and always 1 token
num_tokens += 2 # every reply is primed with <im_start>assistant
return num_tokens
-
num_tokens_from_messages
is a function that takes a list of messages as input and returns the estimated number of tokens used by those messages. -
It uses the
tiktoken
library to calculate token counts. The function attempts to get the token encoding for the specified model. If it encounters a KeyError (indicating an unsupported model), it falls back to the “cl100k_base” encoding, which is a reasonable default. -
The function initializes
num_tokens
to 0, which will be used to accumulate the token count. - For each message in the input list of messages:
- It adds 4 tokens to account for the message structure, including
<im_start>
, role or name, content, and<im_end>
tags. - It then iterates through the message items (e.g., role, content).
- For each item, it calculates the token count by encoding the value using the specified encoding and adding the length of the encoded value.
- If the item is the “name,” it subtracts 1 token because the role is always required and always counts as 1 token.
- It adds 4 tokens to account for the message structure, including
- Finally, it adds 2 tokens to account for the message primed with
<im_start>assistant
.
Truncating Conversation History to Limit Tokens
In the context of managing conversations with language models, it’s crucial to ensure that the conversation history remains within the model’s token limit. To achieve this, we have a function called maybe_truncate_history
which helps truncate the conversation history when it approaches or exceeds the maximum token limit.
Here’s an overview of this function and its purpose:
def maybe_truncate_history(msgs, max_tokens, model='gpt-4', includes_input=True):
msgs_new = []
if msgs[0]['role'] == 'system':
msgs_new.append(msgs[0])
start = 1
msgs = msgs[1:]
if includes_input:
# At least the last message should be included if input
msgs_new.append(msgs[-1])
msgs = msgs[:-1]
# First ensure that input (and maybe system) messages don't exceed token limit
tkns = num_tokens_from_messages(msgs_new)
if tkns > max_tokens:
return False, tkns, []
# Then retain latest messages that fit within token limit
for msg in msgs[::-1]:
msgs_tmp = msgs_new[:1] + [msg] + msgs_new[1:]
tkns = num_tokens_from_messages(msgs_tmp)
if tkns <= max_tokens:
msgs_new = msgs_tmp
else:
break
return True, tkns, msgs_new
-
maybe_truncate_history
is designed to manage the length of conversation history within the token limit of the model. It takes as input the current list of messages (msgs
), the maximum token limit (max_tokens
), the model name (defaults togpt-4
). There is also a flag indicating whether the input is present in the messages to ensure it is not dropped. -
If the first message in the conversation history is a system message, it is added to
msgs_new
, andstart
is set to 1. This step is necessary because system messages should not be truncated. -
If
includes_input
isTrue
, the last message (usually the user’s input) is added tomsgs_new
, and it’s removed from themsgs
list. -
The function first checks if the token count of the messages in
msgs_new
exceeds themax_tokens
. If it does, it returnsFalse
, the token count (tkns
), and an empty list to indicate that the conversation history cannot be accommodated within the token limit. -
Next, the function attempts to retain the latest messages that fit within the token limit. It iterates through the
msgs
list in reverse order, gradually adding messages tomsgs_tmp
. If the token count ofmsgs_tmp
is within themax_tokens
, it updatesmsgs_new
withmsgs_tmp
. This ensures that the conversation history retains as much context as possible while staying within the token limit. -
The function returns
True
to indicate that the conversation history has been successfully truncated to fit within the token limit. It also returns the updated token count (tkns
) and the modifiedmsgs_new
.
This is a simple approach to manage token counts which drops entire messages to keep to the token limit but there are more sophisticated approaches that you could try such as summarising or filtering earlier parts of the conversation.
MinChatGPT
Finally we are in a position to implement our minimalist chat app. With the MinChatGPT
class, we define the foundation of our minimalist chat app. First let us set up the class and add some helper functions. Subsequently we will implement the conversation functionality.
class MinChatGPT(object):
"""
A simplified ChatGPT chatbot implementation.
Parameters:
system: A system-related parameter(optional).
model: The OpenAI model to use; restricted to 'gpt-3.5-turbo' and 'gpt-4'.
log: Boolean that decides if logging is required or not.
logfile: The location of the file where chat logs will be stored.
errfile: The location of the file where error logs will be stored.
include_incomplete: Boolean that decides if incomplete responses are to be included in the history or not.
num_retries: The number of times to retry if there is a connection error
mock: Boolean that decides if the system is in testing mode.
debug: Boolean that decides if the system should go into debug mode.
max_tokens: Maximum number of tokens the model can handle while generating responses.
"""
def __init__(self,
system=None,
model='gpt-4',
log=True,
logfile='./chatgpt.log',
errfile='./chatgpt.error',
include_incomplete=True, # whether to include incomplete responses in history
num_retries=3,
mock=False,
debug=False,
max_tokens=None):
"""
Initializes a MinChatGPT instance with provided parameters.
"""
# For simplicity restrict to these two
assert model in ['gpt-3.5-turbo', 'gpt-4'] # Ensures the model parameter is valid
# System & GPT Model related parameters
self.system = system
self.system_msg = get_system_message(system) # Retrieve system message if available
self.model = model
# logging related parameters
self.log = log
self.logfile = logfile
self.errfile = errfile
# Behavioural flags
self.include_incomplete = include_incomplete
self.num_retries = num_retries
self.mock = mock
self.debug = debug
# History and error storage related parameters
self.history = []
self.history_size = []
self.errors = []
# Setting maximum tokens model can handle, defaults are provided for the two specified models
self.max_tokens = {'gpt-4': 8192, 'gpt-3.5-turbo': 4097}[model] if max_tokens is None else max_tokens
def _logerr(self):
with open(self.errfile, 'w') as f:
f.write('\n'.join(self.errors))
def _logchat(self):
with open(self.logfile, 'w') as f:
json.dump(fp=f, obj={'history': self.history, 'history_size': self.history_size}, indent=4)
def _chatgpt_response(self, msg="", newline=True):
sys.stdout.write(f'\nMinChatGPT: {msg}' + ('\n' if newline else ''))
- The
__init__
method initializes our chatbot instance with parameters, such as the system message, the OpenAI model to be used, and several behavioural flags for logging, debugging, or testing (mock). - It also initializes history-related containers, such as
self.history
,self.history_size
, andself.errors
, for tracking the chat history and potential errors. - The
max_tokens
parameter sets the limit for tokens the model can handle, defaulting to the restrictions of the chosen model. - We have two logging functions,
_logerr
and_logchat
, saving error logs and chat logs respectively to specified locations. - We also have a helper function
_chatgpt_response
for printing the bot’s response to the console.
However we have not yet implemented the main functionality of the chatbot, which is to manage the conversation. Let us go ahead and implement a chat
method that enables the user to interact with the model.
Implementing the conversation functionality
The chat
method is the main entry point for initiating a conversation with the MinChatGPT chatbot. It manages user interaction, input processing, handling special cases like an ‘exit’ command or an empty message, generating responses, and logging information if desired.
Here is a detailed walkthrough of the chat
method
def chat(self):
"""
Initiates a chat session with the user. During the chat session, the chatbot will receive user message inputs,
process them and generate appropriate responses.
The chat session will continue indefinitely until the user enters a termination command like "Bye", "Goodbye", "exit",
or "quit". The function also logs the chat session, and any errors that occur during the session.
"""
# Maybe_exit flag for controlling the exit prompt
maybe_exit = False
# Welcome message for user
print('Welcome to MinChatGPT! Type "Bye", "Goodbye", "exit" or "quit", to quit.')
User Interaction
The core of the chat
method is an infinite while loop that simulates a conversation. The user is requested for an input message, which is then handled in the loop. To allow the user to end the conversation at any point, the code checks for certain phrases such as “bye”, “goodbye”, “exit”, or “quit”.
# Main chat loop
while True:
# Capture user input
inp = input('\nYour message: ')
Handling Empty Messages
If the user input is an empty string, the method reminds the user to enter a message and goes back to the start of the loop to ask again.
try:
# Handling empty input from user
if len(inp) == 0:
print('Please enter a message')
continue
Exiting
If the previous input appeared to have indicated an intention to exit (maybe_exit == True
), the user is asked for confirmation.
If the user gives an affirmative response, the bot replies with a goodbye and breaks the loop to end the conversation.
If the user does not want to exit, the bot continues to chat.
# Case insensitive user input
stripped_lowered_inp = inp.strip().lower()
# Handling user's confirmation on exit
if maybe_exit:
if stripped_lowered_inp in ['y', 'yes', 'yes!', 'yes.']:
self._chatgpt_response('Goodbye!')
break
else:
self._chatgpt_response("Ok. Let's keep chatting.")
maybe_exit = False
continue
Intention to exit
This simple approach determines if the user input is matches any of the exit signals.
If it does, the maybe_exit
flag is assigned a value of True
and in the next interaction the user is asked for confirmation.
You could also try more sophisticated approaches that get the model to infer whether the user wishes to end the conversation.
# Checking if user wants to exit
if stripped_lowered_inp in [
'exit', 'exit()', 'exit!', 'exit.',
'quit', 'quit()', 'quit!', 'quit.',
'bye', 'bye!', 'bye.',
'goodbye', 'goodbye!', 'goodbye.'
]:
maybe_exit = True
self._chatgpt_response('Are you sure you want to quit? Enter Yes or No.')
continue
Process User Inputs
The code next deals with non-empty, non-exit user inputs. It prepares the message history to be sent to the OpenAI model by appending the user’s new message. The history is then checked to ensure it doesn’t exceed the max token limit of the model. If the history is too long, we inform the user, don’t produce a response, and again loop to the start for a new input.
# Preparing message history before calling the model
msgs = [self.system_msg, *self.history, {'role': 'user', 'content': inp}]
# Call to helper function to check if conversation history does not exceed max tokens
valid, tkns, trimmed = maybe_truncate_history(msgs, max_tokens=self.max_tokens)
[_, *msgs_to_send, _] = trimmed
Generate Response and Update History
If the length of the input is within limits, then the bot produces a response. If the system is in mock mode, it just returns a test message. Otherwise, an actual response is generated and delivered to the user.
If there is a connection error in getting a response from the API, it retries for upto num_retries
Incomplete messages are handled as per the include_incomplete
flag which determines whether or not to add incomplete responses to the history.
The code also saves the length of the history used for this response generation.
# Handling valid and invalid token scenarios
if valid:
# Inform user if history was truncated
if len(trimmed) < len(msgs):
print(f'\nDropping earliest {len(msgs) - len(trimmed)} messages from history to keep within token limits')
num_api_calls = 0
if self.mock:
# For testing response functionality
msg = 'Test message'
self._chatgpt_response(msg)
else:
# Generate response from model
self._chatgpt_response(newline=False)
while True:
try:
msg, complete = get_response(inp, system_msg=self.system_msg, msgs=msgs_to_send, return_incomplete=True)
break
except ConnectionResetError:
if num_api_calls < self.num_retries:
num_api_calls += 1
else:
raise
# Skip to next if incomplete messages not included in history
if not complete and not self.include_incomplete:
continue
else:
# If message exceeds token limit, ask user to reduce message length
print(f'\nTotal number of {tkns} tokens exceeds max number of tokens allowed. Please try again after reducing message length.')
continue
# Keeping track of history size
self.history_size.append(len(msgs_to_send))
Logging and Debugging
Log details are printed if the system is in debug mode. Then, a new pair of messages is created from the user input and the generated response and added to the message history. If the log flag is set, the chat history is saved
# Debug information provided for development and troubleshooting
if self.debug:
print(f"\n\nLast {self.history_size[-1]} message(s) used as history / Num tokens sent: {tkns} / Num retries: {num_api_calls + 1}")
print("Messages sent:")
print("="*100)
for i in trimmed:
print(f'{i["role"]}: {i["content"]}')
print("="*100)
# Adding user and system response to chat history
self.history.extend(format_msgs(inp, msg))
# Saving chat history if logging is True
if self.log:
self._logchat()
Handling Errors
Any exceptions that occur during the above process are caught, added to the bot’s error log, and displayed to the user, who is then invited to try again.
except Exception as e: # Exception handling for unexpected inputs or system errors
self.errors.append(str(e))
# Logging error details if logging is True
if self.log:
self._logerr()
print(f'\nThere was the following error:\n\n{e}.\n\nPlease try again.')
continue
Finally make this a method for the MinChatGPT
class
MinChatGPT.chat = chat
Demo
Let us now take a look at a simple demo in debug
mode to see what input is given to the model each time. We can also see how it behaves when given an empty input, how it handles exit signals and what happens when you interrupt it mid-message.
minchat = MinChatGPT(log=True, debug=True)
minchat.chat()
Welcome to MinChatGPT! Type "Bye", "Goodbye", "exit" or "quit", to quit.
Your message:
Please enter a message
Your message: Bye
MinChatGPT: Are you sure you want to quit? Enter Yes or No.
Your message: No
MinChatGPT: Ok. Let's keep chatting.
Your message: What spices and herbs go well with chocolate? Answer as a comma separated list.
MinChatGPT: Cinnamon, nutmeg, chili powder, cardamom, ginger, vanilla, peppermint, lavender, rosemary, star anise, sea salt, cloves, espresso powder.
Last 0 message(s) used as history / Num tokens sent: 34
Messages sent:
====================================================================================================
system: You are a helpful assistant.
user: What spices and herbs go well with chocolate? Answer as a comma separated list.
====================================================================================================
Your message: Why does cinnamon go well?
MinChatGPT: Cinnamon adds a warmth and complexity to the flavor of chocolate, enhancing its richness and depth. The sweet-spicy character of cinnamon can complement both milk and dark chocolate, and it's often used in various chocolate dishes, such as hot cocoa, truffles, and cakes, to create a more intriguing taste profile.
Last 2 message(s) used as history / Num tokens sent: 87
Messages sent:
====================================================================================================
system: You are a helpful assistant.
user: What spices and herbs go well with chocolate? Answer as a comma separated list.
assistant: Cinnamon, nutmeg, chili powder, cardamom, ginger, vanilla, peppermint, lavender, rosemary, star anise, sea salt, cloves, espresso powder.
user: Why does cinnamon go well?
====================================================================================================
Your message: Can you give some examples of these dishes?
MinChatGPT: Certainly, here are some examples of chocolate dishes where cinnamon can shine:
1. Cinnamon Hot Chocolate: This beverage combines the richness of chocolate with the warmth of cinnamon, creating a comforting drink.
2. Cinnamon Chocolate Truffles: These desserts blend the two flavors in a sweet, bite-size treat.
3. Mexican Mole Sauce: This traditional dish uses both chocolate and cinnamon (among other ingredients) to create a unique, rich sauce often served over meats.
4. Chocolate and Cinnamon Swirl Bread: A sweet bread where both flavors
Last 4 message(s) used as history / Num tokens sent: 169
Messages sent:
====================================================================================================
system: You are a helpful assistant.
user: What spices and herbs go well with chocolate? Answer as a comma separated list.
assistant: Cinnamon, nutmeg, chili powder, cardamom, ginger, vanilla, peppermint, lavender, rosemary, star anise, sea salt, cloves, espresso powder.
user: Why does cinnamon go well?
assistant: Cinnamon adds a warmth and complexity to the flavor of chocolate, enhancing its richness and depth. The sweet-spicy character of cinnamon can complement both milk and dark chocolate, and it's often used in various chocolate dishes, such as hot cocoa, truffles, and cakes, to create a more intriguing taste profile.
user: Can you give some examples of these dishes?
====================================================================================================
Your message: Ok got the idea.
MinChatGPT: Great! If you have any other questions or need further information, feel free to ask. Enjoy your culinary adventures with chocolate and cinnamon!
Last 6 message(s) used as history / Num tokens sent: 294
Messages sent:
====================================================================================================
system: You are a helpful assistant.
user: What spices and herbs go well with chocolate? Answer as a comma separated list.
assistant: Cinnamon, nutmeg, chili powder, cardamom, ginger, vanilla, peppermint, lavender, rosemary, star anise, sea salt, cloves, espresso powder.
user: Why does cinnamon go well?
assistant: Cinnamon adds a warmth and complexity to the flavor of chocolate, enhancing its richness and depth. The sweet-spicy character of cinnamon can complement both milk and dark chocolate, and it's often used in various chocolate dishes, such as hot cocoa, truffles, and cakes, to create a more intriguing taste profile.
user: Can you give some examples of these dishes?
assistant: Certainly, here are some examples of chocolate dishes where cinnamon can shine:
1. Cinnamon Hot Chocolate: This beverage combines the richness of chocolate with the warmth of cinnamon, creating a comforting drink.
2. Cinnamon Chocolate Truffles: These desserts blend the two flavors in a sweet, bite-size treat.
3. Mexican Mole Sauce: This traditional dish uses both chocolate and cinnamon (among other ingredients) to create a unique, rich sauce often served over meats.
4. Chocolate and Cinnamon Swirl Bread: A sweet bread where both flavors
user: Ok got the idea.
====================================================================================================
Your message: Goodbye!
MinChatGPT: Are you sure you want to quit? Enter Yes or No.
Your message: Yes
MinChatGPT: Goodbye!
Command Line Interface
To run this as a command line application, copy all the code from this notebook into a python file called minchatgpt.py
. Then add this code to the end of the file.
if __name__ == '__main__':
import argparse
import os
# Get key from environment instead of assigning
openai.api_key = os.environ.get("API_KEY")
# alternatively
# openai.api_key_path = os.environ.get("API_KEY_PATH")
# Define a function to parse boolean arguments
def bool_arg(s):
if s.lower() in ['true', 't', 'yes', 'y', '1']:
return True
elif s.lower() in ['false', 'f', 'no', 'n', '0']:
return False
else:
raise ValueError('Boolean value expected.')
parser = argparse.ArgumentParser(
description='MinChatGPT: A minimalist chat app based on OpenAI\'s GPT model')
parser.add_argument(
'--debug', help='Run in debug mode', type=bool_arg, default=False)
parser.add_argument(
'--mock', help='Run in mock mode', type=bool_arg, default=False)
parser.add_argument(
'--log', help='Log chat history', type=bool_arg, default=True)
parser.add_argument(
'--logfile', type=str, default='./chatgpt.log', help='Location of chat history log file')
parser.add_argument(
'--errfile', type=str, default='./chatgpt.error', help='Location of error log file')
parser.add_argument(
'--model', type=str, default='gpt-4', help='OpenAI model to use')
parser.add_argument(
'--include_incomplete', type=bool_arg, default=True,
help='Include incomplete responses in history')
parser.add_argument(
'--num_retries', type=int, default=3,
help='Number of times to retry if there is a connection error')
parser.add_argument(
'--max_tokens', type=int, default=None,
help='Maximum number of tokens the model can handle while generating responses')
args = parser.parse_args()
kwargs = vars(args)
minchat = MinChatGPT(**kwargs)
minchat.chat()
To run the application, assign your API key to the API_KEY
environment variable (or to the API_KEY_PATH
environment variable). Then run python minchatgpt.py
with arguments as required. For example, to run in debug mode with logging enabled, you can use the following command:
export API_KEY=YOUR_API_KEY; python minchatgpt.py --debug True --log True
Limitations
The goal of MinChatGPT is to demonstrate how a chat app can be implemented on top of a conversational language model. Whilst it serves as a useful starting point for engaging with LLMs, it has several limitations at this stage, including:
-
Lack of Input Moderation: MinChatGPT doesn’t filter or restrict the type of content that users can input. This can potentially lead to inappropriate or offensive messages that might violate the API’s rules.
-
Inability to Resume Chats or Start New Ones: The app does not provide features for resuming previous conversations from saved history or starting entirely new chats. Users are limited to a single, continuous conversation session. However it would be fairly straightforward to incorporate these features.
-
Limited Testing of Chat Logic: MinChatGPT’s chat logic has not been comprehensively tested with a wide range of input combinations. As a result, there may be scenarios where the chat logic behaves unexpectedly, encounters errors or does not properly handle errors.
Conclusion
In this blog post, we explored the building blocks for creating a minimalist chat-style application based on OpenAI’s GPT model within a Jupyter notebook (or command line). We discussed API interaction, token counting, conversation history truncation, and building a chat interface. You can use MinChatGPT as a starting point for building more complex and sophisticated applications. You can also modify it to make it compatible with other LLMs. I encourage you to experiment by adding features, making it more robust, extending its capabilities and adapting it to suit your requirements.