Solan Sync
Posts
Creating an AI Assistant Using OpenAI like ChatGPT

Creating an AI Assistant Using OpenAI like ChatGPT

Detailed Instructions for Utilizing the Assistants API and Fine-Tuning

Solan Sync
February 13, 2024

Albert Einstein: “Imagination is more important than knowledge. For knowledge is limited, whereas imagination embraces the entire world, stimulating progress, giving birth to evolution.”

Understanding the Difference: Chatbot and Assistant

Before we explore the coding examples, let’s clarify the difference between an AI chatbot and an assistant. Although these terms are frequently used as if they’re the same, in this context, they represent distinct concepts.

A chatbot refers to an AI with which you can engage in conversation. On the other hand, an AI assistant is essentially a chatbot enhanced with the ability to utilize tools. These tools could range from web browsing functionalities, calculators, Python interpreters, to any other feature that broadens a chatbot’s utility.

Take, for instance, the basic version of ChatGPT; it qualifies as a chatbot due to its sole focus on chat interactions. Contrastingly, the premium version of ChatGPT exemplifies an assistant, equipped with additional features such as web browsing, information gathering, and image creation capabilities.

Exploring the Assistants API

The concept of constructing AI assistants — or AI agents — has been around for some time. OpenAI’s Assistants API now offers a simplified approach to developing these AI systems. In this discussion, I’ll leverage the API to create an AI that can respond to YouTube comments, utilizing knowledge retrieval capabilities similar to RAG, based on content from one of my Medium articles.

Setting Up the Vanilla Assistant

The initial step involves importing necessary Python libraries and establishing a connection with the OpenAI API.

from openai import OpenAI
# Assuming the secret key is directly assigned or managed through environment variables
api_key = "your_openai_api_key"  
# Replace this with your actual OpenAI API key

client = OpenAI(api_key=api_key)

Acquiring an OpenAI API key is crucial for this process. If you’re unsure about how to obtain one, it’s advisable to look into resources or guides that explain the registration and API key generation process on the OpenAI platform. This example assumes the API key is either directly included in the code (not recommended for production) or managed through secure means, such as environment variables.

We’re now ready to set up a simple assistant (though it’s more accurately described as a chatbot at this stage, as it doesn’t utilize any tools). While this could technically be accomplished with a single line of code, I’ve chosen to use several lines instead to enhance readability.

instructions_text = "As a virtual data science consultant on YouTube, MyGPT \
delivers explanations in straightforward, easily understandable language, \
deepening into technical specifics as requested. It thoughtfully responds to feedback \
and signs off with its distinct '–MyGPT'. MyGPT also adjusts the length of its \
replies to mirror the viewer's comment, offering succinct acknowledgments for short \
thanks or feedback, thereby maintaining a natural and engaging dialogue."

assistant = client.beta.assistants.create(
    name="MyGPT",
    description="Data science-oriented GPT for managing YouTube comments",
    instructions=instructions_text,
    model="gpt-4-0125-preview"
)

The code snippet above demonstrates how to specify the assistant’s name, description, instructions, and the model it should use. The key factors that influence the performance of the assistant are the instructions provided and the choice of model. Crafting effective instructions, also known as prompt engineering, requires iteration and refinement but is a crucial investment of time. For this example, I opt for the most recent version of GPT-4, though older (and potentially more cost-effective) models are available as well.

Once the assistant is configured, we can initiate communication by sending it a message, as illustrated in the following code block.

# Initialize a conversation thread to manage interactions between the user and the assistant
conversation_thread = client.beta.threads.create()

# Incorporate a user's message into the conversation
user_message = client.beta.threads.messages.create(
    thread_id=conversation_thread.id,
    role="user",
    content="Thank you!"
)

# Prompt the assistant to respond to the user's message
assistant_response = client.beta.threads.runs.create(
  thread_id=conversation_thread.id,
  assistant_id=assistant.id,
)

In the code block provided earlier, several steps are executed to facilitate communication. Initially, a thread object is established, which orchestrates the exchange of messages between the user and the assistant, eliminating the necessity for custom code to manage this interaction. Subsequently, a user’s message is incorporated into this thread, representing the YouTube comments relevant to our scenario. Lastly, this thread is forwarded to the assistant, prompting it to craft a response through the creation of a run object.

Within a brief period, the assistant delivers the following response:

Happy to hear it was useful to you! 
If there's anything else you're wondering about or any other questions you have, 
don't hesitate to reach out. –MyGPT

Although the response appears satisfactory, it doesn’t quite match what I would typically say. Let’s explore enhancing the assistant’s output through the technique known as few-shot prompting.

Understanding Few-Shot Prompting

Few-shot prompting involves incorporating specific input-output examples into the assistant’s instructions, enabling it to learn from these examples. In this approach, I’ve added three authentic comments and corresponding responses to the existing instruction set.

instructions_text_few_shot = """As a virtual data science consultant on YouTube, MyGPT engages in dialogue using clear and accessible language, deepening into technical details as needed. It responds effectively to feedback and signs off with its unique '–MyGPT'. MyGPT adjusts its response length to match the viewer's comment, offering succinct acknowledgments for brief thanks or feedback, maintaining a natural and captivating interaction.

Below are examples of MyGPT's replies to viewer comments.

Viewer comment: This was a very thorough introduction to LLMs and answered many questions I had. Thank you.
MyGPT Response: Delighted to hear that it was helpful :) -MyGPT

Viewer comment: Epic, very useful for my BCI class
MyGPT Response: Appreciate it, glad to assist! -MyGPT

Viewer comment: Honestly the most straightforward explanation I've ever watched. Super excellent work. Thank you. It's so rare to find good communicators like you!
MyGPT Response: Thank you, pleased it was easy to understand -MyGPT"""

Observe that I’ve included the signature “ — MyGPT” at the end of each example, despite these not being part of my original responses. Incorporating this signature is crucial for achieving the intended behavior from the assistant.

With these revised instructions in place, we’re ready to set up a new assistant following the same procedure as previously described.

assistant = client.beta.assistants.create(
    name="MyGPT",
    description="Data scientist GPT for YouTube comments",
    instructions=instructions_text_few_shot,
    model="gpt-4-0125-preview"
)

Submitting the same test comment (“Great content, thank you!”) to this assistant as before, we now receive the response:

You're welcome, happy to hear you found it useful! -MyGPT

While this response is (still) somewhat more extended than my usual replies, it represents a significant improvement over the initial vanilla assistant’s output.

Now, let’s pose a more technical query. Using the same method as previously to convey the user’s message to the assistant, we now inquire, “What is fat-tailedness?”

Fat-tailedness refers to a statistical concept describing the characteristics of a specific kind of probability distribution. In these distributions, the tails (the distribution's extreme ends, highlighting outlier values far from the average) are heavier or "fatter" than those in a standard (Gaussian) distribution. This indicates a greater probability of encountering extreme events than would be anticipated with a normal distribution.

Practically, when a dataset is said to have a fat-tailed distribution, it suggests that uncommon, extreme events (those within the tails) occur more often than models based on normal distributions would forecast. This principle holds significant importance in areas such as finance, economics, and risk management, where overlooking the frequency of extreme events can lead to substantial implications.

Fat-tailed distributions stand in contrast to thin-tailed distributions, such as the normal distribution, where extreme outcomes are less common, and the majority of data points cluster around the average. Recognizing if a dataset exhibits a fat-tailed distribution is crucial for modeling and predicting events accurately, particularly in evaluating risks and preparing for infrequent yet consequential occurrences. -MyGPT

Implementing Knowledge Retrieval

Configuring a Retrieval-Augmented Generation (RAG) system typically involves several intricate steps, such as segmenting documents, establishing a vector database, creating a semantic search mechanism, and integrating search outcomes into a context window. However, the Assistants API simplifies this process into just two steps.

Initially, we upload the documents we wish to retrieve information from. Following this, we enable the retrieval function within our assistant.

In the following example, I demonstrate uploading an article about quantifying fat tails to my OpenAI account. This requires specifying the file’s path/name and its intended use. For this instance, we designate the file’s purpose as “assistants”. The alternative use, “fine-tune”, will be explored shortly.

file_upload = client.files.create( file=open("articles/4 Ways to Quantify Fat Tails with Python _ by MyGPT _ Towards Data Science.pdf", "rb"), purpose="assistants" )

Following this, we proceed to establish a new assistant, this time incorporating retrieval capabilities and indicating which documents it can access. Although I’m employing just one document for retrieval in this instance, up to 20 documents can be specified by adding their file IDs to the file_ids list.

assistant_creation = client.beta.assistants.create( name="MyGPT", description="Data scientist GPT for YouTube comments", instructions=instructions_text_few_shot, tools=[{"type": "retrieval"}], file_ids=[file_upload.id], model="gpt-4–0125-preview" )

When we pose the same technical query to the assistant, it generates the response below.

Fat-tailedness refers to the extent to which outlier events significantly impact the overall statistics of a distribution. This concept underscores that, in some distributions, a handful of extreme observations can skew the total analysis. A well-known instance is the Pareto distribution, commonly mentioned in discussions about wealth distribution, where a minor fraction of the population may control the vast majority of resources. Statistically, distributions vary from thin-tailed (like Gaussian distributions, where outliers are extremely rare) to fat-tailed (such as Pareto distributions, where outliers are more common than in Gaussian distributions).
Understanding fat-tailedness goes beyond simply labeling distributions as "Power Law" or not; it involves recognizing a range of tail thickness. This nuanced perspective enables more accurate classification of data distributions. Identifying the degree of fat-tailedness in a dataset is critical in areas such as finance, insurance, risk management, and the social sciences, where assessing the frequency and impact of rare events is key.
To quantify fat-tailedness, various methodologies can be employed, including mathematical approaches like the Power Law Tail Index, Kurtosis, the standard deviation in Log-normal distributions, and Taleb's kappa metric. Each metric provides a unique viewpoint for evaluating the fat-tailed characteristic of a distribution, offering valuable insights into the dynamics of outliers in the data -MyGPT

Utilizing the Fine-tuning API

While shaping an assistant through prompt engineering offers a straightforward method to program desired behaviors, finding the optimal way to guide the model can sometimes be challenging. In such instances, fine-tuning the model becomes a practical solution.

Fine-tuning involves adapting a pre-trained model by training it further with specific examples tailored to a particular task. With the OpenAI Fine-tuning API, this process entails supplying pairs of example interactions between users and the assistant [3].

For a use case like responding to YouTube comments, this requires compiling viewer comments (representing user messages) along with their corresponding replies (representing assistant messages).

Though collecting this extra data for fine-tuning demands more effort initially, the potential for significantly enhanced model performance is substantial [3]. In the following sections, I will guide you through the fine-tuning process tailored to this specific application.

Preparing Data for Fine-tuning

I began by manually reviewing past YouTube comments and transferring them into a spreadsheet, which I later exported as a .csv file.

Although this .csv file contains all the necessary information for fine-tuning, it’s not directly usable in its current form. We need to convert it into a specific format that the OpenAI API can process.

We aim to create a .jsonl file, which is a text file where each line represents a training example in JSON format. For those familiar with Python but not JSON, it’s akin to a dictionary containing key-value pairs.

The conversion to .jsonl format involves reading the .csv file to segregate comments into Python lists, as shown below:

import csv
import json
import random

comment_list = []
response_list = []

with open('data/YT-comments.csv', 'r') as file:
    csv_reader = csv.reader(file)
    
    for line in csv_reader:
        if line[0] == 'Comment':  # Skip the header
            continue
        
        comment_list.append(line[0])
        response_list.append(line[1] + " -MyGPT")

Then, we create a list of dictionaries for training examples, with each dictionary containing “messages” as a key and a list of dictionaries representing system, user, and assistant messages as the value.

The process is visualized below, followed by Python code to structure our comments and responses into the necessary training format.

example_list = []

for i in range(len(comment_list)):
    system_dict = {"role": "system", "content": instructions_text_few_shot}
    user_dict = {"role": "user", "content": comment_list[i]}
    assistant_dict = {"role": "assistant", "content": response_list[i]}
    
    messages_list = [system_dict, user_dict, assistant_dict]
    example_list.append({"messages": messages_list})

After organizing 59 user-assistant pairs, we split them into training and validation datasets for model evaluation.

# Randomly select examples for validation
validation_indices = random.sample(range(len(example_list)), 9)
validation_data_list = [example_list[i] for i in validation_indices]
for index in sorted(validation_indices, reverse=True):
    del example_list[index]

Next, we save these datasets to .jsonl files, preparing them for the fine-tuning process.

# Save the data to .jsonl files
with open('data/training-data.jsonl', 'w') as train_file, open('data/validation-data.jsonl', 'w') as valid_file:
    for example in example_list:
        json.dump(example, train_file)
        train_file.write('\n')
    for example in validation_data_list:
        json.dump(example, valid_file)
        valid_file.write('\n')

Initiating the Fine-tuning Process

With the datasets ready, fine-tuning involves two primary steps: uploading the training and validation datasets to OpenAI and initiating the training job. This process leverages the “fine-tune” purpose for the files and employs the latest model available, here specified as gpt-3.5-turbo, with a custom suffix to identify the fine-tuned model.

# Upload datasets for fine-tuning
training_data = client.files.create(file=open("data/training-data.jsonl", "rb"), purpose="fine-tune")
validation_data = client.files.create(file=open("data/validation-data.jsonl", "rb"), purpose="fine-tune")

# Start the fine-tuning job
fine_tune_job = client.fine_tuning.jobs.create(training_file=training_data.id, validation_file=validation_data.id, model="gpt-3.5-turbo", suffix="MyGPT")

The training typically concludes in about 15 minutes, after which we can deploy the fine-tuned model for generating responses.

The fine-tuning process is expected to last about 15 minutes. Upon completion, the fine-tuned model becomes accessible through the completions API, demonstrated as follows:

test_comment = "Great content, thank you!"

response = client.chat.completions.create(
    model="ft:gpt-3.5-turbo-0613:personal:mygpt:8mUeVreo",
    messages=[
        {"role": "system", "content": instructions_text_few_shot},
        {"role": "user", "content": test_comment}
    ]
)

It’s important to note the difference in generating the response with the fine-tuned model, as highlighted above, which deviates from the previous method due to the lack of support for fine-tuned models in the Assistants API.

A notable challenge is enhancing the fine-tuned model with additional functionalities (transforming it into a more comprehensive assistant), which requires integrating external libraries such as LangChain or LlamaIndex.

Although integrating these features into a fine-tuned assistant demands extra effort, the immediate responses it provides are more aligned with my natural way of communicating. Here are several examples of responses to the test comment:

“Thanks, I appreciate it!” -MyGPT
“Thanks, glad you liked it!” -MyGPT
“Glad it was helpful!” -MyGPT

Testing the model with a simple comment shows improved alignment with the desired response style. Further examination with a technical question, “What is fat-tailedness?”, illustrates the model’s refined understanding and presentation, suggesting a promising enhancement over the pre-fine-tuning stage, especially if combined with RAG capabilities for even richer responses.

Great query! The phenomenon of heavy tails illustrates the prominence of outlier (or extreme) events compared to a normal (Gaussian) distribution. Essentially, it means that extreme occurrences are more likely than what would be expected in a normal distribution. -MyGPT

Looking Ahead

Creating an AI assistant is now more accessible than ever. In this guide, we explored how to effortlessly create an AI assistant using OpenAI’s Assistants API and enhance its capabilities through the Fine-tuning API.

Although OpenAI provides some of the most sophisticated models for constructing the kind of AI assistant we’ve discussed, access to these models is gated by their API, presenting limitations on the scope and manner of our developments.

This leads us to the pivotal question: How can we build comparable systems leveraging open-source tools? The forthcoming articles in this series will delve into this topic, detailing the process of fine-tuning models with QLoRA and expanding a chatbot’s functionality using RAG.

Thank you for reading this article so far, you can also get the free prompts from here.

https://www.buymeacoffee.com/yukitaylorw

Also, Discover the best AI tools with us below.

https://aitools.solan-ai.com/

What Will You Get?

Access to my Premium Prompts Library.
Access our News Letters to get help along your journey.
Access to our Upcoming Premium Tools for free.

Check out discounted digital contents on https://www.solan-ai.com/

Subscribe Our FREE NewsLetter now!

https://solansync.beehiiv.com/

Bonus

The Notion page showcases its platform designed for creating wikis, documents, and managing projects. It features an AI assistant, various templates, and is suitable for teams of all sizes. The platform caters to diverse professional groups and emphasizes community engagement and global events. For more details, you can visit their Notion page.

Reply

or to participate.