Building an AI Agent from Scratch

Introduction

In the world of AI development, pre-built frameworks often simplify the process of creating AI agents. However, building an agent from scratch can provide greater flexibility and a deeper understanding of the underlying mechanisms. This blog post will guide you through the process of creating an AI agent without using any existing frameworks, focusing on implementing a model context protocol for seamless interactions with language models.

Understanding the Model Context Protocol

The model context protocol is a method of maintaining conversation history and relevant information throughout an AI agent's interactions. It allows the agent to provide more coherent and contextually appropriate responses. Let's break down the key components of this protocol:

Message history
Context window
Tokenization
Prompt engineering

Setting Up the Project

First, let's set up a basic project structure. Create a new directory for your project and initialize a Python virtual environment:

mkdir ai-agent-from-scratch
cd ai-agent-from-scratch
python -m venv venv
source venv/bin/activate

# On Windows, use `venv\Scripts\activate`

Install the required dependencies:

pip install requests

Implementing the AI Agent

Let's start by creating a basic structure for our AI agent. Create a new file called ai_agent.py:

import json
import requests

class AIAgent:
    def __init__(self, api_key):
        self.api_key = api_key
        self.conversation_history = []
        self.max_tokens = 4096
        self.api_url = "https://api.openai.com/v1/chat/completions"

    def send_message(self, message):
        self.conversation_history.append({"role": "user", "content": message})
        response = self._get_model_response()
        self.conversation_history.append({"role": "assistant", "content": response})
        return response

    def _get_model_response(self):
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {self.api_key}"
        }
        data = {
            "model": "gpt-3.5-turbo",
            "messages": self._prepare_context(),
            "max_tokens": 150
        }
        response = requests.post(self.api_url, headers=headers, data=json.dumps(data))
        return response.json()["choices"][0]["message"]["content"]

    def _prepare_context(self):
        context = []
        total_tokens = 0
        for message in reversed(self.conversation_history):
            message_tokens = len(message["content"].split())
            if total_tokens + message_tokens > self.max_tokens:
                break
            context.insert(0, message)
            total_tokens += message_tokens
        return context

This basic implementation includes the core functionality of our AI agent. Let's break down the key components:

__init__: Initializes the agent with an API key and sets up the conversation history.
send_message: Adds the user's message to the conversation history, gets a response from the model, and adds the response to the history.
_get_model_response: Sends a request to the language model API and retrieves the response.
_prepare_context: Prepares the context for the model by selecting relevant messages from the conversation history.

Enhancing the Agent with Advanced Features

Now that we have a basic implementation, let's add some advanced features to make our agent more powerful and flexible.

Implementing Tokenization

To more accurately manage the context window, we can implement proper tokenization. Add the following method to the AIAgent class:

def _count_tokens(self, text):

# This is a simple approximation. For more accurate results,
 

# consider using a proper tokenizer like GPT-2 Tokenizer
    return len(text.split())

Update the _prepare_context method to use this new tokenization:

def _prepare_context(self):
    context = []
    total_tokens = 0
    for message in reversed(self.conversation_history):
        message_tokens = self._count_tokens(message["content"])
        if total_tokens + message_tokens > self.max_tokens:
            break
        context.insert(0, message)
        total_tokens += message_tokens
    return context

Adding Memory Management

To help our agent maintain long-term memory, we can implement a simple key-value store for important information. Add the following methods to the AIAgent class:

def __init__(self, api_key):

# ... (previous init code) ...
    self.memory = {}

def remember(self, key, value):
    self.memory[key] = value

def recall(self, key):
    return self.memory.get(key, None)

def _prepare_context(self):
    context = []
    total_tokens = 0

# Add relevant memories to the context
    for key, value in self.memory.items():
        memory_message = {"role": "system", "content": f"Remember: {key} = {value}"}
        memory_tokens = self._count_tokens(memory_message["content"])
        if total_tokens + memory_tokens <= self.max_tokens:
            context.append(memory_message)
            total_tokens += memory_tokens

# ... (previous context preparation code) ...

Implementing Prompt Engineering

To guide the model's responses more effectively, we can add a system message that defines the agent's role and behavior. Update the _prepare_context method:

def _prepare_context(self):
    context = [
        {"role": "system", "content": "You are a helpful AI assistant. Provide clear and concise answers to user queries."}
    ]
    total_tokens = self._count_tokens(context[0]["content"])

# ... (previous context preparation code) ...

Using the AI Agent

Now that we have implemented our AI agent, let's create a simple example to demonstrate its usage. Create a new file called main.py:

from ai_agent import AIAgent

def main():
    api_key = "your-api-key-here"
    agent = AIAgent(api_key)

    print("AI Agent: Hello! How can I assist you today?")

    while True:
        user_input = input("You: ")
        if user_input.lower() in ["exit", "quit", "bye"]:
            print("AI Agent: Goodbye! Have a great day!")
            break

        response = agent.send_message(user_input)
        print(f"AI Agent: {response}")

# Example of using memory
        if "remember" in user_input.lower():
            key, value = user_input.split("remember")[1].strip().split("as")
            agent.remember(key.strip(), value.strip())
            print(f"AI Agent: I'll remember that {key.strip()} is {value.strip()}.")

if __name__ == "__main__":
    main()

This example creates an interactive loop where users can chat with the AI agent. It also demonstrates how to use the memory feature by allowing users to ask the agent to remember specific information.

Conclusion

By building an AI agent from scratch using the model context protocol, we've gained a deeper understanding of how these systems work. This custom implementation allows for greater flexibility and control over the agent's behavior, making it easier to adapt to specific use cases and requirements.

Level Up Your Skills with Xperto-AI