Introduction
In the world of AI development, pre-built frameworks often simplify the process of creating AI agents. However, building an agent from scratch can provide greater flexibility and a deeper understanding of the underlying mechanisms. This blog post will guide you through the process of creating an AI agent without using any existing frameworks, focusing on implementing a model context protocol for seamless interactions with language models.
Understanding the Model Context Protocol
The model context protocol is a method of maintaining conversation history and relevant information throughout an AI agent's interactions. It allows the agent to provide more coherent and contextually appropriate responses. Let's break down the key components of this protocol:
- Message history
- Context window
- Tokenization
- Prompt engineering
Setting Up the Project
First, let's set up a basic project structure. Create a new directory for your project and initialize a Python virtual environment:
mkdir ai-agent-from-scratch cd ai-agent-from-scratch python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
Install the required dependencies:
pip install requests
Implementing the AI Agent
Let's start by creating a basic structure for our AI agent. Create a new file called ai_agent.py
:
import json import requests class AIAgent: def __init__(self, api_key): self.api_key = api_key self.conversation_history = [] self.max_tokens = 4096 self.api_url = "https://api.openai.com/v1/chat/completions" def send_message(self, message): self.conversation_history.append({"role": "user", "content": message}) response = self._get_model_response() self.conversation_history.append({"role": "assistant", "content": response}) return response def _get_model_response(self): headers = { "Content-Type": "application/json", "Authorization": f"Bearer {self.api_key}" } data = { "model": "gpt-3.5-turbo", "messages": self._prepare_context(), "max_tokens": 150 } response = requests.post(self.api_url, headers=headers, data=json.dumps(data)) return response.json()["choices"][0]["message"]["content"] def _prepare_context(self): context = [] total_tokens = 0 for message in reversed(self.conversation_history): message_tokens = len(message["content"].split()) if total_tokens + message_tokens > self.max_tokens: break context.insert(0, message) total_tokens += message_tokens return context
This basic implementation includes the core functionality of our AI agent. Let's break down the key components:
__init__
: Initializes the agent with an API key and sets up the conversation history.send_message
: Adds the user's message to the conversation history, gets a response from the model, and adds the response to the history._get_model_response
: Sends a request to the language model API and retrieves the response._prepare_context
: Prepares the context for the model by selecting relevant messages from the conversation history.
Enhancing the Agent with Advanced Features
Now that we have a basic implementation, let's add some advanced features to make our agent more powerful and flexible.
Implementing Tokenization
To more accurately manage the context window, we can implement proper tokenization. Add the following method to the AIAgent
class:
def _count_tokens(self, text): # This is a simple approximation. For more accurate results, # consider using a proper tokenizer like GPT-2 Tokenizer return len(text.split())
Update the _prepare_context
method to use this new tokenization:
def _prepare_context(self): context = [] total_tokens = 0 for message in reversed(self.conversation_history): message_tokens = self._count_tokens(message["content"]) if total_tokens + message_tokens > self.max_tokens: break context.insert(0, message) total_tokens += message_tokens return context
Adding Memory Management
To help our agent maintain long-term memory, we can implement a simple key-value store for important information. Add the following methods to the AIAgent
class:
def __init__(self, api_key): # ... (previous init code) ... self.memory = {} def remember(self, key, value): self.memory[key] = value def recall(self, key): return self.memory.get(key, None) def _prepare_context(self): context = [] total_tokens = 0 # Add relevant memories to the context for key, value in self.memory.items(): memory_message = {"role": "system", "content": f"Remember: {key} = {value}"} memory_tokens = self._count_tokens(memory_message["content"]) if total_tokens + memory_tokens <= self.max_tokens: context.append(memory_message) total_tokens += memory_tokens # ... (previous context preparation code) ...
Implementing Prompt Engineering
To guide the model's responses more effectively, we can add a system message that defines the agent's role and behavior. Update the _prepare_context
method:
def _prepare_context(self): context = [ {"role": "system", "content": "You are a helpful AI assistant. Provide clear and concise answers to user queries."} ] total_tokens = self._count_tokens(context[0]["content"]) # ... (previous context preparation code) ...
Using the AI Agent
Now that we have implemented our AI agent, let's create a simple example to demonstrate its usage. Create a new file called main.py
:
from ai_agent import AIAgent def main(): api_key = "your-api-key-here" agent = AIAgent(api_key) print("AI Agent: Hello! How can I assist you today?") while True: user_input = input("You: ") if user_input.lower() in ["exit", "quit", "bye"]: print("AI Agent: Goodbye! Have a great day!") break response = agent.send_message(user_input) print(f"AI Agent: {response}") # Example of using memory if "remember" in user_input.lower(): key, value = user_input.split("remember")[1].strip().split("as") agent.remember(key.strip(), value.strip()) print(f"AI Agent: I'll remember that {key.strip()} is {value.strip()}.") if __name__ == "__main__": main()
This example creates an interactive loop where users can chat with the AI agent. It also demonstrates how to use the memory feature by allowing users to ask the agent to remember specific information.
Conclusion
By building an AI agent from scratch using the model context protocol, we've gained a deeper understanding of how these systems work. This custom implementation allows for greater flexibility and control over the agent's behavior, making it easier to adapt to specific use cases and requirements.