Skip to content
Go back

Building a Coding Agent in Rust: Implementing Chat Feature

Edit page

In the last article, we setup our project, added dependencies and verified everything works by loading our API key.

In this article, we will build a CLI where we can chat with gemini with contextual replies. We will also focus on having a nice interface and will do some error handling.

Let’s get started.

Get the source code here —> https://github.com/0xshadow-dev/coding-agent

Setting Up Module Structure

Right now everything’s in main.rs. Let’s update our folder structure to be more modular because think about it, we are going to have types for API requests/response, HTTP client logic, conversation management, tool implementation, and so on.

If we keep all this in a single file, that’ll be a complete mess. So, let’s organize it properly from the start.

Create the Folder Structure

Run these commands in your terminal:

mkdir src/gemini
touch src/gemini/mod.rs
touch src/gemini/types.rs
touch src/gemini/client.rs

Your project should now look like this:

src/
├── main.rs
└── gemini/
    ├── mod.rs      # Module declaration
    ├── types.rs    # API types (requests/responses)
    └── client.rs   # HTTP client logic

of course, all the new files will be empty

Set Up the Module

Open src/gemini/mod.rs and add:

mod types;
mod client;

pub use client::GeminiClient;
pub use types::*;

Update main.rs

Now open src/main.rs and add this at the top, right before the use statements:

mod gemini;

With this, we are just informing Rust, that there’s a module called gemini. This is just standard rust stuff that you might already know.

Your main.rs should now look like this:

mod gemini;

use anyhow::Result;

#[tokio::main]
async fn main() -> Result<()> {
    dotenvy::dotenv().ok();

    let api_key = std::env::var("GEMINI_API_KEY")?;

    println!("Environment loaded");
    println!("✓ API key found: {}...", &api_key[..20]);
    println!("Async runtime working");

    Ok(())
}

With this, the setup is done. Now, lets start defining the types for gemini’s API request and response.

Defining Gemini API Types

Before writing any code, lets first understand the structure of Gemini API (request and response structure). We need to define these in Rust types so our compiler can help us not mess up.

Understanding the Gemini API

Gemini’s REST API is pretty straightforward. You send a POST request with:

When I say all previous message that means all the previous prompts that you sent and all the replies that AI gave back.

Let’s build these types.

Defining Types

Open src/gemini/types.rs and let’s start building.

First, add the imports we’ll need:

use serde::{Deserialize, Serialize};

We’re using serde because we need to convert our Rust structs to JSON (for the request) and convert JSON back to Rust structs (for the response).

The Message Type

Every conversation is just a list of messages. Each message has two things: who said it (role) and what they said (content).

Add this in your types file:

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Message {
    pub role: String,
    pub parts: Vec<Part>,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Part {
    pub text: String,
}

Let me explain this structure:

Why Part instead of just a String? Because that’s how Gemini’s API works. It’s designed for multimodal, so text is wrapped in a “part” object.

The Request Type

Now let’s define what we send to Gemini:

#[derive(Debug, Serialize)]
pub struct GenerateContentRequest {
    pub contents: Vec<Message>,
}

We just send the conversation history. Gemini will look at all previous messages and generate the next response.

The Response Type

Now the response. This is a bit more nested because Gemini wraps things:

#[derive(Debug, Deserialize)]
pub struct GenerateContentResponse {
    pub candidates: Vec<Candidate>,
}

#[derive(Debug, Deserialize)]
pub struct Candidate {
    pub content: Message,
}

Gemini can return multiple candidates (different possible responses), but we’ll just use the first one.

The structure is: Response → Candidates → Content → Parts → Text

Yeah, it’s nested. That’s just how their API works.

Helper Methods

Let’s add some helper methods to make our life easier. Add these implementations:

impl Message {
    pub fn user(text: impl Into<String>) -> Self {
        Self {
            role: "user".to_string(),
            parts: vec![Part {
                text: text.into(),
            }],
        }
    }

    pub fn model(text: impl Into<String>) -> Self {
        Self {
            role: "model".to_string(),
            parts: vec![Part {
                text: text.into(),
            }],
        }
    }

    pub fn text(&self) -> String {
        self.parts
            .first()
            .map(|p| p.text.clone())
            .unwrap_or_default()
    }
}

These helpers let us, create user messages, model messages, and get text from a message without creating a struct every time or unwrapping the response every time. We will just call the methods and pass the text to create a user or model message and we can also get the Gemini’s response, just by calling the text() method.

Building The HTTP Client

Ok, we have our types. Now let’s build the Gemini client that actually talks to Gemini’s API.

The client needs to do a few things:

Let’s build it.

Adding Imports to client.rs

Open src/gemini/client.rs and let’s start.

First, the imports:

use anyhow::{Context, Result};
use reqwest::Client;

use super::types::{GenerateContentRequest, GenerateContentResponse, Message};

We’re using:

The GeminiClient Struct

Let’s define our client:

pub struct GeminiClient {
    client: Client,
    api_key: String,
    model: String,
}

Let me explain:

The Constructor

Now let’s add a way to create a new client:

impl GeminiClient {
    pub fn new(api_key: String) -> Self {
        Self {
            client: Client::new(),
            api_key,
            model: "gemini-2.5-flash".to_string(),
        }
    }
}

We’re using gemini-2.5-flash by default because it’s fast and free. You can change this later if you want.

The Main Method: generate

Now the important part, actually talking to Gemini. Add this method inside the impl GeminiClient block:

    pub async fn generate(&self, messages: &[Message]) -> Result<String> {
        // Build the URL
        let url = format!(
            "https://generativelanguage.googleapis.com/v1beta/models/{}:generateContent?key={}",
            self.model, self.api_key
        );

        // Create the request body
        let request = GenerateContentRequest {
            contents: messages.to_vec(),
        };

        // Make the HTTP request
        let response = self
            .client
            .post(&url)
            .json(&request)
            .send()
            .await
            .context("Failed to send request to Gemini")?;

        // Check if the request was successful
        if !response.status().is_success() {
            let status = response.status();
            let error_text = response.text().await.unwrap_or_default();
            anyhow::bail!("Gemini API error ({}): {}", status, error_text);
        }

        // Parse the response
        let response_data: GenerateContentResponse = response
            .json()
            .await
            .context("Failed to parse Gemini response")?;

        // Extract the text from the first candidate
        let text = response_data
            .candidates
            .first()
            .context("No candidates in response")?
            .content
            .text();

        Ok(text)
    }

Let me walk you through what’s happening here. First, we build the URL. Gemini’s API URL includes the model name and your API key as a query parameter. We’re using string formatting to insert those values. Then we create the request body. We take our messages and convert them into the GenerateContentRequest format that Gemini expects.

Next, we actually make the HTTP request. We POST to the URL with our JSON body and wait for the response. Now we check if the request was successful. If the status code isn’t in the 2xx range (meaning something went wrong), we get the status code and error text and returns the error status and text.

If everything’s good, we parse the JSON response and convert it back into our Rust types.

Finally, we get the actual text from the nested response structure. Remember how Gemini’s response is Response → Candidates → Content → Parts → Text? We go through all that and pull out the text.

The Complete client.rs

Your complete src/gemini/client.rs should look like this:

use anyhow::{Context, Result};
use reqwest::Client;

use super::types::{GenerateContentRequest, GenerateContentResponse, Message};

pub struct GeminiClient {
    client: Client,
    api_key: String,
    model: String,
}

impl GeminiClient {
    pub fn new(api_key: String) -> Self {
        Self {
            client: Client::new(),
            api_key,
            model: "gemini-1.5-flash".to_string(),
        }
    }

    pub async fn generate(&self, messages: &[Message]) -> Result<String> {
        // Build the URL
        let url = format!(
            "https://generativelanguage.googleapis.com/v1beta/models/{}:generateContent?key={}",
            self.model, self.api_key
        );

        // Create the request body
        let request = GenerateContentRequest {
            contents: messages.to_vec(),
        };

        // Make the HTTP request
        let response = self
            .client
            .post(&url)
            .json(&request)
            .send()
            .await
            .context("Failed to send request to Gemini")?;

        // Check if the request was successful
        if !response.status().is_success() {
            let status = response.status();
            let error_text = response.text().await.unwrap_or_default();
            anyhow::bail!("Gemini API error ({}): {}", status, error_text);
        }

        // Parse the response
        let response_data: GenerateContentResponse = response
            .json()
            .await
            .context("Failed to parse Gemini response")?;

        // Extract the text from the first candidate
        let text = response_data
            .candidates
            .first()
            .context("No candidates in response")?
            .content
            .text();

        Ok(text)
    }
}

Test It

Let’s quickly test if our client works. Open src/main.rs and replace it with:

mod gemini;

use anyhow::Result;
use gemini::{GeminiClient, Message};

#[tokio::main]
async fn main() -> Result<()> {
    dotenvy::dotenv().ok();

    let api_key = std::env::var("GEMINI_API_KEY")?;
    let client = GeminiClient::new(api_key);

    // Test with a simple message
    let messages = vec![Message::user("Hello! Can you say hi back?")];

    println!("Sending message to Gemini...");
    let response = client.generate(&messages).await?;

    println!("Gemini: {}", response);

    Ok(())
}

Run it:

cargo run

You should see Gemini respond with something like “Hello! 👋 How can I help you today?” (It might be different as its an LLM and it won’t reply the same thing every time for the same request)

Adding Conversation State

Right now our Gemini client works, but it has no memory. If you ask “What’s 2+2?” and then ask “What did I just ask?”, it won’t remember.

Why? Because we’re only sending one message at a time. Gemini (like all LLMs) is stateless, it only knows what you send it in the current request.

So if we want conversation context, we need to:

  1. Keep track of all messages (user and model)
  2. Send the entire conversation history with each request

Let’s build that.

Create the Chat Session

We’ll create a “Chat” struct that wraps our client and manages the conversation. The client handles HTTP, the chat handles state.

Create a new file:

touch src/chat.rs

Open src/chat.rs and add:

use anyhow::Result;

use crate::gemini::{GeminiClient, Message};

pub struct Chat {
    client: GeminiClient,
    conversation: Vec<Message>,
}

Here, we are creating a Chat struct that has client (our Gemini Client) and conversation (our conversation state).

The Constructor

Add the implementation:

impl Chat {
    pub fn new(client: GeminiClient) -> Self {
        Self {
            client,
            conversation: Vec::new(),
        }
    }
}

We start with an empty conversation.

The Send Method

Now the important part, the method that handles sending messages and tracking history:

pub async fn send(&mut self, user_message: &str) -> Result<String> {
        // Add user message to conversation
        self.conversation.push(Message::user(user_message));

        // Get response from Gemini
        let response = self.client.generate(&self.conversation).await?;

        // Add model response to conversation
        self.conversation.push(Message::model(&response));

        Ok(response)
    }

This is the core of our chat:

  1. Add the user’s message to history
  2. Send the entire history to Gemini
  3. Get the response
  4. Add the response to history
  5. Return the response

Notice &mut self? That’s because we’re modifying the conversation state.

Helper Method

Let’s add one more helper to have a way to see the conversation:

    pub fn conversation(&self) -> &[Message] {
        &self.conversation
    }

This is useful for debugging or if we want to save conversations later.

Update main.rs

Let’s declare the chat module. Open src/main.rs and add at the top:

mod gemini;
mod chat;

use anyhow::Result;
use chat::Chat;
use gemini::GeminiClient;

Now update the main function to test the chat:

#[tokio::main]
async fn main() -> Result<()> {
    dotenvy::dotenv().ok();

    let api_key = std::env::var("GEMINI_API_KEY")?;
    let client = GeminiClient::new(api_key);
    let mut chat = Chat::new(client);

    // First message
    println!("You: Hello!");
    let response = chat.send("Hello!").await?;
    println!("Assistant: {}\n", response);

    // Second message - testing context
    println!("You: What did I just say?");
    let response = chat.send("What did I just say?").await?;
    println!("Assistant: {}", response);

    Ok(())
}

Run It

cargo run

You should see something like:

You: Hello!
Assistant: Hello! How can I help you today?

You: What did I just say?
Assistant: You said "Hello!"

Ok, we have conversation memory working. But typing messages in code isn’t great. Let’s build a interactive chat loop with proper input handling.

Building the Interactive Chat Loop

Our chat works, but hardcoding messages in main.rs isn’t exactly user friendly. We need a interactive loop where you can type messages and get responses, just like ChatGPT or Claude.

We need to:

Let’s build it.

Add rustyline

We could use basic stdin for input, but rustyline is way better. It gives us:

Open Cargo.toml and add it to dependencies:

rustyline = "14.0"

The Chat Loop

Now let’s build the actual loop. Open src/main.rs and replace everything with:

mod gemini;
mod chat;

use anyhow::Result;
use chat::Chat;
use gemini::GeminiClient;
use rustyline::DefaultEditor;

#[tokio::main]
async fn main() -> Result<()> {
    dotenvy::dotenv().ok();

    let api_key = std::env::var("GEMINI_API_KEY")?;
    let client = GeminiClient::new(api_key);
    let mut chat = Chat::new(client);

    // Create readline editor
    let mut rl = DefaultEditor::new()?;

    println!("Chat started! Type 'exit' or 'quit' to end the conversation.\n");

    loop {
        // Read user input
        let readline = rl.readline("You: ");

        match readline {
            Ok(line) => {
                let input = line.trim();

                // Check for exit commands
                if input.eq_ignore_ascii_case("exit") || input.eq_ignore_ascii_case("quit") {
                    println!("Goodbye!");
                    break;
                }

                // Skip empty lines
                if input.is_empty() {
                    continue;
                }

                // Add to history
                rl.add_history_entry(input)?;

                // Send message and get response
                match chat.send(input).await {
                    Ok(response) => {
                        println!("Assistant: {}\n", response);
                    }
                    Err(e) => {
                        eprintln!("Error: {}\n", e);
                    }
                }
            }
            Err(_) => {
                // Ctrl+C or Ctrl+D
                println!("\nGoodbye!");
                break;
            }
        }
    }

    Ok(())
}

Ok, let me walk you through this. First, we create the readline editor with DefaultEditor::new(). This gives us that nice interactive prompt where arrow keys actually work. Then we start an infinite loop. The program will keep running until the user decides to quit. Inside the loop, we call rl.readline("You: ") which shows the “You: ” prompt and waits for the user to type something. When they press Enter, we get their input. First, we check if they typed “exit” or “quit”. If they did, we break out of the loop and the program ends. Simple. We also skip empty lines. If someone just hits Enter without typing anything, we don’t bother sending that to Gemini. That would be a waste of an API call. If they actually typed something, we add it to the history with rl.add_history_entry(). This allows us to use the up arrow key to see previous messages. Then we send the message to our chat and wait for a response. If it works, we print the response. If something goes wrong (like network issues or API errors), we print the error but we don’t crash. The loop just continues and the user can try again.

There’s that Err(_) case at the end. That handles when the user hits Ctrl+C or Ctrl+D. Instead of just crashing, we print “Goodbye!” and exit cleanly.

Run It

cargo run

You should see:

Chat started! Type 'exit' or 'quit' to end the conversation.

You:

Now try having a conversation:

You: Hello!
Assistant: Hello! How can I help you today?

You: What's the capital of France?
Assistant: The capital of France is Paris.

You: What did I just ask you?
Assistant: You just asked me what the capital of France is.

You: exit
Goodbye!

Conclusion

This is awesome, we finished implementing a basic chat functionality. Yes, we can polish this by adding a “thinking…” state while waiting for a response or enable streaming response but those are not important at this point. Instead of that, in the next article, we will focus on building tools and start using tools. I’m really excited for the next one. See you soon.

This post is part of the "Building a Coding Agent in Rust" series. View all posts in this series


Edit page

Subscribe to Newsletter

Share this post on:

Next Post
Building a Coding Agent in Rust: Introduction