In the last article, we setup our project, added dependencies and verified everything works by loading our API key.
In this article, we will build a CLI where we can chat with gemini with contextual replies. We will also focus on having a nice interface and will do some error handling.
Let’s get started.
Get the source code here —> https://github.com/0xshadow-dev/coding-agent
Setting Up Module Structure
Right now everything’s in main.rs. Let’s update our folder structure to be more modular because think about it, we are going to have types for API requests/response, HTTP client logic, conversation management, tool implementation, and so on.
If we keep all this in a single file, that’ll be a complete mess. So, let’s organize it properly from the start.
Create the Folder Structure
Run these commands in your terminal:
mkdir src/gemini
touch src/gemini/mod.rs
touch src/gemini/types.rs
touch src/gemini/client.rs
Your project should now look like this:
src/
├── main.rs
└── gemini/
├── mod.rs # Module declaration
├── types.rs # API types (requests/responses)
└── client.rs # HTTP client logic
of course, all the new files will be empty
Set Up the Module
Open src/gemini/mod.rs and add:
mod types;
mod client;
pub use client::GeminiClient;
pub use types::*;
Update main.rs
Now open src/main.rs and add this at the top, right before the use statements:
mod gemini;
With this, we are just informing Rust, that there’s a module called gemini. This is just standard rust stuff that you might already know.
Your main.rs should now look like this:
mod gemini;
use anyhow::Result;
#[tokio::main]
async fn main() -> Result<()> {
dotenvy::dotenv().ok();
let api_key = std::env::var("GEMINI_API_KEY")?;
println!("Environment loaded");
println!("✓ API key found: {}...", &api_key[..20]);
println!("Async runtime working");
Ok(())
}
With this, the setup is done. Now, lets start defining the types for gemini’s API request and response.
Defining Gemini API Types
Before writing any code, lets first understand the structure of Gemini API (request and response structure). We need to define these in Rust types so our compiler can help us not mess up.
Understanding the Gemini API
Gemini’s REST API is pretty straightforward. You send a POST request with:
- The conversation history (all previous messages)
- Some configuration like temperature, model name, etc And you get back:
- The AI’s response
- Some metadata
When I say all previous message that means all the previous prompts that you sent and all the replies that AI gave back.
Let’s build these types.
Defining Types
Open src/gemini/types.rs and let’s start building.
First, add the imports we’ll need:
use serde::{Deserialize, Serialize};
We’re using serde because we need to convert our Rust structs to JSON (for the request) and convert JSON back to Rust structs (for the response).
The Message Type
Every conversation is just a list of messages. Each message has two things: who said it (role) and what they said (content).
Add this in your types file:
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Message {
pub role: String,
pub parts: Vec<Part>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Part {
pub text: String,
}
Let me explain this structure:
roleis either “user” or “model” (Gemini’s terminology for assistant)partsis a list because Gemini supports multimodal content (text, images, etc.). For now, we’re just doing text, so we’ll always have one part.
Why
Partinstead of just a String? Because that’s how Gemini’s API works. It’s designed for multimodal, so text is wrapped in a “part” object.
The Request Type
Now let’s define what we send to Gemini:
#[derive(Debug, Serialize)]
pub struct GenerateContentRequest {
pub contents: Vec<Message>,
}
We just send the conversation history. Gemini will look at all previous messages and generate the next response.
The Response Type
Now the response. This is a bit more nested because Gemini wraps things:
#[derive(Debug, Deserialize)]
pub struct GenerateContentResponse {
pub candidates: Vec<Candidate>,
}
#[derive(Debug, Deserialize)]
pub struct Candidate {
pub content: Message,
}
Gemini can return multiple candidates (different possible responses), but we’ll just use the first one.
The structure is: Response → Candidates → Content → Parts → Text
Yeah, it’s nested. That’s just how their API works.
Helper Methods
Let’s add some helper methods to make our life easier. Add these implementations:
impl Message {
pub fn user(text: impl Into<String>) -> Self {
Self {
role: "user".to_string(),
parts: vec![Part {
text: text.into(),
}],
}
}
pub fn model(text: impl Into<String>) -> Self {
Self {
role: "model".to_string(),
parts: vec![Part {
text: text.into(),
}],
}
}
pub fn text(&self) -> String {
self.parts
.first()
.map(|p| p.text.clone())
.unwrap_or_default()
}
}
These helpers let us, create user messages, model messages, and get text from a message without creating a struct every time or unwrapping the response every time. We will just call the methods and pass the text to create a user or model message and we can also get the Gemini’s response, just by calling the text() method.
Building The HTTP Client
Ok, we have our types. Now let’s build the Gemini client that actually talks to Gemini’s API.
The client needs to do a few things:
- Store our API key
- Make HTTP POST requests to Gemini
- Send our conversation history
- Get back responses
- Handle errors
Let’s build it.
Adding Imports to client.rs
Open src/gemini/client.rs and let’s start.
First, the imports:
use anyhow::{Context, Result};
use reqwest::Client;
use super::types::{GenerateContentRequest, GenerateContentResponse, Message};
We’re using:
anyhowfor error handling (gives us nice error messages)reqwestfor making HTTP requests- Our types from the previous section
The GeminiClient Struct
Let’s define our client:
pub struct GeminiClient {
client: Client,
api_key: String,
model: String,
}
Let me explain:
clientis the HTTP client fromreqwestapi_keyis your Gemini API keymodelis which model to use (likegemini-2.5-flash)
The Constructor
Now let’s add a way to create a new client:
impl GeminiClient {
pub fn new(api_key: String) -> Self {
Self {
client: Client::new(),
api_key,
model: "gemini-2.5-flash".to_string(),
}
}
}
We’re using gemini-2.5-flash by default because it’s fast and free. You can change this later if you want.
The Main Method: generate
Now the important part, actually talking to Gemini. Add this method inside the impl GeminiClient block:
pub async fn generate(&self, messages: &[Message]) -> Result<String> {
// Build the URL
let url = format!(
"https://generativelanguage.googleapis.com/v1beta/models/{}:generateContent?key={}",
self.model, self.api_key
);
// Create the request body
let request = GenerateContentRequest {
contents: messages.to_vec(),
};
// Make the HTTP request
let response = self
.client
.post(&url)
.json(&request)
.send()
.await
.context("Failed to send request to Gemini")?;
// Check if the request was successful
if !response.status().is_success() {
let status = response.status();
let error_text = response.text().await.unwrap_or_default();
anyhow::bail!("Gemini API error ({}): {}", status, error_text);
}
// Parse the response
let response_data: GenerateContentResponse = response
.json()
.await
.context("Failed to parse Gemini response")?;
// Extract the text from the first candidate
let text = response_data
.candidates
.first()
.context("No candidates in response")?
.content
.text();
Ok(text)
}
Let me walk you through what’s happening here.
First, we build the URL. Gemini’s API URL includes the model name and your API key as a query parameter. We’re using string formatting to insert those values. Then we create the request body. We take our messages and convert them into the GenerateContentRequest format that Gemini expects.
Next, we actually make the HTTP request. We POST to the URL with our JSON body and wait for the response. Now we check if the request was successful. If the status code isn’t in the 2xx range (meaning something went wrong), we get the status code and error text and returns the error status and text.
If everything’s good, we parse the JSON response and convert it back into our Rust types.
Finally, we get the actual text from the nested response structure. Remember how Gemini’s response is Response → Candidates → Content → Parts → Text? We go through all that and pull out the text.
The Complete client.rs
Your complete src/gemini/client.rs should look like this:
use anyhow::{Context, Result};
use reqwest::Client;
use super::types::{GenerateContentRequest, GenerateContentResponse, Message};
pub struct GeminiClient {
client: Client,
api_key: String,
model: String,
}
impl GeminiClient {
pub fn new(api_key: String) -> Self {
Self {
client: Client::new(),
api_key,
model: "gemini-1.5-flash".to_string(),
}
}
pub async fn generate(&self, messages: &[Message]) -> Result<String> {
// Build the URL
let url = format!(
"https://generativelanguage.googleapis.com/v1beta/models/{}:generateContent?key={}",
self.model, self.api_key
);
// Create the request body
let request = GenerateContentRequest {
contents: messages.to_vec(),
};
// Make the HTTP request
let response = self
.client
.post(&url)
.json(&request)
.send()
.await
.context("Failed to send request to Gemini")?;
// Check if the request was successful
if !response.status().is_success() {
let status = response.status();
let error_text = response.text().await.unwrap_or_default();
anyhow::bail!("Gemini API error ({}): {}", status, error_text);
}
// Parse the response
let response_data: GenerateContentResponse = response
.json()
.await
.context("Failed to parse Gemini response")?;
// Extract the text from the first candidate
let text = response_data
.candidates
.first()
.context("No candidates in response")?
.content
.text();
Ok(text)
}
}
Test It
Let’s quickly test if our client works. Open src/main.rs and replace it with:
mod gemini;
use anyhow::Result;
use gemini::{GeminiClient, Message};
#[tokio::main]
async fn main() -> Result<()> {
dotenvy::dotenv().ok();
let api_key = std::env::var("GEMINI_API_KEY")?;
let client = GeminiClient::new(api_key);
// Test with a simple message
let messages = vec![Message::user("Hello! Can you say hi back?")];
println!("Sending message to Gemini...");
let response = client.generate(&messages).await?;
println!("Gemini: {}", response);
Ok(())
}
Run it:
cargo run
You should see Gemini respond with something like “Hello! 👋 How can I help you today?” (It might be different as its an LLM and it won’t reply the same thing every time for the same request)
Adding Conversation State
Right now our Gemini client works, but it has no memory. If you ask “What’s 2+2?” and then ask “What did I just ask?”, it won’t remember.
Why? Because we’re only sending one message at a time. Gemini (like all LLMs) is stateless, it only knows what you send it in the current request.
So if we want conversation context, we need to:
- Keep track of all messages (user and model)
- Send the entire conversation history with each request
Let’s build that.
Create the Chat Session
We’ll create a “Chat” struct that wraps our client and manages the conversation. The client handles HTTP, the chat handles state.
Create a new file:
touch src/chat.rs
Open src/chat.rs and add:
use anyhow::Result;
use crate::gemini::{GeminiClient, Message};
pub struct Chat {
client: GeminiClient,
conversation: Vec<Message>,
}
Here, we are creating a Chat struct that has client (our Gemini Client) and conversation (our conversation state).
The Constructor
Add the implementation:
impl Chat {
pub fn new(client: GeminiClient) -> Self {
Self {
client,
conversation: Vec::new(),
}
}
}
We start with an empty conversation.
The Send Method
Now the important part, the method that handles sending messages and tracking history:
pub async fn send(&mut self, user_message: &str) -> Result<String> {
// Add user message to conversation
self.conversation.push(Message::user(user_message));
// Get response from Gemini
let response = self.client.generate(&self.conversation).await?;
// Add model response to conversation
self.conversation.push(Message::model(&response));
Ok(response)
}
This is the core of our chat:
- Add the user’s message to history
- Send the entire history to Gemini
- Get the response
- Add the response to history
- Return the response
Notice
&mut self? That’s because we’re modifying the conversation state.
Helper Method
Let’s add one more helper to have a way to see the conversation:
pub fn conversation(&self) -> &[Message] {
&self.conversation
}
This is useful for debugging or if we want to save conversations later.
Update main.rs
Let’s declare the chat module. Open src/main.rs and add at the top:
mod gemini;
mod chat;
use anyhow::Result;
use chat::Chat;
use gemini::GeminiClient;
Now update the main function to test the chat:
#[tokio::main]
async fn main() -> Result<()> {
dotenvy::dotenv().ok();
let api_key = std::env::var("GEMINI_API_KEY")?;
let client = GeminiClient::new(api_key);
let mut chat = Chat::new(client);
// First message
println!("You: Hello!");
let response = chat.send("Hello!").await?;
println!("Assistant: {}\n", response);
// Second message - testing context
println!("You: What did I just say?");
let response = chat.send("What did I just say?").await?;
println!("Assistant: {}", response);
Ok(())
}
Run It
cargo run
You should see something like:
You: Hello!
Assistant: Hello! How can I help you today?
You: What did I just say?
Assistant: You said "Hello!"
Ok, we have conversation memory working. But typing messages in code isn’t great. Let’s build a interactive chat loop with proper input handling.
Building the Interactive Chat Loop
Our chat works, but hardcoding messages in main.rs isn’t exactly user friendly. We need a interactive loop where you can type messages and get responses, just like ChatGPT or Claude.
We need to:
- Read user input
- Send it to Gemini
- Print the response
- Repeat until the user wants to quit
- Handle Ctrl+C gracefully
Let’s build it.
Add rustyline
We could use basic stdin for input, but rustyline is way better. It gives us:
- Line editing (arrow keys work!)
- History (up arrow to see previous messages)
- Better UX overall
Open Cargo.toml and add it to dependencies:
rustyline = "14.0"
The Chat Loop
Now let’s build the actual loop. Open src/main.rs and replace everything with:
mod gemini;
mod chat;
use anyhow::Result;
use chat::Chat;
use gemini::GeminiClient;
use rustyline::DefaultEditor;
#[tokio::main]
async fn main() -> Result<()> {
dotenvy::dotenv().ok();
let api_key = std::env::var("GEMINI_API_KEY")?;
let client = GeminiClient::new(api_key);
let mut chat = Chat::new(client);
// Create readline editor
let mut rl = DefaultEditor::new()?;
println!("Chat started! Type 'exit' or 'quit' to end the conversation.\n");
loop {
// Read user input
let readline = rl.readline("You: ");
match readline {
Ok(line) => {
let input = line.trim();
// Check for exit commands
if input.eq_ignore_ascii_case("exit") || input.eq_ignore_ascii_case("quit") {
println!("Goodbye!");
break;
}
// Skip empty lines
if input.is_empty() {
continue;
}
// Add to history
rl.add_history_entry(input)?;
// Send message and get response
match chat.send(input).await {
Ok(response) => {
println!("Assistant: {}\n", response);
}
Err(e) => {
eprintln!("Error: {}\n", e);
}
}
}
Err(_) => {
// Ctrl+C or Ctrl+D
println!("\nGoodbye!");
break;
}
}
}
Ok(())
}
Ok, let me walk you through this. First, we create the readline editor with DefaultEditor::new(). This gives us that nice interactive prompt where arrow keys actually work. Then we start an infinite loop. The program will keep running until the user decides to quit. Inside the loop, we call rl.readline("You: ") which shows the “You: ” prompt and waits for the user to type something. When they press Enter, we get their input. First, we check if they typed “exit” or “quit”. If they did, we break out of the loop and the program ends. Simple. We also skip empty lines. If someone just hits Enter without typing anything, we don’t bother sending that to Gemini. That would be a waste of an API call. If they actually typed something, we add it to the history with rl.add_history_entry(). This allows us to use the up arrow key to see previous messages. Then we send the message to our chat and wait for a response. If it works, we print the response. If something goes wrong (like network issues or API errors), we print the error but we don’t crash. The loop just continues and the user can try again.
There’s that Err(_) case at the end. That handles when the user hits Ctrl+C or Ctrl+D. Instead of just crashing, we print “Goodbye!” and exit cleanly.
Run It
cargo run
You should see:
Chat started! Type 'exit' or 'quit' to end the conversation.
You:
Now try having a conversation:
You: Hello!
Assistant: Hello! How can I help you today?
You: What's the capital of France?
Assistant: The capital of France is Paris.
You: What did I just ask you?
Assistant: You just asked me what the capital of France is.
You: exit
Goodbye!
Conclusion
This is awesome, we finished implementing a basic chat functionality. Yes, we can polish this by adding a “thinking…” state while waiting for a response or enable streaming response but those are not important at this point. Instead of that, in the next article, we will focus on building tools and start using tools. I’m really excited for the next one. See you soon.