03 - Talking to LLMs with Go
📋 Jump to TakeawaysThe Ollama API
Ollama exposes a REST API on localhost:11434. You send a JSON request, you get a JSON response. No SDK needed, just net/http and encoding/json.
# Make sure Ollama is running and you have a model
ollama pull llama3.2
ollama serve # starts the API server (may already be running)The endpoint we care about is /api/chat. It takes a model name and an array of messages, just like OpenAI's API.
Your First API Call
A complete Go program that sends a message to Ollama and prints the response.
package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)
type Message struct {
Role string `json:"role"`
Content string `json:"content"`
}
type ChatRequest struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
Stream bool `json:"stream"`
}
type ChatResponse struct {
Message Message `json:"message"`
}
func main() {
req := ChatRequest{
Model: "llama3.2", // OpenAI: "gpt-4o"
Messages: []Message{
{Role: "user", Content: "What is Go good at? One sentence."},
},
Stream: false,
}
body, _ := json.Marshal(req)
resp, err := http.Post("http://localhost:11434/api/chat", "application/json", bytes.NewReader(body))
if err != nil {
fmt.Println("Error:", err)
return
}
defer resp.Body.Close()
data, _ := io.ReadAll(resp.Body)
var result ChatResponse
json.Unmarshal(data, &result)
fmt.Println(result.Message.Content)
// "Go excels at building concurrent, high-performance network services and CLI tools."
}That's it. No SDK, no dependencies. The Ollama API is OpenAI-compatible, so the same message format works with any provider.
The Message Roles
Every message has a role. Three roles cover everything.
messages := []Message{
{Role: "system", Content: "You are a Go expert. Be concise."},
{Role: "user", Content: "What are goroutines?"},
}
// system: sets the model's behavior and personality
// user: the human's input
// assistant: the model's previous responses (for multi-turn conversations)The system message is optional but powerful. It shapes every response the model generates. We'll cover this in depth in the prompt engineering lesson.
Multi-Turn Conversations
LLMs are stateless. They don't remember previous messages. To have a conversation, you send the entire history with every request.
messages := []Message{
{Role: "system", Content: "You are a helpful Go tutor."},
{Role: "user", Content: "What is an interface?"},
{Role: "assistant", Content: "An interface in Go defines a set of method signatures..."},
{Role: "user", Content: "Give me an example."},
}
// The model sees all four messages and responds to the last one
// with full context of the conversationEach turn adds tokens. A 20-message conversation means sending all 20 messages every time. Context windows and token budgeting matter because of this.
Wrapping It in a Function
A reusable function that calls Ollama and returns the response text.
func chat(messages []Message) (string, error) {
req := ChatRequest{
Model: "llama3.2",
Messages: messages,
Stream: false,
}
body, err := json.Marshal(req)
if err != nil {
return "", err
}
resp, err := http.Post(
"http://localhost:11434/api/chat",
"application/json",
bytes.NewReader(body),
)
if err != nil {
return "", fmt.Errorf("ollama request failed: %w", err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
data, _ := io.ReadAll(resp.Body)
return "", fmt.Errorf("ollama error %d: %s", resp.StatusCode, data)
}
var result ChatResponse
if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {
return "", fmt.Errorf("failed to parse response: %w", err)
}
return result.Message.Content, nil
}Call it:
func main() {
reply, err := chat([]Message{
{Role: "user", Content: "Explain channels in Go. Two sentences."},
})
if err != nil {
fmt.Println("Error:", err)
return
}
fmt.Println(reply)
// "Channels are typed conduits for sending and receiving values between goroutines.
// They provide synchronization, ensuring one goroutine waits until another is ready."
}Switching to OpenAI
The OpenAI API uses the same message format. Change the URL, add an API key header, and adjust the response shape slightly.
// Ollama: no auth needed
url := "http://localhost:11434/api/chat"
// OpenAI: same message format, different endpoint + auth header
url := "https://api.openai.com/v1/chat/completions"
// With Ollama, http.Post is enough.
// With OpenAI, you need to set headers, so use http.NewRequest:
req, _ := http.NewRequest("POST", url, bytes.NewReader(body))
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Authorization", "Bearer "+os.Getenv("OPENAI_API_KEY"))
resp, err := http.DefaultClient.Do(req)The messages array is identical for both. The only difference is the URL and the auth header.
Error Handling
LLM APIs fail. The model might not be loaded, Ollama might not be running, or the request might be malformed. Always check errors.
resp, err := http.Post(url, "application/json", bytes.NewReader(body))
if err != nil {
// Network error: Ollama not running, connection refused
return "", fmt.Errorf("connection failed: %w", err)
}
defer resp.Body.Close()
if resp.StatusCode != 200 {
// API error: model not found, invalid request
errBody, _ := io.ReadAll(resp.Body)
return "", fmt.Errorf("API error %d: %s", resp.StatusCode, errBody)
}Common errors:
connection refused: Ollama isn't running. Start it withollama servemodel not found: You haven't pulled the model. Runollama pull llama3.2context length exceeded: Your input is too long for the model's context window
Key Takeaways
- Ollama's
/api/chatendpoint takes a model name and messages array - No SDK needed.
net/httpandencoding/jsonare all you need - Three message roles:
system(behavior),user(input),assistant(model's prior responses) - LLMs are stateless. Send the full conversation history with every request
- The same message format works for Ollama, OpenAI, and most other providers
- Always handle errors: connection failures, API errors, context length limits