The fastest way to learn how AI systems actually work is to build one. Reading about embeddings is useful. Debugging why your embedding search returns the wrong results at 11pm on a Saturday is how you actually learn.
These 10 projects are all buildable in a weekend. They use real APIs, produce something you can demo, and each one teaches a distinct concept that will compound into everything you build next.
Difficulty ratings: Beginner (few hours, no prior AI experience), Intermediate (full day or two, comfortable with APIs and Python).
1. Sentiment Analyzer for Product Reviews
Difficulty: Beginner | Time: 2–3 hours | Stack: Python, Anthropic API
What it does: Takes a batch of product reviews and classifies each one as positive, negative, or neutral — plus extracts the main topic of the complaint or praise.
Why it is useful: Structured output extraction from LLMs is one of the most common real-world use cases. This project teaches you how to get consistent, machine-readable responses from a model that naturally wants to write prose.
Starting point:
from anthropic import Anthropic
import json
client = Anthropic()
def analyze_review(review: str) -> dict:
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=256,
messages=[{
"role": "user",
"content": f"""Analyze this product review. Return JSON only.
Review: {review}
Return: {{"sentiment": "positive|negative|neutral", "score": 1-5, "main_topic": "string", "summary": "one sentence"}}"""
}]
)
return json.loads(response.content[0].text)
reviews = [
"Battery life is incredible, lasts all day easily. UI is a bit clunky.",
"Broke after two weeks. Customer service was useless.",
"Exactly what I expected. Does the job.",
]
for review in reviews:
result = analyze_review(review)
print(f"Sentiment: {result['sentiment']} ({result['score']}/5)")
print(f"Topic: {result['main_topic']}")
print(f"Summary: {result['summary']}\n")
Extend it: Process a CSV of real Amazon reviews, calculate aggregate sentiment by product, and generate a summary report.
2. PDF Question-Answering Chatbot
Difficulty: Beginner | Time: 3–4 hours | Stack: Python, Anthropic API, PyMuPDF
What it does: Upload any PDF — a research paper, a legal document, a user manual — and ask questions about it in plain English.
Why it is useful: This is RAG in its simplest form. You learn how to extract text from documents, split it into chunks, and pass the right context to an LLM.
Starting point:
import fitz # PyMuPDF
from anthropic import Anthropic
client = Anthropic()
def extract_text(pdf_path: str) -> str:
doc = fitz.open(pdf_path)
return "\n".join(page.get_text() for page in doc)
def ask_pdf(pdf_path: str, question: str) -> str:
text = extract_text(pdf_path)
# For large PDFs, chunk and retrieve instead of including everything
context = text[:12000] # ~8k tokens, adjust per model limits
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=512,
system="Answer questions using only the document provided. Say 'The document does not cover this' if the answer is not present.",
messages=[{
"role": "user",
"content": f"Document:\n{context}\n\nQuestion: {question}"
}]
)
return response.content[0].text
# Usage
answer = ask_pdf("annual_report.pdf", "What was the revenue growth last year?")
print(answer)
Extend it: Add embeddings and proper vector search so it handles 100-page documents correctly.
3. CLI Writing Assistant
Difficulty: Beginner | Time: 2 hours | Stack: Python, Anthropic API
What it does: A command-line tool that takes draft text and improves it for a specified purpose — email, blog post, Slack message, technical doc. You pipe text in, improved text comes out.
Why it is useful: Streaming responses, prompt templates, and CLI tooling — three fundamental patterns in one small project.
Starting point:
import sys
import anthropic
client = anthropic.Anthropic()
modes = {
"email": "Rewrite this as a professional, concise email. Keep it under 150 words.",
"slack": "Rewrite this as a casual but clear Slack message. Keep it brief.",
"blog": "Rewrite this as an engaging blog paragraph with a strong opening sentence.",
"technical": "Rewrite this as clear technical documentation. Use active voice.",
}
mode = sys.argv[1] if len(sys.argv) > 1 else "email"
text = sys.stdin.read().strip()
instruction = modes.get(mode, modes["email"])
with client.messages.stream(
model="claude-haiku-4-5-20251001",
max_tokens=512,
messages=[{"role": "user", "content": f"{instruction}\n\nText: {text}"}]
) as stream:
for chunk in stream.text_stream:
print(chunk, end="", flush=True)
print()
Run it: echo "we need to discuss the project timeline" | python assistant.py email
4. GitHub PR Summarizer
Difficulty: Beginner | Time: 3 hours | Stack: Python, GitHub API, Anthropic API
What it does: Takes a GitHub PR URL and generates a concise summary of what changed, why it matters, and any potential concerns — without reading every line of the diff yourself.
Why it is useful: API integration, prompt engineering for structured summarization, and a tool you will actually use every week.
Starting point:
import requests
from anthropic import Anthropic
client = Anthropic()
def get_pr_diff(owner: str, repo: str, pr_number: int, token: str) -> str:
headers = {"Authorization": f"token {token}", "Accept": "application/vnd.github.v3.diff"}
url = f"https://api.github.com/repos/{owner}/{repo}/pulls/{pr_number}"
response = requests.get(url, headers=headers)
pr_data = response.json()
diff_response = requests.get(url, headers={**headers, "Accept": "application/vnd.github.v3.diff"})
return pr_data["title"], pr_data["body"] or "", diff_response.text[:8000]
def summarize_pr(owner, repo, pr_number, token):
title, description, diff = get_pr_diff(owner, repo, pr_number, token)
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=512,
messages=[{
"role": "user",
"content": f"""Summarize this pull request for a developer who needs to review it quickly.
Title: {title}
Description: {description}
Diff (truncated):
{diff}
Provide: 1) What changed (2-3 sentences), 2) Why it matters, 3) Any concerns worth reviewing carefully."""
}]
)
return response.content[0].text
5. Personal Knowledge Base with Semantic Search
Difficulty: Intermediate | Time: 4–6 hours | Stack: Python, sentence-transformers, SQLite, Anthropic API
What it does: Save notes, articles, and snippets to a local database. Search them semantically — find notes about "deployment problems" even if the note says "production outage in Kubernetes."
Why it is useful: Embeddings, vector similarity, and building a retrieval system from scratch. The exact same pattern that powers enterprise RAG systems.
Starting point:
import sqlite3
import numpy as np
from sentence_transformers import SentenceTransformer
embedder = SentenceTransformer("all-MiniLM-L6-v2")
def init_db():
conn = sqlite3.connect("knowledge.db")
conn.execute("""
CREATE TABLE IF NOT EXISTS notes (
id INTEGER PRIMARY KEY,
content TEXT,
embedding BLOB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""")
conn.commit()
return conn
def add_note(conn, content: str):
embedding = embedder.encode([content])[0]
conn.execute("INSERT INTO notes (content, embedding) VALUES (?, ?)",
(content, embedding.tobytes()))
conn.commit()
def search(conn, query: str, top_k: int = 5) -> list[str]:
query_embedding = embedder.encode([query])[0]
rows = conn.execute("SELECT content, embedding FROM notes").fetchall()
if not rows:
return []
contents = [r[0] for r in rows]
embeddings = np.array([np.frombuffer(r[1], dtype=np.float32) for r in rows])
similarities = np.dot(embeddings, query_embedding) / (
np.linalg.norm(embeddings, axis=1) * np.linalg.norm(query_embedding)
)
top_indices = np.argsort(similarities)[-top_k:][::-1]
return [contents[i] for i in top_indices]
# Usage
conn = init_db()
add_note(conn, "Fixed Kubernetes OOMKilled errors by increasing memory limits in deployment.yaml")
add_note(conn, "Redis connection pooling: set max_connections=20 to avoid timeout spikes under load")
results = search(conn, "production memory problems")
for r in results:
print(r)
6. Email Auto-Classifier and Drafter
Difficulty: Intermediate | Time: 4–5 hours | Stack: Python, Gmail API, Anthropic API
What it does: Connects to your Gmail, reads unread emails, classifies them by category (support request, sales inquiry, newsletter, etc.), and drafts responses for the ones that need replies.
Why it is useful: Real API integration, multi-step workflows, and handling unstructured real-world data. Immediately useful.
Extend it: Add a FastAPI endpoint and a simple web UI to approve/reject the drafted replies before they send.
7. Code Review Agent
Difficulty: Intermediate | Time: 6–8 hours | Stack: Python, Anthropic API, subprocess
What it does: Reads a directory of code files, reviews them for common issues (security vulnerabilities, performance problems, code smell), and generates a structured review report in Markdown.
Why it is useful: Tool use patterns, multi-file context management, and structured output. This is a real agent — it reads files, decides what to examine more closely, and synthesizes findings.
Starting point:
import os
from pathlib import Path
from anthropic import Anthropic
client = Anthropic()
TOOL_SCHEMAS = [{
"name": "read_file",
"description": "Read the contents of a source file for review",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "File path relative to project root"}
},
"required": ["path"]
}
}]
def read_file(path: str) -> str:
try:
return Path(path).read_text()
except Exception as e:
return f"Error reading file: {e}"
def review_project(project_path: str) -> str:
# List all Python files
py_files = [str(p.relative_to(project_path))
for p in Path(project_path).rglob("*.py")]
messages = [{
"role": "user",
"content": f"""Review the Python project at {project_path}.
Files available: {py_files}
Read the most important files and identify:
1. Security issues (input validation, SQL injection, hardcoded secrets)
2. Performance problems
3. Code quality issues
4. Missing error handling
Use the read_file tool to examine files. Focus on the highest-risk files first."""
}]
# Agent loop
while True:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
tools=TOOL_SCHEMAS,
messages=messages
)
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason == "end_turn":
for block in response.content:
if hasattr(block, "text"):
return block.text
break
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = read_file(os.path.join(project_path, block.input["path"]))
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
messages.append({"role": "user", "content": tool_results})
return "Review complete."
8. Meeting Transcript Summarizer
Difficulty: Beginner | Time: 2–3 hours | Stack: Python, Anthropic API, Whisper (optional)
What it does: Takes a meeting transcript (text or audio file) and outputs: key decisions made, action items with owners, unresolved questions, and a 3-sentence summary.
Why it is useful: Prompt engineering for structured extraction, handling long documents, and an immediately practical tool. If you add Whisper, you handle audio → text → structured output in one pipeline.
9. Personalized Learning Path Generator
Difficulty: Intermediate | Time: 5–6 hours | Stack: Python, FastAPI, Anthropic API
What it does: A web API where users input their current skills, their goal skill, and their time budget. The system generates a week-by-week learning plan with specific resources.
Why it is useful: Building an API around an LLM, system prompt engineering, and multi-turn conversation state. A genuine product you could ship.
10. Automated Test Case Generator
Difficulty: Intermediate | Time: 4–6 hours | Stack: Python, Anthropic API
What it does: Takes a Python function or class, analyzes its behavior, and generates pytest test cases covering happy path, edge cases, and error conditions. Writes the test file directly to disk.
Why it is useful: Code analysis with LLMs, structured code generation, and a tool that immediately makes your development workflow better.
Starting point:
from anthropic import Anthropic
from pathlib import Path
client = Anthropic()
def generate_tests(source_file: str, function_name: str) -> str:
source = Path(source_file).read_text()
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
messages=[{
"role": "user",
"content": f"""Generate pytest test cases for the function '{function_name}' in this file.
Source code:
{source}
Generate tests for:
1. Normal/happy path inputs
2. Edge cases (empty input, zero, None, boundary values)
3. Expected exceptions and error conditions
Return only valid Python code with proper imports. Use descriptive test function names."""
}]
)
test_code = response.content[0].text
output_path = Path(source_file).parent / f"test_{Path(source_file).name}"
output_path.write_text(test_code)
print(f"Tests written to {output_path}")
return test_code
How to Pick Your First Project
If you have never called an LLM API before: start with Project 1 (sentiment analyzer) or Project 3 (CLI writing assistant). They are short, immediate, and teach the core pattern.
If you are comfortable with APIs and want to learn RAG: Project 2 (PDF chatbot) or Project 5 (knowledge base) are the right next steps.
If you want to build something agent-based: Project 7 (code review agent) is the best introduction to tool-calling loops.
If you want something you will actually use at work this week: Project 4 (PR summarizer) or Project 8 (meeting summarizer).
From Projects to Real Systems
These projects teach the foundational patterns — structured output, RAG, tool use, streaming, multi-step agents. Once you understand how these patterns work, you can build production-grade AI systems.
MindloomHQ's Agentic AI course covers all of these patterns systematically, starting from the foundations and building up to multi-agent systems and production deployment. Phases 0 and 1 are completely free — no credit card, no trial period.
Start building this weekend.