Unlock Your Next Read: AI Agent-Powered Book Recommendations From Your Bookshelf

Transform a bookshelf photo into your personalized reading guide.

Patrick Kalkman

Jan 21, 2025 — 14 min read

Take a picture of your bookshelf to generate recommendations, image by Midjourney

Have you ever stared at your bookshelf, wondering why the recommendations never match your taste? I used to spend hours logging my books, hoping for brilliant suggestions. Instead, I got more of the same or things that felt random. I knew there had to be a better way.

“Imagine standing in front of your bookshelf, each title whispering secrets of worlds unknown, yet feeling utterly stuck on what to read next.

Your book recommendations should be more like keys — they should unlock worlds you never knew existed or ideas you never considered. So often, the recommendations miss the mark, sending you down paths you never wanted to explore.

That’s when it hit me: what if your bookshelf could become a reading guide?

With my background in creating AI agents, I started building Shelf Genius. This AI Agent turns a simple photo of your bookshelf into a trove of tailored recommendations.

In this article, I’ll show you how it works. You’ll see how a simple photo can unlock a world of resonant, personalized recommendations.

Ready to experience Shelf Genius in action? A complete walkthrough is in the “Seeing Shelf Genius in Action” section below.

The full implementation, including source code and documentation, is available in the Shelf Genius GitHub repository.

The Shelf Genius approach

Imagine if your bookshelf could talk to you. That’s the idea behind Shelf Genius. You snap a photo, and it becomes your guide to your next reading adventure.

It might sound simple — take a photo, get some ideas — but the technical side of it was pretty complex. It required solving some really interesting problems.

Shelf Genius works by using four mini-systems: one to get the photo ready, one to figure out which books are there, one to learn about each book, and finally, one that recommends your next great read. It’s all about bridging the gap between your bookshelf and your reading preferences.

Flowchart showing book recommendation process: User uploads image, which goes through Pre Processing, GPT-4o-mini Vision, Google Books API, and GPT-4o-mini Recommendation to produce final recommendation. — The core process of Shelf Genius, image by author

Let me walk you through how it works.

Image preprocessing

Okay, before any of the AI magic happens, we need to get your bookshelf photo ready. Think of it like giving your photo a quick makeover.

First, I ensure the photo is easy for my AI to read. It’ll brighten things, turn it into black and white, and resize it so it’s not too big.

This isn’t just about ensuring the AI can see the books better. It’s also about making things efficient and not wasting money on expensive AI processing. That means Shelf Genius stays useful for everyone every day.

Identifying books

Okay, now for the real challenge: teaching the AI to see and read the titles on your bookshelf. I tried several different methods — some local AI models and even those OCR tools that read text from images. However, it turned out that the vision skills of GPT-4o-mini worked the best.

I tried faster, cheaper options, but nothing could quite match how well GPT-4o-mini pulled titles and authors from the photos. Sometimes, the most straightforward answer isn’t what you expect.

Gather metadata

But finding the books is just the first step. Shelf Genius needs to know your books inside and out to make excellent recommendations.

That’s where the Google Books API comes in. For each book I find, I collect many extra details— like genres, summaries, and publication dates — so I can build a profile of your reading tastes.

Generate recommendation

Finally, with all that information, GPT-4o-mini acts as a recommendation engine. Unlike those simple algorithms that match books by basic categories, our AI looks at the complete picture.

It considers themes, writing styles, and all those subtle connections between different books to suggest something you’ll love.

Technical architecture

Our AI agent follows a workflow-based architecture with limited agency. We built it using LangGraph and LangChain as the core frameworks.

The major advantage of the workflow approach is that everything happens predictably, the data is managed effectively, and each part works smoothly with the AI and outside information sources.

Flowchart diagram showing data flow between four processing nodes (image, recognition, lookup, and recommendation) and central Agent State, with outputs at each stage including base64 image, book data, and final recommendation. — LangGraph nodes of Shelf Genius storing information in share state, image by author

Node-based architecture

We implemented those four specialized systems we discussed earlier as individual LangGraph nodes.

process_image_node: This step gets your photo ready for analysis
book_recognition_node: This step uses the AI to find the book titles and authors
book_lookup_node: This step collects additional information about each book from Google
book_recommendation_node: This step uses the AI to give you an excellent recommendation.

Each step works independently but shares information through a central “data hub,” which keeps everything organized. This makes it modular and also interconnected at the same time.

The central data hub

This “data hub” — we call it the AgentState — keeps track of all the important stuff. It contains:

Your prepared photo
The list of books it recognized
All the book details
Your final recommendation

This data hub is key to ensuring all the data is consistent and allows me to monitor and fix things if something goes wrong.

In the following sections, we’ll dive deeper into how each step works, starting with the part that processes the image.

Process image node

The first step is ensuring my AI agent can see the spines on your books. Let’s dive into the “image prep” step — it’s like giving my AI a new pair of reading glasses, ensuring it’s looking at things just right.

The process_image_node takes your bookshelf photo and optimizes it in three key ways:

Checking the photo: First, I need to be sure your photo is in a format I can use, like a JPEG or PNG. This is what that looks like in the code:

def validate_image_format(image_path: str) -> bool: 
    valid_formats = ["JPEG", "PNG"] 
    try: 
        with Image.open(image_path) as img: 
            return img.format in valid_formats 
    except Exception: 
        return False

2. Resizing Smartly: If your photo is huge, I’ll make it smaller while keeping everything in proportion. This helps keep processing fast while ensuring we don’t lose any critical details.

def resize_image(img: Image.Image, max_dimension: int = 1024) -> Image.Image: 
    width, height = img.size 
    if width > max_dimension or height > max_dimension: 
        scale = max_dimension / max(width, height) 
        new_size = (int(width * scale), int(height * scale)) 
        return img.resize(new_size, Image.Resampling.LANCZOS) 
    return img

3. Making text pop: Now for the magic part! I convert the image to black and white (which can remove some distractions) and boost the contrast, so the text on the book spines is much easier for the AI to read. I do all that with the PIL image library.

def optimize_image_for_recognition(img: Image.Image) -> Image.Image: 
    img = ImageOps.grayscale(img) 
    img = ImageOps.autocontrast(img, cutoff=0.5) 
    return img

Keeping track of things

This step also keeps track of all the essential image info, like its original size and format, the processed image data, and even a special “encoded” version. This gets sent to the next step, reading the book titles.

This state gets passed to the next node in our pipeline, carrying all the information needed for book recognition.

So now that we’ve made your photo easy to read, how does the AI find those books? The following section will dive into how the ‘book spotter’ step works its magic.

Book recognition node

Let me tell you, this was one of the trickiest parts of this entire project — teaching an AI to “read” the spines of your books. I spent days banging my head against the wall until I found an approach that worked.

The magic behind book spotting

The secret sauce here is GPT-4o-mini. It’s a multi-talented AI that can understand both pictures and text. Think of it like an AI that can both see and read, which is precisely what we need for scanning bookshelves. Besides this it is also a cost effective model.

Giving the AI instructions

Here’s how we tell the AI what we need:

BOOK_RECOGNITION_TEMPLATE = """You are a book recognition system analyzing bookshelf images. Your task is to identify books from the image and return ONLY a valid JSON response. 
 
RESPONSE FORMAT REQUIREMENTS: 
1. You must return ONLY valid JSON 
2. The JSON must contain a "books" array 
3. Each book in the array MUST have both "title" and "author" fields 
4. Use simple double quotes for strings, never nested quotes 
5. Remove special characters from titles and authors 
6. Return empty books array if no books are clearly visible 
"""

Note: We need to be super specific with the AI; without these rules, it goes off and does all kinds of things we don’t need.

Showing the AI your bookshelf

To make this work, we need to send both your photo and these special instructions together in just the right way.

messages = [ 
    { 
        "role": "user", 
        "content": [ 
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{state['image_base64']}"}}, 
            {"type": "text", "text": BOOK_RECOGNITION_TEMPLATE}, 
        ], 
    } 
]

Note: Sending the photo and these instructions together is crucial. If we send them separately, it doesn’t work nearly as well.

What success looks like

When everything works right, the AI will return a clean set of results like this:

{ 
    "books": [ 
        { 
            "title": "The Great Gatsby", 
            "author": "F Scott Fitzgerald" 
        } 
    ] 
}

Note: No extra stuff, just the data we need.

When things go wrong

To keep things safe, I built in a few checks.

try: 
    result = JsonOutputParser().parse(response.choices[0].message.content) 
     
    if not isinstance(result, dict) or "books" not in result: 
        raise ValueError("Invalid response format from LLM") 
except Exception as e: 
    logger.error(f"Error in book recognition node: {str(e)}")

Note: This code ensures the AI is giving us valid results. If not, it tries to fix the issue or logs the error.

A few things I learned along the way

I had to set the AI’s “creativity level” low (temperature=0.1) so it would read the titles and not try to make things up.
I also had to give it plenty of room (max_tokens=4096), because some bookshelves are packed!
Finally, I always had to double-check (using the JSON parser) even if it looked perfect to validate the results.

Now that we have a list of the books on your shelf, how do we turn that into great recommendations? The next step is to dive into each book and gather all the extra details using the Google Books API.

Book details: digging deeper with the Google Books API

Have you ever heard about a book and instantly wanted to know everything? That’s exactly what this step does. Now that my AI knows your book titles and authors, it’s time to find all the details. Let me show you how we get that info.

The heart of the book’s story

Whenever someone snaps a photo of their bookshelf, we end up with a list of titles and authors. But that’s just the beginning. To make those great recommendations, we need to know everything: genres, publication dates, summaries — all the good stuff!

Here’s how we dig deeper using the (Free) Google Books API:

def get_book_metadata(title: str, author: str) -> dict: 
    query = f"intitle:{title}" 
    if author: 
        query += f"+inauthor:{author}" 
    response = requests.get( 
        "https://www.googleapis.com/books/v1/volumes", 
        params={"q": query}, 
    )

Note: We create a specific search query for each book using the title and author. Including the author when we have it makes a huge difference in finding the right book.

Handling the API’s response

The Google Books API sends us back a treasure chest of information. But just like a real treasure hunt, sometimes we come back empty-handed, so we must be careful. We built in some safety nets to manage those times:

items = response.json().get("items", []) 
if items: 
    return items[0].get("volumeInfo", {}) 
return {}

Note: This checks if the API found information about a book. If not, we will still move on, but log what happened.

I learned this the hard way: We can’t assume we’ll get every piece of information for every book. If the API doesn’t have all the details, that’s okay. It is better to have some information than none.

Putting it all together: building the book profiles

For each book we found, we’ll now use the Google API to grab those extra details, building a complete profile, like a detective gathering clues:

for book in state.get("recognized_books", []): 
    title = book.title 
    author = book.author 
    if author: 
        logger.info(f"Looking up book: {title} by {author}") 
    else: 
        logger.info(f"Looking up book: {title}") 
    metadata = get_book_metadata(title, author)

Note: This code runs the Google API search for each book on your shelf.

When things go wrong (API edition)

Let’s face it — APIs can be a little unreliable. Sometimes they’re slow, sometimes down, and sometimes they don’t have the information we need. That’s why:

We log everything (trust me, you’ll thank yourself later).
We never let one failed lookup stop the entire process. We keep going!

Now that we’ve gathered all those extra details about your books, the real magic begins: transforming that information into a book recommendation that’s perfect for you.

Recommendation time: Picking your next great read

Have you ever wondered how those great bookstore owners seem to know exactly what you’ll love next? That’s what we’re teaching our AI to do here. Trust me; making great recommendations is more art than science!

The secret ingredient: Finding unexpected connections

Here’s the thing about recommendations: it’s not just about matching genres. It’s about finding those unexpected connections that make you go, “Huh, I never thought of that!”

Let me show you the special instructions I give my AI for recommendations:

RECOMMENDATION_TEMPLATE = """You are a book recommendation system analyzing a user's bookshelf. 
Based on the books they own and the detailed metadata about these books, recommend ONE NEW book they might enjoy. 
 
IMPORTANT: 
- Do NOT recommend any books that are already on their shelf 
- Do NOT recommend books that are too similar to what they already have 
- Look for interesting connections between their diverse interests 
- Try to recommend something that bridges multiple interests 
"""

Note: This specific set of instructions tells the AI what it should and should not do.

See, we’re not just looking for more of the same. If someone has a bunch of books on Python, we won’t just recommend another Python book. Instead, we’re looking for interesting links, such as:

Technical books + business books = Maybe a book about tech startups
Programming + design books = How about some creative coding?

Preparing the AI: Building Your Book Profile

To make an excellent recommendation, we must first help the AI see your bookshelf as you do. That means giving the AI a clear summary of your books:

def format_book_list(state: ShelfGeniusState) -> str: 
    return "\n".join([f"- {book.title} by {book.author}"  
                     for book in books if book.title and book.author])

Note: First, we list your books, noting the titles and authors.

And then, we go deeper and feed it the metadata we previously collected.

def format_metadata(state: ShelfGeniusState) -> str: 
    for book in metadata: 
        info = [] 
        if "categories" in book: 
            info.append(f"Categories: {', '.join(book['categories'])}") 
        if "description" in book: 
            # A little trick I learned: truncate long descriptions 
            desc = book["description"][:200] + "..."

Note: This is where we add all those juicy details we grabbed from the Google Books API, like categories and descriptions.

Setting the AI’s creativity level

Now for the fun part. We set our AI’s “creativity level” just right:

response = client.chat.completions.create( 
    model="gpt-4o-mini", 
    messages=messages, 
    max_tokens=1000, 
    temperature=0.7,  # The secret sauce for creative recommendations 
)

Why 0.7 temperature? Too low, and you’ll get boring, obvious recommendations. Too high, and the AI might suggest a cookbook to someone who only reads sci-fi!

It’s all about finding that sweet spot where the recommendations are interesting and relevant to your reading tastes.

When things get a little creative

AIs can sometimes get creative, so I’ve built a few safety nets. Here’s how I catch any unexpected results:

try: 
    # First try direct JSON parsing 
    result = json.loads(response_content) 
except json.JSONDecodeError: 
    # If that fails, try the langchain parser 
    result = JsonOutputParser().parse(response_content)

Note: This code ensures the AI’s response is valid JSON, which we can process.

The result

In the end, you get one thoughtful recommendation like this:

{ 
    "recommendation": { 
        "title": "The Design of Everyday Things", 
        "author": "Don Norman", 
        "reasoning": "Given your interest in both programming and user experience..." 
    } 
}

This isn’t a list of ten books you might like, but one book carefully chosen based on everything we know about your reading taste.

Want to know the best part? When it works just right, it feels like magic! You get a recommendation that makes you think, “How did it know I would be interested in that?” That’s when you know you’ve built something special.

Seeing Shelf Genius in action

You know that feeling when you’re staring at your bookshelf, wishing it could tell you what to read next? Let me show you how to make that happen.

I’ll walk you through setting up Shelf Genius on your machine. Don’t worry if you’re not a tech wizard — I’ll guide you through each step.

Getting your environment ready

First things first: we need to install UV. It’s this fantastic tool that makes managing Python packages a breeze. Just one command and you’re set:

curl -LsSf https://astral.sh/uv/install.sh | sh

Now, let’s grab Shelf Genius from GitHub. Open your terminal and type:

git clone https://github.com/PatrickKalkman/shelf-genius.git 
cd shelf-genius

The secret sauce: your OpenAI key

Here’s something important: Shelf Genius needs an OpenAI API key to work its magic. Think of it as the key that unlocks all that AI goodness. Here’s how to set it up:

Head over to OpenAI’s platform and grab your API key
Create a file called .env in your Shelf Genius folder
Add your key like this:

echo "OPENAI_API_KEY=your-key-goes-here" > .env

Let’s make some magic!

Now for the fun part. Got a photo of your bookshelf? Great! Here’s how to get your personalized recommendations:

uv run python ./src/shelf_genius/workflow.py --image-path /path/to/your/bookshelf/photo.jpg

Pro tip: Make sure your bookshelf is well lit and the photo isn’t too chaotic — mine had a plant photobombing the AI!

Here’s what will happen:

The AI will scan your photo
It’ll identify the books it can see
It’ll think about what you might like based on your collection
And finally… it’ll suggest your next great read!

Terminal output showing Shelf Genius processing a bookshelf image, recognizing 4 technical books, and recommending ‘Creative Code’ by Golan Levin based on the user’s interest in programming and design. — Shelf Genius in action, image by author

Troubleshooting

If something’s not working quite right, here are a few things to check:

Did you include the full path to your image?
Is your API key correctly set in the .env file?
Is your photo clear enough? (The AI isn’t great with blurry shots)

Want to see the AI’s thought process? Add --verbose to your command:

uv run python ./src/shelf_genius/workflow.py --image-path /path/to/photo.jpg --verbose

Transform your bookshelf into a personalized reading guide today — snap a photo and unlock your next favorite book!

And remember — if you run into trouble, the error messages are helpful. They’ll tell you exactly what went wrong and how to fix it.

Taking Shelf Genius to the next level: It can be even better!

After working with Shelf Genius for a while now, I’ve got some ideas for making it even better, cheaper, and smarter. Let me tell you what I’m thinking.

Bringing the smarts home (and cutting costs)

You know what’s wild? AI that runs on your computer is getting superb. Like, scarily good! Instead of sending every photo to OpenAI (and watching those API costs add up!), we could:

Use LLaVA or similar local AI models to find your books
Run Mixtral locally to understand your books and make recommendations
Keep using the Google Books API calls (because, hey, that part’s free!)

Think about it: faster responses, no API costs, and it all works even if your internet is acting up.

Giving the AI Agent more control

But here’s what gets me excited. What if we let the AI take more control over the process? Instead of hard-coding every step, we could have a “director” AI that:

Decide when to try a different approach to recognizing books
Figures out which book details are most helpful
Gets more creative with recommendations when it spots interesting patterns

Here’s what that would look like in code:

class ShelfGeniusDirector: 
    def __init__(self): 
        self.tools = { 
            "detect_books": local_vision_model, 
            "get_metadata": google_books_api, 
            "analyze_collection": local_llm 
        } 
         
    def orchestrate(self, image): 
        # Let the AI decide what to do next!

Why this matters (and why it’s exciting!)

Look, I’ve learned that keeping API costs down is essential. You shouldn’t have to worry about those costs! By making these changes, Shelf Genius could be:

More accessible (run it as much as you want!)
More flexible (the AI can adapt to your specific bookshelf)
More future-proof (we can easily swap in better local models as they come out)

Want to help make this happen? The code’s on GitHub, and I’d love to see what you can do with these ideas. Because everyone deserves an innovative book recommendation system — without breaking the bank.