profile

Sarah Glasmacher

Kicking Off 2025 with Cautious Goals and a Local RAG Project 🧠


Kicking Off 2025 with Cautious Goals and a Local RAG Project 🧠

SARAH GLASMACHER

JAN 09



Happy new year everyone! 🎆 (How long into January are we still allowed to say that?)

New Year’s resolutions and I have a complicated relationship. On one hand, I love setting new goals. On the other hand, I have disappointed myself too many times by not reaching them - and it gets even more embarrassing to share those failures online. So if you also have a hard time with all these posts celebrating last year’s accomplishments and setting super high goals that feel out of reach, you’re not alone. 🤗

I hope by sharing my thoughts on my tentative goals, it might help someone reflect on what "right" goals look like for them personally.

Old Resolutions: What Worked and What Didn’t

Last year, I set myself the goal of reaching my first pull-up and becoming fit. Today, I’ve probably never been less fit in my entire life. Clearly, that didn’t work out (literally). 💪❌😅

However, I don’t fail at every yearly goal I set. For example, I’ve had a yearly reading goal for six years now, and I’ve stuck to it almost every time. In 2024, I read 24 books as planned. Did I read eight of them in December? Absolutely. 🫣

A similar success story was when I challenged myself to paint one acrylic artwork every month for a year. I often left it until the last second and had to catch up, but I finished the challenge. 🤷🏻‍♀️

The lesson? I need goals that I can catch up on. I need wiggle room. While I haven’t fully figured out how to apply this to every area I want to improve, I’ve set one goal for 2025 that incorporates these insights:

Goal: I will work on one personal coding project every month of 2025. 🚀💻📈

💡📖 ✅ Why This Goal Works for Me (Hopefully)

This goal is inspired by three things that have worked for me in the past:

  1. Challenges Motivate Me: I love challenges. The idea of taking on something new and exciting gives me an initial energy boost. While I don’t always finish challenges, I wanted to leverage this enthusiasm to kickstart progress each month.
  2. Wiggle Room: I’ve structured this goal so that every small step counts as success. These are hobby projects - the goal isn’t to build a groundbreaking app every month but to learn something along the way. There are quick wins I can aim for at the last minute if needed, as well as stretch goals for when things go well.
  3. Monthly Resets: Even if one month doesn’t go as planned, I can start fresh the next month with a new project. Quarterly goals didn’t work for me in the past because I’d lose motivation midway through and waste weeks (because the streak was lost, the goal was no longer reachable and then why bother?). A monthly reset option fits my attention span better. 🎯🔄

January Project: A Local RAG System

Since it’s already January 7th (when I'm writing this), I’ve already started my first project - and I’m happy to report some progress! 🛠️✨

What’s the Project?

I’m building a fully local Retrieval-Augmented Generation (RAG) system. This is inspired by the "build from scratch" learning approach I experienced in university and admired in popular ML educators like Sebastian Raschka (with this book "Build a Large Language Model (From Scratch)"). While most people shouldn’t reinvent the wheel by building a custom RAG system (or pretraining an LLM from scratch) for daily projects at work, I think this approach is invaluable for learning. It helps you understand how a system works, troubleshoot effectively, and experiment with different components.

🔍 For this project for example, I want to explore how embedding models, chunking strategies, and vector databases influence retrieval and generation results.


Progress So Far: Building the Foundation 📋🧱

Here’s what I’ve done so far:

  • Created a script to process local markdown files, extract content, generate embeddings using OpenAI’s text-embedding-3-large, and store results in a Pandas DataFrame.
  • Set up a local Postgres database with pgvector in Docker for storing and querying embeddings (not connected to Python yet).

What’s Next? 🚧🔍

Once the database connection is running, I’ll explore how to retrieve results and get small working prototype going before starting to replace cloud components with local models. Then - if there is still time - I will start changing techniques like chunking, adding context to embeddings, search techniques and indices. These experiments will help me better understand the strengths and weaknesses of different approaches.

Why I’m Sharing This & Where to find more

I believe learning is best done out loud. By sharing my progress (and inevitable mistakes), I hope to encourage you to start your own intimidating projects, even if the path isn’t perfect. Every small step counts, and progress is progress - no matter how tiny. 💬🌱💡

You can find up-to-date information and the GitHub link to code on my website in the launch blog post and on the project page: https://sarahglasmacher.com/january-project-local-rag-system/ (I updated my website theme, I think it's so cute and cozy now 😍)

See you in next weeks update (if I keep up with some of my ... more ambitious goals 😅🫣) 👋🏻

Senderinfo:

Sarah Glasmacher, c/o Postflex #2871, Emsdettener Str. 10, 48268 Greven, Germany

sarah@sarahglasmacher.com

Imprint Privacy Policy

Unsubscribe · Preferences

Sarah Glasmacher

Read about what I'm learning as an ML engineer, what I observe in my field, useful links and resources I found, incl. courses and books and get updates on new content and tutorials I'm releasing

Share this page