Introducing ConvoDigest ✨💬

Introducing ConvoDigest ✨💬

Hi, I’m Kedar, and today, I’m excited to share the story behind ConvoDigest, an app I built to tackle a problem I encountered in my everyday life. While exploring AI technologies like large language models (LLMs), retrieval-augmented generation (RAG), and embeddings, I realized that these tools could help me deal with a common issue.

As a committee member of a large housing society, our group chats are flooded with messages every day. It became nearly impossible to stay on top of everything without spending hours sifting through the conversations. This is the problem ConvoDigest solves—a local-first solution that enables me (and others) to summarize and query long, unstructured group chats while keeping the data entirely private.

The Problem

In large communities, group chats are a vital means of communication, but the sheer volume of messages—often scattered across multiple topics—makes it easy to lose track of important information. From minor maintenance issues to significant resident concerns, these chats provide valuable insights, but accessing that information amidst the clutter is challenging.

That’s where ConvoDigest comes in. It transforms raw conversations into actionable insights by generating summaries, identifying key points, and answering specific questions. For example, I can now ask, “What were the main issues discussed this week?” or “How many times was a certain issue mentioned?”—and get precise, contextually aware responses.

A Local-First, Privacy-Focused Solution

Privacy is a significant concern for many people, particularly when it comes to sensitive chat data. I knew that offloading this data to external servers would be off-putting for most people. Instead, ConvoDigest processes everything locally on your device, ensuring that your data never leaves your control.

How It Works

At its core, ConvoDigest uses a combination of Vectra (a vector database) and AI models like LLaMA 3.1 (7B parameters) via Ollama to process and analyze chat data. By creating embeddings, the app can understand the content better and provide meaningful responses.

The frontend and backend were built using Node.js, Express, and React, inspired by the minimal design of ChatGPT to keep the interface simple. Users can upload their WhatsApp chat data, and the app processes it locally, turning it into searchable summaries and insights.

Challenges

Embedding Nuanced Conversations: WhatsApp chats are typically unstructured, with abrupt topic changes, slang, and even code-mixed content. Properly embedding these into a vector database required refining the embedding process to capture the meaning behind the words.

Contextual Chunking: Breaking long chats into manageable chunks for embedding meant dealing with context loss across chunks.

Temporal and Cross-Chunk Dependencies: Chat conversations are often temporal in nature, meaning the sequence and timing of messages matter. Capturing these subtleties using static embeddings is a significant technical hurdle.

Non-Textual Data: Conversations often contain non-textual elements like images, videos, and PDFs. Although these are not processed by the app yet, handling mixed media will be an important step in future iterations.

Future plans

At this stage, ConvoDigest is an MVP, and there’s much more I want to explore. One of the immediate next steps is to containerize the app so users can easily deploy it with a single Docker command. This would make installation and setup far simpler.

I’m also excited about experimenting with more advanced embeddings techniques and possibly integrating other LLMs to further improve the app’s accuracy and performance. Refining how context is handled across conversations and exploring ways to incorporate non-textual data into the summarization process are key areas I’ll be focusing on moving forward.

Final Thoughts

Building ConvoDigest has been a rewarding journey that allowed me to bridge AI concepts like LLMs, RAG, and embeddings with a practical a use case. It’s been a deep dive into how these technologies can simplify complex, overwhelming data—in this case, lengthy group chats—and transform it into something much more digestible and actionable.

I’m looking forward to improving the app and helping others with similar issues in the future.

Check out the app here.