From Messy Emails to Clean Structured Data Using CrewAI
How to leverage Python's CrewAI to parse data from an email to a clear and structured object with a simple script. And how you can too.
Introduction
For those of you who don’t know, other than being a software developer, I am also a professional photographer. I specialize mainly in corporate events, conferences, and portraiture.
Every week, I receive multiple emails from potential clients asking for details about my services and a quotation so they can evaluate if they want to work with me.
I mainly need a few bits of information to understand how to prepare the quote for them. However, this info is always scattered between the words they use in their email. And it takes me a while to translate their requests into a quotation, mainly because everyone, naturally, writes differently.
As I am only human, sometimes I make mistakes when I prepare the quote for clients because I may read something wrong or miss an important detail.
Other than that, the process of reading an email with lots of information about a photo shoot and preparing a quote manually is rather painful.
I want to focus on other things, like taking photos instead of spending time going through these emails.
So, I decided to find a solution to automate translating the data from these emails into quotes automatically using Python.
A Simple Solution
The idea was quite simple. Take an email from a potential client, parse the data, and create a structured object out of it so I can use it elsewhere.
It sounds quite straightforward, but we do need to have a powerful tool to take care of this parsing job.
Usually, we could use regex to parse information out of an email, but not all emails are formatted in the same way or contain the same information. As I mentioned before, they all write differently.
That’s when using a specialized tool comes in handy.
Let’s leverage Large Language Models (LLMs), aka AI, for our task, and more specifically, the Python CrewAI library, which helps us interact with LLMs (OpenAI, Gemini, Claude, etc.) to execute tasks like parsing text.
But first, let’s take a brief look at what CrewAI is and how it can help us.
CrewAI
CrewAI is a Python library to integrate LLMs via autonomous AI agents in our codebase to execute tasks. That’s a mouthful. A simple take is the following:
CrewAI allows you to easily access large language models through the usage of autonomous (independent of each other) agents that execute tasks as a team (crew), to complete complex workflows or tasks.
The idea behind this approach is to have multiple independent agents, each with a specific role within the crew, working together to achieve a greater goal. We can then divide a complex task into smaller subtasks, assign them to different agents, and through collaboration, they will work together to achieve the common goal.
There are multiple benefits from using CrewAI to tackle a complex task, and perhaps our example of parsing data from an email into a clean, structured object is a bit of overkill.
Nevertheless, the way CrewAI works and how it is built make it a no-brainer to use for any kind of repetitive task where we want to leverage the power of LLMs to parse data.
Without getting too much into the details on how CrewAI works, let’s see the code in action and the amazing results of parsing data.
For more information about CrewAI, check out their website: www.crewai.com
Building the CrewAI Agent
Keep reading with a 7-day free trial
Subscribe to Daniel’s Substack to keep reading this post and get 7 days of free access to the full post archives.