Building AI Task Suggestions: Teaching a Machine to Read My Emails
My inbox was out of control. Every morning: 50+ unread emails with buried action items, meeting notes with scattered to-dos, and the constant fear of missing something important.
I thought: what if AI could just... read all this and tell me what needs to be done?
Three weeks later, I had a working system. It wasn't easy.
Why I Built This
Our team at Divami uses email heavily. Product managers send requirements, clients request changes, meeting notes contain action items. Every email potentially has tasks hidden inside.
There had to be a better way.
The Initial (Bad) Idea
My first thought: "I'll write regex patterns to extract tasks from emails."
# Don't do this
if "please review" in email_body or "action item" in email_body:
create_task(email_subject)
This worked for exactly 3 emails. Then I got:
- "Can you take a look at the dashboard when you get a chance?"
- "FYI - the client wants changes to the homepage"
- "Following up on our discussion about the Q1 timeline"
Regex couldn't handle natural language variations. I needed something smarter.
The Breakthrough: Use an LLM
The solution hit me while reading about ChatGPT: LLMs are really good at understanding text and extracting structured information.
Here's the flow I designed:
User clicks "AI Task Suggestions"
↓
Fetch 50 recent emails from Gmail
↓
Fetch 8 recent meeting notes from Google Drive
↓
For each email/doc → Send to Gemini LLM
↓
LLM returns: {"tasks": [{"title": "Review Q1 budget"}, ...]}
↓
Show suggestions to user
↓
User accepts/rejects each suggestion
↓
Accepted tasks → Created in database
↓
Rejected tasks → Never shown again
The key insight: I don't need to predict what tasks look like. The LLM already knows.
Part 1: Getting Emails from Gmail (OAuth Hell)
First challenge: accessing Gmail programmatically.
OAuth 2.0: The Necessary Evil
Gmail requires OAuth 2.0. No simple API keys. You need:
- User clicks "Connect Gmail"
- Redirect to Google login
- User grants permissions
- Google redirects back with auth code
- Exchange code for access token
- Store token, refresh it when expired
I spent two days on this. The Google OAuth docs are... dense.
def get_user_credentials(user: User, db: Session):
"""
Retrieve and refresh user's Google OAuth credentials
This took me forever to get right. Token refresh is tricky.
"""
# Check for existing credentials
oauth_row = db.query(GoogleAuth).filter(
GoogleAuth.user_id == user.id
).first()
if not oauth_row:
raise HTTPException(401, "No Gmail connected")
# Build credentials object
creds = Credentials(
token=oauth_row.access_token,
refresh_token=oauth_row.refresh_token,
token_uri="https://oauth2.googleapis.com/token",
client_id=settings.GOOGLE_CLIENT_ID,
client_secret=settings.GOOGLE_CLIENT_SECRET
)
# Refresh if expired
if creds.expired and creds.refresh_token:
creds.refresh(Request())
# Save new token
oauth_row.access_token = creds.token
db.commit()
return creds
What confused me: Access tokens expire after 1 hour. Refresh tokens last longer. You need to handle both.
Fetching 50 Emails (And Why 50?)
Once authenticated, I fetch emails using the Gmail API:
def _fetch_recent_emails(service, limit: int = 50):
"""
Fetch recent emails with full thread context
Why 50? Trial and error. More than 50 takes too long.
Less than 50 misses too many tasks.
"""
# List message IDs
resp = service.users().messages().list(
userId="me",
maxResults=limit
).execute()
messages = resp.get('messages', [])
emails = []
for msg_meta in messages:
msg_id = msg_meta['id']
thread_id = msg_meta['threadId']
# Fetch full message details
detail = service.users().messages().get(
userId="me",
id=msg_id,
format="full"
).execute()
# Extract subject, sender, body
headers = detail['payload']['headers']
subject = next(
(h['value'] for h in headers if h['name'] == 'Subject'),
'No Subject'
)
# Get email body (this is surprisingly complex)
body = extract_body(detail['payload'])
# IMPORTANT: Fetch entire thread (all replies)
thread = service.users().threads().get(
userId="me",
id=thread_id,
format="full"
).execute()
# Combine all messages in thread
thread_body = combine_thread_messages(thread)
emails.append({
'id': msg_id,
'subject': subject,
'body': body,
'thread_body': thread_body # Full context for LLM
})
return emails
Key learning: Email bodies are nested MIME parts (plain text, HTML, attachments). Extracting text requires recursively walking the structure. This took me a full day to debug.
Why process full threads? A task might be mentioned in the 3rd reply, not the original email. The LLM needs full context.
Part 2: Getting Meeting Notes from Google Drive
Our team uses Google Meet, which auto-generates "Notes by Gemini" documents. These contain action items from meetings.
Finding the Right Folder
Google Drive API requires searching for files:
def _find_notes_by_gemini_folder(drive_service):
"""
Find the 'Notes by Gemini' folder
Pro tip: Google creates this folder automatically for Meet notes
"""
query = (
"name contains 'notes by gemini' "
"and mimeType='application/vnd.google-apps.folder'"
)
resp = drive_service.files().list(
q=query,
pageSize=10,
fields="files(id,name)"
).execute()
folders = resp.get('files', [])
if not folders:
return None
return folders[0]['id']
Exporting Google Docs as Plain Text
Google Docs need to be exported to a readable format:
def _fetch_drive_notes_bodies(drive_service, limit_files: int = 8):
"""
Fetch recent meeting notes
Why 8? Most teams have 2-3 meetings per week.
8 docs = ~3 weeks of history.
"""
folder_id = _find_notes_by_gemini_folder(drive_service)
if not folder_id:
return []
# List recent docs in folder
query = (
f"'{folder_id}' in parents "
"and mimeType='application/vnd.google-apps.document' "
"and trashed=false"
)
resp = drive_service.files().list(
q=query,
orderBy="modifiedTime desc",
pageSize=limit_files,
fields="files(id,name,modifiedTime)"
).execute()
docs = []
for doc_file in resp.get('files', []):
doc_id = doc_file['id']
# Export as plain text
content_bytes = drive_service.files().export(
fileId=doc_id,
mimeType="text/plain"
).execute()
content = content_bytes.decode('utf-8', errors='ignore')
docs.append({
'id': doc_id,
'name': doc_file['name'],
'content': content
})
return docs
What surprised me: Meeting notes are incredibly structured. They have sections like "Action Items:", "Discussion:", "Attendees:". The LLM extracts tasks really well from these.
Part 3: The LLM Processing Pipeline (The Magic Part)
Now I have 50 emails + 8 docs. How do I extract tasks?
Using Gemini via LiteLLM
I use Google's Gemini model through LiteLLM (an OpenAI-compatible proxy):
from openai import OpenAI
LITELLM_BASE_URL = "http://localhost:4000"
_llm_client = OpenAI(
api_key="dummy-key-for-litellm-proxy",
base_url=LITELLM_BASE_URL
)
Why LiteLLM? It provides a unified API for different LLM providers. I can switch from Gemini to GPT-4 without changing code.
The System Prompt (Took Me 10 Iterations to Get Right)
LLM_SYSTEM_PROMPT = """
You are given the FULL BODY of ONE EMAIL THREAD (all messages combined).
Your job: extract actionable WORK TASKS ONLY from that thread.
STRICT OUTPUT CONTRACT (NO EXCEPTIONS): Return ONLY raw JSON:
{
"tasks": [
{
"title": string,
"description": string,
"category": "TASK_ASSIGNMENT" | "MEETING_SCHEDULED" | "DEADLINE_MENTIONED"
}
]
}
RULES:
- Use the ENTIRE thread body (all replies), not just the first message
- Extract bullet lists, "Next steps", "Action items", imperatives
- Task titles should be short (max 120 characters)
- Descriptions should be concise (max 5000 characters)
- If no tasks found → return {"tasks": []}
- DO NOT include: newsletters, automated emails, social updates
- Focus on: work assignments, deadlines, meeting action items
Examples:
Email: "Hi John, please review the Q1 budget proposal by Friday. Thanks!"
Output: {"tasks": [{"title": "Review Q1 budget proposal", "description": "Due by Friday", "category": "TASK_ASSIGNMENT"}]}
Email: "Meeting scheduled for tomorrow at 2 PM to discuss project timeline"
Output: {"tasks": [{"title": "Attend project timeline meeting", "description": "Tomorrow at 2 PM", "category": "MEETING_SCHEDULED"}]}
"""
What I learned:
- Be extremely specific. "Return JSON" isn't enough. Show examples.
- Emphasize "ENTIRE thread body" because LLMs sometimes only read the start.
- Setting max character limits prevents the LLM from writing novels.
Parallel Processing (Because 50 Emails Takes Time)
Processing 50 emails sequentially would take 5+ minutes. I process them in parallel:
from concurrent.futures import ThreadPoolExecutor
def process_emails_parallel(emails, max_workers=6):
"""
Process emails in parallel batches
Why 6 workers? Trial and error. More than 6 causes rate limiting.
"""
all_tasks = []
with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = [
executor.submit(_call_llm_single_email, email)
for email in emails
]
for future in futures:
try:
tasks = future.result()
all_tasks.extend(tasks)
except Exception as e:
print(f"Error processing email: {e}")
continue
return all_tasks
Performance:
- Sequential: ~5 minutes for 50 emails
- Parallel (6 workers): ~20-30 seconds
Why not more workers? LiteLLM/Gemini has rate limits. More than 6 concurrent requests gets throttled.
Part 4: Calling the LLM (With Retry Logic)
def _call_llm_single_email(email: dict):
"""
Process a single email through the LLM
Returns list of extracted tasks
"""
# Build user prompt
full_body = email.get('thread_body') or email.get('body', '')
user_prompt = f"""EMAIL_ID: {email['id']}
SUBJECT: {email['subject']}
THREAD_BODY:
{full_body[:10000]}
""" # Limit to 10k chars to avoid token limits
# Call LLM with retry logic
raw_response = _litellm_generate(
LLM_SYSTEM_PROMPT,
user_prompt,
max_attempts=3
)
if not raw_response:
return []
# Parse JSON from response
try:
# LLMs sometimes wrap JSON in markdown
start = raw_response.find("{")
end = raw_response.rfind("}")
if start != -1 and end != -1:
json_str = raw_response[start:end+1]
data = json.loads(json_str)
tasks = data.get('tasks', [])
# Add email ID to each task
for task in tasks:
task['source_email_id'] = email['id']
return tasks
except json.JSONDecodeError:
print(f"Failed to parse LLM response: {raw_response[:200]}")
return []
return []
def _litellm_generate(system_prompt, user_prompt, max_attempts=3):
"""
Call LLM with exponential backoff retry
Why retry? Rate limits and temporary failures.
"""
for attempt in range(1, max_attempts + 1):
try:
response = _llm_client.chat.completions.create(
model="gemini/gemini-1.5-flash",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
],
temperature=0 # Deterministic output
)
return response.choices[0].message.content.strip()
except Exception as e:
if "429" in str(e): # Rate limit
wait_time = min(2 ** attempt, 10) # Max 10 seconds
time.sleep(wait_time)
continue
else:
print(f"LLM error: {e}")
return None
return None
What I learned:
- LLMs don't always return pure JSON. Sometimes they add "Here's the JSON: ```json..."
- Extracting JSON with
find("{")andrfind("}")is hacky but works. - Rate limiting (429 errors) happens. Exponential backoff is essential.
temperature=0gives consistent results. No creativity needed here.
Part 5: The Accept/Reject Workflow (Preventing Duplicates)
Once I have task suggestions, users need to accept or reject them.
The Problem: Users might reject a suggestion, but it keeps appearing every time they fetch new suggestions.
The Solution: Track decisions in a database table.
# Database model
class AiEmailSuggestionDecision(Base):
__tablename__ = "ai_email_suggestion_decisions"
id = Column(UUID, primary_key=True)
user_id = Column(UUID, ForeignKey("users.id"))
source_email_id = Column(String(255))
title_norm = Column(String(400)) # Normalized title for matching
decision = Column(String(20)) # 'ACCEPT' or 'REJECT'
created_at = Column(DateTime, default=datetime.utcnow)
Accepting a Suggestion
@app.post("/ai/task-suggestions/accept-email")
def accept_email_suggestion(payload: AcceptPayload, request: Request, db: Session):
"""
Accept a suggestion and create a real task
"""
user = request.state.user
# Create the task
new_task = Task(
title=payload.title.strip(),
description=payload.description,
priority=payload.priority or "MEDIUM",
due_date=payload.due_date,
assignee_id=payload.assignee_id or user.id,
project_id=payload.project_id,
status="todo",
is_ai_extracted=True,
source_email_id=payload.source_email_id
)
db.add(new_task)
db.commit()
# Record ACCEPT decision
decision = AiEmailSuggestionDecision(
user_id=user.id,
source_email_id=payload.source_email_id,
title_norm=payload.title.strip().lower(),
decision="ACCEPT",
task_created_id=new_task.id
)
db.add(decision)
db.commit()
return {"success": True, "task_id": str(new_task.id)}
Rejecting a Suggestion
@app.post("/ai/task-suggestions/reject-email")
def reject_email_suggestion(payload: RejectPayload, request: Request, db: Session):
"""
Reject a suggestion - it won't appear again
"""
user = request.state.user
# Record REJECT decision
decision = AiEmailSuggestionDecision(
user_id=user.id,
source_email_id=payload.source_email_id,
title_norm=payload.title.strip().lower(),
decision="REJECT"
)
db.add(decision)
db.commit()
return {"success": True}
Filtering Out Already Decided Suggestions
def filter_already_decided(suggestions, user_id, db):
"""
Remove suggestions that were already accepted or rejected
"""
# Get all decisions for this user
decisions = db.query(AiEmailSuggestionDecision).filter(
AiEmailSuggestionDecision.user_id == user_id
).all()
decided_map = {
(d.source_email_id, d.title_norm): d.decision
for d in decisions
}
filtered = []
for suggestion in suggestions:
key = (
suggestion['source_email_id'],
suggestion['title'].strip().lower()
)
if key not in decided_map:
filtered.append(suggestion)
return filtered
Why this works: Matching on (source_email_id, normalized_title) ensures we don't show the same task twice, even if the wording changes slightly.
Part 6: Project Prediction (Heuristic Matching)
Users have multiple projects. I try to predict which project each task belongs to:
def predict_project_for_title(db, user, title, description=""):
"""
Predict project based on keyword matching
This is simple but works surprisingly well.
"""
# Get user's projects
projects = db.query(Project).filter(
Project.user_id == user.id
).all()
text = f"{title} {description}".lower()
# Score each project
scores = []
for proj in projects:
score = 0
proj_keywords = proj.name.lower().split()
for keyword in proj_keywords:
if keyword in text:
score += 1
if score > 0:
scores.append((proj.id, proj.name, score))
if not scores:
return None
# Return highest scoring project
scores.sort(key=lambda x: x[2], reverse=True)
return scores[0][0] # Project ID
Example:
- Task: "Review marketing campaign budget"
- User has projects: ["Marketing Q1", "Engineering Infrastructure", "HR Onboarding"]
- Match: "marketing" appears in task → Predict "Marketing Q1"
Accuracy: ~70% correct in my testing. Not perfect, but better than nothing.
The Challenges I Faced
Challenge 1: Email Body Extraction Is Hard
Emails have complex MIME structures. Plain text, HTML, attachments, inline images.
I spent a full day writing this recursive function:
def extract_body(payload):
"""
Recursively extract text from email payload
This handles multipart MIME messages.
"""
plain_body = ""
html_body = ""
def walk_parts(part):
nonlocal plain_body, html_body
mime_type = part.get("mimeType", "")
if mime_type == "text/plain":
data = part.get("body", {}).get("data")
if data:
decoded = base64.urlsafe_b64decode(data).decode('utf-8', errors='ignore')
plain_body += decoded
elif mime_type == "text/html":
data = part.get("body", {}).get("data")
if data:
decoded = base64.urlsafe_b64decode(data).decode('utf-8', errors='ignore')
html_body += decoded
# Recurse into parts
for sub_part in part.get("parts", []):
walk_parts(sub_part)
walk_parts(payload)
# Convert HTML to plain text if needed
if not plain_body and html_body:
from bs4 import BeautifulSoup
plain_body = BeautifulSoup(html_body, 'html.parser').get_text()
return {"plain": plain_body, "html": html_body}
What confused me: Email bodies are Base64-encoded. HTML needs to be converted to plain text. Nested MIME parts require recursion.
Challenge 2: LLM Token Limits
Gemini has a token limit (~30k tokens for Flash). Long email threads exceed this.
My fix: Truncate thread body to 10,000 characters.
user_prompt = f"""EMAIL_ID: {email['id']}
SUBJECT: {email['subject']}
THREAD_BODY:
{full_body[:10000]} # ← Truncate here
"""
Trade-off: Might miss tasks mentioned later in long threads. But 10k chars = ~2000 words, which covers most cases.
Challenge 3: Duplicate Suggestions
Sometimes the LLM extracts the same task from multiple emails.
Example:
- Email 1: "Please review the budget"
- Email 2: "Following up on the budget review"
Both produce: "Review budget"
My fix: Deduplicate by normalized title before showing to user.
def deduplicate_tasks(tasks):
"""
Remove duplicate tasks by normalized title
"""
seen = set()
unique = []
for task in tasks:
norm_title = task['title'].strip().lower()
if norm_title not in seen:
seen.add(norm_title)
unique.append(task)
return unique
Challenge 4: Rate Limiting
Gmail API has quotas:
- 250 quota units per user per second
- Reading a message = 5 units
- 50 emails = 250 units
I hit this limit during testing. Solution: Add delays between requests.
for email in emails:
process_email(email)
time.sleep(0.1) # Avoid rate limits
The Results
After three weeks of development, here's what works:
Performance:
- Fetches 50 emails + 8 docs in ~5 seconds
- LLM processing (parallel): ~20-30 seconds
- Total time: ~30-35 seconds
- Cache duration: 60 minutes (no need to re-fetch every time)
Accuracy:
- ~80% of extracted tasks are actually actionable
- ~20% are false positives (newsletters, social updates)
- Project prediction: ~70% accurate
User Impact:
- Average 25-55 task suggestions per request
- Users accept ~40% of suggestions
- Saves ~15 minutes per day (no manual scanning)
What I Learned
1. LLMs Are Really Good at This
Gemini consistently extracts tasks correctly. It understands:
- Imperatives: "Please review..."
- Deadlines: "by Friday", "before EOD"
- Action items: "Next steps:", "Follow up:"
The system prompt is critical. Show examples.
2. OAuth Is A Pain But Necessary
I could've used API keys, but OAuth is better:
- Users grant permissions themselves
- Scoped access (read-only)
- Automatic token refresh
Worth the initial complexity.
3. Parallel Processing Is Essential
Sequential processing: 5+ minutes Parallel processing: 30 seconds
For 50 emails, parallel is non-negotiable.
4. Users Need Accept/Reject Workflow
Early version auto-created tasks. Users hated it.
Showing suggestions first lets users:
- Review before committing
- Reject false positives
- Edit details (project, due date)
Much better UX.
5. Caching Saves API Quota
Fetching 50 emails every request burns through Gmail quota fast.
Caching for 60 minutes:
- Reduces API calls by 90%
- Faster response times
- Happy Google
When Would I Use This?
After building this, I'd recommend AI task suggestions for:
✅ Teams with heavy email usage
✅ Projects with scattered action items
✅ Users who attend many meetings with notes
✅ Anyone drowning in inbox tasks
I wouldn't use it for:
❌ Teams that already use issue trackers exclusively
❌ Low email volume (<10 emails/day)
❌ Projects with strict data privacy (emails go to LLM)
My Verdict
This system saves me ~15 minutes every morning. I click "AI Task Suggestions", review ~30 tasks, accept the real ones, reject noise.
No more manually scanning emails. No more missing action items buried in threads.
Is it perfect? No. The LLM sometimes hallucinates tasks. Project prediction is only 70% accurate. But it's good enough to be useful.
Would I build this again? Absolutely.
Resources That Helped Me
- Gmail API Docs: developers.google.com/gmail/api - Essential reading
- Google Drive API Docs: developers.google.com/drive/api - For meeting notes
- LiteLLM Docs: docs.litellm.ai - Unified LLM interface
- Google OAuth 2.0 Guide: developers.google.com/identity/protocols/oauth2 - Painful but necessary
- Gemini API: ai.google.dev - The LLM doing the heavy lifting
Thanks for reading the blog. For more interesting blogs Filter my name "Harshith Rao" and access my knowledge their. Appericate me if you like my blog and way of writings....