How I Gave Apple Intelligence 20 Years of Context It Didn't Have
The Promise and the Gap
Apple calls it a “personal intelligence system.” But ask Siri “Where did I eat in Nashville last year?” and you’ll get a web search. Ask “When did I start my current job?” and it shrugs.
Apple Intelligence reads your emails, scans your photos, indexes your messages — but it only knows what’s already in your apps. And most of your life history isn’t there. The intelligence is there. The personal isn’t.
I decided to fix that. Using Claude Code and my own data exports, I built a pipeline that extracts, normalizes, deduplicates, and imports over 12,000 life events into Apple Calendar — spanning 2004 to 2026. Dining, travel, career milestones, health events, entertainment, financial transactions, auto maintenance — all of it, organized across 10 color-coded iCloud calendars that sync to every device.
The full Apple Intelligence integration with calendar data is still rolling out — but the infrastructure is ready. Even before AI features land, having 12,742 searchable life events across 10 calendars has already changed how I interact with my own history. And once Apple Intelligence gains full calendar awareness, the payoff will be immediate: every event, every location, every pattern — already indexed and waiting.
This post explains how I built it, what I learned, and how you can do it too.
What Goes In
Your life leaves a data trail. Credit card statements, email receipts, Google Calendar history, Amazon orders, ride-sharing logs — collectively, these sources paint a remarkably complete picture of where you’ve been, what you’ve done, and when. The challenge is getting that data out of silos and into a format that’s structured, searchable, and ready for AI.
The pipeline processes these into 33,595 raw events, curates them down to 12,742 approved events, and lands 12,267 calendar entries across 10 color-coded iCloud calendars — synced to every device, searchable, and ready for Apple Intelligence.
The 10-Calendar Organization
Rather than dumping everything into one calendar, I created 10 dedicated iCloud calendars, each with a distinct color:
| Calendar | What It Captures |
|---|---|
| Dining | Restaurant visits, takeout, food delivery |
| Entertainment | Concerts, movies, theater, streaming |
| Shopping | Retail purchases, online orders |
| Technology | Software, hardware, subscriptions |
| Travel | Flights, hotels, road trips, Airbnb |
| Health | Doctor visits, labs, pharmacy, fitness |
| Career | Work milestones, conferences, PTO |
| Auto | Gas, maintenance, car washes, parking |
| Events, gatherings, meetups | |
| Financial | Banking, insurance, tax milestones |
This structure means I can glance at any week in Calendar and immediately see the texture of my life that week — orange dots for dining, teal for travel, green for health.
What This Actually Looks Like: An Alaska Cruise
Before diving into how each data source works, let me show you what the finished product looks like for a single trip — because this is where the pipeline stops being abstract and starts feeling like magic.
In May 2018, my friends celebrated their 5th wedding anniversary with a 7-day Alaska cruise on the ms Noordam with Holland America Line. Myself and another dear friend tagged along. None of us logged anything during the trip. I didn’t journal. I didn’t even take that many photos. But eight years later, the pipeline reconstructed the entire vacation — not just the cruise itself, but the full arc of getting there, the days at sea, and the week that followed — from data I already had.
Months before departure, the timeline picks up the thread. A January 2018 email confirmation from Holland America. An April excursion booking for a Mendenhall Glacier & Whale Quest tour in Juneau. A Delta confirmation email for ATL to SEA.
The journey starts May 4. A Delta boarding pass confirmation surfaces from email mining — Atlanta to Seattle. An Uber receipt shows a 2:17 PM pickup from home to Hartsfield-Jackson Airport — $24.17, 15.1 miles, 41 minutes.
May 5 in Seattle. Credit card charges reconstruct an entire day I’d nearly forgotten: Elliott’s Oyster House on the waterfront. Nine Pies Pizzeria. A $25 Sound Transit charge — riding the light rail around the city. Wine tasting at Kerloo Cellars and Rotie Cellars. Molly Moon’s ice cream ($5.95) and a screening of the RBG documentary at SIFF Cinema Egyptian ($10) — a full day exploring Seattle before the cruise.
May 6: Seattle to Vancouver. The Amtrak Cascades north to Vancouver — a ticket purchased back in February, surfaced by a $32 credit card charge. Four hours through the Pacific Northwest — Puget Sound on one side, the Cascades on the other — watching the landscape shift from Seattle’s urban sprawl to Bellingham’s farmland to the Canadian border. The observation car on the Cascades route is one of the great train experiences in North America: floor-to-ceiling windows, no assigned seats, just the coastline unrolling. It’s the kind of moment a credit card charge can’t capture — but the Amtrak confirmation email timestamps the journey, and the memory fills in the rest.
That evening, dinner at Cactus Club in Coal Harbour before boarding the ship. Embarkation: ms Noordam.
The port days are where it gets vivid. Credit card charges paint an hour-by-hour picture of life ashore:
Here’s what the cruise week looks like in Apple Calendar — teal travel events, orange dining, pink entertainment, and purple shopping all weaving together across the port days:

But the trip didn’t end at debarkation. The Airbnb export picks up the story: a studio in Yaletown, Vancouver, May 12–15. A travel companion’s trip invitation email surfaces from a separate pipeline, cross-confirming the stay.
And now the credit cards take over again, painting three full days in Vancouver. Cactus Club Coal Harbour for the second time — a bookend from embarkation night. A Polar Treats charge ($33.16) from Ketchikan finally posts two days delayed — a ghost from the cruise appearing in the Vancouver timeline. Ramen Koika and Tongue & Cheek from the debit card — two restaurants that wouldn’t appear on the credit card, showing why multiple statement sources matter. The Fountainhead Pub. Pulse Night Club on Davie Street. London Drugs for Tylenol. And a last meal at Canucks Bar & Grill in Richmond before flying home.
That’s one vacation — an Alaska cruise that became a Vancouver food tour — automatically reconstructed from six different data sources (email, credit cards, debit card, Uber, Airbnb, Delta), spread across five different calendars (Travel, Dining, Entertainment, Shopping, Technology). None of these sources alone tells the story. The Uber receipt doesn’t know it’s the start of an Alaska cruise. The credit card charge at Elliott’s Oyster House doesn’t know it’s a pre-cruise Seattle layover. The Airbnb export doesn’t know the Yaletown apartment was a post-cruise extension. But layered together, they reconstruct a two-week arc I’d half-forgotten — from the $5.95 ice cream cone at Molly Moon’s the day before catching the train to Vancouver, to the debit card charges at two Vancouver restaurants the credit card never saw, to a lunch at the airport before heading home.
When I scroll to May 2018 in Calendar now, the two weeks light up: teal flights and Uber rides getting there, orange dinner dots tracing a path from Seattle’s waterfront to Alaskan fishing towns to Vancouver’s Davie Street, pink entertainment from the tramway above Juneau to a nightclub on the last evening, purple shopping souvenirs from port towns and a Canadian pharmacy. It’s not a journal entry — it’s better. It’s the receipts.
Data Sources Deep Dive
No single source tells the whole story, but together they’re remarkably complete — as that Alaska cruise demonstrates, six sources wove together a two-week narrative that no single export could have produced.
Why Apple Calendar?
I considered other approaches — a custom app, a database with a Shortcuts integration, even a Siri intent extension. But Apple Calendar turned out to be the best target: each event has structured fields (title, date, location, description, category) that are searchable and AI-ready. Import once and it syncs to every device via iCloud. And critically, calendar events are a first-party data source for Apple Intelligence — as features roll out, this data will be immediately accessible. The iCalendar spec even supports a CATEGORIES property and Apple’s proprietary X-APPLE-STRUCTURED-LOCATION for embedding coordinates that integrate with Maps.
The Pipeline
Here’s how each source contributes.
Credit and Debit Card Statements
Three card sources form the backbone: a daily driver credit card, a debit/checking card, and Apple Card. Combined, they capture nearly every transaction — and every transaction tells a story.
The extraction scripts parse PDF statements and CSV exports into standardized timeline events. Each transaction gets categorized, and an out-of-town dining boost flags meals in unfamiliar cities as higher-signal events. A $22 charge at Alaska Fish House in Ketchikan during a cruise is more interesting than your Tuesday Starbucks — the pipeline knows the difference.
What makes card statements uniquely valuable: they capture things you’d never think to log. The tramway ticket in Juneau. The souvenir shop in a port town. The pharmacy pickup that correlates with a health event. Even “boring” transactions like gas station stops reconstruct road trip routes when you plot them chronologically.
Google Takeout
Google knows a lot about you, and it’ll give you that data if you ask. My personal Google Takeout included Calendar history (years of events I’d long forgotten), YouTube watch history, and more.
But my richest Google source was unexpected: a decade of Google Workspace data from my own domain. I ran rickylee.com on Google Workspace from 2012 to 2023 — email, calendar, contacts, the full suite. When I migrated to iCloud custom domain in 2023, I did a full Google Takeout export before leaving. That archive — eleven years of email confirmations, calendar events, and service notifications routed through my primary address — became one of the pipeline’s highest-value inputs. Receipts, reservations, appointment confirmations, shipping notifications — all the transactional email that services send to your primary address, preserved in a structured export.
The migration itself created an interesting data archaeology challenge. Services that sent to my rickylee.com address before 2023 show up in the Google Workspace export. The same services sending to the same address after 2023 show up in the iCloud email corpus. The pipeline needed to treat these as a continuous stream from one identity, not two separate sources — same person, same address, different mail servers. The source_record_id provenance field tracks which email database each event was extracted from, so cross-referencing across the migration boundary is straightforward.
Service Exports
Individual services contribute their own slices:
- Amazon: Nearly a thousand orders over 20 years. Each order becomes a Shopping or Technology event depending on the product category.
- Lyft & Uber: Every ride, with pickup/dropoff locations and timestamps. These are particularly good for reconstructing travel patterns.
- Delta Airlines: Flight receipts with route, date, and confirmation numbers.
- Airbnb: Stay history with property names and cities.
- Etsy & eBay: Purchase history for those niche buys that credit cards just show as “ETSY.COM/BILL”.
Email Mining
Perhaps the richest source of all: email. Using an AI-powered summarization pipeline, I processed 15,603 emails into structured summaries stored in a SQLite database. Receipt emails, reservation confirmations, appointment reminders, shipping notifications — email is where services confirm the things you did, and mining it systematically fills gaps that no other source covers.
The email corpus spans four accounts and two decades — personal Gmail, the Google Workspace archive from rickylee.com, and the current iCloud mailbox. The summarization pipeline uses Claude to extract structured data from each email: service name, event type, date, amount, and a one-line summary. The results feed into the timeline extraction phase, where receipt emails become Dining events, booking confirmations become Travel events, and appointment reminders become Health events.
The Signal Policy
Early on, I faced a philosophical question: what’s worth recording? My answer was aggressive inclusivity. Gas stations, pharmacy runs, grocery trips, coffee shops, gym check-ins — I keep them all. The only transactions filtered out are financial noise: credit card payments, account transfers, Zelle to a friend, investment transactions, and Amazon purchases that already have their own dedicated pipeline.
Why keep “boring” events? Because in aggregate, they’re not boring. A cluster of pharmacy visits tells a health story. Weekly grocery runs at the same store establish a routine — and the week you didn’t go becomes notable. Gas stations trace road trips. The signal emerges from the pattern, not the individual event.
The Normalization Challenge
Raw credit card data is, to put it politely, garbage. Here’s what a single Starbucks visit might look like across different card statements:
That’s one merchant. Now multiply by every restaurant, retailer, and service provider you’ve ever patronized.
The normalization layer handles this with two weapons: 1,308 exact merchant mappings and 285 regex patterns. The exact mappings handle the truly bizarre strings that credit card processors generate — things like 2LEVY@MBS-FP1 (that’s Mercedes-Benz Stadium) or BLVD *RUDY'S PONCE CITY (a barbershop in Ponce City Market). The regex patterns handle families of variations:
# One pattern to catch every Starbucks permutation
r'\bstarbucks?\b.*|\bstarbu\d+'
# Mobile ordering platforms that obscure the actual vendor
r'\blevelup\*?\s*tindrumasi\b.*' # LEVELUP*TINDRUMASI → Tin Drum Asia Cafe
# Google services fragment into a dozen statement variations
r'\bgoogle\*gsuite\b.*' # Google Workspace
r'\bgoogle\*adws\b.*' # Google Ads
r'\bgoogle\*project\b.*' # Google Cloud
# Cruise lines: charges appear under multiple entities and ports
r'\bholland\s*america\b.*' # HAL cruises, onboard charges, excursions
Beyond merchant names, the pipeline normalizes categories. A transaction at a restaurant might be coded as “Shopping” or “Financial” or “Travel” by the card issuer — the normalization layer reclassifies it as Dining based on the merchant name. Verbose reservation titles from OpenTable or Resy get cleaned from “Restaurant reservation at The Capital Grille - Atlanta - Buckhead, Table for 2, 7:30 PM” down to a clean “Dining: The Capital Grille”.
The normalization script is the largest in the pipeline at 5,881 lines. It’s the kind of thing that grows organically — and it grew through conversation.
The Conversational Refinement Loop
This is where the AI CLI workflow becomes something you couldn’t replicate with a traditional script-writing approach. Normalization and categorization aren’t problems you solve once — they’re problems you solve through iteration, and the iteration happens in dialogue.
Here’s what that actually looks like. After the first pipeline run, I’d review the output and spot problems: “Rudy’s at Ponce City Market is categorized as dining — it’s actually a barbershop.” Claude Code would update the merchant mapping from ("Rudy's", 'dining') to ("Rudy's Barbershop", 'shopping'), fix the three affected events in the CSV, and re-run the pipeline. One sentence of feedback, three files corrected.
But the real power is when you ask questions that surface patterns you didn’t know existed. “Show me everything categorized as dining that costs less than $10” reveals that half your coffee shop charges are miscategorized. “Find all events with ‘LEVELUP’ in the raw merchant name” exposes that a mobile ordering platform is obscuring the actual restaurants behind it. “What merchants appear more than 20 times but don’t have a normalization mapping yet?” identifies the gaps in your dictionary.
The feedback loop is tight: review → ask → fix → re-run → review again. Claude Code maintains context across the conversation, so corrections compound. When I said “Sweetgreen charges from Boston and LA aren’t real visits — they’re LEVELUP corporate billing artifacts,” it didn’t just fix those events — it added the geographic filter to the extraction script so future pipeline runs would catch them automatically.
Claude Code also has a tool called AskUserQuestion — a structured prompt that presents multiple-choice options for the user to select from. During this project, Claude Code used it to ask me over 50 targeted refinement questions, often batching 3-4 related questions into a single prompt.
What makes this tool powerful isn’t the questions themselves — it’s that Claude Code knows when to ask. As it processes the data, it recognizes patterns that require human judgment: ambiguous categories, inconsistent naming, signals that could go either way. Rather than guessing or applying a generic rule, it stops and asks. And critically, it presents options informed by what it’s already seen in the data — not abstract possibilities, but concrete choices grounded in the actual events it found.
Here’s what that looks like in practice, organized by the kinds of decisions the tool surfaces:
Category ambiguity — when the same merchant could belong to multiple categories:
2LEVY@MBS-FP1. What category?This is a genuinely hard classification problem. A $12 charge at a stadium could be a hot dog or a concert ticket. Claude Code surfaced the ambiguity rather than defaulting to one category, and the “split by amount threshold” option showed it was thinking about the problem structurally. My answer of “Entertainment” was a simplification — but the right one for a personal timeline where the venue matters more than the line item.
The third option — “depends on the pharmacy” — is what a careful human classifier would suggest. It’s technically correct: buying shampoo at CVS isn’t a health event. But for a personal timeline, the signal matters more than the precision. If I’m at a pharmacy, it’s almost certainly for a prescription. Claude Code implemented the blanket rule and moved on — 23 events fixed, one regex pattern, done.
Data noise — when a single real-world event produces multiple pipeline events:
This is the kind of outlier you’d never catch by scanning a spreadsheet. One Airbnb stay was producing four or five events. The question only surfaced because Claude Code noticed the pattern across multiple stays — a cluster of Airbnb-sourced events on the same dates, with slightly different titles. The fix wasn’t just filtering: it was a new consolidation phase in cleanup-master-csv.py that recognizes Airbnb’s email lifecycle (booking → reminder → check-in → review request) and collapses it to a single stay event.
Design decisions that propagate across the entire pipeline:
event_title and into description.This wasn’t an outlier fix — it was an architectural decision disguised as a formatting question. Claude Code asked it because it noticed inconsistency: some extraction scripts put amounts in titles, others didn’t. The answer affected every pipeline stage, every data source, and the visual density of every calendar view. One tap of a button, and Claude Code modified 11 extraction scripts to enforce the new convention.
Inclusion boundaries — what belongs in a life timeline at all:
This question gets at the philosophical heart of the project: what’s worth remembering? A strict dollar threshold would have been simpler to implement, but it would have thrown away cheap books, quirky gifts, and first purchases in new categories. The keyword-based approach let me have it both ways — filter the noise without losing the signal.
Each answer took seconds. But collectively, these 50+ questions resolved ambiguities that would have taken hours to discover through manual CSV review. The tool turned curation from a tedious spreadsheet audit into something closer to a conversation — Claude Code would surface the pattern, present the options, and immediately implement whichever direction I chose. The feedback loop between question, answer, and pipeline update was often under 30 seconds.
What emerged was something like an active learning pipeline. The automated classifier handles the obvious cases — credit card payments are always skipped, Delta flights are always Travel. The AskUserQuestion tool handles the boundary cases — the events where reasonable people would disagree, or where the right answer depends on personal context the system can’t infer. And each answer feeds back into the classification rules, so the boundary shrinks over time. By the end of the project, the questions had shifted from “what category is this?” (solved by 1,308 mappings) to “should this entire class of events exist?” (design decisions about what a life timeline means).
This conversational approach scaled the normalization dictionary from a handful of mappings to over a thousand entries. No one sits down and writes that many merchant mappings from scratch. You build them one correction at a time, and having an AI that can immediately implement each correction and show you the result makes the process almost enjoyable.
From Pipeline to Skill: Living with the System
Once the pipeline was built, a new pattern emerged: I’d be scrolling through Calendar, notice a wrong title or category, and want to fix it immediately — without re-running the entire pipeline from scratch. “That’s a barbershop, not a restaurant.” “That SIFF Cinema screening was the RBG documentary.” Small corrections, each touching two CSV files and maybe a normalization rule.
So I built a custom Claude Code skill — a reusable command called /fix-event — that handles single-event corrections in about 60 seconds:
- Search both CSVs for the event
- Update the title, category, or description
- Add a normalization rule if the fix applies to future pipeline runs
- Re-extract the approved events and re-import to Apple Calendar via EventKit
The skill definition is just a Markdown file that tells Claude Code what to do:
Claude Code finds the three Rudy’s events, changes the title from “Dining: Rudy’s” to “Rudy’s Barbershop,” updates the category from dining to shopping, adds the merchant mapping to the normalization script, and re-imports all approved events to Apple Calendar. One sentence in, corrected timeline out.
This is the difference between a pipeline you built and a system you live with. The pipeline is a batch process that runs periodically. The skills layer turns it into something you interact with daily — a living document that gets more accurate every time you notice something wrong. Claude Code’s custom commands feature (a Markdown file in .claude/commands/) makes it trivial to codify these workflows. I have two: /update-calendar for full pipeline runs and /fix-event for surgical corrections.
Deduplication — The Hard Problem
When you layer 11 data sources, overlap is inevitable. A single dinner out might generate:
- An OpenTable reservation email → email mining pipeline
- A credit card charge → statement extraction
- A Google Calendar event → Takeout import
- A Lyft ride to the restaurant → ride-sharing import
That’s four events for one dinner. The pipeline needs to recognize they’re the same event and keep only the richest record.
Deduplication happens in two layers:
Layer 1: Prefix stripping. Before any comparison, the pipeline strips 24 common prefixes from event titles — “Dining at”, “Restaurant reservation at”, “Receipt from”, “Amazon purchase:”, “Watched:”, and more. This normalizes titles so that “Dining at Capital Grille” and “Receipt from The Capital Grille” can be compared apples-to-apples.
Layer 2: Fuzzy matching. Within each date bucket, every pair of events is compared using word overlap. If two events on the same day share more than 60% of their words (after normalization), the less information-rich one is dropped:
def normalize_for_dedup(title: str) -> str:
t = title.lower().strip()
for prefix in ['facebook event: ', 'dining at ', 'dining: ',
'restaurant reservation at ', 'receipt from ',
'amazon purchase: ', 'watched: ', ...]: # 24 prefixes
if t.startswith(prefix):
t = t[len(prefix):]
t = re.sub(r'[^\w\s]', '', t)
return re.sub(r'\s+', ' ', t).strip()[:50] # cap length to prevent long descriptions from diluting word overlap
def event_richness(row: dict) -> int:
score = 0
if row.get('description'):
score += len(row['description']) # longer descriptions = richer
if row.get('location'):
score += 50 # having a location is valuable
if row.get('time') not in ('', '00:00'):
score += 30 # specific time beats all-day
return score
This two-layer approach catches the vast majority of cross-source duplicates while preserving genuinely distinct events on the same day. It’s not perfect — occasionally a lunch and dinner at the same restaurant on the same day will collide — but the error rate is low enough that manual review catches the edge cases.
The same conversational refinement loop from normalization applies here. “Show me events where the dedup dropped the record with a location in favor of one without” surfaces cases where the richness scoring needs tuning. “Find dates with more than 5 approved events” highlights days where the dedup may have been too permissive. Each question refines the algorithm — the 24 prefix strings in the stripping list, for instance, accumulated one by one as new source formats entered the pipeline and their title patterns collided with existing events.
The Rejection Rate
This is by design: the extraction layer optimizes for recall (capture everything), and the curation layer optimizes for precision (keep only what matters). Most rejections are financial noise (credit card payments, transfers) and low-value duplicates. But a meaningful portion required manual review: is this pharmacy visit a routine pickup or a notable health event? Is this Amazon order worth recording or is it toilet paper?
The implicit error budget: I’d rather have 100 events that should have been rejected than miss 1 event that mattered. False positives are cheap to fix (change review_status to rejected). False negatives — events that never entered the pipeline — are invisible and permanent. Curation is where judgment lives, and no amount of automation replaces it.
The Result
12,742 events spanning 2004–2026, organized in 10 calendars, synced to every Apple device. My personal history is now searchable and structured:

The personal timeline is more than a search tool. It’s a digital autobiography — every restaurant, every trip, every milestone, preserved in a format that both humans and AI can reason about. The events themselves are facts, but the patterns they reveal tell the story of a life.
But a timeline is only as good as the decisions behind it. In Part 2, I’ll cover the engineering tradeoffs — CSV vs. SQLite, EventKit vs. CalDAV, why Advanced Data Protection changes everything — plus an Apple Notes strategy that gives Siri narrative context, and a practical guide for building your own.
The preparation is done. Now we wait for the performance.
Ricky Lee is a Staff Engineer at Fueled (formerly 10up), where he’s spent 11+ years building enterprise content management platforms. This project — a personal data pipeline that turned scattered life data into a structured, AI-ready timeline — grew out of curiosity about what happens when you point modern AI tooling at your own digital footprint. He writes at rickylee.com and is @rickalee on GitHub.