CBApp Integration Notes¶
Integration patterns for cbmodels with the CampaignBrain Chat engine.
Context¶
The cbmodels EAV rotation pipeline provides schema-agnostic lookup across heterogeneous campaign data sources (donors, volunteers, event attendees, petition signers, etc.). This complements the i360 voter data by adding campaign engagement context.
Key difference from i360: - i360 uses SVID (State Voter ID) as the primary key - Campaign data uses email/phone as the linkage key - Not all campaign contacts are registered voters
Integration Architecture¶
┌─────────────────────────────────────────────────────────────┐
│ CampaignBrain Chat │
│ (Claude + QueryIR Engine) │
└─────────────────────────┬───────────────────────────────────┘
│
┌──────────────┼──────────────┐
▼ ▼ ▼
┌────────────┐ ┌──────────┐ ┌────────────┐
│ i360 │ │ cbmodels │ │ Segments │
│ Voters │ │ Campaign │ │ (saved) │
│ (SVID) │ │ (email) │ │ │
└────────────┘ └──────────┘ └────────────┘
Proposed Endpoints¶
1. Contact Lookup¶
Find campaign engagement records by contact info.
Request:
Response:
{
"matches": [
{
"source_name": "2024_Donors.csv",
"source_type": "donor",
"record": {
"first_name": "John",
"last_name": "Smith",
"email": "john@example.com",
"amount": "250.00",
"donation_date": "2024-03-15"
},
"match_field": "email"
},
{
"source_name": "Rally_Signups.xlsx",
"source_type": "event",
"record": {
"first_name": "John",
"last_name": "Smith",
"email": "john@example.com",
"signup_date": "2024-02-20",
"event_name": "Detroit Rally"
},
"match_field": "email"
}
],
"source_count": 2,
"total_matches": 2
}
2. Cross-Source Duplicates¶
Find contacts appearing in multiple campaign sources.
Request:
Response:
{
"duplicates": [
{
"value": "superfan@example.com",
"source_count": 9,
"sources": ["Donors", "Volunteers", "Rally_Signup", "Petition", ...]
}
],
"total": 47
}
3. Source Statistics¶
List all ingested campaign data sources with stats.
Response:
{
"sources": [
{
"name": "2024_Donors.csv",
"row_count": 15000,
"col_count": 12,
"fields": ["email", "first_name", "last_name", "amount", ...],
"ingested_at": "2024-03-01T10:30:00Z"
}
],
"total_sources": 46,
"total_cells": 788136
}
4. Enrichment Endpoint (for i360 results)¶
Enrich a list of contacts with campaign engagement data.
Request:
{
"contacts": [
{"email": "john@example.com"},
{"email": "jane@example.com"},
{"phone": "555-987-6543"}
]
}
Response:
{
"enriched": [
{
"email": "john@example.com",
"campaign_sources": 3,
"source_types": ["donor", "volunteer", "event"],
"first_seen": "2023-06-15",
"last_seen": "2024-03-20",
"total_donated": 750.00,
"events_attended": 2
}
],
"match_rate": 0.67
}
Chat Integration Ideas¶
Extend QueryIR with Campaign Operations¶
Add new operation types that the chat can invoke:
# In cb_query_ir.py, add operations:
CAMPAIGN_OPERATIONS = [
"campaign_lookup", # Find campaign engagement by email/phone
"campaign_enrich", # Add campaign data to i360 results
"campaign_duplicates", # Find multi-source contacts
"campaign_sources", # List available sources
]
Natural Language Examples¶
The chat could handle queries like:
| User Query | Action |
|---|---|
| "Show donors who are also volunteers" | Cross-source duplicate lookup |
| "Find all campaign records for john@example.com" | Contact lookup |
| "Which voters in this segment have donated?" | i360 + campaign enrichment |
| "Show me super-supporters (5+ sources)" | High-engagement filter |
| "List all campaign data sources" | Source inventory |
System Prompt Addition¶
Add to cb_prompts.py:
CAMPAIGN_DATA_CONTEXT = """
## Campaign Data Lookup
In addition to i360 voter data, you can access campaign engagement data from
various sources (donors, volunteers, event attendees, petition signers, etc.).
Campaign data is keyed by email/phone, not SVID. Use these operations:
- campaign_lookup: Find all campaign records for an email or phone
- campaign_enrich: Add campaign engagement to voter results
- campaign_duplicates: Find contacts appearing in multiple sources
Example: "Find donors who are also volunteers"
→ Uses campaign_duplicates with sources filter
Example: "Show campaign history for john@example.com"
→ Uses campaign_lookup by email
"""
Response Actions¶
Add campaign-specific actions to chat responses:
CAMPAIGN_ACTIONS = [
{
"label": "View Campaign History",
"action": "campaign_lookup",
"description": "See all campaign engagement for this contact"
},
{
"label": "Find Similar Supporters",
"action": "campaign_duplicates",
"description": "Find contacts in multiple sources"
},
{
"label": "Enrich with Campaign Data",
"action": "campaign_enrich",
"description": "Add campaign engagement to results"
}
]
Data Flow: Chat → cbmodels → Response¶
User: "Show me donors who volunteered and attended rallies"
1. Claude parses intent → campaign_duplicates operation
2. Chat calls: GET /api/cbmodels/duplicates?min_sources=3&source_types=donor,volunteer,event
3. cbmodels queries rotated_data table
4. Returns contacts in 3+ source types
5. Chat formats response with action buttons
6. User can save as segment or view details
Implementation Priority¶
Phase 1: Core Endpoints¶
/lookup- Single contact lookup/sources- Source inventory/duplicates- Cross-source detection
Phase 2: Enrichment¶
/enrich- Batch enrichment for i360 results- Integration with existing search results
Phase 3: Chat Integration¶
- Add campaign operations to QueryIR
- Extend system prompt
- Add response actions
Phase 4: Advanced¶
- Source type classification (donor, volunteer, event, etc.)
- Engagement scoring (recency, frequency, breadth)
- Timeline view of contact engagement
- Linkage to i360 via voter matching
Technical Notes¶
Database Location¶
cbmodels rotated data: ./data/rotated.db
- Should be accessible from cbapp API
- Consider shared mount or API-only access
Field Normalization¶
The normalizer maps raw column names to canonical fields:
- email, phone, mobile - Contact fields (primary lookup keys)
- first_name, last_name - Identity fields
- amount, donation_date - Donor-specific
- signup_date, event_name - Event-specific
Source Type Detection¶
Could auto-classify sources based on field presence:
- Has amount → donor
- Has event_name or rally → event
- Has volunteer in filename → volunteer
- Has petition → petition signer
Performance Considerations¶
- Rotated table has indexes on
(field_name, value)for fast lookups - Email lookups are case-insensitive
- Phone lookups strip non-digits
- Consider caching frequent lookups
Questions for cbapp Team¶
-
API Integration: Should cbmodels expose FastAPI endpoints, or should cbapp query DuckDB directly?
-
Source Classification: Should source types be manual (in a config) or auto-detected from field names?
-
Enrichment Depth: For batch enrichment, how much detail? Just source counts, or full engagement history?
-
Chat Priority: Which queries should hit cbmodels first vs. i360?
-
Linkage Strategy: For contacts with email but no SVID, should we attempt voter matching?
Files to Modify in cbapp¶
| File | Changes |
|---|---|
cb_query_ir.py |
Add campaign operation types |
cb_prompts.py |
Add campaign data context |
cb_query_engine.py |
Handle campaign operations |
cb_chat.py |
Route campaign queries to cbmodels |
integration_settings.py |
Add cbmodels integration config |
Example: Unified Search Flow¶
User: "Find Republicans in Wayne County who have donated"
1. Claude identifies:
- i360 filter: party=R, county=Wayne
- Campaign filter: source_type=donor
2. Query flow:
a. Query i360_voters → Get SVIDs + emails
b. Query cbmodels with emails → Get donor status
c. Filter to donors only
d. Return enriched results
3. Response:
"Found 1,234 Republican voters in Wayne County.
847 (69%) have donated to campaigns.
Average donation: $125
[View All] [View Donors Only] [Save Segment]"
This unified approach gives the chat engine visibility into both voter data AND campaign engagement.