Skip to content

CBApp Integration Notes

Integration patterns for cbmodels with the CampaignBrain Chat engine.

Context

The cbmodels EAV rotation pipeline provides schema-agnostic lookup across heterogeneous campaign data sources (donors, volunteers, event attendees, petition signers, etc.). This complements the i360 voter data by adding campaign engagement context.

Key difference from i360: - i360 uses SVID (State Voter ID) as the primary key - Campaign data uses email/phone as the linkage key - Not all campaign contacts are registered voters

Integration Architecture

┌─────────────────────────────────────────────────────────────┐
│                    CampaignBrain Chat                        │
│                  (Claude + QueryIR Engine)                   │
└─────────────────────────┬───────────────────────────────────┘
           ┌──────────────┼──────────────┐
           ▼              ▼              ▼
    ┌────────────┐  ┌──────────┐  ┌────────────┐
    │  i360      │  │ cbmodels │  │  Segments  │
    │  Voters    │  │ Campaign │  │  (saved)   │
    │  (SVID)    │  │ (email)  │  │            │
    └────────────┘  └──────────┘  └────────────┘

Proposed Endpoints

1. Contact Lookup

GET /api/cbmodels/lookup

Find campaign engagement records by contact info.

Request:

{
  "email": "john@example.com",
  "phone": "555-123-4567"  // optional, either/or
}

Response:

{
  "matches": [
    {
      "source_name": "2024_Donors.csv",
      "source_type": "donor",
      "record": {
        "first_name": "John",
        "last_name": "Smith",
        "email": "john@example.com",
        "amount": "250.00",
        "donation_date": "2024-03-15"
      },
      "match_field": "email"
    },
    {
      "source_name": "Rally_Signups.xlsx",
      "source_type": "event",
      "record": {
        "first_name": "John",
        "last_name": "Smith",
        "email": "john@example.com",
        "signup_date": "2024-02-20",
        "event_name": "Detroit Rally"
      },
      "match_field": "email"
    }
  ],
  "source_count": 2,
  "total_matches": 2
}

2. Cross-Source Duplicates

GET /api/cbmodels/duplicates

Find contacts appearing in multiple campaign sources.

Request:

{
  "field": "email",
  "min_sources": 3,
  "limit": 100
}

Response:

{
  "duplicates": [
    {
      "value": "superfan@example.com",
      "source_count": 9,
      "sources": ["Donors", "Volunteers", "Rally_Signup", "Petition", ...]
    }
  ],
  "total": 47
}

3. Source Statistics

GET /api/cbmodels/sources

List all ingested campaign data sources with stats.

Response:

{
  "sources": [
    {
      "name": "2024_Donors.csv",
      "row_count": 15000,
      "col_count": 12,
      "fields": ["email", "first_name", "last_name", "amount", ...],
      "ingested_at": "2024-03-01T10:30:00Z"
    }
  ],
  "total_sources": 46,
  "total_cells": 788136
}

4. Enrichment Endpoint (for i360 results)

POST /api/cbmodels/enrich

Enrich a list of contacts with campaign engagement data.

Request:

{
  "contacts": [
    {"email": "john@example.com"},
    {"email": "jane@example.com"},
    {"phone": "555-987-6543"}
  ]
}

Response:

{
  "enriched": [
    {
      "email": "john@example.com",
      "campaign_sources": 3,
      "source_types": ["donor", "volunteer", "event"],
      "first_seen": "2023-06-15",
      "last_seen": "2024-03-20",
      "total_donated": 750.00,
      "events_attended": 2
    }
  ],
  "match_rate": 0.67
}

Chat Integration Ideas

Extend QueryIR with Campaign Operations

Add new operation types that the chat can invoke:

# In cb_query_ir.py, add operations:
CAMPAIGN_OPERATIONS = [
    "campaign_lookup",      # Find campaign engagement by email/phone
    "campaign_enrich",      # Add campaign data to i360 results
    "campaign_duplicates",  # Find multi-source contacts
    "campaign_sources",     # List available sources
]

Natural Language Examples

The chat could handle queries like:

User Query Action
"Show donors who are also volunteers" Cross-source duplicate lookup
"Find all campaign records for john@example.com" Contact lookup
"Which voters in this segment have donated?" i360 + campaign enrichment
"Show me super-supporters (5+ sources)" High-engagement filter
"List all campaign data sources" Source inventory

System Prompt Addition

Add to cb_prompts.py:

CAMPAIGN_DATA_CONTEXT = """
## Campaign Data Lookup

In addition to i360 voter data, you can access campaign engagement data from
various sources (donors, volunteers, event attendees, petition signers, etc.).

Campaign data is keyed by email/phone, not SVID. Use these operations:

- campaign_lookup: Find all campaign records for an email or phone
- campaign_enrich: Add campaign engagement to voter results
- campaign_duplicates: Find contacts appearing in multiple sources

Example: "Find donors who are also volunteers"
→ Uses campaign_duplicates with sources filter

Example: "Show campaign history for john@example.com"
→ Uses campaign_lookup by email
"""

Response Actions

Add campaign-specific actions to chat responses:

CAMPAIGN_ACTIONS = [
    {
        "label": "View Campaign History",
        "action": "campaign_lookup",
        "description": "See all campaign engagement for this contact"
    },
    {
        "label": "Find Similar Supporters",
        "action": "campaign_duplicates",
        "description": "Find contacts in multiple sources"
    },
    {
        "label": "Enrich with Campaign Data",
        "action": "campaign_enrich",
        "description": "Add campaign engagement to results"
    }
]

Data Flow: Chat → cbmodels → Response

User: "Show me donors who volunteered and attended rallies"

1. Claude parses intent → campaign_duplicates operation
2. Chat calls: GET /api/cbmodels/duplicates?min_sources=3&source_types=donor,volunteer,event
3. cbmodels queries rotated_data table
4. Returns contacts in 3+ source types
5. Chat formats response with action buttons
6. User can save as segment or view details

Implementation Priority

Phase 1: Core Endpoints

  1. /lookup - Single contact lookup
  2. /sources - Source inventory
  3. /duplicates - Cross-source detection

Phase 2: Enrichment

  1. /enrich - Batch enrichment for i360 results
  2. Integration with existing search results

Phase 3: Chat Integration

  1. Add campaign operations to QueryIR
  2. Extend system prompt
  3. Add response actions

Phase 4: Advanced

  1. Source type classification (donor, volunteer, event, etc.)
  2. Engagement scoring (recency, frequency, breadth)
  3. Timeline view of contact engagement
  4. Linkage to i360 via voter matching

Technical Notes

Database Location

cbmodels rotated data: ./data/rotated.db - Should be accessible from cbapp API - Consider shared mount or API-only access

Field Normalization

The normalizer maps raw column names to canonical fields: - email, phone, mobile - Contact fields (primary lookup keys) - first_name, last_name - Identity fields - amount, donation_date - Donor-specific - signup_date, event_name - Event-specific

Source Type Detection

Could auto-classify sources based on field presence: - Has amount → donor - Has event_name or rally → event - Has volunteer in filename → volunteer - Has petition → petition signer

Performance Considerations

  • Rotated table has indexes on (field_name, value) for fast lookups
  • Email lookups are case-insensitive
  • Phone lookups strip non-digits
  • Consider caching frequent lookups

Questions for cbapp Team

  1. API Integration: Should cbmodels expose FastAPI endpoints, or should cbapp query DuckDB directly?

  2. Source Classification: Should source types be manual (in a config) or auto-detected from field names?

  3. Enrichment Depth: For batch enrichment, how much detail? Just source counts, or full engagement history?

  4. Chat Priority: Which queries should hit cbmodels first vs. i360?

  5. Linkage Strategy: For contacts with email but no SVID, should we attempt voter matching?

Files to Modify in cbapp

File Changes
cb_query_ir.py Add campaign operation types
cb_prompts.py Add campaign data context
cb_query_engine.py Handle campaign operations
cb_chat.py Route campaign queries to cbmodels
integration_settings.py Add cbmodels integration config

Example: Unified Search Flow

User: "Find Republicans in Wayne County who have donated"

1. Claude identifies:
   - i360 filter: party=R, county=Wayne
   - Campaign filter: source_type=donor

2. Query flow:
   a. Query i360_voters → Get SVIDs + emails
   b. Query cbmodels with emails → Get donor status
   c. Filter to donors only
   d. Return enriched results

3. Response:
   "Found 1,234 Republican voters in Wayne County.
    847 (69%) have donated to campaigns.
    Average donation: $125

    [View All] [View Donors Only] [Save Segment]"

This unified approach gives the chat engine visibility into both voter data AND campaign engagement.