Skip to content

AI Chat Ecosystem - Architecture & Roadmap

Version: 1.0 Last Updated: 2026-01-02 Status: Living Document

Executive Summary

The CampaignBrain AI Chat ecosystem is a modular, extensible natural language interface for querying voter data, campaign engagement, and field operations. It orchestrates Claude API calls with multiple data sources through a plugin-like dataset architecture.

Key Components

Service URL Purpose
cbapp tenant.nominate.ai Main application + AI Chat UI
CBModels models.nominate.ai Campaign data analysis API
CBDistricts districts.nominate.ai Congressional district data & boundaries
i360 Local DuckDB Voter file (8.2M FL records)

Architecture Overview

flowchart TB
    subgraph User["User Interface"]
        UI[Chat UI]
        Debug[Debug Panel]
    end

    subgraph ChatEngine["AI Chat Engine"]
        Engine[ChatEngine]
        Prompts[System Prompts]
        QueryIR[Query IR Parser]
    end

    subgraph Claude["Claude API"]
        Anthropic[Claude Sonnet 4]
    end

    subgraph Datasets["Dataset Registry"]
        Registry[DatasetRegistry]
        Campaign[Campaign Dataset]
        Events[Events Dataset]
        Contacts[Contacts Dataset]
        Bridge[Data Bridge]
    end

    subgraph DataSources["Data Sources"]
        i360[(i360 Voter DB)]
        LocalDB[(App DuckDB)]
        CBModels[CBModels API]
        Districts[Districts API]
    end

    UI --> Engine
    Debug --> Engine
    Engine --> Prompts
    Engine <--> Anthropic
    Anthropic --> |Tool Use| Registry
    Anthropic --> |QueryIR JSON| QueryIR

    Registry --> Campaign
    Registry --> Events
    Registry --> Contacts
    Registry --> Bridge

    Campaign --> CBModels
    Events --> LocalDB
    Contacts --> LocalDB
    Bridge --> i360
    Bridge --> CBModels
    QueryIR --> i360

    Districts -.-> |Future| Engine

    style Claude fill:#f9f,stroke:#333
    style CBModels fill:#bbf,stroke:#333
    style Districts fill:#bfb,stroke:#333
    style i360 fill:#fbb,stroke:#333

Detailed Component Architecture

Chat Engine Core

flowchart LR
    subgraph Input
        Request[ChatRequest]
    end

    subgraph Engine["ChatEngine (engine.py)"]
        Process[process_message]
        Build[Build System Prompt]
        Call[Claude API Call]
        Parse[Parse Response]
    end

    subgraph ResponseHandling
        ToolUse{Tool Use?}
        QueryParse{QueryIR?}
        Execute[Execute Tool]
        RunQuery[Run SQL Query]
        Format[Format Response]
    end

    subgraph Output
        Response[ChatResponse]
    end

    Request --> Process
    Process --> Build
    Build --> Call
    Call --> Parse
    Parse --> ToolUse
    ToolUse -->|Yes| Execute
    ToolUse -->|No| QueryParse
    QueryParse -->|Yes| RunQuery
    QueryParse -->|No| Format
    Execute --> Format
    RunQuery --> Format
    Format --> Response

Dataset Plugin System

classDiagram
    class Dataset {
        <<abstract>>
        +get_tools() List~ToolSchema~
        +get_schema_context() str
        +execute_tool(name, input) dict
        +health_check() HealthCheckResult
        +get_suggestions() List~Suggestion~
        +is_available() bool
    }

    class DatasetRegistry {
        -datasets: Dict
        +register(dataset)
        +get_all_tools() List
        +execute_tool(name, input)
        +get_combined_schema_context()
        +run_health_checks()
    }

    class CampaignDataset {
        +campaign_list_sources()
        +campaign_lookup_contact()
        +campaign_find_super_supporters()
        +campaign_enrich_contacts()
    }

    class EventsDataset {
        +events_list()
        +events_get_stats()
        +events_find_attendees()
        +events_search()
    }

    class ContactsDataset {
        +contacts_get_stats()
        +contacts_get_person_history()
        +contacts_find_uncontacted()
        +contacts_list_recent()
        +contacts_get_user_activity()
    }

    Dataset <|-- CampaignDataset
    Dataset <|-- EventsDataset
    Dataset <|-- ContactsDataset
    DatasetRegistry o-- Dataset

Current Functionality

1. Voter Data Queries (i360)

Source: Local DuckDB (8.2M Florida voter records)

Query Capabilities: - Natural language to SQL translation via QueryIR - Demographic filters (age, gender, race, ethnicity, religion) - Geographic filters (city, county, zip, congressional district) - Voting scores (turnout, Trump support, Biden oppose, issue scores) - Contact info (cell phone, email when available) - Voting history (last 4 general/primary elections)

Example Queries:

"Find Republican women over 65 in Miami-Dade County"
"Show me high-turnout voters in Congressional District 13"
"Count voters with cell phones and email addresses"

2. Campaign Data (CBModels API)

Source: models.nominate.ai/api/v1

Tools Available:

Tool Purpose Parameters
campaign_list_sources List all imported campaign files None
campaign_lookup_contact Find engagement by email/phone email, phone
campaign_find_super_supporters Contacts in 2+ sources field, min_sources, limit
campaign_enrich_contacts Batch enrich with campaign data emails[], phones[]

Example Queries:

"What campaign data do we have?"
"Look up john@example.com in our campaign records"
"Find our most engaged supporters who appear in multiple lists"

3. Event Management (Local DuckDB)

Source: Local app.duckdb

Tools Available:

Tool Purpose Parameters
events_list List events with registration counts limit, include_past
events_get_stats Event metrics summary None
events_find_attendees People registered for events event_id, status
events_search Search by title/location query, limit

Example Queries:

"How many events have we held?"
"Who's registered for the town hall next week?"
"Show me events in Tampa"

4. Contact Activity (Local DuckDB)

Source: Local app.duckdb

Tools Available:

Tool Purpose Parameters
contacts_get_stats Contact activity summary days
contacts_get_person_history Individual contact history person_id
contacts_find_uncontacted People needing outreach limit, segment_id
contacts_list_recent Recent contact actions limit, action_type
contacts_get_user_activity Staff performance metrics user_id, days

Example Queries:

"How many doors have we knocked this week?"
"Show me people we haven't contacted yet"
"What's our team's activity for the last 7 days?"

5. Cross-Source Data Bridge

Purpose: Match and merge data between i360 voters and campaign engagement

Bridge Operations:

flowchart LR
    subgraph Direction1["Campaign → i360"]
        C1[Campaign Contacts] --> |Extract emails| M1[Match in i360]
        M1 --> |Merge| R1[Enriched Voters]
    end

    subgraph Direction2["i360 → Campaign"]
        V1[Voter Query] --> |Extract emails/phones| M2[Match in Campaign]
        M2 --> |Merge| R2[Enriched Results]
    end

Example Queries:

"Find voters who donated and live in Hillsborough County"
"Which of our volunteers are registered Republicans?"
"Match our donor list against the voter file"


CBModels API Reference

Base URL

Production: https://models.nominate.ai/api/v1
Development: http://localhost:32411/api/v1

Authentication

Header: X-Tenant-ID: {tenant_id}

Endpoints

GET /campaign/sources

List all campaign data sources.

Response:

{
  "source_count": 5,
  "total_rows": 15000,
  "total_cells": 120000,
  "sources": [
    {
      "source_name": "2024_Donors.csv",
      "row_count": 8000,
      "fields": ["email", "first_name", "last_name", "amount"]
    }
  ]
}

POST /campaign/lookup

Find all records for a contact.

Request:

{
  "email": "john@example.com"
}

Response:

{
  "match_count": 2,
  "source_count": 2,
  "records": [
    {
      "source_name": "Donors.csv",
      "fields": {"email": "john@example.com", "amount": "250"}
    }
  ]
}

POST /campaign/duplicates

Find contacts appearing in multiple sources.

Request:

{
  "field": "email",
  "min_sources": 2,
  "limit": 50
}

POST /campaign/enrich

Batch enrich contacts with campaign data.

Request:

{
  "emails": ["a@example.com", "b@example.com"],
  "phones": []
}


Districts API Reference

Base URL

https://districts.nominate.ai/api/v1

Endpoints

GET /districts/{geoid}

Full district information.

Response:

{
  "geoid": "1213",
  "state": "FL",
  "district": 13,
  "representative": "Anna Paulina Luna",
  "party": "Republican",
  "cook_pvi": "R+6",
  "population": 769000,
  "median_income": 62000,
  "counties": ["Pinellas"]
}

GET /districts/{geoid}/geometry

GeoJSON boundary for mapping.

GET /districts/{geoid}/census

Census demographic data.

GET /districts/{geoid}/wiki

Representative and political data.


Data Flow Diagrams

Query Execution Flow

sequenceDiagram
    participant U as User
    participant E as ChatEngine
    participant C as Claude API
    participant R as DatasetRegistry
    participant D as Data Source

    U->>E: ChatRequest(message, history)
    E->>E: Build system prompt + schema context
    E->>C: messages + tools

    alt Tool Use Response
        C->>E: tool_use block
        E->>R: execute_tool(name, input)
        R->>D: API call or DB query
        D->>R: Raw results
        R->>E: Formatted response
        E->>C: tool_result
        C->>E: Final text response
    else QueryIR Response
        C->>E: JSON QueryIR
        E->>D: Execute SQL
        D->>E: Query results
    end

    E->>U: ChatResponse(message, data, actions)

Conversation Persistence

flowchart TB
    subgraph Request
        Msg[User Message]
        History[Conversation History]
    end

    subgraph Processing
        Load[Load from DB]
        Process[Process Message]
        Save[Save to DB]
    end

    subgraph Storage["DuckDB Tables"]
        Conv[conversation]
        ConvMsg[conversation_message]
    end

    Msg --> Load
    History --> Load
    Load --> |conversation_id| Conv
    Conv --> Process
    Process --> Save
    Save --> ConvMsg

Configuration

Environment Variables

# Claude API
ANTHROPIC_API_KEY=sk-ant-...
CB_CHAT_MODEL=claude-sonnet-4-20250514

# i360 Database
I360_DB_PATH=/path/to/i360.db

# CBModels API
CBMODELS_BASE_URL=https://models.nominate.ai
CBMODELS_TENANT_ID=your-tenant-id

# Local Database (events, contacts)
DB_PATH=/path/to/app.duckdb

# Debug Features
CB_CHAT_DEBUG=true
CB_CHAT_BRIDGE_ENABLED=true

Integration Settings (Database)

# Set via API or direct DB
set_setting("cbmodels", "base_url", "https://models.nominate.ai")
set_setting("cbmodels", "tenant_id", "mi20")

Roadmap

Phase 1: Foundation Enhancement (Current)

Feature Status Description
Dataset plugin system Done Extensible tool composition
Campaign data integration Done CBModels API tools
Events dataset Done Local event queries
Contacts dataset Done Field activity queries
Conversation persistence Done DB-backed history
Debug panel Done Real-time tool inspection

Phase 2: Geographic Intelligence (Q1 2026)

gantt
    title Phase 2: Geographic Intelligence
    dateFormat  YYYY-MM-DD
    section Districts
    District context tool     :a1, 2026-01-15, 3d
    Geo-fencing filters       :a2, after a1, 5d
    Boundary visualization    :a3, after a2, 4d
    section Integration
    Radio coverage context    :b1, 2026-01-20, 4d
    Media buy optimization    :b2, after b1, 5d

Planned Features:

  1. District Context Tool
  2. Natural language district lookup
  3. Representative and demographics in responses
  4. "Tell me about Florida's 13th district"

  5. Geo-Fencing Filters

  6. Multi-district targeting
  7. Radius-based searches
  8. County/city boundary awareness

  9. Radio Coverage Integration

  10. District-based media planning
  11. Overlapping coverage analysis
  12. Cost optimization queries

Phase 3: Advanced Analytics (Q2 2026)

Feature Description Complexity
Segment builder assistant AI creates segments from natural language High
Predictive targeting "Find voters likely to support us" High
A/B test analysis Campaign experiment results Medium
Donor modeling Propensity scoring for fundraising High

Phase 4: Multi-Modal & Real-Time (Q3 2026)

Feature Description
Map visualization Interactive district/voter maps
Voice interface Speech-to-text queries
Real-time updates Live field activity dashboards
Mobile optimization Touch-friendly chat UI

Future Integration Opportunities

Districts Service Enhancement

flowchart TB
    subgraph Current
        Query[Voter Query]
        District[District Filter]
        Results[Query Results]
    end

    subgraph Future["Future: Geo-Aware AI"]
        Context[District Context]
        Boundaries[Boundary Awareness]
        Geocode[Auto-Geocoding]
        Validation[Geographic Validation]
    end

    Query --> District
    District --> Results

    Context -.-> Query
    Boundaries -.-> District
    Geocode -.-> |Address → District| Query
    Validation -.-> |Flag Inconsistencies| Results

    style Future fill:#f0f0f0,stroke:#333,stroke-dasharray: 5 5

Opportunities:

  1. District Context in Responses
  2. Automatically include representative info
  3. Add demographic context to geographic queries
  4. "The 13th district is represented by Anna Paulina Luna (R)"

  5. Geo-Fencing Segment Creation

  6. "Create a segment of high-turnout voters in districts 10-15"
  7. Multiple boundary types (district, county, radius)
  8. Visual boundary preview before saving

  9. Geographic Contradiction Detection

  10. Flag voter file inconsistencies
  11. "Voter claims CD-3 but zip code is in CD-5"
  12. Data quality alerts

  13. Location-Aware Suggestions

  14. Suggest relevant districts based on campaign
  15. Auto-complete district names
  16. Nearest event recommendations

CBModels Expansion

flowchart LR
    subgraph Current["Current"]
        Sources[List Sources]
        Lookup[Contact Lookup]
        Dupes[Find Duplicates]
        Enrich[Enrich Contacts]
    end

    subgraph Future["Future"]
        Model[Predictive Models]
        Score[Propensity Scoring]
        Cluster[Donor Clustering]
        Timeline[Engagement Timeline]
    end

    Current --> |Expand| Future

Planned Endpoints:

Endpoint Purpose AI Chat Use
/models/propensity Donor likelihood scores "Who's most likely to donate?"
/models/churn Supporter churn prediction "Who might we be losing?"
/analytics/timeline Engagement over time "Show engagement trends"
/analytics/segments Auto-segment discovery "Find natural voter groups"

Tool Inventory

Current Tools (15 total)

Category Tool Status
Campaign campaign_list_sources Active
campaign_lookup_contact Active
campaign_find_super_supporters Active
campaign_enrich_contacts Active
Events events_list Active
events_get_stats Active
events_find_attendees Active
events_search Active
Contacts contacts_get_stats Active
contacts_get_person_history Active
contacts_find_uncontacted Active
contacts_list_recent Active
contacts_get_user_activity Active
Bridge bridge_campaign_to_i360 Active
bridge_i360_to_campaign Active

Planned Tools

Category Tool Phase
Districts district_context Phase 2
district_compare Phase 2
geo_fence_search Phase 2
Analytics propensity_score Phase 3
segment_suggest Phase 3
trend_analysis Phase 3

Performance Characteristics

Operation Typical Latency Max Records
QueryIR execution 50-500ms 100 samples
Claude API call 1-3s N/A
Campaign API call 500ms-2s 1000 records
District API call 100-500ms 1 district
Health check warmup 2-5s N/A

Token Usage

Metric Typical Maximum
System prompt 800-1200 tokens 2000
User message 50-200 tokens 500
Tool results 200-500 tokens 2000
Response 200-500 tokens 4096

Files Reference

Core Library (src/lib/cbchat/)

File Purpose
engine.py Main ChatEngine orchestrator
config.py Configuration management
models.py Pydantic data models
query.py QueryIR execution
bridge.py Cross-source data bridge
prompts.py System prompt templates
warmup.py Health check system
logger.py Debug logging

Datasets (src/lib/cbchat/datasets/)

File Purpose
__init__.py DatasetRegistry
base.py Abstract Dataset class
campaign.py CBModels integration
events.py Event management
contacts.py Field activity

API Routes (src/api/routes/)

File Purpose
cb_chat.py Chat endpoints
conversations.py Conversation persistence

Services (src/api/services/)

File Purpose
campaign_data_service.py CBModels client
district_service.py Districts client

Adding a New Dataset

To add a new data source to AI Chat:

# 1. Create dataset file: src/lib/cbchat/datasets/my_dataset.py

from .base import Dataset, ToolSchema, HealthCheckResult

class MyDataset(Dataset):
    name = "my_data"
    description = "My custom data source"

    def get_tools(self) -> list[ToolSchema]:
        return [
            {
                "name": "my_tool",
                "description": "Does something useful",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "param": {"type": "string"}
                    }
                }
            }
        ]

    def get_schema_context(self) -> str:
        return "## My Data\nDescription of available data..."

    async def execute_tool(self, name: str, input: dict) -> dict:
        if name == "my_tool":
            return {"result": "..."}
        raise ValueError(f"Unknown tool: {name}")

    async def health_check(self) -> HealthCheckResult:
        return HealthCheckResult(healthy=True, message="OK")

# 2. Register in datasets/__init__.py

from .my_dataset import MyDataset

# In DatasetRegistry.__init__:
self.register(MyDataset())

Glossary

Term Definition
QueryIR Query Intermediate Representation - JSON format for voter queries
SVID State Voter ID - Unique identifier in i360 data
PID Person ID - Human-readable 6-char hash for contacts
Dataset Plugin that provides tools and schema context
Bridge Cross-source matching between i360 and campaign data
GEOID Census geographic identifier (4-digit for CDs)
Cook PVI Partisan Voting Index (e.g., R+6, D+10)