AI Chat Ecosystem - Architecture & Roadmap¶
Version: 1.0 Last Updated: 2026-01-02 Status: Living Document
Executive Summary¶
The CampaignBrain AI Chat ecosystem is a modular, extensible natural language interface for querying voter data, campaign engagement, and field operations. It orchestrates Claude API calls with multiple data sources through a plugin-like dataset architecture.
Key Components¶
| Service | URL | Purpose |
|---|---|---|
| cbapp | tenant.nominate.ai | Main application + AI Chat UI |
| CBModels | models.nominate.ai | Campaign data analysis API |
| CBDistricts | districts.nominate.ai | Congressional district data & boundaries |
| i360 | Local DuckDB | Voter file (8.2M FL records) |
Architecture Overview¶
flowchart TB
subgraph User["User Interface"]
UI[Chat UI]
Debug[Debug Panel]
end
subgraph ChatEngine["AI Chat Engine"]
Engine[ChatEngine]
Prompts[System Prompts]
QueryIR[Query IR Parser]
end
subgraph Claude["Claude API"]
Anthropic[Claude Sonnet 4]
end
subgraph Datasets["Dataset Registry"]
Registry[DatasetRegistry]
Campaign[Campaign Dataset]
Events[Events Dataset]
Contacts[Contacts Dataset]
Bridge[Data Bridge]
end
subgraph DataSources["Data Sources"]
i360[(i360 Voter DB)]
LocalDB[(App DuckDB)]
CBModels[CBModels API]
Districts[Districts API]
end
UI --> Engine
Debug --> Engine
Engine --> Prompts
Engine <--> Anthropic
Anthropic --> |Tool Use| Registry
Anthropic --> |QueryIR JSON| QueryIR
Registry --> Campaign
Registry --> Events
Registry --> Contacts
Registry --> Bridge
Campaign --> CBModels
Events --> LocalDB
Contacts --> LocalDB
Bridge --> i360
Bridge --> CBModels
QueryIR --> i360
Districts -.-> |Future| Engine
style Claude fill:#f9f,stroke:#333
style CBModels fill:#bbf,stroke:#333
style Districts fill:#bfb,stroke:#333
style i360 fill:#fbb,stroke:#333
Detailed Component Architecture¶
Chat Engine Core¶
flowchart LR
subgraph Input
Request[ChatRequest]
end
subgraph Engine["ChatEngine (engine.py)"]
Process[process_message]
Build[Build System Prompt]
Call[Claude API Call]
Parse[Parse Response]
end
subgraph ResponseHandling
ToolUse{Tool Use?}
QueryParse{QueryIR?}
Execute[Execute Tool]
RunQuery[Run SQL Query]
Format[Format Response]
end
subgraph Output
Response[ChatResponse]
end
Request --> Process
Process --> Build
Build --> Call
Call --> Parse
Parse --> ToolUse
ToolUse -->|Yes| Execute
ToolUse -->|No| QueryParse
QueryParse -->|Yes| RunQuery
QueryParse -->|No| Format
Execute --> Format
RunQuery --> Format
Format --> Response
Dataset Plugin System¶
classDiagram
class Dataset {
<<abstract>>
+get_tools() List~ToolSchema~
+get_schema_context() str
+execute_tool(name, input) dict
+health_check() HealthCheckResult
+get_suggestions() List~Suggestion~
+is_available() bool
}
class DatasetRegistry {
-datasets: Dict
+register(dataset)
+get_all_tools() List
+execute_tool(name, input)
+get_combined_schema_context()
+run_health_checks()
}
class CampaignDataset {
+campaign_list_sources()
+campaign_lookup_contact()
+campaign_find_super_supporters()
+campaign_enrich_contacts()
}
class EventsDataset {
+events_list()
+events_get_stats()
+events_find_attendees()
+events_search()
}
class ContactsDataset {
+contacts_get_stats()
+contacts_get_person_history()
+contacts_find_uncontacted()
+contacts_list_recent()
+contacts_get_user_activity()
}
Dataset <|-- CampaignDataset
Dataset <|-- EventsDataset
Dataset <|-- ContactsDataset
DatasetRegistry o-- Dataset
Current Functionality¶
1. Voter Data Queries (i360)¶
Source: Local DuckDB (8.2M Florida voter records)
Query Capabilities: - Natural language to SQL translation via QueryIR - Demographic filters (age, gender, race, ethnicity, religion) - Geographic filters (city, county, zip, congressional district) - Voting scores (turnout, Trump support, Biden oppose, issue scores) - Contact info (cell phone, email when available) - Voting history (last 4 general/primary elections)
Example Queries:
"Find Republican women over 65 in Miami-Dade County"
"Show me high-turnout voters in Congressional District 13"
"Count voters with cell phones and email addresses"
2. Campaign Data (CBModels API)¶
Source: models.nominate.ai/api/v1
Tools Available:
| Tool | Purpose | Parameters |
|---|---|---|
campaign_list_sources |
List all imported campaign files | None |
campaign_lookup_contact |
Find engagement by email/phone | email, phone |
campaign_find_super_supporters |
Contacts in 2+ sources | field, min_sources, limit |
campaign_enrich_contacts |
Batch enrich with campaign data | emails[], phones[] |
Example Queries:
"What campaign data do we have?"
"Look up john@example.com in our campaign records"
"Find our most engaged supporters who appear in multiple lists"
3. Event Management (Local DuckDB)¶
Source: Local app.duckdb
Tools Available:
| Tool | Purpose | Parameters |
|---|---|---|
events_list |
List events with registration counts | limit, include_past |
events_get_stats |
Event metrics summary | None |
events_find_attendees |
People registered for events | event_id, status |
events_search |
Search by title/location | query, limit |
Example Queries:
"How many events have we held?"
"Who's registered for the town hall next week?"
"Show me events in Tampa"
4. Contact Activity (Local DuckDB)¶
Source: Local app.duckdb
Tools Available:
| Tool | Purpose | Parameters |
|---|---|---|
contacts_get_stats |
Contact activity summary | days |
contacts_get_person_history |
Individual contact history | person_id |
contacts_find_uncontacted |
People needing outreach | limit, segment_id |
contacts_list_recent |
Recent contact actions | limit, action_type |
contacts_get_user_activity |
Staff performance metrics | user_id, days |
Example Queries:
"How many doors have we knocked this week?"
"Show me people we haven't contacted yet"
"What's our team's activity for the last 7 days?"
5. Cross-Source Data Bridge¶
Purpose: Match and merge data between i360 voters and campaign engagement
Bridge Operations:
flowchart LR
subgraph Direction1["Campaign → i360"]
C1[Campaign Contacts] --> |Extract emails| M1[Match in i360]
M1 --> |Merge| R1[Enriched Voters]
end
subgraph Direction2["i360 → Campaign"]
V1[Voter Query] --> |Extract emails/phones| M2[Match in Campaign]
M2 --> |Merge| R2[Enriched Results]
end
Example Queries:
"Find voters who donated and live in Hillsborough County"
"Which of our volunteers are registered Republicans?"
"Match our donor list against the voter file"
CBModels API Reference¶
Base URL¶
Authentication¶
Endpoints¶
GET /campaign/sources¶
List all campaign data sources.
Response:
{
"source_count": 5,
"total_rows": 15000,
"total_cells": 120000,
"sources": [
{
"source_name": "2024_Donors.csv",
"row_count": 8000,
"fields": ["email", "first_name", "last_name", "amount"]
}
]
}
POST /campaign/lookup¶
Find all records for a contact.
Request:
Response:
{
"match_count": 2,
"source_count": 2,
"records": [
{
"source_name": "Donors.csv",
"fields": {"email": "john@example.com", "amount": "250"}
}
]
}
POST /campaign/duplicates¶
Find contacts appearing in multiple sources.
Request:
POST /campaign/enrich¶
Batch enrich contacts with campaign data.
Request:
Districts API Reference¶
Base URL¶
Endpoints¶
GET /districts/{geoid}¶
Full district information.
Response:
{
"geoid": "1213",
"state": "FL",
"district": 13,
"representative": "Anna Paulina Luna",
"party": "Republican",
"cook_pvi": "R+6",
"population": 769000,
"median_income": 62000,
"counties": ["Pinellas"]
}
GET /districts/{geoid}/geometry¶
GeoJSON boundary for mapping.
GET /districts/{geoid}/census¶
Census demographic data.
GET /districts/{geoid}/wiki¶
Representative and political data.
Data Flow Diagrams¶
Query Execution Flow¶
sequenceDiagram
participant U as User
participant E as ChatEngine
participant C as Claude API
participant R as DatasetRegistry
participant D as Data Source
U->>E: ChatRequest(message, history)
E->>E: Build system prompt + schema context
E->>C: messages + tools
alt Tool Use Response
C->>E: tool_use block
E->>R: execute_tool(name, input)
R->>D: API call or DB query
D->>R: Raw results
R->>E: Formatted response
E->>C: tool_result
C->>E: Final text response
else QueryIR Response
C->>E: JSON QueryIR
E->>D: Execute SQL
D->>E: Query results
end
E->>U: ChatResponse(message, data, actions)
Conversation Persistence¶
flowchart TB
subgraph Request
Msg[User Message]
History[Conversation History]
end
subgraph Processing
Load[Load from DB]
Process[Process Message]
Save[Save to DB]
end
subgraph Storage["DuckDB Tables"]
Conv[conversation]
ConvMsg[conversation_message]
end
Msg --> Load
History --> Load
Load --> |conversation_id| Conv
Conv --> Process
Process --> Save
Save --> ConvMsg
Configuration¶
Environment Variables¶
# Claude API
ANTHROPIC_API_KEY=sk-ant-...
CB_CHAT_MODEL=claude-sonnet-4-20250514
# i360 Database
I360_DB_PATH=/path/to/i360.db
# CBModels API
CBMODELS_BASE_URL=https://models.nominate.ai
CBMODELS_TENANT_ID=your-tenant-id
# Local Database (events, contacts)
DB_PATH=/path/to/app.duckdb
# Debug Features
CB_CHAT_DEBUG=true
CB_CHAT_BRIDGE_ENABLED=true
Integration Settings (Database)¶
# Set via API or direct DB
set_setting("cbmodels", "base_url", "https://models.nominate.ai")
set_setting("cbmodels", "tenant_id", "mi20")
Roadmap¶
Phase 1: Foundation Enhancement (Current)¶
| Feature | Status | Description |
|---|---|---|
| Dataset plugin system | Done | Extensible tool composition |
| Campaign data integration | Done | CBModels API tools |
| Events dataset | Done | Local event queries |
| Contacts dataset | Done | Field activity queries |
| Conversation persistence | Done | DB-backed history |
| Debug panel | Done | Real-time tool inspection |
Phase 2: Geographic Intelligence (Q1 2026)¶
gantt
title Phase 2: Geographic Intelligence
dateFormat YYYY-MM-DD
section Districts
District context tool :a1, 2026-01-15, 3d
Geo-fencing filters :a2, after a1, 5d
Boundary visualization :a3, after a2, 4d
section Integration
Radio coverage context :b1, 2026-01-20, 4d
Media buy optimization :b2, after b1, 5d
Planned Features:
- District Context Tool
- Natural language district lookup
- Representative and demographics in responses
-
"Tell me about Florida's 13th district"
-
Geo-Fencing Filters
- Multi-district targeting
- Radius-based searches
-
County/city boundary awareness
-
Radio Coverage Integration
- District-based media planning
- Overlapping coverage analysis
- Cost optimization queries
Phase 3: Advanced Analytics (Q2 2026)¶
| Feature | Description | Complexity |
|---|---|---|
| Segment builder assistant | AI creates segments from natural language | High |
| Predictive targeting | "Find voters likely to support us" | High |
| A/B test analysis | Campaign experiment results | Medium |
| Donor modeling | Propensity scoring for fundraising | High |
Phase 4: Multi-Modal & Real-Time (Q3 2026)¶
| Feature | Description |
|---|---|
| Map visualization | Interactive district/voter maps |
| Voice interface | Speech-to-text queries |
| Real-time updates | Live field activity dashboards |
| Mobile optimization | Touch-friendly chat UI |
Future Integration Opportunities¶
Districts Service Enhancement¶
flowchart TB
subgraph Current
Query[Voter Query]
District[District Filter]
Results[Query Results]
end
subgraph Future["Future: Geo-Aware AI"]
Context[District Context]
Boundaries[Boundary Awareness]
Geocode[Auto-Geocoding]
Validation[Geographic Validation]
end
Query --> District
District --> Results
Context -.-> Query
Boundaries -.-> District
Geocode -.-> |Address → District| Query
Validation -.-> |Flag Inconsistencies| Results
style Future fill:#f0f0f0,stroke:#333,stroke-dasharray: 5 5
Opportunities:
- District Context in Responses
- Automatically include representative info
- Add demographic context to geographic queries
-
"The 13th district is represented by Anna Paulina Luna (R)"
-
Geo-Fencing Segment Creation
- "Create a segment of high-turnout voters in districts 10-15"
- Multiple boundary types (district, county, radius)
-
Visual boundary preview before saving
-
Geographic Contradiction Detection
- Flag voter file inconsistencies
- "Voter claims CD-3 but zip code is in CD-5"
-
Data quality alerts
-
Location-Aware Suggestions
- Suggest relevant districts based on campaign
- Auto-complete district names
- Nearest event recommendations
CBModels Expansion¶
flowchart LR
subgraph Current["Current"]
Sources[List Sources]
Lookup[Contact Lookup]
Dupes[Find Duplicates]
Enrich[Enrich Contacts]
end
subgraph Future["Future"]
Model[Predictive Models]
Score[Propensity Scoring]
Cluster[Donor Clustering]
Timeline[Engagement Timeline]
end
Current --> |Expand| Future
Planned Endpoints:
| Endpoint | Purpose | AI Chat Use |
|---|---|---|
/models/propensity |
Donor likelihood scores | "Who's most likely to donate?" |
/models/churn |
Supporter churn prediction | "Who might we be losing?" |
/analytics/timeline |
Engagement over time | "Show engagement trends" |
/analytics/segments |
Auto-segment discovery | "Find natural voter groups" |
Tool Inventory¶
Current Tools (15 total)¶
| Category | Tool | Status |
|---|---|---|
| Campaign | campaign_list_sources | Active |
| campaign_lookup_contact | Active | |
| campaign_find_super_supporters | Active | |
| campaign_enrich_contacts | Active | |
| Events | events_list | Active |
| events_get_stats | Active | |
| events_find_attendees | Active | |
| events_search | Active | |
| Contacts | contacts_get_stats | Active |
| contacts_get_person_history | Active | |
| contacts_find_uncontacted | Active | |
| contacts_list_recent | Active | |
| contacts_get_user_activity | Active | |
| Bridge | bridge_campaign_to_i360 | Active |
| bridge_i360_to_campaign | Active |
Planned Tools¶
| Category | Tool | Phase |
|---|---|---|
| Districts | district_context | Phase 2 |
| district_compare | Phase 2 | |
| geo_fence_search | Phase 2 | |
| Analytics | propensity_score | Phase 3 |
| segment_suggest | Phase 3 | |
| trend_analysis | Phase 3 |
Performance Characteristics¶
| Operation | Typical Latency | Max Records |
|---|---|---|
| QueryIR execution | 50-500ms | 100 samples |
| Claude API call | 1-3s | N/A |
| Campaign API call | 500ms-2s | 1000 records |
| District API call | 100-500ms | 1 district |
| Health check warmup | 2-5s | N/A |
Token Usage¶
| Metric | Typical | Maximum |
|---|---|---|
| System prompt | 800-1200 tokens | 2000 |
| User message | 50-200 tokens | 500 |
| Tool results | 200-500 tokens | 2000 |
| Response | 200-500 tokens | 4096 |
Files Reference¶
Core Library (src/lib/cbchat/)¶
| File | Purpose |
|---|---|
engine.py |
Main ChatEngine orchestrator |
config.py |
Configuration management |
models.py |
Pydantic data models |
query.py |
QueryIR execution |
bridge.py |
Cross-source data bridge |
prompts.py |
System prompt templates |
warmup.py |
Health check system |
logger.py |
Debug logging |
Datasets (src/lib/cbchat/datasets/)¶
| File | Purpose |
|---|---|
__init__.py |
DatasetRegistry |
base.py |
Abstract Dataset class |
campaign.py |
CBModels integration |
events.py |
Event management |
contacts.py |
Field activity |
API Routes (src/api/routes/)¶
| File | Purpose |
|---|---|
cb_chat.py |
Chat endpoints |
conversations.py |
Conversation persistence |
Services (src/api/services/)¶
| File | Purpose |
|---|---|
campaign_data_service.py |
CBModels client |
district_service.py |
Districts client |
Adding a New Dataset¶
To add a new data source to AI Chat:
# 1. Create dataset file: src/lib/cbchat/datasets/my_dataset.py
from .base import Dataset, ToolSchema, HealthCheckResult
class MyDataset(Dataset):
name = "my_data"
description = "My custom data source"
def get_tools(self) -> list[ToolSchema]:
return [
{
"name": "my_tool",
"description": "Does something useful",
"input_schema": {
"type": "object",
"properties": {
"param": {"type": "string"}
}
}
}
]
def get_schema_context(self) -> str:
return "## My Data\nDescription of available data..."
async def execute_tool(self, name: str, input: dict) -> dict:
if name == "my_tool":
return {"result": "..."}
raise ValueError(f"Unknown tool: {name}")
async def health_check(self) -> HealthCheckResult:
return HealthCheckResult(healthy=True, message="OK")
# 2. Register in datasets/__init__.py
from .my_dataset import MyDataset
# In DatasetRegistry.__init__:
self.register(MyDataset())
Glossary¶
| Term | Definition |
|---|---|
| QueryIR | Query Intermediate Representation - JSON format for voter queries |
| SVID | State Voter ID - Unique identifier in i360 data |
| PID | Person ID - Human-readable 6-char hash for contacts |
| Dataset | Plugin that provides tools and schema context |
| Bridge | Cross-source matching between i360 and campaign data |
| GEOID | Census geographic identifier (4-digit for CDs) |
| Cook PVI | Partisan Voting Index (e.g., R+6, D+10) |