CBOS Intelligence Layer - THE PLAN¶
Vision¶
Transform CBOS from a passive session monitor into an intelligent orchestration system that actively helps manage multiple Claude Code sessions through AI-powered analysis, prioritization, and response assistance.
Architecture Overview¶
┌─────────────────────────────────────────────────────────────────────────────┐
│ CBOS TUI │
│ ┌─────────────┐ ┌──────────────────┐ ┌─────────────────────────────────┐ │
│ │ Sessions │ │ AI Suggestions │ │ Priority Queue / Smart View │ │
│ │ ● AUTH │ │ ┌────────────┐ │ │ 1. 🔴 AUTH - needs decision │ │
│ │ ◐ INTEL │ │ │ Suggested │ │ │ 2. 🟡 DOCS - clarification │ │
│ │ ○ DOCS │ │ │ Response: │ │ │ 3. 🟢 INTEL - routine │ │
│ │ │ │ │ "yes, ..." │ │ │ │ │
│ └─────────────┘ │ └────────────┘ │ └─────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ CBOS API (port 32205) │
│ ┌─────────────────────────────────────────────────────────────────────────┐│
│ │ Intelligence Service ││
│ │ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ ┌─────────────┐ ││
│ │ │ ResponseDraft │ │ Summarizer │ │ Prioritizer │ │ Embeddings │ ││
│ │ │ Generator │ │ │ │ │ │ Store │ ││
│ │ └───────────────┘ └───────────────┘ └───────────────┘ └─────────────┘ ││
│ └─────────────────────────────────────────────────────────────────────────┘│
└────────────────────────────┬────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ CBAI Service (ai.nominate.ai) │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ /chat │ │ /summarize │ │ /topics │ │ /embed │ │
│ │ mistral/ │ │ │ │ │ │ nomic-embed-text │ │
│ │ claude │ │ │ │ │ │ 768-dim vectors │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
Feature 1: Auto-Suggest Responses¶
Purpose¶
When a Claude session is waiting for input, analyze the question and generate a suggested response that the user can accept, edit, or reject.
Implementation¶
New Endpoint: POST /sessions/{slug}/suggest¶
@app.post("/sessions/{slug}/suggest")
async def suggest_response(slug: str) -> SuggestionResponse:
"""Generate AI-suggested response for a waiting session"""
session = store.get(slug)
if not session or session.state != SessionState.WAITING:
raise HTTPException(400, "Session not waiting for input")
# Get context: last N lines of buffer + the question
buffer = store.get_buffer(slug, lines=50)
question = session.last_question
# Call CBAI to generate suggestion
suggestion = await intelligence.generate_suggestion(
question=question,
context=buffer,
session_slug=slug
)
return SuggestionResponse(
slug=slug,
question=question,
suggested_response=suggestion.response,
confidence=suggestion.confidence,
reasoning=suggestion.reasoning
)
Intelligence Service¶
# cbos/intelligence/suggestions.py
class SuggestionGenerator:
"""Generate response suggestions using CBAI"""
SYSTEM_PROMPT = """You are an assistant helping a developer respond to Claude Code.
Claude Code is asking a question and waiting for input. Based on the context
and question, suggest a helpful response.
Guidelines:
- Be concise - most responses are short confirmations or brief instructions
- If Claude is asking for permission, usually "yes" or "y" is appropriate
- If Claude needs clarification, provide specific guidance
- If unsure, say so and offer options
Respond with JSON:
{
"response": "your suggested response",
"confidence": 0.0-1.0,
"reasoning": "brief explanation"
}"""
async def generate(self, question: str, context: str) -> Suggestion:
response = await self.cbai.chat(
messages=[
{"role": "system", "content": self.SYSTEM_PROMPT},
{"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}
],
provider="ollama", # Fast local inference
model="mistral-small3.2:latest"
)
return Suggestion.parse(response)
TUI Integration¶
- New keybinding:
s- Show suggestion for selected session - Display suggestion in a panel below the buffer
Enterto accept,eto edit,Escto dismiss
Feature 2: Session Summarization¶
Purpose¶
Provide quick summaries of what each session is working on, visible in the session list and detailed view.
Implementation¶
New Endpoint: GET /sessions/{slug}/summary¶
@app.get("/sessions/{slug}/summary")
async def get_session_summary(slug: str) -> SummaryResponse:
"""Get AI-generated summary of session activity"""
buffer = store.get_buffer(slug, lines=200)
summary = await intelligence.summarize_session(buffer)
return SummaryResponse(
slug=slug,
summary=summary.short, # 1-line for list view
detailed=summary.detailed, # 2-3 sentences for detail view
topics=summary.topics, # Key themes
last_action=summary.last_action
)
Caching Strategy¶
class SummaryCache:
"""Cache summaries to avoid redundant API calls"""
def __init__(self, ttl_seconds: int = 30):
self._cache: dict[str, tuple[Summary, float]] = {}
self.ttl = ttl_seconds
async def get(self, slug: str, buffer_hash: str) -> Optional[Summary]:
cached = self._cache.get(slug)
if cached and cached[1] > time.time() - self.ttl:
if cached[0].buffer_hash == buffer_hash:
return cached[0]
return None
TUI Integration¶
- Show 1-line summary next to session name in list
- Full summary in content header when session selected
- Refresh summary on demand with
Skeybinding
Feature 3: Priority Queue¶
Purpose¶
Rank waiting sessions by urgency and importance, helping users focus on what matters most.
Implementation¶
Priority Factors¶
| Factor | Weight | Detection Method |
|---|---|---|
| Time waiting | 0.3 | now - last_activity |
| Question type | 0.3 | AI classification |
| Error severity | 0.2 | Pattern matching |
| User preference | 0.2 | Configured priorities |
Question Types (AI-classified)¶
class QuestionType(Enum):
PERMISSION = "permission" # "Should I proceed?", "Run this command?"
DECISION = "decision" # "Which approach?", "Option A or B?"
CLARIFICATION = "clarification" # "What do you mean by...?"
ERROR = "error" # "Failed to...", "Error occurred"
INFORMATION = "information" # "What is the...?", "Where should...?"
New Endpoint: GET /sessions/prioritized¶
@app.get("/sessions/prioritized")
async def get_prioritized_sessions() -> list[PrioritizedSession]:
"""Get waiting sessions ranked by priority"""
waiting = store.waiting()
prioritized = []
for session in waiting:
priority = await intelligence.calculate_priority(session)
prioritized.append(PrioritizedSession(
session=session,
priority_score=priority.score,
priority_reason=priority.reason,
question_type=priority.question_type,
suggested_action=priority.suggested_action
))
return sorted(prioritized, key=lambda p: p.priority_score, reverse=True)
TUI Integration¶
- New view mode:
ptoggles priority view - Shows waiting sessions sorted by priority
- Color-coded urgency indicators
- Priority reason shown on hover/select
Feature 4: Cross-Session Context (Embeddings)¶
Purpose¶
Detect when sessions are working on related tasks, enabling context sharing and conflict detection.
Implementation¶
Embedding Storage¶
# cbos/intelligence/embeddings.py
class SessionEmbeddingStore:
"""Store and query session context embeddings"""
def __init__(self, cbai_url: str):
self.cbai = CBAIClient(cbai_url)
self._embeddings: dict[str, np.ndarray] = {}
self._contexts: dict[str, str] = {}
async def update(self, slug: str, buffer: str) -> None:
"""Update embedding for a session's current context"""
# Summarize first to reduce token count
summary = await self.cbai.summarize(buffer, max_length=500)
# Generate embedding
embedding = await self.cbai.embed(summary)
self._embeddings[slug] = np.array(embedding)
self._contexts[slug] = summary
def find_related(self, slug: str, threshold: float = 0.7) -> list[RelatedSession]:
"""Find sessions with similar context"""
if slug not in self._embeddings:
return []
target = self._embeddings[slug]
related = []
for other_slug, other_embed in self._embeddings.items():
if other_slug == slug:
continue
similarity = cosine_similarity(target, other_embed)
if similarity >= threshold:
related.append(RelatedSession(
slug=other_slug,
similarity=similarity,
context_summary=self._contexts[other_slug]
))
return sorted(related, key=lambda r: r.similarity, reverse=True)
New Endpoint: GET /sessions/{slug}/related¶
@app.get("/sessions/{slug}/related")
async def get_related_sessions(slug: str) -> list[RelatedSession]:
"""Find sessions working on similar tasks"""
return intelligence.embeddings.find_related(slug)
Use Cases¶
- Context Sharing: "AUTH and TENANT are both working on user permissions"
- Conflict Detection: "DOCS and API are modifying the same files"
- Task Routing: "This task is similar to what INTEL is doing"
Feature 5: Smart Routing¶
Purpose¶
When starting a new task, suggest which existing session should handle it or recommend creating a new one.
Implementation¶
New Endpoint: POST /sessions/route¶
@app.post("/sessions/route")
async def route_task(request: RouteRequest) -> RouteResponse:
"""Suggest which session should handle a task"""
task_description = request.task
# Embed the task
task_embedding = await intelligence.embed(task_description)
# Find best matching session
sessions = store.all()
candidates = []
for session in sessions:
if session.state == SessionState.ERROR:
continue
similarity = intelligence.embeddings.similarity(
task_embedding,
session.slug
)
candidates.append(RoutingCandidate(
slug=session.slug,
match_score=similarity,
current_state=session.state,
summary=await intelligence.get_summary(session.slug)
))
# Rank and recommend
ranked = sorted(candidates, key=lambda c: c.match_score, reverse=True)
best = ranked[0] if ranked and ranked[0].match_score > 0.6 else None
return RouteResponse(
recommended_session=best.slug if best else None,
recommendation_reason=generate_reason(best, task_description),
alternatives=ranked[1:3],
suggest_new=best is None or best.match_score < 0.6
)
TUI Integration¶
- New command:
n- New task routing - Prompt for task description
- Show recommended session with explanation
- Quick action to send task to selected session
Data Models¶
# cbos/intelligence/models.py
class Suggestion(BaseModel):
response: str
confidence: float
reasoning: str
class Summary(BaseModel):
short: str # "Implementing auth middleware"
detailed: str # 2-3 sentence description
topics: list[str] # ["authentication", "middleware", "FastAPI"]
last_action: str # "Writing test cases"
buffer_hash: str # For cache invalidation
class Priority(BaseModel):
score: float # 0.0 - 1.0
reason: str # "Error requiring immediate attention"
question_type: QuestionType
wait_time: int # Seconds waiting
suggested_action: str
class RelatedSession(BaseModel):
slug: str
similarity: float
context_summary: str
shared_topics: list[str]
class RoutingCandidate(BaseModel):
slug: str
match_score: float
current_state: SessionState
summary: str
availability: str # "idle", "busy", "waiting"
Configuration¶
# cbos/core/config.py
class IntelligenceSettings(BaseSettings):
cbai_url: str = "https://ai.nominate.ai"
# Model selection
suggestion_model: str = "mistral-small3.2:latest"
suggestion_provider: str = "ollama"
summary_model: str = "mistral-small3.2:latest"
priority_model: str = "mistral-small3.2:latest"
# Use Claude for complex reasoning
complex_reasoning_model: str = "claude-sonnet-4-5-20250929"
complex_reasoning_provider: str = "claude"
# Caching
summary_cache_ttl: int = 30
embedding_update_interval: int = 60
# Thresholds
suggestion_confidence_threshold: float = 0.7
related_session_threshold: float = 0.7
routing_match_threshold: float = 0.6
Implementation Phases¶
Phase 1: Foundation (Core Intelligence Service)¶
- Create
cbos/intelligence/module - Implement CBAI client wrapper
- Add configuration for AI settings
- Basic suggestion generation endpoint
Phase 2: Suggestions¶
- Implement
SuggestionGenerator - Add
/sessions/{slug}/suggestendpoint - TUI integration with suggestion panel
- Accept/edit/reject flow
Phase 3: Summarization¶
- Implement
SessionSummarizer - Add summary caching
- Add
/sessions/{slug}/summaryendpoint - TUI integration in session list and detail
Phase 4: Priority Queue¶
- Implement
PriorityCalculator - Question type classification
- Add
/sessions/prioritizedendpoint - TUI priority view mode
Phase 5: Cross-Session Context¶
- Implement
SessionEmbeddingStore - Background embedding updates
- Add
/sessions/{slug}/relatedendpoint - TUI related sessions indicator
Phase 6: Smart Routing¶
- Implement routing logic
- Add
/sessions/routeendpoint - TUI new task flow
- Integration with session creation
API Summary¶
| Endpoint | Method | Purpose |
|---|---|---|
/sessions/{slug}/suggest |
POST | Generate response suggestion |
/sessions/{slug}/summary |
GET | Get session summary |
/sessions/prioritized |
GET | Get priority-ranked waiting sessions |
/sessions/{slug}/related |
GET | Find related sessions |
/sessions/route |
POST | Suggest session for new task |
/intelligence/health |
GET | AI service health check |
TUI Keybindings (New)¶
| Key | Action |
|---|---|
s |
Show AI suggestion for selected session |
S |
Refresh summary |
p |
Toggle priority view |
n |
New task routing |
x |
Show related sessions |
Success Metrics¶
- Response Time: Suggestions generated in < 2 seconds
- Suggestion Accuracy: > 70% of suggestions accepted without edit
- Priority Accuracy: User agrees with top priority > 80% of time
- Related Detection: False positive rate < 20%
- Routing Accuracy: Recommended session is correct > 75% of time
Dependencies¶
Add to pyproject.toml:
dependencies = [
# ... existing ...
"numpy>=1.24", # For embedding operations
"httpx>=0.24", # Already present, for CBAI calls
]
Security Considerations¶
- API Key Management: CBAI credentials stored in environment, not code
- Rate Limiting: Implement backoff for CBAI calls
- Data Sanitization: Strip sensitive content before sending to AI
- Local First: Use Ollama (local) by default, Claude for complex cases
Next Steps¶
- Review and approve this plan
- Create
cbos/intelligence/module structure - Implement CBAI client wrapper
- Start with Phase 1: Basic suggestion generation