API Documentation Standard for Nominate.AI¶

This document defines the standard for FastAPI/OpenAPI documentation across all Nominate.AI services. The gold standard reference is models.nominate.ai (/docs | /redoc).

Document Structure Overview¶

OpenAPI Specification
├── info                    # API metadata (title, description, contact)
├── servers                 # Environment URLs
├── tags                    # Endpoint groupings with descriptions
├── paths                   # Endpoint definitions
│   └── /api/v1/endpoint
│       ├── summary         # One-line description
│       ├── description     # Detailed markdown documentation
│       ├── tags            # Grouping reference
│       ├── parameters      # Query/path params
│       ├── requestBody     # Input schema
│       └── responses       # Output schemas + examples
└── components
    └── schemas             # Reusable data models

1. API Info Section¶

Every API MUST include a complete info section in FastAPI:

from fastapi import FastAPI

app = FastAPI(
    title="Service Name API",
    description="""
## Overview

Brief description of what this API does and who it's for.

## Key Capabilities

### Feature 1
Description of first major capability.

### Feature 2
Description of second major capability.

## How to Use This API

Step-by-step guidance for consumers.

### Typical Workflow

User: "Example query" API: 1. First step 2. Second step 3. Return results

## Data Sources

- **Source 1**: Description
- **Source 2**: Description

## Rate Limits

- Endpoint X: Limit per request
- Recommended batch size: X-Y items
""",
    version="1.0.0",
    contact={
        "name": "Team Name",
        "url": "https://github.com/Nominate-AI/repo-name",
        "email": "team@nominate.ai"
    },
    license_info={
        "name": "Proprietary",
        "url": "https://nominate.ai/terms"
    },
    docs_url="/docs",
    redoc_url="/redoc",
    openapi_url="/openapi.json"
)

Required Info Fields¶

Field	Required	Description
`title`	Yes	Clear, descriptive API name
`description`	Yes	Markdown overview with sections
`version`	Yes	Semantic version (1.0.0)
`contact.name`	Yes	Team or owner name
`contact.url`	Yes	GitHub repo URL
`contact.email`	Yes	Team email
`license_info`	Yes	License type and URL

Description Section Headers¶

Use these markdown headers in the API description:

Overview - What the API does (2-3 sentences)
Key Capabilities - Major features with subsections
How to Use This API - Integration guidance
Typical Workflow - Code block showing usage flow
Data Sources - Backing data with record counts
Rate Limits - Throughput constraints

2. Tags (Endpoint Groupings)¶

Tags organize endpoints into logical categories. Each tag MUST have a description.

from fastapi import FastAPI

app = FastAPI(...)

# Define tags with descriptions
tags_metadata = [
    {
        "name": "segment",
        "description": """
**Segment Analysis Operations**

Analyze voter segments defined by State Voter IDs (SVIDs). These endpoints
compare segment characteristics against the pre-computed baseline.

Use these endpoints when a user asks questions like:
- "How do these voters differ from the general population?"
- "What's unique about this donor segment?"
"""
    },
    {
        "name": "health",
        "description": """
**System Health & Status**

Monitor API health and configuration. Use before making analysis
requests to ensure the system is ready.
"""
    }
]

app = FastAPI(
    ...,
    openapi_tags=tags_metadata
)

Tag Naming Convention¶

Tag Name	Use Case
`health`	Health checks, status endpoints
`segment`	Segment analysis operations
`baseline`	Reference data access
`behavioral`	Behavioral/engagement data
`campaign`	Campaign-specific operations
`admin`	Administrative operations

Tag Description Format¶

**Bold Title**

One paragraph explaining the category purpose.

Use these endpoints when a user asks questions like:
- "Example question 1?"
- "Example question 2?"
- "Example question 3?"

3. Endpoint Documentation¶

Every endpoint MUST include summary, description, tags, and documented responses.

3.1 Summary (One-Liner)¶

Short, action-oriented description (< 60 chars):

@app.post("/api/v1/segment/analyze", tags=["segment"])
async def analyze_segment(...):
    """Analyze voter segment against baseline"""  # This becomes the summary

Good summaries: - "Analyze voter segment against baseline" - "Get baseline model summary" - "Check API health status" - "Enrich segment with behavioral data"

Bad summaries: - "This endpoint analyzes segments" (verbose) - "POST segment analysis" (redundant) - "Segment" (too short)

3.2 Description (Detailed)¶

Use markdown with structured sections:

@app.post("/api/v1/segment/analyze", tags=["segment"])
async def analyze_segment(request: SegmentAnalysisRequest):
    """
    ## Analyze a Voter Segment

    Compare a segment of voters (defined by State Voter IDs) against the full
    voter baseline to identify statistically significant differences.

    ### What This Endpoint Does

    1. **Resolves SVIDs** - Converts State Voter IDs to internal records
    2. **Computes statistics** - Calculates mean, median, distributions
    3. **Compares to baseline** - Identifies deviations from baseline
    4. **Flags significance** - Marks columns with significant differences

    ### Use Cases

    - **Donor Analysis**: "How do Trump donors differ from the population?"
    - **Geographic Targeting**: "What's unique about Miami-Dade voters?"
    - **Behavioral Insights**: "Do email clickers have different demographics?"

    ### Response Interpretation

    The consumer should:
    1. **Start with the summary** - Report segment size and deviations
    2. **Highlight top deviations** - Focus on `summary.top_deviations`
    3. **Explain categorical shifts** - Compare segment vs baseline values
    4. **Note numeric differences** - Use `mean_deviation` percentages

    ### Example Narrative

    ```
    "This segment of 823 voters represents 1% of the baseline.
    They show 8 significant deviations:

    - County: 40% in Miami-Dade (vs 15% baseline) - 2.7x over
    - Age: Average 58.3 (vs 45.2 baseline) - 29% older
    - Party: 72% Republican (vs 45% baseline)"
    ```
    """

Description Section Template¶

## [Action] [Object]

One paragraph explaining what this endpoint does.

### What This Endpoint Does

1. **Step 1** - Description
2. **Step 2** - Description
3. **Step 3** - Description

### Use Cases

- **Use Case 1**: "Example question"
- **Use Case 2**: "Example question"
- **Use Case 3**: "Example question"

### Response Interpretation

How to interpret the response data.

### Example Narrative

"Plain English interpretation of typical response..."

4. Request/Response Schemas¶

4.1 Schema Definitions¶

Every Pydantic model MUST include docstrings and field descriptions:

from pydantic import BaseModel, Field
from typing import List, Optional

class SegmentAnalysisRequest(BaseModel):
    """
    Request for segment analysis.

    Provide a list of State Voter IDs to analyze against the baseline.
    """

    svids: List[str] = Field(
        ...,
        min_length=1,
        max_length=100000,
        description="List of State Voter IDs to analyze. Maximum 100,000 per request.",
        json_schema_extra={"examples": [["FL-123456789", "FL-987654321"]]}
    )

    include_patterns: bool = Field(
        default=True,
        description="Include data quality patterns (missing data, imbalance) in results."
    )

    significance_threshold: float = Field(
        default=0.05,
        ge=0.001,
        le=0.5,
        description="P-value threshold for flagging significant deviations. Lower = stricter."
    )

    class Config:
        json_schema_extra = {
            "example": {
                "svids": ["FL-123456789", "FL-987654321", "FL-456789123"],
                "include_patterns": True,
                "significance_threshold": 0.05
            }
        }

4.2 Response Schema¶

class SegmentAnalysisResponse(BaseModel):
    """
    Response from segment analysis.

    Contains summary statistics and detailed column-by-column deviations.
    """

    request_id: str = Field(
        description="Unique identifier for this analysis request."
    )

    analyzed_at: datetime = Field(
        description="Timestamp when analysis was performed."
    )

    segment_svids_provided: int = Field(
        description="Number of SVIDs in the original request."
    )

    segment_svids_matched: int = Field(
        description="Number of SVIDs successfully matched to voter records."
    )

    summary: AnalysisSummary = Field(
        description="High-level summary of analysis results."
    )

    columns: List[ColumnDeviation] = Field(
        description="Detailed deviation analysis for each column."
    )

    class Config:
        json_schema_extra = {
            "example": {
                "request_id": "550e8400-e29b-41d4-a716-446655440000",
                "analyzed_at": "2025-12-26T10:30:00Z",
                "segment_svids_provided": 1000,
                "segment_svids_matched": 823,
                "summary": {
                    "segment_size": 823,
                    "baseline_size": 81000,
                    "significant_deviations": 8,
                    "top_deviations": ["county", "age", "party_affiliation"]
                },
                "columns": []
            }
        }

Field Description Requirements¶

Element	Required	Description
`description`	Yes	Clear explanation of what this field contains
`examples`	Yes	Realistic example values
Constraints	When applicable	`min_length`, `max_length`, `ge`, `le`, `regex`
Default	When applicable	Sensible default with explanation

5. Response Status Codes¶

Document all possible response codes with examples:

from fastapi import HTTPException
from fastapi.responses import JSONResponse

@app.post(
    "/api/v1/segment/analyze",
    responses={
        200: {
            "description": "Successful analysis",
            "content": {
                "application/json": {
                    "example": {
                        "request_id": "550e8400-e29b-41d4-a716-446655440000",
                        "segment_svids_matched": 823,
                        "summary": {"significant_deviations": 8}
                    }
                }
            }
        },
        404: {
            "description": "No matching voters found for provided SVIDs",
            "content": {
                "application/json": {
                    "example": {"detail": "No matching persons found for provided SVIDs"}
                }
            }
        },
        422: {
            "description": "Validation error - invalid input format",
            "content": {
                "application/json": {
                    "example": {"detail": [{"loc": ["body", "svids"], "msg": "field required"}]}
                }
            }
        },
        503: {
            "description": "Service unavailable - model or database not loaded",
            "content": {
                "application/json": {
                    "example": {"detail": "Model not loaded"}
                }
            }
        }
    }
)
async def analyze_segment(request: SegmentAnalysisRequest):
    ...

Standard Response Codes¶

Code	Use Case	Required
200	Success	Always
201	Resource created	POST creating resources
400	Bad request (client error)	When applicable
401	Unauthorized	Protected endpoints
403	Forbidden	Role-based access
404	Not found	Resource lookups
422	Validation error	FastAPI default
500	Internal server error	Optional (implicit)
503	Service unavailable	Dependency failures

6. Health Check Endpoint¶

Every API MUST expose a health check:

class HealthResponse(BaseModel):
    """Health check response with component status."""

    status: str = Field(
        description="Overall health status: 'healthy' or 'degraded'"
    )
    model_loaded: bool = Field(
        description="Whether the ML model is loaded and ready"
    )
    database_connected: bool = Field(
        description="Whether the database connection is active"
    )
    version: str = Field(
        description="API version number"
    )

@app.get(
    "/api/v1/health",
    tags=["health"],
    response_model=HealthResponse,
    responses={
        200: {
            "description": "Current health status of all API components",
            "content": {
                "application/json": {
                    "example": {
                        "status": "healthy",
                        "model_loaded": True,
                        "database_connected": True,
                        "version": "1.0.0"
                    }
                }
            }
        }
    }
)
async def health_check():
    """
    ## Health Check

    Returns the current health status of the API, including:

    - **Model status** - Is the baseline model loaded?
    - **Database status** - Is the database connection active?
    - **Version** - Current API version

    ### Health States

    - **healthy** - All systems operational
    - **degraded** - Some components unavailable (analysis may fail)
    """

7. Documentation Quality Checklist¶

API Level¶

Title is descriptive and matches service name
Description includes Overview, Capabilities, Workflow sections
Contact info with name, URL, email
Version follows semantic versioning
All tags have descriptions with use cases

Endpoint Level¶

Summary is < 60 characters, action-oriented
Description has What/Use Cases/Interpretation sections
All parameters documented with descriptions
Request body has example
All response codes documented with examples
Tags assigned for grouping

Schema Level¶

Model has docstring explaining purpose
Every field has description
Fields have realistic examples
Constraints documented (min_length, etc.)
Response model has complete example

8. Writing Style Guide¶

Do's¶

Use active voice: "Analyzes voter segments" not "Voter segments are analyzed"
Start descriptions with action verbs: Analyze, Get, Create, Update, Delete
Include realistic examples: Use plausible data, not "foo", "bar"
Document edge cases: What happens with empty input?
Explain why, not just what: "Use this to identify targeting opportunities"

Don'ts¶

Don't use jargon without explanation
Don't assume prior knowledge of internal systems
Don't leave fields without descriptions
Don't use placeholder examples ("string", 0, null)
Don't skip error response documentation

Markdown in Descriptions¶

## Heading (main sections)
### Subheading (subsections)
**Bold** for emphasis
`code` for field names, values
- Bullet lists for options
1. Numbered lists for steps
```code blocks``` for examples
| Tables | for | structured data |

9. Example: Complete Endpoint¶

from fastapi import FastAPI, HTTPException, Header
from pydantic import BaseModel, Field
from typing import List, Optional
from datetime import datetime

class DonorPropensityRequest(BaseModel):
    """
    Request for donor propensity scoring.

    Score voters for likelihood of making a donation based on behavioral signals.
    """

    svids: List[str] = Field(
        ...,
        min_length=1,
        max_length=10000,
        description="State Voter IDs to score for donation propensity.",
        json_schema_extra={"examples": [["FL-123456789", "FL-987654321"]]}
    )

    include_factors: bool = Field(
        default=True,
        description="Include breakdown of scoring factors in response."
    )

    class Config:
        json_schema_extra = {
            "example": {
                "svids": ["FL-123456789", "FL-987654321"],
                "include_factors": True
            }
        }


class DonorPropensityResponse(BaseModel):
    """Donor propensity scores with tier classification."""

    request_id: str = Field(description="Unique request identifier.")
    scored_at: datetime = Field(description="When scoring was performed.")
    svids_provided: int = Field(description="Input SVID count.")
    svids_scored: int = Field(description="Successfully scored count.")

    high_propensity_count: int = Field(
        description="Voters with score > 0.7 (strong donation candidates)."
    )
    medium_propensity_count: int = Field(
        description="Voters with score 0.4-0.7 (cultivation candidates)."
    )
    low_propensity_count: int = Field(
        description="Voters with score < 0.4 (focus on engagement first)."
    )

    class Config:
        json_schema_extra = {
            "example": {
                "request_id": "abc-123",
                "scored_at": "2025-12-26T10:30:00Z",
                "svids_provided": 1000,
                "svids_scored": 950,
                "high_propensity_count": 52,
                "medium_propensity_count": 180,
                "low_propensity_count": 718
            }
        }


@app.post(
    "/api/v1/behavioral/propensity",
    tags=["behavioral"],
    response_model=DonorPropensityResponse,
    summary="Score donor propensity",
    responses={
        200: {
            "description": "Donor propensity scores with tier classification",
            "content": {
                "application/json": {
                    "schema": {"$ref": "#/components/schemas/DonorPropensityResponse"}
                }
            }
        },
        422: {
            "description": "Validation Error",
            "content": {
                "application/json": {
                    "example": {"detail": [{"loc": ["body", "svids"], "msg": "field required"}]}
                }
            }
        }
    }
)
async def score_donor_propensity(request: DonorPropensityRequest):
    """
    ## Donor Propensity Scoring

    Score voters for likelihood of making a donation based on behavioral signals.
    Uses a rule-based model that can be upgraded to ML in the future.

    ### Propensity Score Components

    | Factor | Weight | Rationale |
    |--------|--------|-----------|
    | Previous donor | +0.60 | Historical donors most likely to donate again |
    | Click engagement | +0.15 | Clickers show investment in content |
    | Campaign diversity | +0.10 | Engaged across campaigns = committed |
    | Base | +0.10 | Baseline probability |

    ### Score Tiers

    - **High (>0.7)** - Strong candidates for donation asks
    - **Medium (0.4-0.7)** - Cultivation candidates
    - **Low (<0.4)** - Focus on engagement first

    ### Use Cases

    - **Fundraising**: "Who should we ask for donations?"
    - **Prioritization**: "Rank this list by donation likelihood"
    - **Segmentation**: "Split into high/medium/low propensity groups"

    ### Example Interpretation

    ```
    "Of 1000 voters, 52 are high-propensity (>0.7), 180 are medium, 768 are low.
    Top candidate: SVID-123 (score 0.85, previous donor, high engagement)"
    ```
    """
    ...

References¶

models.nominate.ai/docs - Swagger UI reference
models.nominate.ai/redoc - ReDoc reference
models.nominate.ai/openapi.json - OpenAPI spec
FastAPI Documentation
OpenAPI 3.1 Specification