Skip to content

FCC Data Integration - Post-Implementation Report

Date: December 2024 Status: Complete Author: Data Engineering Team


Executive Summary

Successfully integrated FCC station data into the CB Radio platform, enriching 1,563 radio stations (82%) with official geographic coordinates, transmit power, station classification, and coverage contour polygons. The implementation includes a sync module for ongoing updates and a comprehensive Geo API for spatial queries.


Implementation Overview

What Was Built

Component Description
FCC Sync Module Python package (fcc/) to fetch data from FCC Contours API
Database Schema 12 new fcc_* columns on the station table
Coverage Contours 1,563 GeoJSON polygon files (37MB total)
Geo API 8 read-only endpoints for spatial queries
Documentation Data dictionary for geo team integration

Data Source

FCC Contours API https://geo.fcc.gov/api/contours/entity.json

This public API provides: - Station facility IDs (unique FCC identifiers) - Antenna coordinates (WGS84) - Transmit power and ERP - Station classification (A, B, C, D) - Hours of operation - 360-point coverage contour polygons

Results

Metric Value
Total stations in database 1,908
Stations synced with FCC 1,563 (81.9%)
Coverage contours generated 1,563
Contour data size 37 MB
Sync duration ~30 minutes
API endpoints created 8

Technical Learnings

1. DuckDB Foreign Key Limitations

Issue: DuckDB throws internal errors when updating rows that are referenced by foreign keys, even when not modifying the primary key.

Solution: Removed foreign key constraints from tables that frequently update: - document.station_id - rate.station_id - rate_card.station_id

Lesson: DuckDB is excellent for analytics but has limitations for transactional workloads with complex FK relationships. Consider PostgreSQL for production if FK integrity is critical.

2. FCC API Behavior

Findings: - API returns 302 redirects; must follow redirects - Returns 400 for stations without ground conductivity data (can't calculate contours) - Rate limiting is informal but ~1 req/sec is respectful - Contour data uses x/y coordinates, not lat/lon

Code adaptation:

# Handle x/y format from FCC
lon = point.get("x") or point.get("lon")
lat = point.get("y") or point.get("lat")

3. Callsign Parsing

Station callsigns in our database include suffixes (WABC-AM, WXYZ-FM) but the FCC API expects bare callsigns with a separate serviceType parameter.

Solution: Parse callsign to extract service type, with fallback logic:

def parse_callsign(callsign: str) -> tuple[str, str]:
    if callsign.endswith('-AM'):
        return callsign[:-3], 'am'
    elif callsign.endswith('-FM'):
        return callsign[:-3], 'fm'
    return callsign, 'fm'  # Default FM, try AM as fallback

4. Spatial Queries in DuckDB

DuckDB supports trigonometric functions needed for Haversine distance calculations, but requires subqueries instead of HAVING clauses for computed columns:

-- Works in DuckDB
SELECT * FROM (
    SELECT callsign, 6371 * acos(...) AS distance_km
    FROM station
) WHERE distance_km <= ?

-- Does NOT work
SELECT callsign, 6371 * acos(...) AS distance_km
FROM station
HAVING distance_km <= ?

Coverage contour files (37MB) are stored on external NAS (/mnt/nom01/) with a symlink in the repo. This keeps the git repository small while allowing local development access.

fcc -> /mnt/nom01/campaignbrain/cbradio/fcc/

API Endpoints Delivered

Endpoint Use Case
GET /geo/stations List stations with FCC data, filterable
GET /geo/stations/{callsign} Single station geo details
GET /geo/stations/{callsign}/contour Coverage polygon GeoJSON
GET /geo/nearby?lat=&lon=&radius_km= Proximity search
GET /geo/within?min_lat=&max_lat=&... Bounding box search
GET /geo/points All stations as GeoJSON points
GET /geo/coverage All contours as FeatureCollection
GET /geo/stats FCC data statistics

Next Steps

Immediate (Sprint)

  1. Tileserver Integration
  2. Load GeoJSON contours into MapLibre/Mapbox tileserver
  3. Create vector tiles for efficient rendering at scale
  4. Configure zoom-level simplification for contours

  5. Frontend Map View

  6. Add station map to dashboard
  7. Display coverage areas on hover/click
  8. Filter by state, power, class

Short-Term (Month)

  1. Ownership Data Integration
  2. Download LMS database from FCC
  3. Parse ownership/licensee information
  4. Link to station records via facility_id

  5. Automated Sync

  6. Schedule monthly FCC sync via cron
  7. Alert on sync failures
  8. Track changes over time

  9. Coverage Overlap Analysis

  10. Calculate market coverage percentages
  11. Identify stations with overlapping coverage
  12. Build "reach" metrics for campaigns

Future Improvements

Performance Optimizations

Optimization Benefit Effort
Spatial index on coordinates 10x faster proximity queries Low
Pre-computed vector tiles Instant map rendering Medium
Contour simplification Smaller payloads (361 → ~50 points) Low
Redis cache for geo queries Sub-ms response times Medium

Spatial Index Example

-- DuckDB doesn't have native spatial indexes, but we can approximate:
CREATE INDEX idx_station_lat ON station(fcc_antenna_lat);
CREATE INDEX idx_station_lon ON station(fcc_antenna_lon);

-- Or migrate to PostGIS for true spatial indexing:
CREATE INDEX idx_station_geom ON station USING GIST(
    ST_SetSRID(ST_MakePoint(fcc_antenna_lon, fcc_antenna_lat), 4326)
);

Data Quality Improvements

  1. Missing Station Resolution
  2. 345 stations (18%) not found in FCC
  3. Investigate: dark stations, incorrect callsigns, translators
  4. Consider manual lookup for high-priority stations

  5. Contour Validation

  6. Verify polygon closure (first point = last point)
  7. Check for self-intersecting polygons
  8. Validate coordinate bounds

  9. Historical Tracking

  10. Store previous FCC data on resync
  11. Track power/frequency changes over time
  12. Alert on significant changes

API Enhancements

  1. Polygon Intersection Queries

    GET /geo/intersects?polygon=[[lon,lat],...]
    
    Find stations whose coverage overlaps a given area.

  2. Distance Matrix

    GET /geo/distances?callsigns=WABC-AM,WXYZ-FM,...
    
    Return pairwise distances between stations.

  3. Coverage Statistics

    GET /geo/coverage-stats?state=TX
    
    Return aggregate coverage area, population reached, etc.

  4. Real-time Availability

    GET /geo/stations/{callsign}/status
    
    Integrate with FCC's real-time broadcast status.


Architecture Recommendations

Current State

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   FCC API   │────▶│  fcc/sync   │────▶│   DuckDB    │
└─────────────┘     └─────────────┘     └─────────────┘
                    ┌─────────────┐             │
                    │  GeoJSON    │◀────────────┘
                    │   Files     │
                    └─────────────┘
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   FCC API   │────▶│  fcc/sync   │────▶│  PostGIS    │
└─────────────┘     └─────────────┘     └─────────────┘
┌─────────────┐     ┌─────────────┐             │
│  Tileserver │◀────│  pg_tileserv│◀────────────┘
└─────────────┘     └─────────────┘
┌─────────────┐
│  MapLibre   │
│  Frontend   │
└─────────────┘

Why PostGIS? - Native spatial indexes (R-tree) - ST_Within, ST_Intersects, ST_Distance functions - Direct vector tile generation via pg_tileserv - Better concurrent write handling


Files Reference

File Purpose
fcc/__init__.py Package exports
fcc/client.py FCC API client with rate limiting
fcc/models.py Pydantic models for FCC data
fcc/sync.py CLI sync script
fcc/README.md Module documentation
fcc/data/contours/*.json GeoJSON coverage polygons
src/api/routes/geo.py Geo API endpoints
src/api/models/geo.py Geo response models
docs/FCC-DATA-DICTIONARY.md Geo team reference

Conclusion

The FCC data integration provides a solid foundation for geographic features in CB Radio. With 82% of stations now enriched with coordinates and coverage data, the platform can support location-based queries, coverage visualization, and market analysis.

Key wins: - Clean API abstraction over FCC data - GeoJSON-native output for tileserver compatibility - Spatial query capabilities (nearby, bounding box) - Documented and maintainable sync process

The recommended next phase focuses on tileserver integration and frontend visualization, followed by ownership data and coverage analytics.


Report generated: December 2024