FCC Data Integration - Post-Implementation Report¶
Date: December 2024 Status: Complete Author: Data Engineering Team
Executive Summary¶
Successfully integrated FCC station data into the CB Radio platform, enriching 1,563 radio stations (82%) with official geographic coordinates, transmit power, station classification, and coverage contour polygons. The implementation includes a sync module for ongoing updates and a comprehensive Geo API for spatial queries.
Implementation Overview¶
What Was Built¶
| Component | Description |
|---|---|
| FCC Sync Module | Python package (fcc/) to fetch data from FCC Contours API |
| Database Schema | 12 new fcc_* columns on the station table |
| Coverage Contours | 1,563 GeoJSON polygon files (37MB total) |
| Geo API | 8 read-only endpoints for spatial queries |
| Documentation | Data dictionary for geo team integration |
Data Source¶
FCC Contours API
https://geo.fcc.gov/api/contours/entity.json
This public API provides: - Station facility IDs (unique FCC identifiers) - Antenna coordinates (WGS84) - Transmit power and ERP - Station classification (A, B, C, D) - Hours of operation - 360-point coverage contour polygons
Results¶
| Metric | Value |
|---|---|
| Total stations in database | 1,908 |
| Stations synced with FCC | 1,563 (81.9%) |
| Coverage contours generated | 1,563 |
| Contour data size | 37 MB |
| Sync duration | ~30 minutes |
| API endpoints created | 8 |
Technical Learnings¶
1. DuckDB Foreign Key Limitations¶
Issue: DuckDB throws internal errors when updating rows that are referenced by foreign keys, even when not modifying the primary key.
Solution: Removed foreign key constraints from tables that frequently update:
- document.station_id
- rate.station_id
- rate_card.station_id
Lesson: DuckDB is excellent for analytics but has limitations for transactional workloads with complex FK relationships. Consider PostgreSQL for production if FK integrity is critical.
2. FCC API Behavior¶
Findings:
- API returns 302 redirects; must follow redirects
- Returns 400 for stations without ground conductivity data (can't calculate contours)
- Rate limiting is informal but ~1 req/sec is respectful
- Contour data uses x/y coordinates, not lat/lon
Code adaptation:
# Handle x/y format from FCC
lon = point.get("x") or point.get("lon")
lat = point.get("y") or point.get("lat")
3. Callsign Parsing¶
Station callsigns in our database include suffixes (WABC-AM, WXYZ-FM) but the FCC API expects bare callsigns with a separate serviceType parameter.
Solution: Parse callsign to extract service type, with fallback logic:
def parse_callsign(callsign: str) -> tuple[str, str]:
if callsign.endswith('-AM'):
return callsign[:-3], 'am'
elif callsign.endswith('-FM'):
return callsign[:-3], 'fm'
return callsign, 'fm' # Default FM, try AM as fallback
4. Spatial Queries in DuckDB¶
DuckDB supports trigonometric functions needed for Haversine distance calculations, but requires subqueries instead of HAVING clauses for computed columns:
-- Works in DuckDB
SELECT * FROM (
SELECT callsign, 6371 * acos(...) AS distance_km
FROM station
) WHERE distance_km <= ?
-- Does NOT work
SELECT callsign, 6371 * acos(...) AS distance_km
FROM station
HAVING distance_km <= ?
5. Symlink Strategy for Large Data¶
Coverage contour files (37MB) are stored on external NAS (/mnt/nom01/) with a symlink in the repo. This keeps the git repository small while allowing local development access.
API Endpoints Delivered¶
| Endpoint | Use Case |
|---|---|
GET /geo/stations |
List stations with FCC data, filterable |
GET /geo/stations/{callsign} |
Single station geo details |
GET /geo/stations/{callsign}/contour |
Coverage polygon GeoJSON |
GET /geo/nearby?lat=&lon=&radius_km= |
Proximity search |
GET /geo/within?min_lat=&max_lat=&... |
Bounding box search |
GET /geo/points |
All stations as GeoJSON points |
GET /geo/coverage |
All contours as FeatureCollection |
GET /geo/stats |
FCC data statistics |
Next Steps¶
Immediate (Sprint)¶
- Tileserver Integration
- Load GeoJSON contours into MapLibre/Mapbox tileserver
- Create vector tiles for efficient rendering at scale
-
Configure zoom-level simplification for contours
-
Frontend Map View
- Add station map to dashboard
- Display coverage areas on hover/click
- Filter by state, power, class
Short-Term (Month)¶
- Ownership Data Integration
- Download LMS database from FCC
- Parse ownership/licensee information
-
Link to station records via facility_id
-
Automated Sync
- Schedule monthly FCC sync via cron
- Alert on sync failures
-
Track changes over time
-
Coverage Overlap Analysis
- Calculate market coverage percentages
- Identify stations with overlapping coverage
- Build "reach" metrics for campaigns
Future Improvements¶
Performance Optimizations¶
| Optimization | Benefit | Effort |
|---|---|---|
| Spatial index on coordinates | 10x faster proximity queries | Low |
| Pre-computed vector tiles | Instant map rendering | Medium |
| Contour simplification | Smaller payloads (361 → ~50 points) | Low |
| Redis cache for geo queries | Sub-ms response times | Medium |
Spatial Index Example¶
-- DuckDB doesn't have native spatial indexes, but we can approximate:
CREATE INDEX idx_station_lat ON station(fcc_antenna_lat);
CREATE INDEX idx_station_lon ON station(fcc_antenna_lon);
-- Or migrate to PostGIS for true spatial indexing:
CREATE INDEX idx_station_geom ON station USING GIST(
ST_SetSRID(ST_MakePoint(fcc_antenna_lon, fcc_antenna_lat), 4326)
);
Data Quality Improvements¶
- Missing Station Resolution
- 345 stations (18%) not found in FCC
- Investigate: dark stations, incorrect callsigns, translators
-
Consider manual lookup for high-priority stations
-
Contour Validation
- Verify polygon closure (first point = last point)
- Check for self-intersecting polygons
-
Validate coordinate bounds
-
Historical Tracking
- Store previous FCC data on resync
- Track power/frequency changes over time
- Alert on significant changes
API Enhancements¶
-
Polygon Intersection Queries
Find stations whose coverage overlaps a given area. -
Distance Matrix
Return pairwise distances between stations. -
Coverage Statistics
Return aggregate coverage area, population reached, etc. -
Real-time Availability
Integrate with FCC's real-time broadcast status.
Architecture Recommendations¶
Current State¶
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ FCC API │────▶│ fcc/sync │────▶│ DuckDB │
└─────────────┘ └─────────────┘ └─────────────┘
│
┌─────────────┐ │
│ GeoJSON │◀────────────┘
│ Files │
└─────────────┘
Recommended Evolution¶
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ FCC API │────▶│ fcc/sync │────▶│ PostGIS │
└─────────────┘ └─────────────┘ └─────────────┘
│
┌─────────────┐ ┌─────────────┐ │
│ Tileserver │◀────│ pg_tileserv│◀────────────┘
└─────────────┘ └─────────────┘
│
▼
┌─────────────┐
│ MapLibre │
│ Frontend │
└─────────────┘
Why PostGIS? - Native spatial indexes (R-tree) - ST_Within, ST_Intersects, ST_Distance functions - Direct vector tile generation via pg_tileserv - Better concurrent write handling
Files Reference¶
| File | Purpose |
|---|---|
fcc/__init__.py |
Package exports |
fcc/client.py |
FCC API client with rate limiting |
fcc/models.py |
Pydantic models for FCC data |
fcc/sync.py |
CLI sync script |
fcc/README.md |
Module documentation |
fcc/data/contours/*.json |
GeoJSON coverage polygons |
src/api/routes/geo.py |
Geo API endpoints |
src/api/models/geo.py |
Geo response models |
docs/FCC-DATA-DICTIONARY.md |
Geo team reference |
Conclusion¶
The FCC data integration provides a solid foundation for geographic features in CB Radio. With 82% of stations now enriched with coordinates and coverage data, the platform can support location-based queries, coverage visualization, and market analysis.
Key wins: - Clean API abstraction over FCC data - GeoJSON-native output for tileserver compatibility - Spatial query capabilities (nearby, bounding box) - Documented and maintainable sync process
The recommended next phase focuses on tileserver integration and frontend visualization, followed by ownership data and coverage analytics.
Report generated: December 2024