Skip to content

Seedbed - AI-Driven Knowledge Platform

Source: extern/seedbed/README.md Last updated: 2024-10

An AI-driven knowledge platform where ideas grow into understanding. Seedbed crawls, analyzes, and synthesizes web content to generate comprehensive reports on any topic.

Features

  • Intelligent Web Crawling: Multi-iteration crawling with smart content extraction
  • AI-Powered Analysis: Evaluates content quality and relevance
  • Knowledge Synthesis: Integrates findings into comprehensive reports
  • Real-time Progress Tracking: Web interface to monitor generation progress
  • Queue Management: Handles multiple research requests efficiently

Architecture

Seedbed consists of three main components:

  1. Web Frontend (Next.js): Interactive UI for submitting queries and viewing results
  2. Python Pipeline: Orchestrator and processing modules for content analysis
  3. Data Storage: Session-based storage for crawl results and reports

Quick Start

# Clone the repository
git clone https://github.com/yourusername/seedbed.git
cd seedbed

# Install all dependencies (Node.js and Python)
npm run setup

# Start the development server
npm run dev

The application will be available at http://localhost:3000

Usage

Web Interface

  1. Navigate to http://localhost:3000
  2. Enter your research query
  3. Click "Generate" to start the research process
  4. Monitor progress in real-time
  5. View the generated report when complete

Command Line

# Run the orchestrator directly
npm run orchestrator "your research query"

# Or use individual components
npm run crawler "search query"
npm run evaluator session_id
npm run integrator session_id

Project Structure

seedbed/
├── pages/              # Next.js pages
├── components/         # React components
├── integrations/       # Backend service integrations
├── bin/               # Python pipeline scripts
│   ├── unified-orchestrator
│   ├── unified-crawler
│   ├── unified-evaluator
│   ├── unified-extender
│   └── unified-integrator
├── lib/               # Python libraries
├── data/sessions/     # Session data storage
├── scripts/           # Utility bash scripts
└── styles/           # CSS/Tailwind styles

Configuration

Environment variables can be set in a .env file:

# Maximum iterations for research
MAX_ITERATIONS=2

# Quality threshold for content
QUALITY_THRESHOLD=0.4

# Maximum concurrent crawls
MAX_CONCURRENT_CRAWLS=3