Ferret - Autonomous Web Interaction System¶
Source:
extern/ferret/README.mdLast updated: 2024-10
An intelligent, self-learning web automation system that transforms natural language goals into successful web interactions through adaptive learning, visual understanding, and pattern recognition.
Overview¶
The SWRM Ferreting Tool is an advanced autonomous agent that can: - Understand Goals: Convert natural language instructions into executable actions - Learn from Experience: Improve performance through reinforcement learning and pattern mining - See and Understand: Use computer vision to understand web pages visually - Heal and Adapt: Recover from failures and adapt strategies in real-time - Remember and Apply: Store successful patterns and reuse them intelligently
Architecture¶
┌─────────────────────────────────────────────────────────┐
│ User Interface (API) │
├─────────────────────────────────────────────────────────┤
│ Orchestration Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Goal │ │ Script │ │ Learning │ │
│ │ Achievement │ │ Execution │ │ Control │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
├─────────────────────────────────────────────────────────┤
│ Intelligence Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Perception │ │ Strategy │ │ RL │ │
│ │ Engine │ │ Generation │ │ Learning │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
├─────────────────────────────────────────────────────────┤
│ Execution Layer │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Script │ │ Selector │ │ Wait │ │
│ │ Compiler │ │ Generation │ │ Strategy │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
├─────────────────────────────────────────────────────────┤
│ SWRM Infrastructure │
│ (Playwright Cluster - Existing System) │
└─────────────────────────────────────────────────────────┘
Components¶
Perception System (perception_system.py)¶
- Visual-DOM correlation
- Bounding box extraction
- Element type classification
- Interaction affordance detection
Script Engine (script_engine.py)¶
- AST-based action representation
- Natural language compilation
- Optimized execution
- Fallback strategies
Learning System (learning_system.py)¶
- Experience memory with semantic search
- Strategy generation (Memory/LLM/Hybrid)
- Reinforcement learning
- Adaptive goal achievement
Advanced Features (advanced_features.py)¶
- Multimodal perception (Vision + Language)
- Pattern mining from successes
- Smart selector generation
- Self-healing scripts
- Adaptive wait strategies
API Endpoints¶
DOM Retrieval¶
Script Execution¶
POST /browser/exec
{
"url": "https://example.com",
"script": {
"type": "sequence",
"actions": [
{"type": "navigate", "url": "https://example.com"},
{"type": "fill", "selector": "#search", "value": "test"},
{"type": "click", "selector": "button[type='submit']"}
]
}
}
Goal Achievement¶
POST /ferreting/achieve
{
"goal": "Search for 'nutrias' on entireweb.com",
"starting_url": "https://entireweb.com",
"max_attempts": 10,
"learning_mode": "adaptive"
}
Learning Modes¶
- Memory Mode: Uses past successful experiences
- LLM Mode: Generates novel strategies using language models
- Hybrid Mode: Combines memory and LLM insights
- Adaptive Mode (Recommended): Uses all strategies with RL
Performance¶
- Perception Speed: ~0.2s for 1000 elements
- Compilation Speed: ~0.001s per instruction
- Memory Search: ~0.004s per query (10k experiences)
- Success Rate: 85%+ after 100 training iterations
- Recovery Rate: 70% of failures auto-recovered