Skip to content

billxbf/AutoKG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AutoKG - Autonomous Kaggle Competitor 🥇

A long-running autonomous agent scaffold that competes in Kaggle competitions. AutoKG orchestrates AI agents through iterative research-experiment-review rounds, accumulating knowledge and improving submissions until the competition ends or a round budget is exhausted.

How It Works 🔥

Round 0: Competition Setup & Baseline
  Planner reads competition description, downloads data, does EDA, establishes baseline

Round 1-N: Iterative Research & Experimentation
  Planning    -> Research papers/forums, read user comments, review past rounds, write plan
  Development -> Implement experiments, train models, log metrics, optimize
  Review      -> Verify approach, submit to Kaggle, wait for LB score, write round review

  Loop continues until max rounds or user cancellation

Three AI agents rotate through each round:

Agent Role Output
Planner Research papers, forums, notebooks; decide what to try rounds/round_N/plan.md
Developer Implement experiments, train models, log CV scores rounds/round_N/dev_log.md
Reviewer Validate, submit to Kaggle, analyze CV-LB gap rounds/round_N/review.md

Design Principles

  • Never stop improving - Even at rank #1, keep iterating to maintain edge
  • Knowledge accumulates - Every round builds on all previous findings
  • User steerable - Inject guidance via comments or kill signal at any time
  • Research-first - Deep investigation before coding, not just trying random things
  • Reproducible - Every experiment is logged and documented
  • Full hardware utilization - Agents use all available GPUs directly on the host
  • Clean environments - All Python packages managed via uv

Requirements

  • bash >= 4.0
  • jq - JSON processing
  • Claude Code - AI agent CLI backend (default)
  • uv - Python package manager
  • tmux (optional) - Monitor dashboard

Quick Start

Install

./install.sh

Recommended MCP setup:

"mcpServers": {
    "context7": {
      "url": "https://mcp.context7.com/mcp",
      "headers": {
        "CONTEXT7_API_KEY": "Your-Key"
      }
    },
    "kaggle": {
      "url": "https://www.kaggle.com/mcp",
      "type": "http",
      "headers": {
        "Authorization": "Bearer Your-Key"
      }
    },
    "arxiv-mcp-server": {
      "command": "uv",
      "args": [
        "tool",
        "run",
        "arxiv-mcp-server",
        "--storage-path",
        ".cache/paper"
      ]
    },
    "browser-use": {
      "command": "npx",
      "args": [
        "mcp-remote",
        "https://api.browser-use.com/mcp",
        "--header",
        "X-Browser-Use-API-Key: Your-Key"
      ]
    },
    "hf-mcp-server": {
      "url": "https://huggingface.co/mcp?login",
      "headers": {}
    },
    "ssh-mcp": {
      "command": "npx",
      "args": [
        "ssh-mcp",
        "-y",
        "--",
        "--host=Your-Host",
        "--user=username",
        "--password=pass",
        "--timeout=30000",
        "--maxChars=none"
      ]
    } 
}

Fill in your auth keys. ssh-mcp is a compute cluster for heavy model training.

Preflight Check

# Validate agent CLI, auth, model, and dependencies
autokg check

Sample output:

[1/5] Claude Code CLI
  ✓ Installed

[2/5] Authentication
  + Logged in as user@example.com

[3/5] MCP Servers
  ✓ kaggle: running
  ! context7: not loaded (needs approval)

[4/5] Model Configuration
  ✓ Model 'opus-4.5-thinking' is available

[5/5] Dependencies
  ✓ jq: jq-1.7.1
  ✓ uv: uv 0.9.11

Initialize a Competition

# Basic init
autokg init titanic

# With a competition spec file
autokg init titanic --spec MYSPEC.md

This creates the full project structure:

.autokg/                 # AutoKG state
  config.json            # Project configuration
  round_state.json       # Current round/phase tracking
  status.json            # Agent communication file
competition.json         # Competition metadata
leaderboard.json         # Submission & score history
knowledge_base.json      # Accumulated research knowledge
master_comments.json     # User-injected guidance
rounds/                  # Per-round artifacts (plan.md, dev_log.md, review.md)
data/raw/                # Competition data
src/                     # Experiment code
models/                  # Saved model artifacts
submissions/             # Submission CSV files

Run

# Start the agent loop
autokg run

# Run with tmux dashboard (3 panes: runner, status, scores)
autokg --monitor run

Interact While Running

# Inject guidance (highest priority - planner reads these first)
autokg comment "Try using LightGBM instead of XGBoost"
autokg comment "Focus on feature engineering for the next round"
autokg comment "Check arxiv paper 2401.xxxxx for feature selection ideas"

# Graceful stop (finishes current phase, then exits)
autokg --kill

Monitor Progress

# Current status (round, phase, scores, rate limits)
autokg status

# Score history table (CV, LB, gap, rank, trend)
autokg scores

# Accumulated knowledge base
autokg knowledge
autokg knowledge --category model

CLI Reference

autokg init <competition> [--spec <file>]   Initialize for a competition
autokg check                                 Preflight check (auth, MCP, deps)
autokg run                                   Run the agent loop
autokg status                                Show current status
autokg comment "<message>"                   Add user guidance
autokg scores                                Show score history & trends
autokg knowledge [--category <cat>]          View knowledge base

autokg --monitor run                         Run with tmux dashboard
autokg --model <model> run                   Use specific AI model
autokg --kill                                Graceful stop signal
autokg --reset-rate-limit                    Reset API rate limiter
autokg --help                                Show help
autokg --version                             Show version

Exit Codes

Code Meaning
0 User --kill (graceful exit)
1 General error
21 Max rounds reached

Configuration

Edit .autokg/config.json after init:

{
  "agent": {
    "cli_tool": "claude",
    "model": "opus",
    "timeout_minutes": 1440,
    "models_by_role": {
      "planner": "opus-4.6-thinking",
      "developer": "gpt-5.3-codex-xhigh-fast",
      "reviewer": "opus-4.6-thinking"
    }
  },
  "round": {
    "max_rounds": 50,
    "planning_max_loops": 3,
    "development_max_loops": 10,
    "review_max_loops": 5
  },
  "rate_limiting": {
    "max_calls_per_hour": 100
  },
  "kaggle": {
    "daily_submission_limit": 5
  }
}

Architecture

autokg.sh                     Main CLI and orchestration loop
lib/
  utils.sh                    Logging, timestamps, JSON helpers
  rate_limiter.sh             API call rate + Kaggle submission tracking
  round_manager.sh            Round state, phase transitions, kill signal
  agent_adapter.sh            Agent CLI integration (Claude Code / cursor-agent), prompt generation
  metrics_collector.sh        Score tracking, CV-LB gap analysis, trends
  competition_manager.sh      Competition metadata management
  knowledge_manager.sh        Knowledge base CRUD
  comment_manager.sh          User comment management
prompts/
  planner.md                  Research-first planning prompt
  developer.md                ML engineering prompt
  reviewer.md                 Submission management prompt
templates/
  config.json                 Default configuration

Key Design Choices

No circuit breaker. Unlike Sprinty (the reference project), AutoKG never stops on agent errors. If an agent crashes, the orchestrator retries up to max_loops for that phase, then moves on to the next phase. For Kaggle, persistence beats correctness of any single round.

Knowledge accumulation. knowledge_base.json stores confirmed insights, failed approaches, reviewed papers, forum findings, and top notebook analyses. Every round's planner reads this to avoid repeating mistakes and build on what works.

User steering. master_comments.json lets users inject guidance at any time. The planner reads unread comments before anything else - they take highest priority over all other research.

Kill signal. autokg --kill writes a .autokg/.kill file. The orchestrator checks for it between phases and exits gracefully, allowing the current phase to finish.

State Files

File Purpose
competition.json Competition name, metric, deadline, submission limits, best scores
leaderboard.json Full submission history with CV/LB scores, gaps, ranks, trends
knowledge_base.json Insights, failed approaches, papers, forums, notebooks reviewed
master_comments.json User-injected guidance with read/unread status
.autokg/round_state.json Current round number, phase, loop count, round history
.autokg/status.json Agent communication - agents update this after each execution
.autokg/config.json Project configuration (models, timeouts, limits)

About

Autonomous Kaggler

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages