General-Compute
diff --git a/‎.env.example‎
Lines changed: 9 additions & 0 deletions b/‎.env.example‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎.github/workflows/ci.yml‎
Lines changed: 34 additions & 0 deletions b/‎.github/workflows/ci.yml‎
Lines changed: 34 additions & 0 deletions
diff --git a/‎.gitignore‎
Lines changed: 66 additions & 0 deletions b/‎.gitignore‎
Lines changed: 66 additions & 0 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 39 additions & 0 deletions b/‎CONTRIBUTING.md‎
Lines changed: 39 additions & 0 deletions
diff --git a/‎LICENSE‎
Lines changed: 21 additions & 0 deletions b/‎LICENSE‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 179 additions & 0 deletions b/‎README.md‎
Lines changed: 179 additions & 0 deletions
diff --git a/‎SECURITY.md‎
Lines changed: 28 additions & 0 deletions b/‎SECURITY.md‎
Lines changed: 28 additions & 0 deletions
@@ -0,0 +1,9 @@
+# API Keys for inference providers
+# General Compute is served by SambaNova's cloud.
+SAMBANOVA_API_KEY=your_sambanova_api_key_here
+OPENROUTER_API_KEY=your_openrouter_api_key_here
+
+# Optional Configuration Overrides
+# DEFAULT_ITERATIONS=50
+# RESULTS_DIR=./results
+# CONFIG_FILE=config/config.yaml
@@ -0,0 +1,34 @@
+name: CI
+
+on:
+  pull_request:
+  push:
+    branches: [main]
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ["3.10", "3.11", "3.12"]
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+
+      - name: Install
+        run: |
+          python -m pip install --upgrade pip
+          python -m pip install -e ".[dev]"
+
+      - name: Test
+        run: pytest
+
+      - name: Lint
+        run: ruff check src tests
+
+      - name: Type check
+        run: mypy src
@@ -0,0 +1,66 @@
+# Environment variables
+.env
+.env.*
+!.env.example
+
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+
+# Virtual environments
+venv/
+env/
+ENV/
+.venv
+
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+.DS_Store
+
+# Testing
+.pytest_cache/
+.ruff_cache/
+.coverage
+htmlcov/
+.tox/
+
+# Results and output
+results/*
+!results/.gitkeep
+*.csv
+*.html
+*.xls
+*.xlsx
+*.xlsm
+!src/benchmarking/reporting/templates/*.html
+
+# Logs
+*.log
+
+# Type checking
+.mypy_cache/
+.dmypy.json
+dmypy.json
@@ -0,0 +1,39 @@
+# Contributing
+
+Thanks for improving GC Benchmarking.
+
+## Development Setup
+
+```bash
+python3 -m venv venv
+source venv/bin/activate
+pip install -e ".[dev]"
+```
+
+Create `.env` from `.env.example` only when you need live provider calls. Unit
+tests should not require provider credentials.
+
+## Checks
+
+Run these before opening a pull request:
+
+```bash
+pytest
+ruff check src tests
+mypy src
+```
+
+Format changed Python files with:
+
+```bash
+black src tests
+```
+
+## Pull Request Guidance
+
+- Keep generated benchmark output out of commits.
+- Do not include API keys, provider account identifiers, or private benchmark
+  data.
+- Add or update tests for changes to metrics, config loading, CLI behavior, or
+  report generation.
+- When changing benchmark methodology, document the tradeoff in `README.md`.
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2026 General Compute
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
@@ -0,0 +1,179 @@
+# GC Benchmarking
+
+LLM inference benchmarking for OpenAI-compatible providers. The tool runs the
+same logical model across every enabled provider that has a configured model ID,
+then reports time to first token, end-to-end latency, output throughput, token
+counts, retry attempts, and error rate.
+
+The default configuration compares General Compute, served through SambaNova's
+cloud API, with OpenRouter for the same model families. Add or disable providers
+in `config/config.yaml` without changing code.
+
+## Features
+
+- Same-model provider comparisons
+- Provider interleaving within each iteration to reduce time-window bias
+- Warm-up requests that are discarded from metrics
+- Prompt variation to reduce provider-side cache effects
+- Streaming TTFT measurement, including reasoning-token streams
+- Incremental raw CSV writes so interrupted runs keep completed samples
+- CSV, HTML, and static-site JSON report generation
+
+## Installation
+
+Use Python 3.10 or newer.
+
+```bash
+python3 -m venv venv
+source venv/bin/activate
+pip install -e ".[dev]"
+```
+
+Or run the local setup script:
+
+```bash
+./setup.sh
+```
+
+## Configuration
+
+Create a local `.env` from the example and add provider keys:
+
+```bash
+cp .env.example .env
+```
+
+Required by the default config:
+
+```bash
+SAMBANOVA_API_KEY=your_sambanova_api_key_here
+OPENROUTER_API_KEY=your_openrouter_api_key_here
+```
+
+`.env` and benchmark outputs are intentionally ignored by Git. Do not commit
+real API keys or generated result files.
+
+By default, the CLI loads `config/config.yaml` from the current working
+directory when present. Otherwise, it falls back to the packaged default config.
+Set `CONFIG_FILE=/path/to/config.yaml` to use an explicit file.
+
+## Usage
+
+List configured providers, models, and workloads:
+
+```bash
+benchmark providers
+benchmark models
+benchmark workloads
+```
+
+Run a quick connectivity test:
+
+```bash
+benchmark test --provider general_compute --model gpt-oss-120b --workload ctx_256 --iterations 1
+```
+
+Run a benchmark:
+
+```bash
+benchmark run --providers general_compute,openrouter --models gpt-oss-120b --workloads ctx_256,ctx_1k --iterations 5
+```
+
+Run all enabled providers, models, and workloads:
+
+```bash
+benchmark run --iterations 50
+```
+
+Regenerate reports for an existing session:
+
+```bash
+benchmark report <session-id>
+```
+
+List local sessions:
+
+```bash
+benchmark list-sessions
+```
+
+## Workloads
+
+The default workloads are context-size sweeps:
+
+- `ctx_256`: 256 input tokens
+- `ctx_1k`: 1,024 input tokens
+- `ctx_4k`: 4,096 input tokens
+- `ctx_16k`: 16,384 input tokens
+- `ctx_64k`: 65,536 input tokens
+- `ctx_128k`: 131,072 input tokens
+
+Token counts are approximate because prompts are generated with `tiktoken`
+`cl100k_base`, not each model provider's native tokenizer.
+
+## Outputs
+
+Results are written under `results/`:
+
+- `session_<id>_raw.csv`: one row per request
+- `session_<id>_summary.csv`: aggregate statistics by model, provider, and workload
+- `session_<id>_report.html`: general HTML charts and tables
+- `session_<id>_provider_performance.html`: provider performance charts
+
+HTML reports load Plotly from the public CDN. Use the CSV outputs if you need a
+fully offline artifact.
+
+## Static Site Export
+
+Export a completed session as pre-aggregated JSON for a static site:
+
+```bash
+benchmark publish <session-id> --site-path ../my-site --label "June benchmark"
+```
+
+This writes files under `../my-site/public/benchmarks/`:
+
+- `manifest.json`
+- `<session-id>.json`
+- `<session-id>_raw.csv` unless `--no-copy-raw` is passed
+
+Remove a published session:
+
+```bash
+benchmark unpublish <session-id> --site-path ../my-site
+```
+
+## Methodology Notes
+
+Comparisons are meaningful only within the same logical model. OpenRouter is an
+aggregator, so its latency can include routing overhead and can vary by selected
+backend. Review provider routing settings in `config/config.yaml` before
+publishing benchmark claims.
+
+The tool measures output throughput after TTFT, so decode speed is separated
+from queueing and prompt-processing overhead. Retries are limited to transient
+errors; failed attempts and backoff sleeps do not inflate successful-attempt
+latency metrics.
+
+## Development
+
+```bash
+pytest
+ruff check src tests
+mypy src
+```
+
+Format code:
+
+```bash
+black src tests
+```
+
+## Security
+
+Please do not open public issues with secrets, API keys, private benchmark data,
+or unpublished provider credentials. See `SECURITY.md` for reporting guidance.
+
+## License
+
+MIT. See `LICENSE`.
@@ -0,0 +1,28 @@
+# Security Policy
+
+## Reporting
+
+Please report suspected vulnerabilities privately to the maintainers instead of
+opening a public issue.
+
+Include:
+
+- A concise description of the issue
+- Steps to reproduce or a proof of concept
+- Affected version or commit, when known
+- Any known impact on API keys, benchmark data, or generated reports
+
+## Secret Handling
+
+Never commit `.env`, provider API keys, account identifiers, private result
+files, or unpublished benchmark data. If a credential is committed or shared,
+rotate it with the provider immediately.
+
+## Scope
+
+Security-sensitive areas include:
+
+- API key loading and error handling
+- Generated HTML reports
+- Static-site export files
+- CSV parsing and report regeneration from local session files