Skip to content

joshp123/xuezh

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

xuezh (Chinese Learning Engine, ZFC / Unix-style)

Name: xuezh is short for 学中文 (learn Chinese).

This repo is a local learning engine for Mandarin study. It is designed to be used as a tool/skill behind a bot runtime + SOTA LLM (Clawdbot is the recommended integration), but it is also a plain CLI you can call however you want.

Authorship

Primary author: Codex using the gpt-5.2-codex model.

Recommended usage (Clawdbot)

Recommended: run xuezh as a CLI tool from a bot agent (Clawdbot) and parse JSON outputs. Use a config file for mode/credentials, and keep dependencies pinned in the bot's dev environment. For managed OpenClaw, the Mac CLI is client-backed and the Azure Speech key stays on the mini-server.

Clawdbot (upstream) repo:

https://github.com/steipete/clawdbot

Clawdbot is a local-first personal assistant that routes WhatsApp/Telegram/WebChat messages to an agent runtime. The Gateway is the control plane (sessions, providers, media, voice wake), while tools like xuezh are called on demand. Integration is simple: have the bot call the xuezh CLI and parse JSON responses, then surface the feedback back to the user. You can run the bot locally on your devices and keep all state under your control.

Key interaction flows (bot ↔ user):

  1. Pronunciation feedback (audio → assessment)
    User sends a voice note (e.g., “你好我叫小李。你叫什么?”).
    Bot calls:

    xuezh audio process-voice --in /tmp/voice.m4a --ref-text "你好我叫小李。你叫什么?" --json
    

    Bot reads data.assessment + data.transcript and responds with targeted feedback.

  2. Listen and repeat (text → audio)
    User asks “How do I say …?”
    Bot calls:

    xuezh audio tts --text "你好" --voice XiaoxiaoNeural --out /tmp/xuezh-tts.ogg --json
    

    Bot returns the local delivery file as a voice note. Server artifact paths in the JSON are audit metadata.

  3. Progress recap (facts → summary)
    User asks “How am I doing?”
    Bot calls:

    xuezh report hsk --level 6 --json
    

    Bot summarizes the factual progress data. Use --level 7-9 if your dataset includes the 7–9 bucket.

Example screenshots (from a bot flow):

Mac OpenClaw config (/etc/xuezh/config.toml):

[client]
server_url = "https://chinese.jjpcodes.com"

Mini-server config (/etc/xuezh/config.toml):

[workspace]
dir = "/var/lib/xuezh"

[azure.speech]
key_file = "/run/agenix/xuezh-azure-speech-key"
region = "westeurope"

[audio]
process_voice_backend = "azure.speech"
convert_backend = "ffmpeg"
tts_backend = "edge-tts"
inline_max_bytes = 200000

Other usage (CLI)

This is a standard CLI. You can call it from any script or workflow as long as its dependencies are available:

nix run github:joshp123/xuezh
xuezh version --json
xuezh audio process-voice --in /path/to/voice.m4a --ref-text "你好" --json

Core commands + example outputs

Version:

$ xuezh version --json
{"ok":true,"schema_version":"1.0","command":"version","data":{"version":"0.1.0"},"artifacts":[],"truncated":false,"limits":{}}

Voice processing (pronunciation assessment + transcript):

$ xuezh audio process-voice --in /path/to/voice.m4a --ref-text "你好" --json
{"ok":true,"schema_version":"1.0","command":"audio.process-voice","data":{"assessment":{...},"transcript":{"text":"你好"}},"artifacts":[...],"truncated":false,"limits":{"inline_bytes_max":200000}}

Text-to-speech (audio artifact):

$ xuezh audio tts --text "你好" --voice XiaoxiaoNeural --out /tmp/xuezh-tts.ogg --json
{"ok":true,"schema_version":"1.0","command":"audio.tts","data":{"voice":"zh-CN-XiaoxiaoNeural","delivery_path":"/tmp/xuezh-tts.ogg"},"artifacts":[{"purpose":"audio_tts","path":"artifacts/.../tts.ogg"}],"truncated":false,"limits":{}}

SRS review (recall vs pronunciation):

$ xuezh review start --json
$ xuezh review grade --item w_xxx --recall 4 --pronunciation 2 --json

Notes:

  • review start returns separate recall_items and pronunciation_items queues.
  • The --grade flag applies to recall only.

Key idea

  • Model = smart endpoint (lesson planning, choosing what to teach next, pedagogy)
  • Engine = dumb pipes (SQLite persistence, mechanical transforms, bounded reports, audio file materialization)

The engine must remain ZFC-compliant: no local ranking/selection heuristics; no “what should we do next” logic. The engine only returns primary sources and performs mechanical transforms. See docs/reference/zfc-zero-framework-cognition.md.

Audio pipeline architecture (STT/TTS)

Input normalization: all audio is normalized to WAV via ffmpeg before any backend call.
STT / assessment: audio.process-voice runs STT + pronunciation assessment (default backend is Azure Speech).
Artifacts: full raw outputs are stored as artifacts; the CLI response inlines only the actionable subset.
TTS: audio.tts uses edge-tts to materialize voice audio into an artifact.
Local fallback: whisper provides a local STT path when Azure isn't used.

Runtime dependencies

  • ffmpeg (audio conversion)
  • edge-tts (TTS voice)
  • whisper (local STT fallback; Azure is default)
  • Azure Speech SDK + credentials for pronunciation assessment

Azure notes:

  • You need an Azure Speech resource key/region (free F0 tier is fine to start).
  • Quick setup:
    1. Create an Azure Speech resource (region westeurope is fine).
    2. Grab the key + region from the Azure portal.
    3. Put the key in the server config file ([azure.speech] key_file = ...) and set region.
    4. Run xuezh audio process-voice --in /path/to/audio.m4a --ref-text "你好" --json.
  • Free tier includes 5 audio hours/month for Speech to Text and 0.5M Neural TTS characters/month.
  • Pronunciation Assessment is billed at the baseline Speech to Text rate; prosody/grammar/vocabulary/topic are add-on charges.

Quick start (developer)

  1. Enter the dev environment:

    devenv shell
  2. (Optional) Install the package in editable mode:

    python -m pip install -e .[dev]
  3. Run the CLI:

    xuezh --help
    xuezh version --json
  4. Run tests:

    pytest

Default dataset seed (HSK)

The repo bundles a pinned snapshot of ivankra/hsk30 under datasets/ivankra-hsk30/. Use it to initialize real HSK coverage (vocab + grammar only; levels 1–6).

python scripts/seed_hsk30.py --source datasets/ivankra-hsk30

Verify the DB has coverage:

xuezh report hsk --level 6 --json

Notes:

  • Characters are not imported by default (v1 scope). Add --include-chars if needed.
  • The seed script filters to levels 1–6. If your dataset includes a 7–9 bucket, import those rows separately; reporting supports --level 7-9.
  • Managed OpenClaw should not set a xuezh workspace. It should use [client].server_url and let the mini-server own /var/lib/xuezh.

What’s included

  • schemas/ : JSON Schemas (contract stubs; to be enforced by tickets)
  • tests/fixtures/ : minimal dataset fixtures
  • datasets/ : pinned upstream HSK snapshot for local seeding
  • src/xuezh/ : Python package + CLI skeleton (xuezh)
  • tickets/ : historical implementation ticket specs
  • specs/ : user requirements, BDD scenarios, and testing pyramid strategy
  • skills/chinese-learning-orchestrator/ : the Skill prompt glue (SKILL.md + references)
  • devenv.nix : dev environment skeleton (use this; do not install via global package managers)
  • docs/handoff/ : handoff prompt for the implementing agent
  • infra/azure/speech/ : OpenTofu scaffold for the Azure Speech resource

Project boundaries (important)

  • This repo does not implement the Clawdbot bot runtime/Gateway or any Telegram/WhatsApp/WebChat integration.
  • This repo does not implement pedagogy, recommendations, or personalization logic (that stays in the model/agent).
  • The skill (skills/.../SKILL.md) teaches the model how to use the engine, and encodes learning best practices.

Workspace / data path

Local/server mode stores data under [workspace].dir. Managed production uses /var/lib/xuezh.

Client-backed mode uses [client].server_url and does not open a local xuezh workspace. It may write caller-requested delivery files such as /tmp/xuezh-tts.ogg, but those files are not learner state.

Ticket execution method

Use the current user request, active ExecPlan, and repo docs as the work source. The tickets/ directory is historical scaffold context, not an active issue tracker. Implement substantial changes using the RGR pattern:

  • Red: write/enable tests
  • Green: minimal implementation to pass tests
  • Refactor: clean up without behavior change

See AGENTS.md.


References

  • Authoritative CLI spec: docs/cli-contract.md
  • Documentation map: docs/README.md
  • Out of scope (v1): specs/out-of-scope.md
  • Authoritative specs: specs/id-scheme.md, specs/events.md, specs/artifacts/retention.md
  • CI-style checks: ./scripts/check.sh
  • Contract coverage enforcement: tests/contract/

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors