An opinionated, builder-first map of the generative AI stack
Why This List | Start Here | Top Picks | Build Paths | Resource Index | Contributing
Most AI awesome lists are too broad, too stale, or too academic.
This repo is for builders who want to ship real products with modern generative AI:
- Practical entry points across agents, MCP, speech, vision, and model APIs
- Curated links that are useful for implementation, not just hype
- Faster path selection instead of endless scrolling through generic lists
If you are building agentic apps, voice products, multimodal workflows, or AI-powered developer tools, start here.
Pick one path and get moving in under 10 minutes:
- Build AI agents: AI Agents
- Build speech products: STT Models, TTS Models
- Build visual products: Text-to-Image, Talking Head
- Build model apps: Transformers, GenAI APIs
- Add tool integration: MCP Servers
If you only open a few sections, start with these:
| Area | Best Starting Point | Why It Matters |
|---|---|---|
| Agentic apps | AI Agents | Frameworks, coding agents, memory systems, and orchestration tools |
| Tool-using LLMs | MCP Servers | Quick map of the MCP ecosystem and the most useful server categories |
| Voice AI | TTS Models, STT Models | Core stack for assistants, call automation, and speech UX |
| Visual generation | Text-to-Image, Talking Head | Useful for multimodal products, synthetic media, and creator tools |
| Model integration | GenAI APIs, Transformers | Compare hosted APIs and open model tooling quickly |
- Start: AI Agents
- Add memory and context: context-engineering.md
- Add tool integration: mcp.md
- Best for: coding agents, assistants, workflow automation, and multi-agent systems
- Speech recognition: stt-models.md, stt-datasets.md
- Speech synthesis: tts.md, voice-cloning.md
- Emotion and affect: emotion-recognition.md
- Best for: call agents, speech interfaces, dubbing, and voice cloning pipelines
- Image generation: text-to-image.md
- Talking avatars: talking-head.md
- Best for: creative tools, avatars, media generation, and synthetic presenters
- Model selection: transformers.md
- API integrations: genai-apis.md
- Best for: choosing between open-source models, APIs, and deployment approaches
| Area | Primary Files |
|---|---|
| Agents and orchestration | ai-agents.md, mcp.md, context-engineering.md |
| Speech | stt-models.md, stt-datasets.md, tts.md, voice-cloning.md, emotion-recognition.md |
| Vision and media | text-to-image.md, talking-head.md |
| Models and APIs | transformers.md, genai-apis.md |
| Extended references | more_detailed.md |
- Curated manually for practical relevance and builder usefulness
- Focused on tools, repos, datasets, and frameworks with real implementation value
- Some files include star counts from periodic updates
- If you spot stale entries, weak categorization, or broken links, open an issue or PR
Contributions are welcome for:
- New high-quality open-source repos
- Broken links or outdated entries
- Better categorization and structure
Please follow CONTRIBUTING.md.
This is a curated collection of external resources. Rights and credit belong to each original author and organization.