Fast and Easy Infinite Neural Networks in Python
-
Updated
Mar 1, 2024 - Jupyter Notebook
Fast and Easy Infinite Neural Networks in Python
CVPR 2024-Improved Implicit Neural Representation with Fourier Reparameterized Training
ICML2025-Inductive Gradient Adjustment for Spectral Bias in Implicit Neural Representations
Existing literature about training-data analysis.
A unified framework for attributing model components, data, and training dynamics to model behavior.
Official repository for "FOCUS: First Order Concentrated Updating Scheme"
Code for "What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers" (NeurIPS 2025)
Code for 'Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics'
Source code for <Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies>
Code for "Effect of equivariance on training dynamics"
Official repository for the EMNLP 2024 paper "How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics"
External LLM intelligence monitors & diagnoses MoE expert ecology during training — preventing routing collapse without auxiliary loss engineering. 16 Experts, 3 Tiers, Top-2 Gating, Claude-in-the-Loop.
Code and data for: Three Phases of Expert Routing — How Load Balance Evolves During MoE Training
Cross-Family Convergence of Neural Network Weight Skeletons. Companion to Zenodo paper (10.5281/zenodo.19652706).
TMLR 2026 | Mechanistic interpretability: attention-head binding (EB*) as a marker of concept emergence. 7 models, 5 architectures (Pythia 160M–2.8B, OLMo-1B, CRFM GPT-2, SmolLM3-3B, Qwen2.5-1.5B), 41 terms.
Code for "Abrupt Learning in Transformers: A Case Study on Matrix Completion" (NeurIPS 2024)
σFlow-PDE: A drop-in H-Bar training engine that escapes the σ-trap in neural PDE solvers via live σ/δ/α ODE integration, autonomous phase curriculum, and auto-falsification.
Reimplementation of the Sliced Information Plane (SIP) framework from Wongso, Ghosh, and Motani (2025) for analyzing deep neural network training dynamics. The repo uses Sliced Mutual Information (SMI) to obtain scalable, finite dependence estimates in high‑dimensional, deterministic settings, and applies them to MNIST MLP experiments.
A research project investigating how LSTM training dynamics relate to dynamical stability and order–chaos transitions through Finite-Time Lyapunov Exponent (FTLE) analysis.
Atomic benchmark suite showing drift can act as an early warning before direct symmetry detection in gradual-breaking regimes, with reversal controls, finite-budget sensitivity tests, and exact alarm-time validation.
Add a description, image, and links to the training-dynamics topic page so that developers can more easily learn about it.
To associate your repository with the training-dynamics topic, visit your repo's landing page and select "manage topics."