LatentRecurrentDepthLM: An Open-Source Framework for Recurrent-Depth Language Models with Controllable Test-Time Compute
DOI:
https://doi.org/10.31224/6512Keywords:
Recurrent-depth language models, latent reasoning, test-time, compute scaling, PyTorch, Hugging Face, Transformers, Open-SourceAbstract
LatentRecurrentDepthLM is a modular, production-ready open-source framework implementing a hybrid recurrent-depth language model that decouples effective reasoning depth from parameter count by iterating a single weight-shared block over a continuous latent state, enabling controllable test-time compute scaling without generating intermediate tokens or modifying model weights. Built in PyTorch with full Hugging Face Transformers compatibility, the framework provides end-to-end pipelines for dataset preparation, tokenization, training with randomized iteration depth and cosine scheduling, autoregressive generation with temperature and top-k sampling, and one-command Hub deployment via a custom PreTrainedModel subclass. This paper documents the software architecture, core algorithms, training and inference workflows, practical use cases, and comparisons with related tools, connecting the framework to recent advances in recurrent-depth and latent reasoning research to serve researchers, educators, and practitioners exploring parameter-efficient sequence modeling. The repository (codewithdark-git/LatentRecurrentDepthLM) and Hugging Face model checkpoint are released under the MIT license.
Downloads
Downloads
Posted
License
Copyright (c) 2026 Ahsan Umar

This work is licensed under a Creative Commons Attribution 4.0 International License.