Width-Induced Functional Redundancy in Large Language Models
Why Pruning, Early-Exit, and Quantization Leave “Low-Resolution Reasoning” Intact
DOI:
https://doi.org/10.31224/6261Keywords:
large language models, functional redundancy, model pruning, overparameterization, low-resolution inference, model widthAbstract
Recent observations show that large language models (LLMs) often retain non-trivial functionality even after aggressive pruning, early-exit, sparse routing, or low-precision quantization. While these phenomena are individually well-documented, a unifying explanation remains under-articulated.
This research note proposes a conceptual framework linking model width to functional redundancy, arguing that increasing width induces overlapping approximations of similar functions across parameters. As a result, partial removal or degradation of components does not immediately annihilate functionality but instead yields a low-resolution inference regime.
We formalize this intuition through definitions, hypotheses, and testable predictions, and outline experimental designs to validate or falsify the framework.
Downloads
Posted
License
Copyright (c) 2026 Seungmi Lee

This work is licensed under a Creative Commons Attribution 4.0 International License.