Width-Induced Functional Redundancy in Large Language Models:  Why Pruning, Early-Exit, and Quantization Leave “Low-Resolution Reasoning” Intact

Seungmi Lee

doi:10.31224/6261

##article.authors##

Seungmi Lee Independent Researcher

DOI:

https://doi.org/10.31224/6261

Keywords:

large language models, functional redundancy, model pruning, overparameterization, low-resolution inference, model width

Abstract

Recent observations show that large language models (LLMs) often retain non-trivial functionality even after aggressive pruning, early-exit, sparse routing, or low-precision quantization. While these phenomena are individually well-documented, a unifying explanation remains under-articulated.

This research note proposes a conceptual framework linking model width to functional redundancy, arguing that increasing width induces overlapping approximations of similar functions across parameters. As a result, partial removal or degradation of components does not immediately annihilate functionality but instead yields a low-resolution inference regime.
We formalize this intuition through definitions, hypotheses, and testable predictions, and outline experimental designs to validate or falsify the framework.

Downloads

Download data is not yet available.

Width-Induced Functional Redundancy in Large Language Models

Why Pruning, Early-Exit, and Quantization Leave “Low-Resolution Reasoning” Intact

##article.authors##

DOI:

Keywords:

Abstract

Downloads

Downloads

Posted

License

Latest preprints