Enhancing MP-MLP with Patch Mixing
DOI:
https://doi.org/10.31224/6815Keywords:
lightweight vision model, MLP, patch mixing, image classification, efficient deep learningAbstract
Lightweight vision architectures are important for image classification in resource-constrained environments.Among CNN-free approaches, multi-layer perceptron (MLP)-based models provide a simple and computationally efficient alternative.MP-MLP is a lightweight vision model that divides an image into non-overlapping micro-patches and applies a shared MLP to each patch independently.While this design is simple and efficient, it does not explicitly model interactions across patches before classification.To address this limitation, we introduce a simple patch mixing module inspired by the token-mixing idea of MLP-Mixer.The proposed module is applied after local patch encoding and performs mixing along the patch dimension through a lightweight MLP block.Experimental results on MNIST and SVHN show that the proposed method consistently improves performance over the baseline MP-MLP, with especially large gains on the more complex SVHN dataset.These results suggest that patch mixing is an effective way to enhance lightweight MLP-based vision models.
Downloads
Downloads
Posted
License
Copyright (c) 2026 Taehyeon Kim

This work is licensed under a Creative Commons Attribution 4.0 International License.