10.INT8 Activation Ternary or Binary Weights Networks: Unifying Between INT8 and Lower-bit Width Quantization

Ninnart Fuengfusin1 , Hakaru Tamukoh2
1Graduate School of Life Science and Systems Engineering, Kyushu Institute of Technology, 2-4 Hibikino, Wakamatsu-ku, Kitakyushu, Fukuoka, 808-0196, Japan
2Research Center for Neuromorphic AI Hardware, Graduate School of Life Science and Systems Engineering, Kyushu Institute of Technology, 2-4 Hibikino, Wakamatsu-ku, Kitakyushu, Fukuoka, 808-0196, Japan
pp. 171–176
ABSTRACT
This paper proposes ternary or binary weights with 8-bit integer activation convolutional neural networks. Our proposed model serves as the middle ground between 8-bit integer and lower than 8-bit precision quantized models. Our empirical experiments established that the conventional 1-bit or 2-bit only-weight quantization methods (i.e., BinaryConnect and ternary weights network) can be used jointly with the 8-bit integer activation quantization. We evaluate our model with the VGG16-like model to operate with the CIFAR10 and CIFAR100 datasets. Our models show competitive results to the general 32-bit floating point model.
Keywords : Quantization, Image recognition, Model compression

© 2022 The Author. Published by Sugisaka Masanori at ALife Robotics Corporation Ltd This is an open access article distributed under the CC BY-NC 4.0 license ( h ttp://creativecommons.org/licenses/by-nc/4.0/).

ARTICLE INFO
Article History
Received 15 December 2021
Accepted 01 July 2022

J-STAGE9210

Download article (PDF)