48fecef47b19fe501d27d338b6d52582@2024@MLSYS

Total: 1

#1 Keyformer: KV Cache reduction through key tokens selection for Efficient Generative Inference [PDF4] [Copy] [Kimi2] [REL]

Authors: Muhammad Adnan ; Akhil Arunkumar ; Gaurav Jain ; Prashant Nair ; Ilya Soloveychik ; Purushotham Kamath

No summary was provided.