Accurate KV Cache Eviction via Anchor Direction Projection for Efficient LLM Inference

Publication
The Thirty-Ninth Annual Conference on Neural Information Processing Systems

Related