Parallel Prompt Decoding: A Cost-Effective and Memory-Efficient Approach for Accelerating LLM InferencePublished in Arxiv, 2024AbstractHao Mark Chen, Wayne Luk, Hongxiang Fan, Roberto BondesanDownloadShare on Twitter Facebook LinkedIn Previous Next