SqueezeBits
Fits on Chips: Saving LLM Costs Became Easier Than Ever
This article introduces Fits on Chips, an LLMOps toolkit for performance evaluation.
The Missing Piece of TensorRT-LLM
This article is about an open-source library for direct conversion of PyTorch models to TensorRT-LLM.