SqueezeBits
[Intel Gaudi] #5. FLUX.1 on Gaudi-2
This article discusses inference efficiency when running the FLUX.1 models on Intel Gaudi-2 hardware.
TensorRT-LLM Goes Open Source!
With TensorRT-LLM now open source, we can finally take a deep dive into the secret sauce behind its impressive performance.
When Should I Use Fits on Chips?
This article describes when to use Fits on Chips toolkit with specific use cases.
Fits on Chips: Saving LLM Costs Became Easier Than Ever
This article introduces Fits on Chips, an LLMOps toolkit for performance evaluation.