SqueezeBits
The Rise and Fall of ONNX (feat. PyTorch 2.0)
This article explores the rise and fall of ONNX, from its early success as a unifying stasndard for AI frameworks to its gradual shift into a niche tool in the era of PyTorch 2.0.
[vLLM vs TensorRT-LLM] #13. Vision-Language Models
This article provides a comparative analysis of serving vision-language models on vLLM and TensorRT-LLM.
[Intel Gaudi] #4. FP8 Quantization
In this blog series, we thoroughly evaluate Intel's AI accelerator, the Gaudi series, focusing on its performance, features, and usability.
[Intel Gaudi] #3. Performance Evaluation with SynapseAI v1.19
In this blog series, we thoroughly evaluate Intel's AI accelerator, the Gaudi series, focusing on its performance, features, and usability.