DeepSpeed (@DeepSpeedAI) / X

DeepSpeed

110 posts

DeepSpeed

@DeepSpeedAI

Official account for DeepSpeed, a library that enables unprecedented scale and speed for deep learning training + inference. 日本語 : @DeepSpeedAI_JP

Joined May 2020

Pinned
DeepSpeed
@DeepSpeedAI
Jun 3
We now have native support for all ZeRO stages 1/2/3 for Muon Optimizers, providing superior performance on LLM pre-training and post-training. Feel free to try it out, kudos to @PKUWZP Guokai Ma, Peng Du and Chi for the contribution!
PyTorch
@PyTorch
Jun 3
DeepSpeed now supports the Muon Optimizer. Optimized specifically for internal 2D weights within neural networks, Muon is gaining traction for its significant memory savings and strong convergence metrics during LLM training. In our latest blog post, the DeepSpeed team shares a
1.5K
DeepSpeed
@DeepSpeedAI
Nov 3, 2023
Introducing DeepSpeed-FastGen 🚀 Serve LLMs and generative AI models with - 2.3x higher throughput - 2x lower average latency - 4x lower tail latency w. Dynamic SplitFuse batching Auto TP, load balancing w. perfect linear scaling, plus easy-to-use API github.com/microsoft/Deep…
113K
DeepSpeed
@DeepSpeedAI
Apr 11, 2023
Want to train 10B+ ChatGPT-style models on a single GPU and 100B+ on multi-GPUs systems? Introducing DeepSpeed-Chat, an easy (single script), fast, and low-cost solution for training high-quality ChatGPT-style models with RLHF, 15x faster than SoTA. Blog: github.com/microsoft/Deep…
167K
DeepSpeed
@DeepSpeedAI
Jan 19, 2024
Introducing Mixtral, Phi2, Falcon, and Qwen support in #DeepSpeed-FastGen! - Up to 2.5x faster LLM inference - Optimized SplitFuse and token sampling - Exciting new features like RESTful API and more! For more details: github.com/microsoft/Deep… #DeepSpeeed #AI
50K
DeepSpeed
@DeepSpeedAI
Apr 22, 2023
DeepSpeed-Chat aims to provide a highly efficient pipeline to help you explore RLHF training. Towards this aim we are releasing training logs and our experiences in a new tutorial: github.com/microsoft/Deep… (🧵 thread 1/3)
50K
DeepSpeed
@DeepSpeedAI
Nov 6, 2023
🚀 Announcing DeepSpeed ZeRO-Offload++ -6x Higher Training Throughput via Collaborative CPU/GPU Twin-Flow 🔥 -Systematic optimizations at no data precision loss -Performance gain maintains for both single and multi-node cases github.com/microsoft/Deep…
51K
DeepSpeed
@DeepSpeedAI
Jul 17, 2023
DeepSpeed v0.10.0 release! Includes our ZeRO++ release, H100 support, and many bug fixes/updates. Special thanks to our wonderful community of contributors! ZeRO++ paper: arxiv.org/pdf/2306.10209… ZeRO++ blog: microsoft.com/en-us/research… v0.10.0 details: github.com/microsoft/Deep…
49K
DeepSpeed
@DeepSpeedAI
Jul 20, 2023
We recently finished a long-awaited sync between microsoft/Megatron-DeepSpeed and NVIDIA/Megatron-LM 🚀🚀🚀 This resulted in a ~10% throughput gain, together with support for FlashAttention (both 1 and 2) and Rotary Positional Embedding (RoPE)! Details:
GitHub - deepspeedai/Megatron-DeepSpeed: Ongoing research training transformer language models at...
From github.com
19K
DeepSpeed
@DeepSpeedAI
Sep 12, 2023
🚀Exciting new updates on #DeepSpeed ZeRO-Inference with 20X faster generation! - 4x lesser memory usage through 4-bit weight quantization with no code change needed. - 4x larger batch sizes through KV cache offloading. Available in DeepSpeed v0.10.3: aka.ms/z3-inference
18K
DeepSpeed
@DeepSpeedAI
Aug 23, 2023
Want to train 1 million token context lengths (all 7 of the Harry Potter books!📚) on a GPT-like model w. 64 GPUs? Announcing DeepSpeed-Ulysses🚀 This release enables highly efficient and scalable LLM training with extremely long sequence lengths🤯 github.com/microsoft/Deep…
16K