Goblin
News
AI news by
promptgoblins.ai
|
News
About
News
About
Filtered by:
distributed-training
Clear
Titles
Summaries
April
7
AutoSP: DeepSpeed Tool Automates Sequence Parallelism for Long-Context LLM Training
Infra
1
Apr 30
7
AutoSP: DeepSpeed Tool Automates Sequence Parallelism for Long-Context LLM Training
Infra
· 1 src · Apr 30
Discuss
6
Expert Upcycling: Expanding MoE Models Mid-Training Cuts GPU Costs by 32–67%
Research
1
Apr 24
6
Expert Upcycling: Expanding MoE Models Mid-Training Cuts GPU Costs by 32–67%
Research
· 1 src · Apr 24
Discuss
7
Google DeepMind's Decoupled DiLoCo Trains 12B-Parameter Models Across Regions at Standard Internet Speeds
Research
1
Apr 23
7
Google DeepMind's Decoupled DiLoCo Trains 12B-Parameter Models Across Regions at Standard Internet Speeds
Research
· 1 src · Apr 23
Discuss
6
Meta Hits 90%+ Training Efficiency with ETT% Framework
Infra
1
Apr 21
6
Meta Hits 90%+ Training Efficiency with ETT% Framework
Infra
· 1 src · Apr 21
Discuss
6
Monarch: PyTorch Framework Brings Supercomputer Control to Python API
Infra
1
Apr 9
6
Monarch: PyTorch Framework Brings Supercomputer Control to Python API
Infra
· 1 src · Apr 9
Discuss
7
TorchTPU: Google Releases Native PyTorch Integration for TPU Infrastructure at Scale
Infra
1
Apr 8
7
TorchTPU: Google Releases Native PyTorch Integration for TPU Infrastructure at Scale
Infra
· 1 src · Apr 8
Discuss
6
TGS and AWS Cut Seismic Foundation Model Training from 6 Months to 5 Days
Enterprise
1
Apr 6
6
TGS and AWS Cut Seismic Foundation Model Training from 6 Months to 5 Days
Enterprise
· 1 src · Apr 6
Discuss
March
7
NanoGPT Slowrun Achieves 10x Data Efficiency via Ensembling
Research
1
Mar 20
7
NanoGPT Slowrun Achieves 10x Data Efficiency via Ensembling
Research
· 1 src · Mar 20
Discuss
7
Claude Code Runs 910 ML Experiments on 16-GPU Cluster in 8 Hours
Research
1
Mar 20
7
Claude Code Runs 910 ML Experiments on 16-GPU Cluster in 8 Hours
Research
· 1 src · Mar 20
Discuss
8
Covenant-72B: First Open Permissionless Distributed LLM Pre-Training at Scale
Research
1
Mar 16
8
Covenant-72B: First Open Permissionless Distributed LLM Pre-Training at Scale
Research
· 1 src · Mar 16
Discuss
7
AWS + llm-d Bring Disaggregated LLM Inference to SageMaker and EKS
Infra
1
Mar 16
7
AWS + llm-d Bring Disaggregated LLM Inference to SageMaker and EKS
Infra
· 1 src · Mar 16
Discuss
Last Week
6
TRL Adds Delta Weight Sync to Cut Async RL Transfer Costs by ~98%
Open Source
1
5d ago
6
TRL Adds Delta Weight Sync to Cut Async RL Transfer Costs by ~98%
Open Source
· 1 src · 5d ago
Discuss
2 Weeks Ago
7
HRM-Text: Full Foundation Model Pretraining Framework for Under $1,500
Open Source
1
May 19
7
HRM-Text: Full Foundation Model Pretraining Framework for Under $1,500
Open Source
· 1 src · May 19
Discuss
7
Aurora Optimizer Fixes Muon Neuron Death Bug, Sets New Speedrun SoTA
Research
1
May 18
7
Aurora Optimizer Fixes Muon Neuron Death Bug, Sets New Speedrun SoTA
Research
· 1 src · May 18
Discuss
7
Lighthouse Attention: 17× Faster Long-Context Training via Hierarchical Selection
Research
1
May 18
7
Lighthouse Attention: 17× Faster Long-Context Training via Hierarchical Selection
Research
· 1 src · May 18
Discuss
6
Pretraining Failure Modes: MoE Causality Bugs and FP16 Precision Errors
Infra
1
May 18
6
Pretraining Failure Modes: MoE Causality Bugs and FP16 Precision Errors
Infra
· 1 src · May 18
Discuss
3 Weeks Ago
7
Datadog Releases Toto 2.0: Scalable Open-Weights Time Series Models
Open Source
1
May 15
7
Datadog Releases Toto 2.0: Scalable Open-Weights Time Series Models
Open Source
· 1 src · May 15
Discuss
7
PyTorch 2.12: 100x Eigendecomp Speedup, Unified Graph API
Open Source
1
May 14
7
PyTorch 2.12: 100x Eigendecomp Speedup, Unified Graph API
Open Source
· 1 src · May 14
Discuss
Last Month
7
NVIDIA Spectrum-X Adopts MRC Protocol for Gigascale AI Networking
Infra
1
May 7
7
NVIDIA Spectrum-X Adopts MRC Protocol for Gigascale AI Networking
Infra
· 1 src · May 7
Discuss
6
Unsloth + NVIDIA Collaboration Cuts LLM Training Time by ~25%
Infra
1
May 7
6
Unsloth + NVIDIA Collaboration Cuts LLM Training Time by ~25%
Infra
· 1 src · May 7
Discuss
Filters
Signal
Title
Category
Sources
Posted
Discuss