Goblin
News
AI news by
promptgoblins.ai
|
News
About
News
About
Filtered by:
moe
Clear
Titles
Summaries
April
7
Defense AI Lab Releases Laguna XS.2 and M.1 Agentic Coding Models, Including Open-Weight Apache 2.0
Models
1
Apr 29
7
Defense AI Lab Releases Laguna XS.2 and M.1 Agentic Coding Models, Including Open-Weight Apache 2.0
Models
· 1 src · Apr 29
Discuss
8
Xiaomi Open-Sources MiMo-V2.5 Models Under MIT License for Agentic AI
Models
2
Apr 28
8
Xiaomi Open-Sources MiMo-V2.5 Models Under MIT License for Agentic AI
Models
· 2 srcs · Apr 28
Discuss
8
DeepSeek Slashes V4-Pro API Prices 75%, Intensifying War With US AI Labs
Models
1
Apr 28
8
DeepSeek Slashes V4-Pro API Prices 75%, Intensifying War With US AI Labs
Top
Models
· 1 src · Apr 28
Discuss
9
DeepSeek Launches V4 Flash and V4 Pro, Claims Frontier-Level Performance
Updated
Models
5
May 4
9
DeepSeek Launches V4 Flash and V4 Pro, Claims Frontier-Level Performance
Top
Models
· 5 srcs · May 4
Discuss
6
Expert Upcycling: Expanding MoE Models Mid-Training Cuts GPU Costs by 32–67%
Research
1
Apr 24
6
Expert Upcycling: Expanding MoE Models Mid-Training Cuts GPU Costs by 32–67%
Research
· 1 src · Apr 24
Discuss
7
AI2 Introduces BAR: Modular Post-Training via Branch-Adapt-Route
Research
1
Apr 21
7
AI2 Introduces BAR: Modular Post-Training via Branch-Adapt-Route
Research
· 1 src · Apr 21
Discuss
6
Qwen3.6 Released with Agentic Coding and Thinking Preservation Features
Models
1
Apr 17
6
Qwen3.6 Released with Agentic Coding and Thinking Preservation Features
Models
· 1 src · Apr 17
Discuss
6
Warp Decode: 1.84x Faster MoE Inference by Flipping the Parallelism Axis on Blackwell GPUs
Research
1
Apr 8
6
Warp Decode: 1.84x Faster MoE Inference by Flipping the Parallelism Axis on Blackwell GPUs
Research
· 1 src · Apr 8
Discuss
8
Google Launches Gemma 4: Four Open Models with Frontier-Level Efficiency
Models
6
Apr 3
8
Google Launches Gemma 4: Four Open Models with Frontier-Level Efficiency
Top
Models
· 6 srcs · Apr 3
Discuss
March
8
NVIDIA Releases Nemotron-Cascade 2: Open 30B MoE Achieves Gold-Medal Reasoning with 20x Efficiency
Models
1
Mar 23
8
NVIDIA Releases Nemotron-Cascade 2: Open 30B MoE Achieves Gold-Medal Reasoning with 20x Efficiency
Models
· 1 src · Mar 23
Discuss
6
LLM Architecture Gallery: 11 Open Models Compared by Design
Research
1
Mar 23
6
LLM Architecture Gallery: 11 Open Models Compared by Design
Research
· 1 src · Mar 23
Discuss
8
Moonshot AI Releases Kimi K2 Open-Source Model and Kimi-Researcher Agent
Models
1
Mar 20
8
Moonshot AI Releases Kimi K2 Open-Source Model and Kimi-Researcher Agent
Top
Models
· 1 src · Mar 20
Discuss
7
NVIDIA Nemotron 3 Super Launches on Amazon Bedrock as Serverless Model
Models
1
Mar 19
7
NVIDIA Nemotron 3 Super Launches on Amazon Bedrock as Serverless Model
Models
· 1 src · Mar 19
Discuss
7
Mistral Small 4: Unified 119B MoE Model Released Under Apache 2.0
Models
2
Mar 17
7
Mistral Small 4: Unified 119B MoE Model Released Under Apache 2.0
Models
· 2 srcs · Mar 17
Discuss
6
Open Weights vs. Open Training: Fine-Tuning Large MoE Models Remains Practically Inaccessible
Research
1
Mar 13
6
Open Weights vs. Open Training: Fine-Tuning Large MoE Models Remains Practically Inaccessible
Research
· 1 src · Mar 13
Discuss
3 Weeks Ago
7
NVIDIA Nemotron 3 Ultra (550B) Available on Amazon SageMaker JumpStart
Models
1
Jun 4
7
NVIDIA Nemotron 3 Ultra (550B) Available on Amazon SageMaker JumpStart
Models
· 1 src · Jun 4
Discuss
7
JetBrains Releases Mellum2: Open-Source 12B MoE with Instruct and Thinking Variants for Developer AI
Updated
Models
2
Jun 2
7
JetBrains Releases Mellum2: Open-Source 12B MoE with Instruct and Thinking Variants for Developer AI
Models
· 2 srcs · Jun 2
Discuss
Last Month
7
Liquid AI Releases LFM2.5-8B-A1B: Edge MoE Model Trained on 38T Tokens
Models
1
May 30
7
Liquid AI Releases LFM2.5-8B-A1B: Edge MoE Model Trained on 38T Tokens
Models
· 1 src · May 30
Discuss
7
2,000-Run Study Identifies Optimal Mixture-of-Experts Config Rules
Research
1
May 21
7
2,000-Run Study Identifies Optimal Mixture-of-Experts Config Rules
Research
· 1 src · May 21
Discuss
8
Alibaba Qwen Releases Wave: Open MoE, Image Gen, and Compact Vision Models
Models
1
May 19
8
Alibaba Qwen Releases Wave: Open MoE, Image Gen, and Compact Vision Models
Top
Models
· 1 src · May 19
Discuss
6
Pretraining Failure Modes: MoE Causality Bugs and FP16 Precision Errors
Infra
1
May 18
6
Pretraining Failure Modes: MoE Causality Bugs and FP16 Precision Errors
Infra
· 1 src · May 18
Discuss
7
SlimQwen: Alibaba Compresses 80B MoE Model to 23B via Pruning and Distillation
Research
1
May 15
7
SlimQwen: Alibaba Compresses 80B MoE Model to 23B via Pruning and Distillation
Research
· 1 src · May 15
Discuss
7
EMO: New MoE Model Achieves Emergent Modularity Without Human-Defined Domains
Updated
Research
2
May 11
7
EMO: New MoE Model Achieves Emergent Modularity Without Human-Defined Domains
Research
· 2 srcs · May 11
Discuss
Filters
Signal
Title
Category
Sources
Posted
Discuss