Goblin
News
AI news by
promptgoblins.ai
|
News
About
News
About
Filtered by:
inference-compute
Clear
Titles
Summaries
April
8
Alphabet Cloud Hits $20B With 63% Growth, But $462B Backlog Shows Demand Far Exceeds Capacity
Updated
Markets
5
Apr 30
8
Alphabet Cloud Hits $20B With 63% Growth, But $462B Backlog Shows Demand Far Exceeds Capacity
Top
Markets
· 5 srcs · Apr 30
Discuss
7
AI Agent Evaluation Costs Surge to $40K+ Per Run, Becoming a New Compute Bottleneck
Research
1
Apr 29
7
AI Agent Evaluation Costs Surge to $40K+ Per Run, Becoming a New Compute Bottleneck
Research
· 1 src · Apr 29
Discuss
7
Meta Signs Deal for Space-Based Solar Power Beamed to Earth at Night
Infra
1
Apr 27
7
Meta Signs Deal for Space-Based Solar Power Beamed to Earth at Night
Infra
· 1 src · Apr 27
Discuss
7
Test-Time Scaling Breakthrough Pushes Coding Agents Past 77% on SWE-Bench
Research
1
Apr 27
7
Test-Time Scaling Breakthrough Pushes Coding Agents Past 77% on SWE-Bench
Research
· 1 src · Apr 27
Discuss
8
Meta Signs Deal for Millions of AWS Graviton CPUs for AI Workloads
Updated
Infra
3
Apr 27
8
Meta Signs Deal for Millions of AWS Graviton CPUs for AI Workloads
Top
Infra
· 3 srcs · Apr 27
Discuss
6
Jefferies: AI Compute Demand Outstrips Supply, Hyperscalers Remain Top Beneficiaries
Infra
1
Apr 24
6
Jefferies: AI Compute Demand Outstrips Supply, Hyperscalers Remain Top Beneficiaries
Infra
· 1 src · Apr 24
Discuss
6
Expert Upcycling: Expanding MoE Models Mid-Training Cuts GPU Costs by 32–67%
Research
1
Apr 24
6
Expert Upcycling: Expanding MoE Models Mid-Training Cuts GPU Costs by 32–67%
Research
· 1 src · Apr 24
Discuss
8
Google Launches 8th-Gen TPUs at Cloud Next: Two Purpose-Built Chips for the Agentic AI Era
Updated
Infra
6
Apr 22
8
Google Launches 8th-Gen TPUs at Cloud Next: Two Purpose-Built Chips for the Agentic AI Era
Top
Infra
· 6 srcs · Apr 22
Discuss
7
Morgan Stanley: Agentic AI to Boost CPU and Memory Spending Beyond GPUs
Markets
1
Apr 20
7
Morgan Stanley: Agentic AI to Boost CPU and Memory Spending Beyond GPUs
Markets
· 1 src · Apr 20
Discuss
7
Google in Talks With Marvell to Build Custom AI Inference Chips
Infra
1
Apr 20
7
Google in Talks With Marvell to Build Custom AI Inference Chips
Infra
· 1 src · Apr 20
Discuss
6
Moonshot AI Proposes Cross-Datacenter LLM Serving via Prefill-as-a-Service
Research
1
Apr 20
6
Moonshot AI Proposes Cross-Datacenter LLM Serving via Prefill-as-a-Service
Research
· 1 src · Apr 20
Discuss
6
China's Token Economy Mints New AI Stock Winners, Bypassing Tech Giants
Markets
1
Apr 20
6
China's Token Economy Mints New AI Stock Winners, Bypassing Tech Giants
Markets
· 1 src · Apr 20
Discuss
8
OpenAI Agrees to $20B+ Cerebras Chip Deal with Equity Stake, Doubling Prior Commitment
Markets
1
Apr 17
8
OpenAI Agrees to $20B+ Cerebras Chip Deal with Equity Stake, Doubling Prior Commitment
Top
Markets
· 1 src · Apr 17
Discuss
7
AI Compute Scarcity: Blackwell GPU Prices Surge 114% in Six Weeks as Frontier Access Tightens
Updated
Infra
2
Apr 28
7
AI Compute Scarcity: Blackwell GPU Prices Surge 114% in Six Weeks as Frontier Access Tightens
Infra
· 2 srcs · Apr 28
Discuss
3 Weeks Ago
7
NVIDIA DSX: End-to-End Platform for Designing and Operating AI Factories
Products
1
Jun 1
7
NVIDIA DSX: End-to-End Platform for Designing and Operating AI Factories
Products
· 1 src · Jun 1
Discuss
Last Month
7
XCENA Raises $135M to Put Compute Inside Memory, Cutting AI Server Overhead
Markets
1
May 29
7
XCENA Raises $135M to Put Compute Inside Memory, Cutting AI Server Overhead
Markets
· 1 src · May 29
Discuss
6
General Compute Raises $15M Seed to Build SambaNova-Powered Inference Neocloud
Infra
1
May 28
6
General Compute Raises $15M Seed to Build SambaNova-Powered Inference Neocloud
Infra
· 1 src · May 28
Discuss
6
NVIDIA CompileIQ: AI-Powered Compiler Auto-Tuning Lands in CUDA 13.3
Infra
1
May 27
6
NVIDIA CompileIQ: AI-Powered Compiler Auto-Tuning Lands in CUDA 13.3
Infra
· 1 src · May 27
Discuss
7
LiteFrame Cuts Video LLM Inference Latency 35% with Compact Encoder
Research
1
May 21
7
LiteFrame Cuts Video LLM Inference Latency 35% with Compact Encoder
Research
· 1 src · May 21
Discuss
7
OpenAI Launches Guaranteed Capacity for Enterprise Compute Access
Products
1
May 20
7
OpenAI Launches Guaranteed Capacity for Enterprise Compute Access
Products
· 1 src · May 20
Discuss
7
LongLive 2.0: Real-Time Long Video Generation at 45.7 FPS
Open Source
1
May 20
7
LongLive 2.0: Real-Time Long Video Generation at 45.7 FPS
Open Source
· 1 src · May 20
Discuss
6
Amazon SageMaker AI Adds Bidirectional Streaming for Real-Time Voice Apps
Products
1
May 20
6
Amazon SageMaker AI Adds Bidirectional Streaming for Real-Time Voice Apps
Products
· 1 src · May 20
Discuss
7
NVIDIA Vera Rubin Enters Full Production: Pod-Scale AI Factories Ramping Worldwide
Updated
Infra
2
Jun 1
7
NVIDIA Vera Rubin Enters Full Production: Pod-Scale AI Factories Ramping Worldwide
Infra
· 2 srcs · Jun 1
Discuss
8
NVIDIA Vera CPU: First Agentic AI Processor Delivered to Anthropic, OpenAI, SpaceXAI, and Oracle — Benchmarks Confirm Claims
Updated
Infra
3
May 27
8
NVIDIA Vera CPU: First Agentic AI Processor Delivered to Anthropic, OpenAI, SpaceXAI, and Oracle — Benchmarks Confirm Claims
Top
Infra
· 3 srcs · May 27
Discuss
7
Lighthouse Attention: 17× Faster Long-Context Training via Hierarchical Selection
Research
1
May 18
7
Lighthouse Attention: 17× Faster Long-Context Training via Hierarchical Selection
Research
· 1 src · May 18
Discuss
6
Async Continuous Batching Eliminates 24% GPU Idle Time in LLM Inference
Research
1
May 15
6
Async Continuous Batching Eliminates 24% GPU Idle Time in LLM Inference
Research
· 1 src · May 15
Discuss
7
PyTorch 2.12: 100x Eigendecomp Speedup, Unified Graph API
Open Source
1
May 14
7
PyTorch 2.12: 100x Eigendecomp Speedup, Unified Graph API
Open Source
· 1 src · May 14
Discuss
7
Modal Explains Four Ingredients for Serverless GPU Scaling
Infra
1
May 13
7
Modal Explains Four Ingredients for Serverless GPU Scaling
Infra
· 1 src · May 13
Discuss
8
CME Group and Silicon Data to Launch First Futures Market for AI Computing Power
Markets
1
May 12
8
CME Group and Silicon Data to Launch First Futures Market for AI Computing Power
Top
Markets
· 1 src · May 12
Discuss
6
Grok App Downloads Plummet 60% as xAI Pivots Toward Orbital Compute
Markets
1
May 12
6
Grok App Downloads Plummet 60% as xAI Pivots Toward Orbital Compute
Markets
· 1 src · May 12
Discuss
6
Meta IKBO: Kernel-Level Broadcast Elimination Cuts RecSys Latency by Two-Thirds
Infra
1
May 8
6
Meta IKBO: Kernel-Level Broadcast Elimination Cuts RecSys Latency by Two-Thirds
Infra
· 1 src · May 8
Discuss
7
TokenSpeed: Compiler-Backed LLM Inference Engine Built for Agentic Coding Workloads
Infra
1
May 7
7
TokenSpeed: Compiler-Backed LLM Inference Engine Built for Agentic Coding Workloads
Infra
· 1 src · May 7
Discuss
8
Anthropic Signs $1.25B/Month Compute Deal With xAI's Colossus 1 Data Center
Updated
Infra
8
May 20
8
Anthropic Signs $1.25B/Month Compute Deal With xAI's Colossus 1 Data Center
Top
Infra
· 8 srcs · May 20
Discuss
8
Anthropic Partners with SpaceX, Doubles Claude Usage Limits
Infra
1
May 6
8
Anthropic Partners with SpaceX, Doubles Claude Usage Limits
Top
Infra
· 1 src · May 6
Discuss
6
LLM Weights in BF16 Carry Only 10.6 of 16 Allocated Bits
Research
1
May 6
6
LLM Weights in BF16 Carry Only 10.6 of 16 Allocated Bits
Research
· 1 src · May 6
Discuss
6
SageMaker AI Adds Automatic Instance Fallback for GPU Capacity Gaps
Products
1
May 4
6
SageMaker AI Adds Automatic Instance Fallback for GPU Capacity Gaps
Products
· 1 src · May 4
Discuss
7
Speculative Decoding Cuts RL Post-Training Rollout Time by Up to 2.5x
Research
1
May 1
7
Speculative Decoding Cuts RL Post-Training Rollout Time by Up to 2.5x
Research
· 1 src · May 1
Discuss
6
KV Cache Locality: How Load Balancing Drives Up LLM Serving Costs
Infra
1
May 1
6
KV Cache Locality: How Load Balancing Drives Up LLM Serving Costs
Infra
· 1 src · May 1
Discuss
6
SMG: Rust Gateway Disaggregates CPU Work from GPU Inference to Kill GIL Bottleneck
Infra
1
May 1
6
SMG: Rust Gateway Disaggregates CPU Work from GPU Inference to Kill GIL Bottleneck
Infra
· 1 src · May 1
Discuss
Filters
Signal
Title
Category
Sources
Posted
Discuss