Goblin
News
AI news by
promptgoblins.ai
|
News
About
News
About
Filtered by:
multimodal
Clear
Titles
Summaries
April
7
Google DeepMind Launches AI Co-Clinician Research Initiative
Research
1
Apr 30
7
Google DeepMind Launches AI Co-Clinician Research Initiative
Research
· 1 src · Apr 30
Discuss
7
Eka Startup Demonstrates Generalized Robot Dexterity via VLA Models and Self-Supervised Free-Play Training
Research
1
Apr 29
7
Eka Startup Demonstrates Generalized Robot Dexterity via VLA Models and Self-Supervised Free-Play Training
Research
· 1 src · Apr 29
Discuss
7
AI Carb Estimation Study: 27,000 Queries Reveal Dangerous Variability in Diabetes Apps
Safety
1
Apr 29
7
AI Carb Estimation Study: 27,000 Queries Reveal Dangerous Variability in Diabetes Apps
Safety
· 1 src · Apr 29
Discuss
7
SenseTime Releases SenseNova U1: Open-Source Image Model With Native Image Reasoning
Models
1
Apr 29
7
SenseTime Releases SenseNova U1: Open-Source Image Model With Native Image Reasoning
Models
· 1 src · Apr 29
Discuss
7
NVIDIA Nemotron 3 Nano Omni Launches: Unified Multimodal Model on SageMaker
Updated
Models
4
Apr 29
7
NVIDIA Nemotron 3 Nano Omni Launches: Unified Multimodal Model on SageMaker
Models
· 4 srcs · Apr 29
Discuss
6
YouTube Tests 'Ask YouTube' AI Search Feature with Step-by-Step Text and Video Results
Products
1
Apr 28
6
YouTube Tests 'Ask YouTube' AI Search Feature with Step-by-Step Text and Video Results
Products
· 1 src · Apr 28
Discuss
6
Google Translate Turns 20: Pronunciation Practice and Gemini-Powered Live Translation
Products
1
Apr 28
6
Google Translate Turns 20: Pronunciation Practice and Gemini-Powered Live Translation
Products
· 1 src · Apr 28
Discuss
6
Amazon Launches AI-Powered Audio Q&A on Product Pages
Products
1
Apr 28
6
Amazon Launches AI-Powered Audio Q&A on Product Pages
Products
· 1 src · Apr 28
Discuss
7
Vision Banana: Image Generation Pretraining Achieves SOTA on Diverse Vision Tasks
Research
1
Apr 27
7
Vision Banana: Image Generation Pretraining Achieves SOTA on Diverse Vision Tasks
Research
· 1 src · Apr 27
Discuss
6
State of Efficient Video AI in 2026: Encoders, Edge Deployment, Scale Challenges
Research
1
Apr 27
6
State of Efficient Video AI in 2026: Encoders, Edge Deployment, Scale Challenges
Research
· 1 src · Apr 27
Discuss
9
OpenAI Releases GPT-5.5, GPT-5.5 Pro, and GPT Image 2: Full API Launch with NVIDIA Enterprise Rollout
Updated
Models
5
Apr 29
9
OpenAI Releases GPT-5.5, GPT-5.5 Pro, and GPT Image 2: Full API Launch with NVIDIA Enterprise Rollout
Top
Models
· 5 srcs · Apr 29
Discuss
6
X Launches Grok-Powered Custom Timelines Across 75+ Topic Categories
Products
1
Apr 23
6
X Launches Grok-Powered Custom Timelines Across 75+ Topic Categories
Products
· 1 src · Apr 23
Discuss
7
Google Launches Generative AI Features for Enterprise Geospatial Analysis
Enterprise
1
Apr 22
7
Google Launches Generative AI Features for Enterprise Geospatial Analysis
Enterprise
· 1 src · Apr 22
Discuss
8
ChatGPT Images 2.0: Native Text Rendering Drives Adoption — and Fraud Concerns
Updated
Security
4
May 2
8
ChatGPT Images 2.0: Native Text Rendering Drives Adoption — and Fraud Concerns
Top
Security
· 4 srcs · May 2
Discuss
7
Alibaba Qwen3.5-Omni: SOTA Omnimodal Model Surpasses Gemini-3.1 Pro on Audio
Models
1
Apr 21
7
Alibaba Qwen3.5-Omni: SOTA Omnimodal Model Surpasses Gemini-3.1 Pro on Audio
Models
· 1 src · Apr 21
Discuss
7
Claude Design Launch Adds AI Pressure on Figma's Non-Designer Base
Updated
Products
4
Apr 20
7
Claude Design Launch Adds AI Pressure on Figma's Non-Designer Base
Products
· 4 srcs · Apr 20
Discuss
7
Amazon Nova Multimodal Embeddings Enables Native Video Semantic Search
Products
2
Apr 17
7
Amazon Nova Multimodal Embeddings Enables Native Video Semantic Search
Products
· 2 srcs · Apr 17
Discuss
7
Google Upgrades AI Mode in Chrome with Side-by-Side Browsing and Cross-Tab Search
Updated
Products
4
Apr 17
7
Google Upgrades AI Mode in Chrome with Side-by-Side Browsing and Cross-Tab Search
Products
· 4 srcs · Apr 17
Discuss
Monday
8
NVIDIA Cosmos 3: Unified Omni-Model for Physical AI Reasoning and Action
Updated
Models
3
19h ago
8
NVIDIA Cosmos 3: Unified Omni-Model for Physical AI Reasoning and Action
Top
Models
· 3 srcs · 19h ago
Discuss
Last Week
7
NVIDIA LocateAnything: Parallel Box Decoding Breaks VLM Grounding Speed-Accuracy Tradeoff
Research
1
5d ago
7
NVIDIA LocateAnything: Parallel Box Decoding Breaks VLM Grounding Speed-Accuracy Tradeoff
Research
· 1 src · 5d ago
Discuss
7
Trajectory: Ex-Google DeepMind and Apple Researchers Target Visual AI with $50M Seed
Markets
1
5d ago
7
Trajectory: Ex-Google DeepMind and Apple Researchers Target Visual AI with $50M Seed
Markets
· 1 src · 5d ago
Discuss
6
Claude Voice Mode Adding Multilingual Support
Products
1
5d ago
6
Claude Voice Mode Adding Multilingual Support
Products
· 1 src · 5d ago
Discuss
2 Weeks Ago
8
Google Adds Ads to AI Mode Search with Gemini-Powered Formats
Products
2
May 21
8
Google Adds Ads to AI Mode Search with Gemini-Powered Formats
Top
Products
· 2 srcs · May 21
Discuss
7
Google Launches Gemini Omni Flash Multi-Input Video Generator
Updated
Models
4
4d ago
7
Google Launches Gemini Omni Flash Multi-Input Video Generator
Models
· 4 srcs · 4d ago
Discuss
7
LiteFrame Cuts Video LLM Inference Latency 35% with Compact Encoder
Research
1
May 21
7
LiteFrame Cuts Video LLM Inference Latency 35% with Compact Encoder
Research
· 1 src · May 21
Discuss
6
WavFlow Generates Audio Directly in Raw Waveform Space
Research
1
May 21
6
WavFlow Generates Audio Directly in Raw Waveform Space
Research
· 1 src · May 21
Discuss
9
Google I/O 2026: Gemini 3.5, World Models, and $190B AI Infrastructure Bet
Updated
Models
30
May 24
9
Google I/O 2026: Gemini 3.5, World Models, and $190B AI Infrastructure Bet
Top
Models
· 30 srcs · May 24
Discuss
7
ByteDance Releases Lance: Open-Source 3B Unified Multimodal Model
Updated
Open Source
2
May 21
7
ByteDance Releases Lance: Open-Source 3B Unified Multimodal Model
Open Source
· 2 srcs · May 21
Discuss
8
Google Workspace Gets Voice AI, New Image App, and Personal Agent at I/O 2026
Products
7
May 19
8
Google Workspace Gets Voice AI, New Image App, and Personal Agent at I/O 2026
Top
Products
· 7 srcs · May 19
Discuss
8
Alibaba Qwen Releases Wave: Open MoE, Image Gen, and Compact Vision Models
Models
1
May 19
8
Alibaba Qwen Releases Wave: Open MoE, Image Gen, and Compact Vision Models
Top
Models
· 1 src · May 19
Discuss
8
Anduril and Meta Building AI-Powered AR Smart Glasses for Military Combat
Enterprise
1
May 18
8
Anduril and Meta Building AI-Powered AR Smart Glasses for Military Combat
Top
Enterprise
· 1 src · May 18
Discuss
3 Weeks Ago
6
Aronofsky Defends AI Filmmaking at Cannes, Details 1776 Series and Google DeepMind Collab
Creative
1
May 16
6
Aronofsky Defends AI Filmmaking at Cannes, Details 1776 Series and Google DeepMind Collab
Creative
· 1 src · May 16
Discuss
7
Thinking Machines Lab: Mira Murati Bets on Human-Collaborative AI Over Full Automation
Products
3
May 15
7
Thinking Machines Lab: Mira Murati Bets on Human-Collaborative AI Over Full Automation
Products
· 3 srcs · May 15
Discuss
8
Meta Launches Muse Spark: New Foundational Model Powering Meta AI Across Apps and Glasses
Models
1
May 13
8
Meta Launches Muse Spark: New Foundational Model Powering Meta AI Across Apps and Glasses
Models
· 1 src · May 13
Discuss
7
Perceptron Mk1: Video Analysis AI Model Priced 80-90% Below Frontier Rivals
Models
1
May 13
7
Perceptron Mk1: Video Analysis AI Model Priced 80-90% Below Frontier Rivals
Models
· 1 src · May 13
Discuss
7
Alibaba Releases Qwen-Image-2.0: Unified Image Generation and Editing Model
Models
1
May 13
7
Alibaba Releases Qwen-Image-2.0: Unified Image Generation and Editing Model
Models
· 1 src · May 13
Discuss
7
Tsinghua Study: Visual Generation Boosts AI Spatial Reasoning
Research
1
May 12
7
Tsinghua Study: Visual Generation Boosts AI Spatial Reasoning
Research
· 1 src · May 12
Discuss
7
A²RD: Agentic Diffusion Architecture Achieves 30% Consistency Gains in Long-Form Video Generation
Research
1
May 12
7
A²RD: Agentic Diffusion Architecture Achieves 30% Consistency Gains in Long-Form Video Generation
Research
· 1 src · May 12
Discuss
7
Google Launches Gemini-Powered AI Mouse Pointer in Chrome and Googlebook
Products
1
May 12
7
Google Launches Gemini-Powered AI Mouse Pointer in Chrome and Googlebook
Products
· 1 src · May 12
Discuss
6
Amazon Nova Multimodal Embeddings Enables Cross-Modal Manufacturing Document Search
Products
1
May 11
6
Amazon Nova Multimodal Embeddings Enables Cross-Modal Manufacturing Document Search
Products
· 1 src · May 11
Discuss
6
OpenAI Launches gpt-realtime-translate for Live Speech Interpretation
Models
1
May 11
6
OpenAI Launches gpt-realtime-translate for Live Speech Interpretation
Models
· 1 src · May 11
Discuss
Last Month
7
OpenAI Launches GPT-Realtime-2, Translate, and Whisper Voice Features in API
Products
1
May 8
7
OpenAI Launches GPT-Realtime-2, Translate, and Whisper Voice Features in API
Products
· 1 src · May 8
Discuss
8
AI2 Releases MolmoAct 2: Open Robotics Model Beats GPT-5 on Embodied Reasoning
Models
1
May 6
8
AI2 Releases MolmoAct 2: Open Robotics Model Beats GPT-5 on Embodied Reasoning
Top
Models
· 1 src · May 6
Discuss
6
Google Gemini API File Search Goes Multimodal with Metadata Filters and Citations
Updated
Products
2
May 10
6
Google Gemini API File Search Goes Multimodal with Metadata Filters and Citations
Products
· 2 srcs · May 10
Discuss
6
OpenAI Launches ChatGPT for Intune, a Dedicated iOS App for Enterprise
Products
1
May 6
6
OpenAI Launches ChatGPT for Intune, a Dedicated iOS App for Enterprise
Products
· 1 src · May 6
Discuss
8
OpenAI Launches GPT-5.5 Instant as New Default ChatGPT Model
Models
1
May 5
8
OpenAI Launches GPT-5.5 Instant as New Default ChatGPT Model
Top
Models
· 1 src · May 5
Discuss
6
Meta Releases Tuna-2: Encoder-Free Multimodal Model Outperforms Predecessors
Models
1
May 5
6
Meta Releases Tuna-2: Encoder-Free Multimodal Model Outperforms Predecessors
Models
· 1 src · May 5
Discuss
6
Edit-R1: Verifier-Based Reinforcement Learning Framework Advances Image Editing
Research
1
May 4
6
Edit-R1: Verifier-Based Reinforcement Learning Framework Advances Image Editing
Research
· 1 src · May 4
Discuss
8
Google Gemini 3.1 Pro Preview Takes Top Spot on Artificial Analysis Intelligence Index
Models
1
May 1
8
Google Gemini 3.1 Pro Preview Takes Top Spot on Artificial Analysis Intelligence Index
Top
Models
· 1 src · May 1
Discuss
6
GLM-5V-Turbo: Multimodal-Native Foundation Model for Agentic AI
Models
1
May 1
6
GLM-5V-Turbo: Multimodal-Native Foundation Model for Agentic AI
Models
· 1 src · May 1
Discuss
Filters
Signal
Title
Category
Sources
Posted
Discuss