Goblin
News
AI news by
promptgoblins.ai
|
News
About
News
About
Filtered by:
red-teaming
Clear
Titles
Summaries
April
7
OpenAI Restricts GPT-5.5 Cyber to Vetted Users After Mocking Anthropic for Same Move
Security
1
Apr 30
7
OpenAI Restricts GPT-5.5 Cyber to Vetted Users After Mocking Anthropic for Same Move
Security
· 1 src · Apr 30
Discuss
7
AI Jailbreakers: The Psychological Frontier of LLM Safety Testing
Safety
1
Apr 29
7
AI Jailbreakers: The Psychological Frontier of LLM Safety Testing
Safety
· 1 src · Apr 29
Discuss
9
OpenAI Releases GPT-5.5, GPT-5.5 Pro, and GPT Image 2: Full API Launch with NVIDIA Enterprise Rollout
Updated
Models
5
Apr 29
9
OpenAI Releases GPT-5.5, GPT-5.5 Pro, and GPT Image 2: Full API Launch with NVIDIA Enterprise Rollout
Top
Models
· 5 srcs · Apr 29
Discuss
6
Opinion: Critic Alleges Anthropic's Claude Mythos Security Claims Lack Verification
Security
1
Apr 23
6
Opinion: Critic Alleges Anthropic's Claude Mythos Security Claims Lack Verification
Security
· 1 src · Apr 23
Discuss
7
AI Models Execute Convincing Social Engineering Attacks in Automated Simulations
Security
1
Apr 22
7
AI Models Execute Convincing Social Engineering Attacks in Automated Simulations
Security
· 1 src · Apr 22
Discuss
7
Deep Neural Lesion: Sign-Bit Flips Can Catastrophically Break AI Models
Security
1
Apr 22
7
Deep Neural Lesion: Sign-Bit Flips Can Catastrophically Break AI Models
Security
· 1 src · Apr 22
Discuss
8
Mozilla Fixed 271 Firefox Bugs via Anthropic's Mythos AI — Almost Zero False Positives
Updated
Security
10
May 9
8
Mozilla Fixed 271 Firefox Bugs via Anthropic's Mythos AI — Almost Zero False Positives
Top
Security
· 10 srcs · May 9
Discuss
8
LLM API Router Supply Chain Attacks: Systematic Study Finds Active Exploits in the Wild
Security
1
Apr 13
8
LLM API Router Supply Chain Attacks: Systematic Study Finds Active Exploits in the Wild
Security
· 1 src · Apr 13
Discuss
7
Researchers Expose Every Major AI Agent Benchmark as Trivially Exploitable
Research
1
Apr 11
7
Researchers Expose Every Major AI Agent Benchmark as Trivially Exploitable
Research
· 1 src · Apr 11
Discuss
7
Researchers Reverse-Engineer Google's SynthID Watermark, Achieve 91% Bypass Effectiveness
Security
1
Apr 10
7
Researchers Reverse-Engineer Google's SynthID Watermark, Achieve 91% Bypass Effectiveness
Security
· 1 src · Apr 10
Discuss
9
Anthropic Claude Mythos Preview: UK AISI Independently Confirms Step-Change Cyber Capabilities with Hard Benchmarks
Updated
Security
19
Apr 14
9
Anthropic Claude Mythos Preview: UK AISI Independently Confirms Step-Change Cyber Capabilities with Hard Benchmarks
Top
Security
· 19 srcs · Apr 14
Discuss
8
Claude Code Uncovers 23-Year-Old Linux Kernel Vulnerability
Security
1
Apr 4
8
Claude Code Uncovers 23-Year-Old Linux Kernel Vulnerability
Top
Security
· 1 src · Apr 4
Discuss
7
Califio Researchers Use Claude to Find RCE Zero-Days in Vim and Emacs
Security
1
Apr 4
7
Califio Researchers Use Claude to Find RCE Zero-Days in Vim and Emacs
Security
· 1 src · Apr 4
Discuss
7
CVE-2026-4747: FreeBSD RPCSEC_GSS Stack Buffer Overflow Remote Kernel RCE
Security
1
Apr 4
7
CVE-2026-4747: FreeBSD RPCSEC_GSS Stack Buffer Overflow Remote Kernel RCE
Security
· 1 src · Apr 4
Discuss
7
AWS Launches Frontier Agents for Autonomous Security Testing and Cloud Operations
Enterprise
1
Apr 3
7
AWS Launches Frontier Agents for Autonomous Security Testing and Cloud Operations
Enterprise
· 1 src · Apr 3
Discuss
March
6
Enclave Raises $6M to Detect Security Flaws in AI-Generated Code
Markets
1
Mar 26
6
Enclave Raises $6M to Detect Security Flaws in AI-Generated Code
Markets
· 1 src · Mar 26
Discuss
7
Northeastern Study: OpenClaw AI Agents Manipulated Into Self-Sabotage via Social Engineering
Safety
1
Mar 25
7
Northeastern Study: OpenClaw AI Agents Manipulated Into Self-Sabotage via Social Engineering
Safety
· 1 src · Mar 25
Discuss
6
Startup Pays $800/Day for 'AI Bully' to Expose Chatbot Memory Failures
Safety
1
Mar 20
6
Startup Pays $800/Day for 'AI Bully' to Expose Chatbot Memory Failures
Safety
· 1 src · Mar 20
Discuss
8
Anthropic Hiring Weapons Expert to Prevent Catastrophic AI Misuse
Safety
1
Mar 17
8
Anthropic Hiring Weapons Expert to Prevent Catastrophic AI Misuse
Top
Safety
· 1 src · Mar 17
Discuss
7
Google Leads $12.5M AI-Era Open Source Security Pledge
Security
1
Mar 17
7
Google Leads $12.5M AI-Era Open Source Security Pledge
Security
· 1 src · Mar 17
Discuss
8
OpenAI Acquires Promptfoo, Embedding AI Security and Red Teaming into Its Platform
Markets
1
Mar 13
8
OpenAI Acquires Promptfoo, Embedding AI Security and Red Teaming into Its Platform
Top
Markets
· 1 src · Mar 13
Discuss
7
CodeWall Claims It Hacked McKinsey's Internal AI Platform Lilli
Security
1
Mar 13
7
CodeWall Claims It Hacked McKinsey's Internal AI Platform Lilli
Security
· 1 src · Mar 13
Discuss
Last Week
7
Champion Ethical Hacker Warns AI Tools May Obsolete Human Bug Hunters
Security
1
May 27
7
Champion Ethical Hacker Warns AI Tools May Obsolete Human Bug Hunters
Security
· 1 src · May 27
Discuss
7
AI Models Are Flooding Bug Bounty Programs, Reshaping Cybersecurity Economics
Security
1
May 25
7
AI Models Are Flooding Bug Bounty Programs, Reshaping Cybersecurity Economics
Security
· 1 src · May 25
Discuss
2 Weeks Ago
7
Anthropic Co-Founder Predicts Nobel-Winning AI Discovery Within 12 Months
Safety
1
May 21
7
Anthropic Co-Founder Predicts Nobel-Winning AI Discovery Within 12 Months
Safety
· 1 src · May 21
Discuss
9
Anthropic to Brief Global Finance Watchdog on Mythos AI Cyber Risks
Security
1
May 18
9
Anthropic to Brief Global Finance Watchdog on Mythos AI Cyber Risks
Top
Security
· 1 src · May 18
Discuss
7
AudioHijack: Imperceptible Audio Attacks Hijack AI Voice Models with Up to 96% Success Rate
Security
1
May 18
7
AudioHijack: Imperceptible Audio Attacks Hijack AI Voice Models with Up to 96% Success Rate
Security
· 1 src · May 18
Discuss
3 Weeks Ago
8
Researchers Discover MacOS Exploit Via Techniques Derived From Testing Anthropic's Mythos
Security
1
May 16
8
Researchers Discover MacOS Exploit Via Techniques Derived From Testing Anthropic's Mythos
Security
· 1 src · May 16
Discuss
6
Elite CTF Competitor Argues Frontier AI Has Broken Competitive Hacking Format
Security
1
May 16
6
Elite CTF Competitor Argues Frontier AI Has Broken Competitive Hacking Format
Security
· 1 src · May 16
Discuss
9
AI System Autonomously Finds 18-Year-Old Critical RCE Bug in NGINX
Security
1
May 14
9
AI System Autonomously Finds 18-Year-Old Critical RCE Bug in NGINX
Top
Security
· 1 src · May 14
Discuss
8
Microsoft MDASH Multi-Agent System Tops CyberGym Cybersecurity Benchmark
Security
1
May 14
8
Microsoft MDASH Multi-Agent System Tops CyberGym Cybersecurity Benchmark
Top
Security
· 1 src · May 14
Discuss
7
AI Safety Controls Remain Easy to Bypass, Researchers Warn
Safety
1
May 14
7
AI Safety Controls Remain Easy to Bypass, Researchers Warn
Safety
· 1 src · May 14
Discuss
7
Palo Alto Networks: Frontier AI Finds Vulnerabilities at Unprecedented Scale With 3-5 Month Window to Respond
Security
1
May 13
7
Palo Alto Networks: Frontier AI Finds Vulnerabilities at Unprecedented Scale With 3-5 Month Window to Respond
Security
· 1 src · May 13
Discuss
7
Anthropic's Mythos AI Bug-Hunter Finds One Low-Severity Flaw in cURL, Drawing Mockery
Security
1
May 11
7
Anthropic's Mythos AI Bug-Hunter Finds One Low-Severity Flaw in cURL, Drawing Mockery
Security
· 1 src · May 11
Discuss
Last Month
7
First Formal Study Demonstrates AI Models Self-Replicating Across Networked Computers
Safety
1
May 7
7
First Formal Study Demonstrates AI Models Self-Replicating Across Networked Computers
Safety
· 1 src · May 7
Discuss
8
Common Sense Media Launches Youth AI Safety Institute
Safety
1
May 5
8
Common Sense Media Launches Youth AI Safety Institute
Safety
· 1 src · May 5
Discuss
7
Anthropic Red-Teams 'Jupiter V1' Ahead of May 6 Dev Conference
Models
1
May 4
7
Anthropic Red-Teams 'Jupiter V1' Ahead of May 6 Dev Conference
Models
· 1 src · May 4
Discuss
8
Okta Research: AI Agents Bypass Guardrails and Leak Credentials in Real-World Tests
Security
1
May 2
8
Okta Research: AI Agents Bypass Guardrails and Leak Credentials in Real-World Tests
Security
· 1 src · May 2
Discuss
8
GPT-5.5 Matches Anthropic Mythos on Cybersecurity Benchmarks
Security
1
May 1
8
GPT-5.5 Matches Anthropic Mythos on Cybersecurity Benchmarks
Top
Security
· 1 src · May 1
Discuss
Filters
Signal
Title
Category
Sources
Posted
Discuss