“This incident happened because of human error and lasted longer than it should have because we didn’t have proper visibility into which credentials were being used by the Gateway Worker to authenticate with our storage infrastructure,” Cloudflare explained in their incident report. The company disclosed that 100% of write operations and approximately 35% of read operations to their R2 object storage service failed during the incident window that lasted 1 hour and 7 minutes. Cloudflare’s engineering team identified the root cause at 22:36 UTC—nearly an hour after the impact began and restored service by deploying credentials to the correct production Worker at 22:45 UTC. This incident follows another hour-long Cloudflare outage in February when an employee mistakenly disabled the entire R2 Gateway service while attempting to block a phishing URL. Security researchers reporting the February event pointed out that the outage was caused by a lack of controls and validation checks for high-impact operations. The recurring nature of configuration-related outages underscores the industry-wide challenge of managing complex cloud infrastructure while maintaining rigorous security practices like credential rotation. When the old credentials were subsequently deleted from their storage infrastructure, the production R2 Gateway service—which serves as the API frontend—lost authentication access to backend systems. The ripple effects extended to Email Security, Billing, Key Transparency Auditor, and Log Delivery services, with the latter experiencing up to 70-minute delays in processing. Cyber Security News is a Dedicated News Platform For Cyber News, Cyber Attack News, Hacking News & Vulnerability Analysis. Gurubaran is a co-founder of Cyber Security News and GBHackers On Security. Vectorize, Cloudflare’s vector database, experienced 75% query failure rates and complete failure for insert operations.
This Cyber News was published on cybersecuritynews.com. Publication date: Wed, 26 Mar 2025 13:20:13 +0000