Hackers Can Bypass Microsoft, Nvidia, & Meta AI Filters With a Simple Emoji

According to new research, these companies’ AI safety systems can be completely bypassed using a deceptively simple technique involving emoji characters, allowing malicious actors to inject harmful prompts and execute jailbreaks with 100% success in some cases. The impact of this discovery is far-reaching, affecting major commercial AI safety systems including Microsoft’s Azure Prompt Shield, Meta’s Prompt Guard, and Nvidia’s NeMo Guard Jailbreak Detect. This discovery highlights critical weaknesses in existing AI safety mechanisms and emphasizes the urgent need for more robust protective measures as AI systems become increasingly integrated into sensitive applications. Large Language Model (LLM) guardrails are specialized systems designed to protect AI models from prompt injection and jailbreak attacks. Their findings, published in a comprehensive academic paper, demonstrate that character injection techniques – particularly emoji smuggling – can completely circumvent detection while maintaining the functionality of the underlying prompt. When processed by guardrail systems, these characters and the text between them become essentially invisible to detection algorithms, while the LLM itself can still parse and execute the hidden instructions. Cyber Security News is a Dedicated News Platform For Cyber News, Cyber Attack News, Hacking News & Vulnerability Analysis. Most concerning, the emoji smuggling technique achieved a perfect 100% success rate across multiple systems. These security measures inspect user inputs and outputs, filtering or blocking potentially harmful content before it reaches the underlying AI model. As organizations increasingly deploy AI systems across various sectors, these guardrails have become critical infrastructure for preventing misuse. The researchers achieved attack success rates of 71.98% against Microsoft, 70.44% against Meta, and 72.54% against Nvidia using various evasion techniques. With years of experience under his belt in Cyber Security, he is covering Cyber Security News, technology and other news. A significant security vulnerability has been uncovered in the artificial intelligence safeguards deployed by tech giants Microsoft, Nvidia, and Meta.

This Cyber News was published on cybersecuritynews.com. Publication date: Tue, 06 May 2025 20:05:05 +0000

Cyber News related to Hackers Can Bypass Microsoft, Nvidia, & Meta AI Filters With a Simple Emoji

Hackers Can Bypass Microsoft, Nvidia, & Meta AI Filters With a Simple Emoji - According to new research, these companies’ AI safety systems can be completely bypassed using a deceptively simple technique involving emoji characters, allowing malicious actors to inject harmful prompts and execute jailbreaks with 100% ...
3 months ago Cybersecuritynews.com

Windows 10 KB5062554 update breaks emoji panel search feature - The search feature for the Windows 10 emoji panel is broken after installing the KB5062554 cumulative update released Tuesday, making it not possible to look up emojis by name or keyword. BleepingComputer can confirm that the search feature in ...
4 weeks ago Bleepingcomputer.com

CVE-2021-36845 - Multiple Authenticated Stored Cross-Site Scripting (XSS) vulnerabilities in YITH Maintenance Mode (WordPress plugin) versions < 1.3.8, there are 46 vulnerable parameters that were missed by the vendor while patching the 1.3.7 version to 1.3.8. ...
3 years ago

25 Best Managed Security Service Providers (MSSP) - 2025 - Pros & Cons: ProsConsStrong threat intelligence & expert SOCs.High pricing for SMBs.24/7 monitoring & rapid incident response.Complex UI and steep learning curve.Flexible, scalable, hybrid deployments.Limited visibility into endpoint ...
1 month ago Cybersecuritynews.com

Privacy at Stake: Meta's AI-Enabled Ray-Ban Garners' Mixed Reactions - There is a high chance that Meta is launching a new version of Ray-Ban glasses with embedded artificial intelligence assistant capabilities to revolutionize wearable technology. As a result of this innovation, users will have the ability to process ...
1 year ago Cysecurity.news

Meta sues ex VP of Infrastructure for 'trade secret theft' The Register - Over the course of his 12-year employment at the Facebook giant, Dipinder Singh Khurana - also known as T.S. Khurana - rose to the rank of vice-president of infrastructure. He left the mega-corp in June 2023 to take a position as senior veep of ...
1 year ago Go.theregister.com

Microsoft Incident Response lessons on preventing cloud identity compromise - Microsoft Incident Response is often engaged in cases where organizations have lost control of their Microsoft Entra ID tenant, due to a combination of misconfiguration, administrative oversight, exclusions to security policies, or insufficient ...
1 year ago Microsoft.com

As Meta rolls out end-to-end encryption, police warn keeping children safe 'no longer possible' - The move will ensure that Meta's users are protected from abusive legal requests from non-democratic governments. Globally the company receives hundreds of thousands of government requests for user data annually, according to its transparency center ...
1 year ago Therecord.media

WhatsApp's Meta AI is now rolling out in Europe, and it can't be turned off - The chatbot built into WhatsApp is not as powerful as Meta AI's web app, but it can answer your questions, reply with a large chunk of text, share links from Bing, and even create images. On March 19, WhatsApp owner Meta announced that a variety ...
4 months ago Bleepingcomputer.com

Nvidia sued after video call mistake showed 'stolen' data - According to a lawsuit filed against tech giant Nvidia, senior staff member Mohammad Moniruzzaman made this error with disastrous consequences. In the course of it, Valeo claims he accidentally displayed a file proving he stole its tech secrets. The ...
1 year ago Bbc.com

Wordfence Intelligence Weekly WordPress Vulnerability Report (September 23, 2024 to September 29, 2024) - Software Name Software Slug 012 Ps Multi Languages 012-ps-multi-languages ABC APP CREATOR abcapp-creator Absolute Reviews absolute-reviews Accordion accordions Ads by WPQuads – Adsense Ads, Banner Ads, Popup Ads quick-adsense-reloaded Advanced File ...
10 months ago Wordfence.com Slug

Facebook's New Privacy Nightmare: 'Link History' - Facebook is doubling down on tracking your behavior, despite the efforts of regulators worldwide. Its new Link History app feature is yet another AdTech privacy dark pattern. Meta's Mister Zuckerberg pretends it's all for the good of Facebook users. ...
1 year ago Securityboulevard.com

Microsoft reveals how hackers breached its Exchange Online accounts - Microsoft confirmed that the Russian Foreign Intelligence Service hacking group, which hacked into its executives' email accounts in November 2023, also breached other organizations as part of this malicious campaign. On January 12, 2024, Microsoft ...
1 year ago Bleepingcomputer.com APT29

Meta AI Models Cracked Open With Exposed API Tokens - Researchers recently were able to get full read and write access to Meta's Bloom, Meta-Llama, and Pythia large language model repositories in a troubling demonstration of the supply chain risks to organizations using these repositories to integrate ...
1 year ago Darkreading.com

Cohesity partners with NVIDIA to harness the power of generative AI - Cohesity announced a collaboration with NVIDIA to help organizations safely unlock the power of generative AI and data using the recently announced NVIDIA NIM microservices and by integrating NVIDIA AI Enterprise into the Cohesity Gaia platform. ...
1 year ago Helpnetsecurity.com

CISA orders agencies impacted by Microsoft hack to mitigate risks - CISA has issued a new emergency directive ordering U.S. federal agencies to address risks resulting from the breach of multiple Microsoft corporate email accounts by the Russian APT29 hacking group. It requires them to investigate potentially ...
1 year ago Bleepingcomputer.com APT29

Meta Rolls Out Default End-to-End Encryption on Messenger Amid Child Security Concerns - Meta Platforms announced on Wednesday the commencement of the rollout of end-to-end encryption for personal chats and calls on both Messenger and Facebook. This heightened security feature, ensuring that only the sender and recipients can access ...
1 year ago Cysecurity.news

New Phishing Scam Hooks META Businesses with Trademark Threats - The phishing scam falsely asserts that the victim's Facebook page will be permanently deleted due to a post allegedly infringing on trademark rights. There is no actual infringement; it's all part of the scammer's malicious plan. In a recent wave of ...
1 year ago Hackread.com

Meta’s Llama Firewall Bypassed Using Prompt Injection Vulnerability - Trendyol’s application security team uncovered a series of bypasses that render Meta’s Llama Firewall protections unreliable against sophisticated prompt injection attacks. Testing and Disclosure Of 100 payloads tested, half succeeded; ...
4 weeks ago Cybersecuritynews.com

New Microsoft Incident Response guides help security teams analyze suspicious activity - Today Microsoft Incident Response are proud to introduce two one-page guides to help security teams investigate suspicious activity in Microsoft 365 and Microsoft Entra. These guides contain the artifacts that Microsoft Incident Response hunts for ...
1 year ago Microsoft.com

Meta and Microsoft double-down on AI - Artificial intelligence has the potential to change every industry and businesses are racing to harness those capabilities for the benefit of their users. Some are forming new partnerships to share knowledge and experience, allowing them to develop ...
1 year ago Pandasecurity.com

How Hackers Interrupted GTA 5 Online Gameplay on PC - Recently, a cyber-attack on Grand Theft Auto 5 Online on PC caused an interruption to thousands of players’ gameplays. The game was completely taken offline and players couldn’t even access the main gameplay menu. The attack caused an uproar ...
2 years ago Hackread.com

How to manage a migration to Microsoft Entra ID - Microsoft Entra ID, formerly Azure Active Directory, is not a direct replacement for on-premises Active Directory due to feature gaps and alternative ways to perform similar identity and access management tasks. For some organizations, a move to ...
1 year ago Techtarget.com

How To Implementing MITRE ATT&CK In SOC Workflows - A Step-by-Step Guide - By understanding the framework, mapping your current capabilities, developing targeted detection and response strategies, and integrating ATT&CK into your tools and processes, you can build a proactive, threat-informed defense that evolves ...
3 months ago Cybersecuritynews.com

CVE-2024-43891 - In the Linux kernel, the following vulnerability has been resolved: ...
7 months ago