ChatGPT 4.1 fails to beat Google Gemini 2.5 in early benchmarks

According to benchmarks shared by Stagehand, which is a production-ready browser automation framework, Gemini 2.0 Flash has the lowest error rate (6.67%) along with the highest exact‑match score (90%), and it’s also cheap and fast. On the other hand, GPT‑4.1 has a higher error rate (16.67%) and costs over 10 times more than Gemini 2.0 Flash. In another data shared by Pierre Bongrand, who is a scientist working on RNA at Harward, GPT‑4.1 offers poorer cost-effectiveness than competing models. According to the benchmarks, these models are far better than the existing GPT‑4o and GPT‑4o mini, particularly in coding. Models like Gemini 2.0 Flash, Gemini 2.5 Pro, and even DeepSeek or o3 mini lie closer to or on the frontier, which suggests they deliver higher performance at a lower or comparable cost. We're seeing similar results in coding benchmarks, with Aider Polyglot listing GPT-4.1 with a 52% score, while Gemini 2.5 is miles ahead at 73%. Yesterday, OpenAI confirmed that developers with API access can try as many as three new models: GPT‑4.1, GPT‑4.1 mini, and GPT‑4.1 nano.

This Cyber News was published on www.bleepingcomputer.com. Publication date: Tue, 15 Apr 2025 21:25:10 +0000

Cyber News related to ChatGPT 4.1 fails to beat Google Gemini 2.5 in early benchmarks

Google Adds Gemini Pro API to AI Studio and Vertex AI - Google also announced Duet AI for Developers and Duet AI in Security Operations, but neither uses Gemini yet. Starting Dec. 13, developers can use Google AI Studio and Vertex AI to build applications with the Gemini Pro API, which allows access to ...
2 years ago Techrepublic.com

Attackers Can Gain Control of Users' Queries and LLM Data Output - Gemini is Google's newest family of Large Language Models. The Gemini suite currently houses 3 different model sizes: Nano, Pro, and Ultra. Although Gemini has been removed from service due to politically biased content, findings from HiddenLayer ...
1 year ago Packetstormsecurity.com

Google Rebrands Bard AI Chatbot As Gemini - Bard becomes Gemini, as Google rebrands chatbot and launches monthly subscription for access to more powerful AI system. Alphabet's Google has shaken up its artificial intelligence chatbot offering, as it seeks to take the fight to rival Microsoft. ...
1 year ago Silicon.co.uk

Google Launches Gemini, the Most Capable and Largest AI Model - In a groundbreaking revelation, Google has ushered in a new era of artificial intelligence with the introduction of Gemini, its most formidable and sophisticated AI model to date. This paradigm-shifting technology promises to redefine human-machine ...
2 years ago Cybersecuritynews.com

XSS Marks the Spot: Digging Up Vulnerabilities in ChatGPT - With its widespread use among businesses and individual users, ChatGPT is a prime target for attackers looking to access sensitive information. In this blog post, I'll walk you through my discovery of two cross-site scripting vulnerabilities in ...
1 year ago Imperva.com

Google Cloud Next 2024: New Data Center Chip Joins Ecosystem - Google Cloud announced a new enterprise subscription for Chrome and a bevy of generative AI add-ons for Google Workspace during the Cloud Next '24 conference, held in Las Vegas from April 9 - 11. Overall, Google Cloud is putting its Gemini generative ...
1 year ago Techrepublic.com

ChatGPT 4.1 fails to beat Google Gemini 2.5 in early benchmarks - According to benchmarks shared by Stagehand, which is a production-ready browser automation framework, Gemini 2.0 Flash has the lowest error rate (6.67%) along with the highest exact‑match score (90%), and it’s also cheap and fast. ...
8 months ago Bleepingcomputer.com

Gemini: Google Launches its Most Powerful AI Software Model - Google has recently launched Gemini, its most powerful generative AI software model to date. Since the model is designed in three different sizes, Gemini may be utilized in a variety of settings, including mobile devices and data centres. Google has ...
2 years ago Cysecurity.news

ChatGPT 4.1 early benchmarks compared against Google Gemini - For example, GPT‑4.1 scores 54.6% on SWE-bench Verified, which is better than GPT-4o by 21.4% and 26.6% over GPT‑4.5. We have similar results on other benchmarking tools shared by OpenAI, but how does it compete against Gemini ...
8 months ago Bleepingcomputer.com

Google to Announce Chat-GPT Rival On February 8 Event - There seems to be a lot of consternation on Google's part at the prospect of a showdown with ChatGPT on the February 8 event. The search giant has been making moves that suggest it is preparing to enter the market for large language models, where ...
2 years ago Cybersecuritynews.com

Google Gemini's Astra (screen sharing) rolls out on Android for some users - According to a video shared by a Reddit user who owns a Xiaomi phone with a Gemini Advanced subscription, you can now share your phone's screen with Gemini Live and ask questions about it. At MWC 2025, Google confirmed it was working on screen and ...
9 months ago Bleepingcomputer.com

How enterprises are using gen AI to protect against ChatGPT leaks - ChatGPT is the new DNA of shadow IT, exposing organizations to new risks no one anticipated. Enterprise workers are gaining a 40% performance boost thanks to ChatGPT based on a recent Harvard University study. A second study from MIT discovered that ...
1 year ago Venturebeat.com

Flaw in Gemini CLI AI coding assistant allowed stealthy code execution - A vulnerability in Google's Gemini CLI allowed attackers to silently execute malicious commands and exfiltrate data from developers' computers using allowlisted programs. Tracebit found it's possible to hide malicious instructions in these files to ...
4 months ago Bleepingcomputer.com

How Are Security Professionals Managing the Good, The Bad and The Ugly of ChatGPT? - ChatGPT has emerged as a shining light in this regard. Already we're seeing the platform being integrated into corporate systems, supporting in areas such as customer success or technical support. The bad: The risks surrounding ChatGPT. Of course, ...
2 years ago Cyberdefensemagazine.com

Google Gemini flaw hijacks email summaries for phishing - Google Gemini for Workspace can be exploited to generate email summaries that appear legitimate but include malicious instructions or warnings that direct users to phishing sites without using attachments or direct links. As many users are likely to ...
5 months ago Bleepingcomputer.com

Sec-Gemini v1 - Google Released a New AI Model for Cybersecurity - The model draws on extensive data sources, including Google Threat Intelligence (GTI), the Open Source Vulnerabilities (OSV) database, and Mandiant Threat Intelligence, to deliver unparalleled performance in critical areas such as incident root cause ...
8 months ago Cybersecuritynews.com

ChatGPT Extensions Could be Exploited to Steal Data and Sensitive Information - API security professionals Salt Security have released new threat research from Salt Labs highlighting critical security flaws within ChatGPT plugins, presenting a new risk for enterprises. Plugins provide AI chatbots like ChatGPT access and ...
1 year ago Itsecurityguru.org

Google Gemini for Workspace Vulnerability Lets Attackers Conceal Malicious Scripts in Emails - Cyber Security News - Security researchers have uncovered a significant vulnerability in Google Gemini for Workspace that enables threat actors to embed hidden malicious instructions within emails. The attack exploits the AI assistant’s “Summarize this ...
5 months ago Cybersecuritynews.com

Google Researchers' Attack Prompts ChatGPT to Reveal Its Training Data - A team of researchers primarily from Google's DeepMind systematically convinced ChatGPT to reveal snippets of the data it was trained on using a new type of attack prompt which asked a production model of the chatbot to repeat specific words forever. ...
2 years ago 404media.co

Restrictions on Gemini Chatbot's Election Answers by Google - AI chatbot Gemini has been limited by Google in terms of its ability to respond to queries concerning several forthcoming elections in several countries, including the presidential election in the United States, this year. According to an ...
1 year ago Cysecurity.news

ChatGPT Clone Apps Collecting Personal Data on iOS, Play Store - On Android devices, one of the apps analyzed by researchers has more than 100,000 downloads, tracks, and shares location data with ByteDance and Amazon, etc. ChatGPT, the AI software, has already taken the Internet by storm, and that is why ...
2 years ago Hackread.com Everest

Apple In Talks With Google To Bring Gemini AI To iPhones - Apple reportedly in talks with Google to use Gemini for generative AI tasks on iPhones in potentially major win for search giant. Apple is in talks with Google to bring its Gemini generative artificial intelligence to the iPhone platform, Bloomberg ...
1 year ago Silicon.co.uk

Researchers Uncover Simple Technique to Extract ChatGPT Training Data - Can getting ChatGPT to repeat the same word over and over again cause it to regurgitate large amounts of its training data, including personally identifiable information and other data scraped from the Web? The answer is an emphatic yes, according to ...
2 years ago Darkreading.com

Are you sure you want to share that with ChatGPT? How Metomic helps stop data leaks - Open AI's ChatGPT is one of the most powerful tools to come along in a lifetime, set to revolutionize the way many of us work. Workers aren't content to wait until organizations work this question out, however: Many are already using ChatGPT and ...
1 year ago Venturebeat.com