People are using Super Mario to benchmark AI now | TechCrunch

Interestingly, the lab found that reasoning models like OpenAI’s o1, which “think” through problems step by step to arrive at solutions, performed worse than “non-reasoning” models, despite being generally stronger on most benchmarks. Thought Pokémon was a tough benchmark for AI? One group of researchers argues that Super Mario Bros. Hao AI Lab, a research org at the University of California San Diego, on Friday threw AI into live Super Mario Bros. One of the main reasons reasoning models have trouble playing real-time games like this is that they take a while — seconds, usually — to decide on actions, according to the researchers. GamingAgent, which Hao developed in-house, fed the AI basic instructions, like, “If an obstacle or enemy is near, move/jump left to dodge” and in-game screenshots. Still, Hao says that the game forced each model to “learn” to plan complex maneuvers and develop gameplay strategies. It wasn’t quite the same version of Super Mario Bros. The game ran in an emulator and integrated with a framework, GamingAgent, to give the AIs control over Mario. In Super Mario Bros., timing is everything. Unlike the real world, games tend to be abstract and relatively simple, and they provide a theoretically infinite amount of data to train AI. His writing has appeared in VentureBeat and Digital Trends, as well as a range of gadget blogs including Android Police, Android Authority, Droid-Life, and XDA-Developers.

This Cyber News was published on techcrunch.com. Publication date: Thu, 06 Mar 2025 00:59:02 +0000


Cyber News related to People are using Super Mario to benchmark AI now | TechCrunch

People are using Super Mario to benchmark AI now | TechCrunch - Interestingly, the lab found that reasoning models like OpenAI’s o1, which “think” through problems step by step to arrive at solutions, performed worse than “non-reasoning” models, despite being generally stronger on ...
9 hours ago Techcrunch.com
Thousands of Young People Told Us Why the Kids Online Safety Act Will Be Harmful to Minors - How young people feel about the Kids Online Safety Act matters. These comments show that thoughtful young people are deeply concerned about the proposed law's fallout, and that many who would be affected think it will harm them, not help them. In ...
11 months ago Eff.org
Latest Release of CIS Security Standards for February 2023 - We are delighted to announce the release of the new CIS pfSense Firewall Benchmark v1.0.0! We would like to express our gratitude to Touhid Shaikh and Daniel Brown for their hard work and communication which made this release possible. CIS ...
2 years ago Cisecurity.org
CVE-2022-47949 - The Nintendo NetworkBuffer class, as used in Animal Crossing: New Horizons before 2.0.6 and other products, allows remote attackers to execute arbitrary code via a large UDP packet that causes a buffer overflow, aka ENLBufferPwn. The victim must join ...
2 years ago
Super Bowl LVIII Presents a Vast Attack Surface for Threat Actors - The outcome of this year's Super Bowl matchup between the Kansas City Chiefs and the San Francisco 49ers on Feb. 11 at the Allegiant Stadium in Las Vegas will likely remain unknown until the last down of the game. The NFL's continuing digitization of ...
1 year ago Darkreading.com
CEO of Data Privacy Company Onerep.com Founded Dozens of People-Search Firms - The data privacy company Onerep.com bills itself as a Virginia-based service for helping people remove their personal information from almost 200 people-search websites. An investigation into the history of onerep.com finds this company is operating ...
11 months ago Krebsonsecurity.com
Be one of those people that gives back to the community - During the On Air recording, I noticed that Nicole had great camera presence and was able to articulate, what most people would consider, complex topics in a language that really anyone would understand. At some point I decided to make a career ...
1 year ago Feedpress.me
Privacy Isn't Dead. Far From It. - EFF is one of dozens, if not hundreds, of organizations that work to protect privacy. Millions of people read EFF's website each year, and tens of millions use the tools we've made, like Privacy Badger. Privacy is one of EFF's biggest concerns, and ...
1 year ago Eff.org
CVE-2024-50118 - In the Linux kernel, the following vulnerability has been resolved: btrfs: reject ro->rw reconfiguration if there are hard ro requirements [BUG] Syzbot reports the following crash: BTRFS info (device loop0 state MCS): disabling free space tree BTRFS ...
4 months ago Tenable.com
Balancing "super app" ambitions with privacy - Since the era of big data started - and the amount of both public and personal data collected and curated has blown up - it's become somewhat common for people to own several hundred online accounts. Nothing indicates the growing amount of data is ...
1 year ago Helpnetsecurity.com
CVE-2023-52848 - In the Linux kernel, the following vulnerability has been resolved: f2fs: fix to drop meta_inode's page cache in f2fs_put_super() syzbot reports a kernel bug as below: F2FS-fs (loop1): detect filesystem reference count leak during umount, type: 10, ...
9 months ago Tenable.com
Decoding the Elusive 'FedEx' Scam: An Inside Look at the Tactics and Challenges - One type of spam that is going around lately is FedEx scam calls, which have been targeting people, and are also doing the rounds. Most people have been victims of online fraud at some point in their lives. For us to better understand this scam, ...
1 year ago Cysecurity.news
Collection agency FBCS ups data breach tally to 3.2 million people - Debt collection agency Financial Business and Consumer Solutions now says over 3.2 million people have been impacted by a data breach that occurred in February. FBCS is a nationally licensed debt collection agency in the U.S., specializing in ...
8 months ago Bleepingcomputer.com
Hacker leaks millions of new 23andMe genetic data profiles - A hacker has leaked an additional 4.1 million stolen 23andMe genetic data profiles for people in Great Britain and Germany on a hacking forum. Earlier this month, a threat actor leaked the stolen data of 1 million Ashkenazi Jews who used 23andMe ...
1 year ago Bleepingcomputer.com Rocke Hunters
CVE-2023-27576 - An issue was discovered in phpList before 3.6.14. Due to an access error, it was possible to manipulate and edit data of the system's super admin, allowing one to perform an account takeover of the user with super-admin permission. Specifically, ...
4 months ago
CVE-2022-49409 - In the Linux kernel, the following vulnerability has been resolved: ext4: fix bug_on in __es_tree_search Hulk Robot reported a BUG_ON: ================================================================== kernel BUG at fs/ext4/extents_status.c:199! ...
1 week ago Tenable.com
CVE-2021-47126 - In the Linux kernel, the following vulnerability has been resolved: ...
11 months ago
Ransomware Gangs Are Collaborating To Attack Financial Services - The Cyber-Extortion Trinity-the BianLian, White Rabbit, and Mario ransomware gangs-was observed by researchers working together to launch a joint extortion campaign against publicly traded financial services companies. Although these joint ransomware ...
1 year ago Cybersecuritynews.com BianLian
KubeCon 2023: Not Your Father's Tenable - Look, full disclosure, I've been working with Tenable for 20 since I think Ron Gula and Renaud started Tenable. Alan Shimel: That'd be around 2001, maybe, I'm going to guess because that's when I had started my security company. We get a lot of ...
1 year ago Securityboulevard.com
Beyond the Noise: Appreciating the Quiet Work of Effective Doers - In many cases, few, if any, are aware of the work that they do and how important it is. We as an industry are long overdue in appreciating those who talk little but deliver big for us time and time again. Getting things done requires more than talk - ...
1 year ago Securityweek.com
Building inclusive AI will accelerate innovation - This post was authored by Mary Fernandez, Cisco global lead for disability and neuro-inclusion. Often when we talk about promoting inclusion of disabled and neurodivergent people in society we exclusively focus on work and education. In recognition ...
1 year ago Feedpress.me
Speaking Freely: Mohamed El Gohary - After majoring in Biomedical Engineering in October 2010, he switched careers to work as a Social Media manager for Al-Masry Al-Youm newspaper until October 2011, when he joined Global Voices contracts managing Lingua until the end of 2021. Free ...
9 months ago Eff.org
Speaking Freely: Alison Macrina - In the US, I think about power that comes from, not just the government, but also rich individuals and how they use their money to influence things like free speech, as well as corporations. I think the best way that we can use our speech is using it ...
1 year ago Eff.org

Cyber Trends (last 7 days)