A brand new period for AI maths whizzes

0
17
A new era for AI maths whizzes


Alibaba Cloud’s Qwen crew has unveiled Qwen2-Math, a collection of huge language fashions particularly designed to sort out complicated mathematical issues.

These new fashions – constructed upon the present Qwen2 basis – reveal outstanding proficiency in fixing arithmetic and mathematical challenges, and outperform former business leaders.

The Qwen crew crafted Qwen2-Math utilizing an enormous and various Arithmetic-specific Corpus. This corpus contains a wealthy tapestry of high-quality sources, together with internet texts, books, code, examination questions, and artificial information generated by Qwen2 itself.

Rigorous analysis on each English and Chinese language mathematical benchmarks – together with GSM8K, Math, MMLU-STEM, CMATH, and GaoKao Math – revealed the distinctive capabilities of Qwen2-Math. Notably, the flagship mannequin, Qwen2-Math-72B-Instruct, surpassed the efficiency of proprietary fashions corresponding to GPT-4o and Claude 3.5 in numerous mathematical duties.

qwen2 math benchmark

“Qwen2-Math-Instruct achieves the perfect efficiency amongst fashions of the identical dimension, with RM@8 outperforming Maj@8, significantly within the 1.5B and 7B fashions,” the Qwen crew famous.

This superior efficiency is attributed to the efficient implementation of a math-specific reward mannequin throughout the improvement course of.

Additional showcasing its prowess, Qwen2-Math demonstrated spectacular ends in difficult mathematical competitions just like the American Invitational Arithmetic Examination (AIME) 2024 and the American Arithmetic Contest (AMC) 2023.

To make sure the mannequin’s integrity and stop contamination, the Qwen crew carried out strong decontamination strategies throughout each the pre-training and post-training phases. This rigorous strategy concerned eradicating duplicate samples and figuring out overlaps with take a look at units to take care of the mannequin’s accuracy and reliability.

Trying forward, the Qwen crew plans to develop Qwen2-Math’s capabilities past English, with bilingual and multilingual fashions within the pipeline.  This dedication to inclusivity goals to make superior mathematical problem-solving accessible to a world viewers.

“We’ll proceed to boost our fashions’ potential to resolve complicated and difficult mathematical issues,” affirmed the Qwen crew.

You will discover the Qwen2 fashions on Hugging Face right here.

See additionally: Paige and Microsoft unveil next-gen AI fashions for most cancers prognosis

ai expo world 728x 90 01

Need to study extra about AI and massive information from business leaders? Try AI & Huge Knowledge Expo going down in Amsterdam, California, and London. The great occasion is co-located with different main occasions together with Clever Automation Convention, BlockX, Digital Transformation Week, and Cyber Safety & Cloud Expo.

Discover different upcoming enterprise expertise occasions and webinars powered by TechForge right here.

Tags: ai, alibaba cloud, synthetic intelligence, maths, fashions, qwen, qwen2, qwen2-math



Supply hyperlink

LEAVE A REPLY

Please enter your comment!
Please enter your name here