Zuckerberg says Meta will want 10x extra computing energy to coach Llama 4 than Llama 3

0
19
Zuckerberg says Meta will need 10x more computing power to train Llama 4 than Llama 3


Meta, which develops one of many greatest foundational open-source massive language fashions, Llama, believes it’ll want considerably extra computing energy to coach fashions sooner or later.

Mark Zuckerberg mentioned on Meta’s second-quarter earnings name on Tuesday that to coach Llama 4 the corporate will want 10x extra compute than what was wanted to coach Llama 3. However he nonetheless desires Meta to construct capability to coach fashions reasonably than fall behind its rivals.

“The quantity of computing wanted to coach Llama 4 will probably be nearly 10 occasions greater than what we used to coach Llama 3, and future fashions will proceed to develop past that,” Zuckerberg mentioned.

“It’s exhausting to foretell how this may development a number of generations out into the longer term. However at this level, I’d reasonably threat constructing capability earlier than it’s wanted reasonably than too late, given the lengthy lead occasions for spinning up new inference tasks.”

Meta launched Llama 3 with 80 billion parameters in April. The corporate final week launched an upgraded model of the mannequin, known as Llama 3.1 405B, which had 405 billion parameters, making it Meta’s greatest open-source mannequin.

Meta’s CFO, Susan Li, additionally mentioned the corporate is considering totally different knowledge heart tasks and constructing capability to coach future AI fashions. She mentioned Meta expects this funding to extend capital expenditures in 2025.

Coaching massive language fashions generally is a expensive enterprise. Meta’s capital expenditures rose practically 33% to $8.5 billion in Q2 2024, from $6.4 billion a yr earlier, pushed by investments in servers, knowledge facilities and community infrastructure.

In response to a report from The Data, OpenAI spends $3 billion on coaching fashions and an extra $4 billion on renting servers at a reduction fee from Microsoft.

“As we scale generative AI coaching capability to advance our basis fashions, we’ll proceed to construct our infrastructure in a manner that gives us with flexibility in how we use it over time. This may enable us to direct coaching capability to gen AI inference or to our core rating and suggestion work, after we anticipate that doing so could be extra priceless,” Li mentioned through the name.

Through the name, Meta additionally talked about its consumer-facing Meta AI’s utilization and mentioned India is the most important market of its chatbot. However Li famous that the corporate doesn’t anticipate Gen AI merchandise to contribute to income in a major manner.



Supply hyperlink

LEAVE A REPLY

Please enter your comment!
Please enter your name here