OpenAI Keynote on Building Scalable AI Infrastructure

OpenAI Hot Chips 2024_Page_19

At Hot Chips 2024, OpenAI will be giving a one-hour keynote about building a scalable AI infrastructure. This makes a lot of sense, as OpenAI as an organization uses a lot of compute and will likely use even more in the coming years.

We're live on Hot Chips 2024 this week so please excuse any typos.

OpenAI Keynote on Building Scalable AI Infrastructure

I'm sure most of you are familiar with ChatGPT and OpenAI and how LLM works, so I'll just show you the next few slides since I'm sure you're already up to speed.

OpenAI Hot Chips 2024_Page_03 OpenAI Hot Chips 2024_Page_04 OpenAI Hot Chips 2024_Page_05

From a scale perspective, GPT-1 was cool in 2018. GPT-2 was more consistent. GPT-3 had in-context learning. GPT-4 is actually useful. Future models are expected to be even more useful with new behaviors.

OpenAI Hot Chips 2024_Page_06

The big observation is that scaling up produces better, more useful AI.

OpenAI Hot Chips 2024_Page_07

The question was, how would OpenAI know if training a larger model would produce a better model? OpenAI observed that every time they doubled their computing power, they got better results. The graph below shows a four-order of magnitude increase in computing power, yet the scaling still worked.

OpenAI Hot Chips 2024_Page_08

OpenAI looked at tasks like coding and found that a similar pattern applies, and because this is done on an average logarithmic scale, pass/fail isn't overly weighted towards solving easy coding problems.

OpenAI Hot Chips 2024_Page_09

This is the MMLU benchmark, which is an attempt to be the gold standard for machine learning benchmarks, but with logarithmic progress, GPT-4 was already scoring around 90% on the test.

OpenAI Hot Chips 2024_Page_10

Here is a graph of the industry compute used to train various frontier models, which has grown roughly 4x annually since 2018.

OpenAI Hot Chips 2024_Page_13

GPT-1 was just a box for a few weeks, but it has now scaled up to use huge GPU clusters.

OpenAI Hot Chips 2024_Page_14

In 2018, the growth rate of computing increased from 6-7x annually to 4x. Many of the low-hanging fruit were likely solved in 2018. Going forward, issues such as cost and power will become much bigger challenges.

OpenAI Hot Chips 2024_Page_15

On the inference side, demand is driven by intelligence. Most of the inference computation is used for top-end models. Smaller models tend to require significantly less computation. Demand for inference GPUs is growing significantly.

OpenAI Hot Chips 2024_Page_16

Here are three key claims about AI computing:

OpenAI Hot Chips 2024_Page_17

It is believed that the world needs more AI infrastructure than we have planned.

OpenAI Hot Chips 2024_Page_18

The black line shows the actual solar demand. Here are the experts' predictions for demand. The line keeps going up, but the experts don't agree.

OpenAI Hot Chips 2024_Page_19

For nearly 50 years, Moore's Law has continued to rise and rise, longer than many people thought possible.

OpenAI Hot Chips 2024_Page_20

As a result, OpenAI believes that AI requires massive investment, with gains in computing power already delivering over eight orders of magnitude.

OpenAI says it needs to be designed for mass deployment. RAS is one example of this. Clusters get too big and experience hard and soft failures. Silent data corruption can occur that is not reproducible even if you can isolate the GPUs. Cluster failures have a wide impact.

OpenAI Hot Chips 2024_Page_22

OpenAI says repair costs need to come down, and the blast radius needs to be smaller so that the failure of one part means fewer other parts fail.

OpenAI Hot Chips 2024_Page_23

One idea is to use graceful degradation, very similar to what we do in our hosting clusters at STH, and without the need for technician time. Validation is also important in large environments.

OpenAI Hot Chips 2024_Page_24

Power becomes a big challenge because there is only so much power in the world, and GPUs all start and stop at the same time, which creates challenges with data center loads.

OpenAI Hot Chips 2024_Page_25

Just as we have learned important lessons, so too has OpenAI. Read on to find out.

OpenAI Hot Chips 2024_Page_26

It's interesting that everyone is so focused on performance, yet performance is only one of the four points.

Final Words

The scaling challenges and cluster-level challenges are huge. If you look at the Top500, a large AI cluster today is roughly the size of the top three or four systems on that list combined. It was interesting to see what major customers are saying about their AI hardware needs.

Source link

What's Hot

One sixty-meter to rent in a lease. “You can get a shower with one hand and mix with another pot.”

I will meet with zelanski. We sign an agreement

A breakthrough in Ukrainian minerals. Trump: I’ll meet Zelansky

Conversational AI – AiPedia

Associative Memory – AiPedia

AI Search – AiPedia

Everything you need to know about Mercosur and its translation into several different languages

6 Best Free Online Translation Tools

How do I translate my mobile app?

EF Polymer Named to Forbes' List of 100 Asia Companies to Watch for Agricultural Innovation

Champions League draw: Improved format packed with high-profile rematches between Europe's biggest clubs

Costacurta expects Milan to have a 'good journey' in Europe this season

Two of Europe's most successful teams meet in the Champions League

One sixty-meter to rent in a lease. “You can get a shower with one hand and mix with another pot.”

Review: 7 Future Fashion Trends Shaping the Future of Fashion

Meta’s AlbedoGAN Advances Realistic 3D Face Generation

Subscribe to Updates

What's Hot

OpenAI Keynote on Building Scalable AI Infrastructure

OpenAI Keynote on Building Scalable AI Infrastructure

Final Words

Related Posts