Dojo, comes from a Japanese martial arts school or training facility, and has been borrowed by Elon Musk for the computer field, possibly meaning it’s a place to train on the car’s data. In fact, Dojo is used to process and train the massive amounts of data generated by Autopilot.
Dojo for Tesla’s Autopilot
With Tesla’s FSD Beta (Fully Self-Driving) having accumulated 500 million miles and Autopilot (Automated Assisted Driving) having exceeded 3 billion miles, the volume of this data is massive, with millions of terabytes of cumulative data (1TB = 1024GB). This data is mainly video-based, which occupies much more space compared to text and audio, and therefore requires extremely powerful computing power.
It is in this context that the Dojo supercomputer was born. It came about in response to Tesla’s own need to process large amounts of video data, but also in relation to NVIDIA.
Why does Musk build a Dojo?
Tesla’s training of AI requires the use of large amounts of video to teach the system how to operate the vehicle automatically and safely. However, NVIDIA’s recently released new chip, the H100, while significantly better than its predecessor, the A100, is priced at up to $40,000 a piece, which is already equivalent to the price of a new Model 3. Tesla’s training clusters require 10,000 H100s to support them, with the GPUs alone costing hundreds of millions of dollars.
Due to the high demands of this advanced chip, NVIDIA was unable to produce enough H100 units to meet the growing needs of Tesla and the industry. As a result, Tesla has decided to invest over $1 billion to develop its own supercomputer, Dojo.
Dojo will have a custom-designed chip specifically designed to train Tesla’s neural network. Musk has said that the creation of Dojo might not have been necessary if NVIDIA had been able to provide enough chips. However, the current situation forced Tesla to find its own solution, hence the creation of the Dojo supercomputer.
Musk has talked about the development of FSD Beta, saying that it takes a lot of computing power to achieve L5-level autonomous driving, and Dojo will provide faster neural network training. Musk said that the fundamental limiting factor in the progress of fully automated driving is training, and that Tesla could get things done faster if there were more training calculations. That problem has now been solved with the launch of Dojo, where FSD learning will be three times faster.
Tesla is not only developing Dojo, but also designing the first Dojo data center. The construction of the data center is part of a broader vision for Tesla, which has a significant need for large amounts of computing resources, especially for Autopilot software, which processes and analyzes large amounts of video data. The Austin, Texas job listing for a senior engineering project manager job indicates a construction move for a Tesla data center right at the Texas Superfactory.
According to Tesla’s vision, by February 2024, Dojo will become one of the world’s most advanced 5 supercomputers, by October 2024, Dojo’s total computing power scale will reach 100 Exa-Flops. In simple terms, Exa-Flops is the computer’s computational power per second, Exa-Flops is equivalent to 10 billion per second. From the money point of view, 100 Exa-Flops is equivalent to 300,000 pieces of Nvidia A100 chip arithmetic, before the very hot Chatgpt is trained with A100 chips, a piece of A100 chip sold for about $ 10,000, then from the value of 100 Exa-Flops arithmetic will have to spend $ 3 billion to build.