On Twitter, Tesla’s AI team is sharing its plans for foundation models for autonomous robots like the Tesla Bot.
Tesla’s goal with the Tesla Bot is to create a universal, autonomous, bipedal humanoid robot capable of performing dangerous, repetitive, or boring tasks. Like other robotics projects, Tesla hopes to achieve this goal by using foundation models for autonomous robots.
Such models are trained on large amounts of data, and their general capabilities form the basis for specialized applications. In computational linguistics, GPT-4 is an example of such a model.
Tesla relies on big (video) data
For the robotic models, Tesla plans to rely on multimodal neural networks already used in Tesla’s autonomous driving vehicles. These currently process multiple modalities such as camera video, maps, navigation, IMU (Inertial Measurement Unit), or GPS to predict whether there are vehicles, cyclists, people, or other objects in the way.
According to Tesla’s AI team, these networks could also be used for collision avoidance in any robot. All the data from the entire fleet is also used to reconstruct sections of the road on which the AI can be further trained. In addition, the team is developing generative models that can, for example, produce short new video clips in which the vehicle behaves differently based on diverse real-world data.
This increases the amount of data available – a basic requirement for foundation models. A short clip also shows how a Tesla bot or similar system collects data in offices.
Video foundation models as the “brains” of the Optimus bot
Together, they will create video foundation models that form the “brains” of cars and robots. Google is also experimenting with such foundation models for robots and has shown that they can be used to build better robots with its multimodal Robotic Transformer.
Tesla has a clear data advantage, at least in the area of autonomous driving, and could also collect the data necessary for foundation models for robots with the Optimus robot planned for mass production.
To do this, it needs computing power, and Tesla wants to get its own Dojo supercomputer up to 100 exaflops by October 2024, the equivalent of about 400,000 Nvidia A100 GPUs. The interesting insight into their plans, however, is mainly an attempt to recruit the experts Tesla is desperately looking for.