New infrastructures for autonomous driving: AI foundation models and intelligent computing centers are emerging.

In recent years, the boom of artificial intelligence has actuated autonomous driving, and the troika of artificial intelligence is: data, algorithm, and computing power. This report highlights the research on new infrastructures for autonomous driving algorithms and computing power: AI foundation models and intelligent computing centers.

Large AI model, or foundation model, internationally known as pre-trained model, refers to a model trained on a vast quantity of unlabeled data at scale resulting in a model that can be adapted to a wide range of downstream tasks. The Transformer networks Google proposed in 2017 laid the foundation of mainstream algorithm architecture for current foundation models. The ViT (Vision Transformer), introduced by Google in 2020, first applied the Transformer architecture to the image classification task in the field of computer vision (CV). And then Tesla’s introduction of Transformer foundation models into autopilot started the adoption of large AI models in autonomous driving.

Key features of AI foundation models:

1. Generalization capability is strong.

AI foundation models can capture knowledge from a mass of labeled and unlabeled data, and fine-tunes specific tasks by storing knowledge into enormous parameters.

For example, Baidu ERNIE Foundation Model learns from large knowledge graphs and massive unstructured data, and then works with companies to build industry foundation models. Up to now, ERNIE Model has released 11 industry models. Wherein, Geely-Baidu ERNIE, a large automotive industry model co-built by Baidu and Geely in November 2022, uses Baidu ERNIE Foundation Model 3.0 for fine-tuning and verification in three tasks: intelligent customer service knowledge base expansion, short answer generation for vehicle speech systems, and knowledge base construction in automotive field.

2. Have self-supervised learning capability, reducing training and development costs

The self-supervised learning method of AI foundation models can reduce data annotations, and partly solve the problems of high cost, long cycle and low accuracy of manual annotations. For example, the video self-supervised foundation model, unveiled by in January 2023, first builds a large model based on data clips, and adjusts the model using a part of manually annotated clip data, in which only 10% of the key frames are manually annotated, and the other 90% are not; and then trains the entire model to guess the content of the next frame according to the current frame, and automatically annotates the remaining 90% frames, so as to achieve 100% automatic annotation and lower the cost of annotation.

3. AI foundation models can break the accuracy limitations of existing model structures.

The experimental researches in recent years show that larger models and data scale may break the existing accuracy limitations. For example, the INTERN Foundation Model 2.0 SenseTime released in September 2022 has been a leading performer in model support in more than 40 visual tasks in 12 categories, outperforming world-renowned institutions in related fields.

The use of AI foundation models can not only greatly expedite algorithm iteration, but also directly shorten the iteration cycle of autonomous driving systems. To match large-scale parameters and mass data calculations in models, some OEMs and autonomous driving technology developers have begun to build data computing centers that can provide large computing power and train foundation models, namely, intelligent computing centers.

Intelligent computing center refers to the infrastructure for building intelligent computing server clusters based on chips (e.g., GPU and FPGA) to provide intelligent computing power. For intelligent computing centers need long construction period and huge initial investment, only some powerful OEMs and companies make layout of construction at present. Examples include Geely which launched the Xingrui Intelligent Computing Center in January 2023, with total investment of RMB1 billion and 5,000 cabinets planned. The facility currently boasts total cloud computing power of 810 petaflops per second, which is expected to expand to 1,200 petaflops per second in 2025. It covers such services as intelligent connectivity, intelligent driving, new energy safety, and trial production experiments, improving Geely’s overall R&D efficiency by 20%.

Furthermore, China is also encouraging rapid development of intelligent computing centers. In 2022, the State Council issued the 14th Five-Year Plan for the Development of the Digital Economy, suggesting promoting the orderly development of intelligent computing centers and building new intelligent infrastructures that integrate intelligent computing power, general algorithms, and development platforms. In February 2022, the East-Data-West-Computing Project was fully launched. National computing power hub nodes started construction in 8 regions, i.e., Beijing-Tianjin-Hebei, Yangtze River Delta, Guangdong-Hong Kong-Macao Greater Bay Area, Chengdu-Chongqing, Inner Mongolia, Guizhou, Gansu, and Ningxia, and 10 national data center clusters were planned. So far, there have been more than 30 cities in China building or proposing to build intelligent computing centers, some of which have become operational.