Visual Perception Algorithms Become Crucial
Automotive cameras are divided into perception cameras and video cameras, according to Sunny Optical.
Perception camera, used for active safety (generally for forward and inward view), captures images accurately. Video camera, suitable for passive safety (typically surround view and rear view), stores or sends the captured images to the user. They are totally different in imaging quality and temperature reliability.
For lane detection, signal light detection, road sign recognition, in-vehicle monitoring, etc., any errors in the images taken by perception cameras will issue in software calculation errors and inevitable consequences. In this sense, perception camera is less sensitive to price.
Video cameras, with not so high a demanding on performance, are very sensitive to price; there exists a price war in this market where numerous Chinese companies are fiercely scrambling for meager profits but the vast majority do not make money. As per the pattern of Chinese passenger-car-use surround-view camera OEM market in 2019, the market share of vendors is quite scattered, a sharp contrast to the top 6 players holding a combined over 90% of the front-view monocular camera market.
Forward-view cameras require complex algorithms and chips. A forward-view monocular camera costs nearly RMB1,000, and a forward-view binocular camera over RMB1,000. Rear-view, side-view and in-built cameras has the unit price at RMB200 or so.
Breakthroughs in Binocular Camera
Chinese startups have made dramatic breakthroughs in binocular cameras, especially for commercial vehicle and special vehicle.
Smarter Eye has shipped more than 10,000 binocular cameras, mainly available in Apollo autonomous vehicles, JMC small buses, sanitation vehicles, autonomous boats, tractors, patrol vehicles, etc. Smarter Eye’s binocular cameras are typically rendered for AEBS and detection of height limiters. The bus groups in more than 30 cities have employed such cameras.
Smarter Eye pioneers in applying binocular cameras to detect height limiters. Both conventional height limiters and non-standard height limiters such as small archways and culverts in the countryside can be accurately perceived by binocular cameras that promptly warn drivers by sound and light alarms. Recreational vehicles, special vehicles, etc. have rigid demand for this function.
In 2019, Sphyrna Technology announced it successfully developed a flexible binocular camera system which can use two separate monocular cameras to accomplish a binocular camera system, overcoming inherent defects (large size, complicated process, difficult installation, and high cost) of binocular cameras.
The two cameras are independently fixed on the right side of the car body, needing neither rigid connection in the middle nor strict requirements on angle or spacing. The "flexible binocular cameras" lie in self-calibration technology. Even if the cameras slightly deform and move during use, Sphyrna Technology’s algorithms will automatically detect and recalibrate, dispensing with regular recalibration essential to conventional binocular cameras.
In April 2020, Foresight Autonomous Holdings Ltd. announced to collaborate with FLIR Systems Inc. on the development, marketing and distribution of Foresight’s QuadSight? vision system, combined with FLIR Systems’ infrared cameras, to a wide range of prospective customers. Foresight will exclusively purchase its thermal cameras from FLIR for all systems to be commercialized worldwide. Foresight’s unique QuadSight system, comprised of four thermal and visible-light cameras, showcases the capability of FLIR thermal cameras to greatly improve ADAS and autonomous vehicle safety through improved situational awareness at night, in high-contrast scenes and in bad weather. The data fusion between the two stereo channels can effectively avoid the non-reporting or false-reporting under the extreme conditions like the tunnel entrance where light changes quickly.
Millimeter-wave radar is beating other sensors. The same is true of cameras. Many companies like MAXIEYE are attempting to replace binocular cameras or liDAR by using monocular cameras for ranging and 3D imaging.
Unlike the first-generation product -- IFVS-200 series based on machine learning solutions, MAXIEYE’s 3rd-Gen product IFVS-500 family based on deep learning technology achieves monocular ranging and 3D scanning. IFVS-500 not only allows monocular vision products to perform 3D scanning like lidar, but also resembles lidar by scanning the 3D point cloud within 50 meters, enabling direct ranging of the object and detection of both motor vehicles within 200 meters and pedestrians & small obstacles within 100 meters.
Visual Capabilities under Extreme Scenarios Heighten
Whatever vision inside or outside the car in poor lighting and even the dark requires infrared technology. ON Semiconductor’s RGB-IR image sensor features near infrared (NIR) enhanced images. Israeli start-up TriEye has developed a sensor technology for short-wave infrared (SWIR) which enhances safety in vehicles fitted with assistance systems or autonomous driving functions by improving the ability to see in weather conditions where visibility is poor, such as dust, fog, murky conditions or rain, and to stay alert to the danger (like icy road) in advance.
In April 2020, Alibaba DAMO Academy revealed that it has developed an ISP processor for in-vehicle cameras. Through the use of it, the car camera has a significant improvement in image object detection and recognition capabilities by more than 10% compared to the mainstream processor in the night, the most challenging scene, and the original obscure labeling can be clearly identified, according to the drive test results of the DAMO Academy Automated Driving Laboratory.
On May 19, 2020, OmniVision announced the expansion of its Nyxel near-infrared (NIR) technology into the automotive market with the new 2.5-megapixel OX03A2S image sensor. This ASIL-B sensor is intended for exterior imaging applications that operate in low to no ambient light conditions within 2 meters of the vehicle. It can detect and recognize objects that other image sensors would reportedly miss under extremely low lighting conditions.
Visual Perception Algorithms Become Crucial
The massive data incurred by the dramatic increment of vision sensors in smart cars as well as the addition of vision sensors has brought challenges and opportunities to algorithm processing.
For the visual perception system, Mobileye, a leader of visual ADAS algorithms, gets a multitude of independent perception algorithms used simultaneously to create internal redundancy layers for both detection and measurements with higher perception accuracy and stability. Detection is to determine what object is perceived, and measurements are to infer the 3D information of the perceived object via the 2D image of the camera.
For detection, Mobileye uses 6 independent engines:
3DVD or 3D Object Detection: It is the common approach of predicting a 3D bounding box using a neural network from the 2D image.
Full Image Detection: It deals with the common scenario where an object (such as buses cars or trucks) is extremely close and can only be partially seen. Object classification using visual signatures is used to track close up vehicles in this scenario.
Top View FS: It focuses on identifying and marking the unoccupied roads in the screen.
Features Detection: This appearance-based engine detects vehicles from reliable features, such as wheels.
VIDAR: It is an approach that uses multiple views of the environment to triangulate and create a depth map. That depth map is converted into a point cloud and LiDAR based processing algorithms are used on that point cloud.
Scene Segmentation (NSS): It is a neural network-based approach to break the scene down into free space and semantic objects (vehicle, sign, pedestrian, etc.), the pixels containing an object are treated as a detection.
Also as a leader in visual perception algorithms, Tesla names its deep learning network as HydraNet which involves 48 different neural networks that output 1,000 distinct tensors (predictions) at each timestep; in theory, HydraNet can detect 1,000 objects simultaneously. Tesla acquired DeepScale, a Silicon Valley startup that focuses on developing computer vision technologies, in a bid to improve its algorithmic capabilities.
To catch up with Tesla and Mobileye in visual perception algorithms, OEMs and Tier1 suppliers are expanding their teams of software engineers. The algorithmic capabilities will be one of the decisive factors to the performance of visual perception.