Probabilistic Collision Loss: Bounds and Soft Distance Maps for Autonomous Driving
This post explores collision loss design for end-to-end autonomous driving training, focusing on extending PLUTO’s binary occupancy map approach to handle probabilistic maps. The method enables safer autonomous driving by providing smooth, uncertainty-aware collision avoidance while maintaining computational efficiency. Using Signed Distance Map with Binary Occupancy Map How PLUTO Builds the Loss Map [2] Vehicle Model In PLUTO and other planning algorithms, the vehicle is modelded as a series of overlapping discs. ...
A State Machine for Object Tracking
This post contains a state machine diagram that illustrates the typical lifecycle of an object in a tracking system. Object Tracking Management State Machine.
Wayformer Paper Reading
This post provides a technical deep dive into the Wayformer paper [1], a key publication in the field of motion forecasting. Training Overview An overview of the deep learning training pipeline, illustrating the data flow and key components involved during model training. Model Overview of the One-Stage E2E model One staged E2E model. Overview of the Two-Stage E2E model Two staged E2E model. Details of the Two-Stage E2E Model Overview of the Wayformer model. Model Structure Overview (a) (b) The left figure shows the encoder and decoder of the Wayformer model. The right figure shows the details of the encoder [1]. Feature Embedding/Feature Projection $$\mathbf{f}\in \mathbb{R}^{T \times N\times D} \to \mathbf{x}_{input} \in \mathbb{R}^{(T \cdot N) \times d}$$Where $T$ is the number of time history, $N$ is the number of entities, $D$ is the number of features, and $d=256$. ...
LiDAR-SLAM Decoded: From Point Clouds to Precision Maps
What is SLAM? SLAM demo. SLAM stands for Simultaneous Localization and Mapping. It is a computational problem of constructing or updating a map of an unknown environment while simultaneously keeping track of an agent's location within it. Applications Object Detection Parking Lot Annotation Lane Annotation Lane Reprojection HD Map [source] SLAM has various applications, including: ...
Perspective-n-Point (PnP) Problem
In this post, we will discuss the perspective-n-point (PnP) problem. We will start with the problem definition. Then, gradient-based optimization methods will be introduced. Finally, we will discuss two global optimization methods. Problem Formulation The core task of the Perspective-n-Point (PnP) problem is to determine the pose—specifically, the rotation and translation—of a calibrated camera in 3D space. This is achieved by using a set of known 3D points in the world and their corresponding 2D projections observed on the camera’s image sensor. ...
Connecting Points with Grace: A Study on Natural Path Generation Methods
Given a starting point, starting direction, ending point, and ending direction, the goal is to generate a feasible and “natural” path that connects the two points. This path should adhere to vehicle dynamics and avoids obstacles. However, it is hard to define what is “natural”. Path Planning In this section, we review some classical path planning methods. We ignore the algorithmic details and focus on the resulting path shapes to provide an overview. The demonstration images are generated using the code repository [1]. ...
Demystifying Kalman Filters: From Classical Estimation to Bayesian Inference
In this post, we will discuss the Kalman filters from two perspectives: classical parameter estimation and Bayesian estimation. Each perspective provides a unique way to derive the Kalman filter. Parameter estimation is more flexible as it allows you to easily add constraints, revise the transition and observation models, and derive other related smoothing and filtering methods. Meanwhile, Bayesian estimation is more intuitive and provides a clearer probabilistic interpretation, helping us understand the underlying principles better. By examining both approaches, we can gain a more comprehensive understanding of how Kalman filters work. ...
Understanding Reinforcement Learning: Concepts, Algorithms, and Applications
I have taken some courses related to reinforcement learning during my Ph.D. study. However, I have not touched it for a long time. Recently, I am working on end-to-end autonomous driving and the success of reinforcement learning in large language models brings me back to this topic. In this post, we will introduce the basic concepts and commonly used algorithms in reinforcement learning. More detailed information can be found in the reference book [1] and OpenAI RL page. ...
Fisheye Camera Extrinsic EOL Calibration
This post is an application of the EOL calibration described in the EOL calibration article. Detect Image Corners Detecting the corners of the fisheye images on its original image is challenging due to its severe distortion. So we resort to detecting the corners on the BEV image and then project the corners back onto the original image to further refine the corners. Set initial extrinsics (referring to installation parameters (angles and positions) or parameters from joint calibration with LiDAR), construct a 20m×20m grid with resolution of 0.01m in the ego coordinate system’s ground plane (z=0), and generate a BEV projected image; ...
Coordinate Systems in Autonomous Driving
In this post, we will discuss the coordinate systems commonly used in autonomous driving. In practice, different positioning providers may define their own coordinate systems. This post will introduce the fundamental concepts of these coordinate systems and explain how to convert between them for your specific needs. Pose on the Earth When we talk about pose, we refer to the position and orientation of an object in the world. In the context of autonomous driving, the world is the Earth. Therefore, pose describes the position and orientation of an object relative to the Earth. ...