Overview: This project introduces a transformer-based 3D object detection framework for LiDAR point clouds, aimed at improving perception and navigation in autonomous vehicles. Unlike conventional CNN and PointNet-based models, our approach leverages self-attention mechanisms to effectively capture long-range dependencies and spatial correlations within sparse and irregular LiDAR data. A pretrained PointNet++ model is used for feature extraction, ensuring high-quality embeddings that are passed to transformer blocks for robust object detection.
GitHub Repository: View Source Code on GitHub
Key Features:
Results & Impact
Experiments demonstrated the effectiveness of transformers in capturing long-range dependencies for object detection in urban driving scenarios. The custom loss function improved object localization, and chunk-wise processing enhanced computational efficiency. Despite GPU constraints, the model showed promising results and outperformed traditional CNN-based approaches in feature alignment.
Technologies Used:
This project contributes to advancing 3D object detection models for autonomous driving by leveraging transformers to enhance spatial reasoning in LiDAR data.
Below is the full project report. You can view it directly here or download it.