A Vendor-Agnostic LiDAR Data Conversion System with Multi-Signal Detection and Multi-Format Output
2026-06-22 • Robotics
Robotics
AI summaryⓘ
The authors created a tool that can read raw data files from four different LiDAR sensors used in things like self-driving cars without needing separate software for each brand. Their system first figures out which sensor made the data, then converts it into common formats using either fast C++ code or slower Python code, depending on the sensor. They tested it on real outdoor data and found that the C++ parts run much faster than the Python parts. Overall, their pipeline works on a normal computer without special setup and unifies different sensor data into one process.
LiDARPCAP files3D point cloudsUDP protocolSDKC++Pythondata parsingsensor identificationpoint cloud formats
Authors
Param Patel, Jay Dave, Pratyush Chakraborty
Abstract
LiDAR (Light Detection and Ranging) sensors capture the surrounding environment as dense 3D point clouds by measuring the time-of-flight of emitted laser pulses, making them foundational across autonomous vehicles, robotics, and large-scale mapping. PCAP (Packet Capture) files from these sensors are the starting point of most 3D perception pipelines, yet internal packet structures, UDP (User Datagram Protocol) port conventions and encoding schemes differ enough across manufacturers that no single tool reads them all. Ouster, Velodyne, Hesai, and Livox each require their own SDK (Software Development Kit), their own environment setup, and their own conversion workflow. Supporting all four means maintaining four disconnected pipelines with no shared infrastructure. The pipeline described here takes a raw PCAP as input and handles vendor identification automatically, scoring six independent file characteristics through a weighted multi-signal approach to determine the source sensor. C++ SDKs handle Ouster and Velodyne, while Hesai and Livox rely on Python-based dpkt parsing where no open source SDK exists. From there, a single command writes output to any of five industry-standard formats. We tested on real outdoor captures. Ouster peaks at 2.08M points per second, Velodyne at 1.47M, both running through native C++ packet decoding. Hesai and Livox land at 110K and 150K respectively, where Python-layer parsing introduces overhead that compounds under sustained load. The 8-10x gap held consistently across runs. Tested on a consumer-grade i3 with 8GB RAM, no vendor configuration required