LiDAR-based semantic segmentation is a key component for autonomous mobile robots, yet large-scale annotation of LiDAR point clouds is prohibitively expensive and time-consuming. Although simulators can provide labeled synthetic data, models trained on synthetic data often underperform on real-world data due to a data-level domain gap. To address this issue, we propose DRUM, a novel Sim2Real translation framework. We leverage a diffusion model pre-trained on unlabeled real-world data as a generative prior and translate synthetic data by reproducing two key measurement characteristics: reflectance intensity and raydrop noise. To improve sample fidelity, we introduce a raydrop-aware masked guidance mechanism that selectively enforces consistency with the input synthetic data while preserving realistic raydrop noise induced by the diffusion prior. Experimental results demonstrate that DRUM consistently improves Sim2Real performance across multiple representations of LiDAR data.
LiDAR simulators provide perfect segmentation labels but lack realistic sensor characteristics such as reflectance intensity and raydrop noise, leading to a performance drop in real-world deployment. Our method DRUM bridges this sim2real gap by synthesizing pseudo-real training samples that combine simulator-provided annotations with learned realistic sensor characteristics.
| Simulation | Pseudo-real (ours) | Real | |
|---|---|---|---|
| Range | ![]() |
![]() |
![]() |
| Reflectance intensity |
N/A | ![]() |
![]() |
| Raydrop noise | N/A | ![]() |
![]() |
| Semantic label | ![]() |
![]() |
N/A |
We formulate Sim2Real LiDAR translation as a posterior sampling. First, we pre-train a LiDAR diffusion model[Nakashima+ 2024] on real-world data to capture its underlying distribution, which serves as a generative prior. We then condition the diffusion sampling process with the simulation sample via the proposed raydrop-aware masked guidance.
Naïve conditioning suppresses realistic raydrop noise. In this work, we first generate the raydrop-aware mask \(m_t\) from the tentative Tweedie's estimate \(\hat{x}_t\) and then compute the sim-real discrepancy based on the pseudoinverse method [Song+ 2023]. The operator \(H\) corrupts the reflectance modality.
@inproceedings{miyawaki2026drum,
title = {{DRUM}: Diffusion-based Raydrop-aware Unapaired Mapping for Sim2Real {LiDAR} Segmentation},
author = {Tomoya Miyawaki and Kazuto Nakashima and Yumi Iwashita and Ryo Kurazume},
booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
pages = {},
year = 2026
}
This work was supported by JSPS KAKENHI Grant Number JP23K16974 and JSPS KAKENHI Grant Number JP20H00230