Lightweight manipulator grasping method based on manifold projection and negative space analysis in point cloud bird's-eye view

Baoju Wu; Yancheng Li; Nanmu Hui; Xiaowei Han; Baoju Wu; Yancheng Li; Nanmu Hui; Xiaowei Han

doi:10.3934/electreng.2026016

AIMS Electronics and Electrical Engineering

2026, Volume 10, Issue 3: 395-421. doi: 10.3934/electreng.2026016

Previous Article Next Article

Research article Topical Sections

Lightweight manipulator grasping method based on manifold projection and negative space analysis in point cloud bird's-eye view

1.
Institute of Interdisciplinary Technology, Shenyang University, Shenyang 110044, Liaoning, China
2.
School of Intelligent Science and Information Engineering, Shenyang University, Shenyang 110044, Liaoning, China
3.
School of Mechanical Engineering, Shenyang University, Shenyang 110044, Liaoning, China

Academic Editor: Rubén Puche Panadero

Received: 11 December 2025 Revised: 30 March 2026 Accepted: 27 April 2026 Published: 03 June 2026

To address the challenges of heavy data processing volume and the difficulty in meeting real-time requirements for industrial applications in 3D point cloud–based manipulator grasping, this paper proposes a novel visual grasping method based on negative space analysis of point cloud bird's-eye view (BEV). First, the YOLOv8 network is employed to perform fast and accurate 2D localization of targets in RGB images, and a 3D frustum is constructed to preliminarily filter the scene point cloud, followed by the random sample consensus (RANSAC) algorithm to robustly segment the desktop support plane. The core innovation involves a geometric manifold projection strategy that reduces the dimensionality of sparse 3D point clouds onto a 2D BEV plane. Based on the theory of image moments, the contour of the "negative space" occupied by the object is analytically parsed, thereby solving the target's six-degree-of-freedom (6-DoF) grasping pose with a linear computational complexity of $ O(N) $. Experimental results demonstrate that, compared with the baseline method combining single-shot multiBox detector (SSD) and PointNetGPD, the proposed method achieves a 5% improvement in the total system success rate (rising from 65% to 70%). Moreover, the average computation time per grasp is significantly reduced from 550 to 210 ms, exhibiting a speed advantage of more than 2.6 times. This work verifies the feasibility of replacing complex 3D deep-learning models with lightweight geometric analysis in specific structured scenes.
- manipulator visual grasping,
- negative space analysis,
- image moments,
- point cloud segmentation,
- lightweight network
Citation: Baoju Wu, Yancheng Li, Nanmu Hui, Xiaowei Han. Lightweight manipulator grasping method based on manifold projection and negative space analysis in point cloud bird's-eye view[J]. AIMS Electronics and Electrical Engineering, 2026, 10(3): 395-421. doi: 10.3934/electreng.2026016

Related Papers:

Abstract

To address the challenges of heavy data processing volume and the difficulty in meeting real-time requirements for industrial applications in 3D point cloud–based manipulator grasping, this paper proposes a novel visual grasping method based on negative space analysis of point cloud bird's-eye view (BEV). First, the YOLOv8 network is employed to perform fast and accurate 2D localization of targets in RGB images, and a 3D frustum is constructed to preliminarily filter the scene point cloud, followed by the random sample consensus (RANSAC) algorithm to robustly segment the desktop support plane. The core innovation involves a geometric manifold projection strategy that reduces the dimensionality of sparse 3D point clouds onto a 2D BEV plane. Based on the theory of image moments, the contour of the "negative space" occupied by the object is analytically parsed, thereby solving the target's six-degree-of-freedom (6-DoF) grasping pose with a linear computational complexity of $ O(N) $. Experimental results demonstrate that, compared with the baseline method combining single-shot multiBox detector (SSD) and PointNetGPD, the proposed method achieves a 5% improvement in the total system success rate (rising from 65% to 70%). Moreover, the average computation time per grasp is significantly reduced from 550 to 210 ms, exhibiting a speed advantage of more than 2.6 times. This work verifies the feasibility of replacing complex 3D deep-learning models with lightweight geometric analysis in specific structured scenes.

References

[1]	Chu FJ, Xu R, Vela PA (2018) Real-world multiobject, multi grasp detection. IEEE Robotic Autom Lett 3: 3355–3362.https://doi.org/10.1109/LRA.2018.2852777 doi: 10.1109/LRA.2018.2852777
[2]	Ribeiro EG, de Queiroz R, Grassi J (2021) Real-time deep learning approach to visual servo control and grasp detection for autonomous manipulator manipulation. Robot Auton Syst 139: 103757.https://doi.org/10.1016/j.manipulator.2021.103757 doi: 10.1016/j.manipulator.2021.103757
[3]	Jiang Y, Fang Y, Deng L (2025) PDCNet: A lightweight and efficient manipulator grasp detection framework via partial convolution and knowledge distillation. Comput Vis Image Und 259: 104441.https://doi.org/10.1016/j.cviu.2025.104441 doi: 10.1016/j.cviu.2025.104441
[4]	Yang M, Li H (2025) GMatch: A lightweight, geometry-constrained keypoint matcher for zero-shot 6DoF pose estimation in manipulator grasp tasks. arXiv preprint arXiv: 2505.16144.
[5]	Guo C, Zhu C, Liu Y, Huang R, Cao B, Zhu Q, et al. (2024) End-to-end lightweight transformer-based neural network for grasp detection towards fruit manipulator handling. Comput Electron Agr 221: 109014.https://doi.org/10.1016/j.compag.2024.109014 doi: 10.1016/j.compag.2024.109014
[6]	Xu Z, Xue J, Song Z, Jia R, Lu W (2025) Lightweight network research for manipulator visual grasp for deep space exploration. Neural Comput Appl 37: 17083–17109.https://doi.org/10.1007/s00521-025-11377-1 doi: 10.1007/s00521-025-11377-1
[7]	Yang L, Bai Y, Wang Y, Alsarraj I, Kutyniok G, Wang Z, et al. (2026) Lightweight learning from actuation-space demonstrations via flow matching for whole-body soft manipulator grasping. IEEE Robotic Autom Lett 11: 6720‒6727.
[8]	Nguyen N, Vu MN, Huang B, Vuong A, Le N, Vo T, et al. (2024) Lightweight language-driven grasp detection using conditional consistency model. IEEE/RSJ International Conference on Intelligent Manipulators and Systems (IROS), 13719‒13725.https://doi.org/10.1109/IROS58592.2024.10802007 doi: 10.1109/IROS58592.2024.10802007
[9]	Fang HS, Wang C, Gou M, Lu C (2020) GraspNet-1Billion: A large-scale benchmark for general object grasping. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11444‒11453.https://doi.org/10.1109/CVPR42600.2020.01146 doi: 10.1109/CVPR42600.2020.01146
[10]	Wang C, Martín-Martín R, Xu D, Lv J, Lu C, Fei-Fei L, et al. (2020) 6-PACK: Category-level 6D pose tracker with anchor-based keypoints. IEEE International Conference on Robotics and Automation (ICRA), 10059‒10066.https://doi.org/10.1109/ICRA40945.2020.9196643 doi: 10.1109/ICRA40945.2020.9196643
[11]	Farhadi A, Mirzarezaee M, Sharifi A, Teshnehlab M (2024) Domain adaptation in reinforcement learning: a comprehensive and systematic study. Front Inform Tech Electr Eng 25: 1446‒1465.https://doi.org/10.1631/FITEE.2300668 doi: 10.1631/FITEE.2300668
[12]	Pan Y, Zhang T, Li R (2025) Object dynamic recognition and grasping location via lightweight semantic attention network with learnable boundary vectors. Measurement 258: 119386.https://doi.org/10.1016/j.measurement.2025.119386 doi: 10.1016/j.measurement.2025.119386
[13]	Wang S, Fei S (2019) Research and improvement of SSD (Single Shot MultiBox Detector) target detection algorithm. Industrial Control Computer 32: 103–105.
[14]	Ten PA, Gualtieri M, Saenko K (2017) Grasp pose detection in point clouds. The International Journal of Robotics Research 36: 1455–1473.https://doi.org/10.1177/0278364917735594 doi: 10.1177/0278364917735594
[15]	Qi CR, Su H, Mo K (2017) PointNet: Deep learning on point sets for 3D classification and segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 652–660.https://doi.org/10.1109/CVPR.2017.70 doi: 10.1109/CVPR.2017.70
[16]	Liang H, Ma X, Li S (2019) PointNetGPD: Detecting grasp configurations from point sets. IEEE International Conference on Robotics and Automation (ICRA), 3629–3635.https://doi.org/10.1109/ICRA.2019.8794435 doi: 10.1109/ICRA.2019.8794435
[17]	Zhang Q, Zhang L, Dai C, Huang H, Liu L, Guo J, et al. (2023) RTFT6D: A real-time 6D pose estimation with fusion transformer. International Conference on Autonomous Unmanned Systems, 430–440.https://doi.org/10.1007/978-981-97-1099-7_41 doi: 10.1007/978-981-97-1099-7_41
[18]	Chai Z, Liu C, Xiong Z (2023) Multi-pyramid-based hierarchical template matching for 6D pose estimation in industrial grasping task. Ind Robot 50: 659–672.https://doi.org/10.1108/IR-08-2022-0220 doi: 10.1108/IR-08-2022-0220
[19]	Zhang H, Tan J, Zhao C, Liang Z, Liu L, Zhong H, et al. (2020) A fast detection and grasping method for mobile manipulator based on improved Faster R-CNN. Ind Robot 47: 167–175.https://doi.org/10.1108/IR-07-2019-0150 doi: 10.1108/IR-07-2019-0150
[20]	Yu JY, Huang D, Gao J, Li W (2023) Grasping perception method of space manipulator for complex scene task. In Third International Conference on Machine Learning and Computer Application (ICMLCA 2022) 12636: 930‒940.https://doi.org/10.1117/12.2675288
[21]	Boulch A (2020) ConvPoint: Continuous convolutions for point cloud processing. Comput Graph 88: 24–34.https://doi.org/10.1016/j.cag.2020.02.005 doi: 10.1016/j.cag.2020.02.005
[22]	Zhou Y, Tuzel O (2018) VoxelNet: End-to-end learning for point cloud based 3D object detection. Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition (CVPR), 4490–4499.https://doi.org/10.1109/CVPR.2018.00472 doi: 10.1109/CVPR.2018.00472
[23]	Shi SS, Wang XG, Li HS (2019) PointRCNN: 3D object proposal generation and detection from point cloud. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 770–779.https://doi.org/10.1109/CVPR.2019.00086 doi: 10.1109/CVPR.2019.00086
[24]	Hui NM, Wu XH, Han XW, Wu BJ (2024) A robotic arm visual grasp detection algorithm combining 2D images and 3D point clouds. Appl Mech Mater 919: 209‒223.https://doi.org/10.4028/p-vnDoN1 doi: 10.4028/p-vnDoN1
[25]	Zhang Y, Xiang Z, Qiao C, Chen S (2020) High precision real-time target detection based on 3D point cloud bird's eye view. Manipulator 42: 148–156.https://doi.org/10.13973/j.cnki.manipulator.190236 doi: 10.13973/j.cnki.manipulator.190236
[26]	Liu Z, Luo J, Pan Z (2019) Mid-view projection processing based on radar point cloud. Information Technology and Network Security 38: 40–44.
[27]	Guo Y, Wang H, Gao X, Wang H, Wang Y (2026) Survey of BEV 3D object detection algorithm system. Journal of Computer Applications 46: 1238–1252.
[28]	Lian QY, Zheng SW, Tu XK, Li WH (2025) Voxel feature attention-based point cloud object dection algorithm for traffic cone. Journal of Mechanical Engineering 61: 239–249.
[29]	Chen X, Han L, Xiao Y, Xue B, Ma L (2025) 3D object detection of point cloud based on voxel-keypoint feature aggregation network. Laser Technology 50: 291–299.
[30]	Xu K, Li W (2024) End-to-end multi-task 3D object detection method based on bird's eye view images. Computer Simulation 41: 176–181.
[31]	Zhang T, Xiao Z, Zou YB (2022) Workpiece recognition and pose estimation based on 3D point cloud features. Journal of Machinery Design & Manufacturing, 252–256.https://doi.org/10.3969/j.issn.1001-3997.2022.02.054 doi: 10.3969/j.issn.1001-3997.2022.02.054
[32]	Wu J, Fang HG, Yang GX (2022) 6D pose estimation and manipulator arm grasping based on minimum size point model. Computer Integrated Manufacturing Systems 28: 2472–2480.https://doi.org/10.13196/j.cims.2022.08.018 doi: 10.13196/j.cims.2022.08.018
[33]	Zhong Y, Zhang J, Zhang H (2022) Manipulator hand-eye calibration method based on target detection. Journal of Computer Engineering 48: 100–106.https://doi.org/10.19678/j.issn.1000-3428.0060670 doi: 10.19678/j.issn.1000-3428.0060670
[34]	Fischler MA, Bolles RC (1981) Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24: 381–395.https://doi.org/10.1145/358669.358692 doi: 10.1145/358669.358692
[35]	Jocher G, Chaurasia A, Qiu J (2023) YOLO by Ultralytics. arXiv preprint arXiv: 2309.13353.
[36]	Konishi Y, Hattori K, Hashimoto M (2019) Real-time 6D object pose estimation on CPU. IEEE/RSJ International Conference on Intelligent Manipulators and Systems (IROS), 3451–3458.https://doi.org/10.1109/IROS40897.2019.8967967 doi: 10.1109/IROS40897.2019.8967967
[37]	Liao Y, Kang S, Li J, Liu Y, Liu Y, et al. (2024) Mobile-Seed: Joint semantic segmentation and boundary detection for mobile manipulators. IEEE Robot Autom Lett 9: 3902–3909.https://doi.org/10.1109/LRA.2024.3373235 doi: 10.1109/LRA.2024.3373235
[38]	Cavelli RF, Cheng PDC, Indri M (2024) Motion planning and safe object handling for a low-resource mobile manipulator as human assistant. IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), 1–8.https://doi.org/10.1109/ETFA61755.2024.10711157 doi: 10.1109/ETFA61755.2024.10711157
[39]	Gao Z, Li C, Ma D, Chong NY (2024) Object re-orientation via two-edge-contact pushing along a circular path based on friction estimation. Eighth IEEE International Conference on Manipulator Computing (IRC), 17–23.https://doi.org/10.1109/IRC63610.2024.00009 doi: 10.1109/IRC63610.2024.00009
[40]	Chen H, Quan F, Fang L, Zhang S (2019) Aerial grasping with a lightweightmanipulator based on multi-objective optimization and visual compensation. Sensors 19: 4253.https://doi.org/10.3390/s19194253 doi: 10.3390/s19194253
[41]	Xie Y, Liu J, Yang Y (2024) Pose optimization for mobile manipulator grasping based on hybrid manipulability. Ind Robot 51: 134–147.https://doi.org/10.1108/IR-06-2023-0128 doi: 10.1108/IR-06-2023-0128

Reader Comments

Your name:*

Email:*
© 2026 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)