Research article Topical Sections

Lightweight manipulator grasping method based on manifold projection and negative space analysis in point cloud bird's-eye view


  • Published: 03 June 2026
  • To address the challenges of heavy data processing volume and the difficulty in meeting real-time requirements for industrial applications in 3D point cloud–based manipulator grasping, this paper proposes a novel visual grasping method based on negative space analysis of point cloud bird's-eye view (BEV). First, the YOLOv8 network is employed to perform fast and accurate 2D localization of targets in RGB images, and a 3D frustum is constructed to preliminarily filter the scene point cloud, followed by the random sample consensus (RANSAC) algorithm to robustly segment the desktop support plane. The core innovation involves a geometric manifold projection strategy that reduces the dimensionality of sparse 3D point clouds onto a 2D BEV plane. Based on the theory of image moments, the contour of the "negative space" occupied by the object is analytically parsed, thereby solving the target's six-degree-of-freedom (6-DoF) grasping pose with a linear computational complexity of $ O(N) $. Experimental results demonstrate that, compared with the baseline method combining single-shot multiBox detector (SSD) and PointNetGPD, the proposed method achieves a 5% improvement in the total system success rate (rising from 65% to 70%). Moreover, the average computation time per grasp is significantly reduced from 550 to 210 ms, exhibiting a speed advantage of more than 2.6 times. This work verifies the feasibility of replacing complex 3D deep-learning models with lightweight geometric analysis in specific structured scenes.

    Citation: Baoju Wu, Yancheng Li, Nanmu Hui, Xiaowei Han. Lightweight manipulator grasping method based on manifold projection and negative space analysis in point cloud bird's-eye view[J]. AIMS Electronics and Electrical Engineering, 2026, 10(3): 395-421. doi: 10.3934/electreng.2026016

    Related Papers:

  • To address the challenges of heavy data processing volume and the difficulty in meeting real-time requirements for industrial applications in 3D point cloud–based manipulator grasping, this paper proposes a novel visual grasping method based on negative space analysis of point cloud bird's-eye view (BEV). First, the YOLOv8 network is employed to perform fast and accurate 2D localization of targets in RGB images, and a 3D frustum is constructed to preliminarily filter the scene point cloud, followed by the random sample consensus (RANSAC) algorithm to robustly segment the desktop support plane. The core innovation involves a geometric manifold projection strategy that reduces the dimensionality of sparse 3D point clouds onto a 2D BEV plane. Based on the theory of image moments, the contour of the "negative space" occupied by the object is analytically parsed, thereby solving the target's six-degree-of-freedom (6-DoF) grasping pose with a linear computational complexity of $ O(N) $. Experimental results demonstrate that, compared with the baseline method combining single-shot multiBox detector (SSD) and PointNetGPD, the proposed method achieves a 5% improvement in the total system success rate (rising from 65% to 70%). Moreover, the average computation time per grasp is significantly reduced from 550 to 210 ms, exhibiting a speed advantage of more than 2.6 times. This work verifies the feasibility of replacing complex 3D deep-learning models with lightweight geometric analysis in specific structured scenes.



    加载中


    [1] Chu FJ, Xu R, Vela PA (2018) Real-world multiobject, multi grasp detection. IEEE Robotic Autom Lett 3: 3355–3362.https://doi.org/10.1109/LRA.2018.2852777 doi: 10.1109/LRA.2018.2852777
    [2] Ribeiro EG, de Queiroz R, Grassi J (2021) Real-time deep learning approach to visual servo control and grasp detection for autonomous manipulator manipulation. Robot Auton Syst 139: 103757.https://doi.org/10.1016/j.manipulator.2021.103757 doi: 10.1016/j.manipulator.2021.103757
    [3] Jiang Y, Fang Y, Deng L (2025) PDCNet: A lightweight and efficient manipulator grasp detection framework via partial convolution and knowledge distillation. Comput Vis Image Und 259: 104441.https://doi.org/10.1016/j.cviu.2025.104441 doi: 10.1016/j.cviu.2025.104441
    [4] Yang M, Li H (2025) GMatch: A lightweight, geometry-constrained keypoint matcher for zero-shot 6DoF pose estimation in manipulator grasp tasks. arXiv preprint arXiv: 2505.16144.
    [5] Guo C, Zhu C, Liu Y, Huang R, Cao B, Zhu Q, et al. (2024) End-to-end lightweight transformer-based neural network for grasp detection towards fruit manipulator handling. Comput Electron Agr 221: 109014.https://doi.org/10.1016/j.compag.2024.109014 doi: 10.1016/j.compag.2024.109014
    [6] Xu Z, Xue J, Song Z, Jia R, Lu W (2025) Lightweight network research for manipulator visual grasp for deep space exploration. Neural Comput Appl 37: 17083–17109.https://doi.org/10.1007/s00521-025-11377-1 doi: 10.1007/s00521-025-11377-1
    [7] Yang L, Bai Y, Wang Y, Alsarraj I, Kutyniok G, Wang Z, et al. (2026) Lightweight learning from actuation-space demonstrations via flow matching for whole-body soft manipulator grasping. IEEE Robotic Autom Lett 11: 6720‒6727.
    [8] Nguyen N, Vu MN, Huang B, Vuong A, Le N, Vo T, et al. (2024) Lightweight language-driven grasp detection using conditional consistency model. IEEE/RSJ International Conference on Intelligent Manipulators and Systems (IROS), 13719‒13725.https://doi.org/10.1109/IROS58592.2024.10802007 doi: 10.1109/IROS58592.2024.10802007
    [9] Fang HS, Wang C, Gou M, Lu C (2020) GraspNet-1Billion: A large-scale benchmark for general object grasping. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11444‒11453.https://doi.org/10.1109/CVPR42600.2020.01146 doi: 10.1109/CVPR42600.2020.01146
    [10] Wang C, Martín-Martín R, Xu D, Lv J, Lu C, Fei-Fei L, et al. (2020) 6-PACK: Category-level 6D pose tracker with anchor-based keypoints. IEEE International Conference on Robotics and Automation (ICRA), 10059‒10066.https://doi.org/10.1109/ICRA40945.2020.9196643 doi: 10.1109/ICRA40945.2020.9196643
    [11] Farhadi A, Mirzarezaee M, Sharifi A, Teshnehlab M (2024) Domain adaptation in reinforcement learning: a comprehensive and systematic study. Front Inform Tech Electr Eng 25: 1446‒1465.https://doi.org/10.1631/FITEE.2300668 doi: 10.1631/FITEE.2300668
    [12] Pan Y, Zhang T, Li R (2025) Object dynamic recognition and grasping location via lightweight semantic attention network with learnable boundary vectors. Measurement 258: 119386.https://doi.org/10.1016/j.measurement.2025.119386 doi: 10.1016/j.measurement.2025.119386
    [13] Wang S, Fei S (2019) Research and improvement of SSD (Single Shot MultiBox Detector) target detection algorithm. Industrial Control Computer 32: 103–105.
    [14] Ten PA, Gualtieri M, Saenko K (2017) Grasp pose detection in point clouds. The International Journal of Robotics Research 36: 1455–1473.https://doi.org/10.1177/0278364917735594 doi: 10.1177/0278364917735594
    [15] Qi CR, Su H, Mo K (2017) PointNet: Deep learning on point sets for 3D classification and segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 652–660.https://doi.org/10.1109/CVPR.2017.70 doi: 10.1109/CVPR.2017.70
    [16] Liang H, Ma X, Li S (2019) PointNetGPD: Detecting grasp configurations from point sets. IEEE International Conference on Robotics and Automation (ICRA), 3629–3635.https://doi.org/10.1109/ICRA.2019.8794435 doi: 10.1109/ICRA.2019.8794435
    [17] Zhang Q, Zhang L, Dai C, Huang H, Liu L, Guo J, et al. (2023) RTFT6D: A real-time 6D pose estimation with fusion transformer. International Conference on Autonomous Unmanned Systems, 430–440.https://doi.org/10.1007/978-981-97-1099-7_41 doi: 10.1007/978-981-97-1099-7_41
    [18] Chai Z, Liu C, Xiong Z (2023) Multi-pyramid-based hierarchical template matching for 6D pose estimation in industrial grasping task. Ind Robot 50: 659–672.https://doi.org/10.1108/IR-08-2022-0220 doi: 10.1108/IR-08-2022-0220
    [19] Zhang H, Tan J, Zhao C, Liang Z, Liu L, Zhong H, et al. (2020) A fast detection and grasping method for mobile manipulator based on improved Faster R-CNN. Ind Robot 47: 167–175.https://doi.org/10.1108/IR-07-2019-0150 doi: 10.1108/IR-07-2019-0150
    [20] Yu JY, Huang D, Gao J, Li W (2023) Grasping perception method of space manipulator for complex scene task. In Third International Conference on Machine Learning and Computer Application (ICMLCA 2022) 12636: 930‒940.https://doi.org/10.1117/12.2675288
    [21] Boulch A (2020) ConvPoint: Continuous convolutions for point cloud processing. Comput Graph 88: 24–34.https://doi.org/10.1016/j.cag.2020.02.005 doi: 10.1016/j.cag.2020.02.005
    [22] Zhou Y, Tuzel O (2018) VoxelNet: End-to-end learning for point cloud based 3D object detection. Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition (CVPR), 4490–4499.https://doi.org/10.1109/CVPR.2018.00472 doi: 10.1109/CVPR.2018.00472
    [23] Shi SS, Wang XG, Li HS (2019) PointRCNN: 3D object proposal generation and detection from point cloud. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 770–779.https://doi.org/10.1109/CVPR.2019.00086 doi: 10.1109/CVPR.2019.00086
    [24] Hui NM, Wu XH, Han XW, Wu BJ (2024) A robotic arm visual grasp detection algorithm combining 2D images and 3D point clouds. Appl Mech Mater 919: 209‒223.https://doi.org/10.4028/p-vnDoN1 doi: 10.4028/p-vnDoN1
    [25] Zhang Y, Xiang Z, Qiao C, Chen S (2020) High precision real-time target detection based on 3D point cloud bird's eye view. Manipulator 42: 148–156.https://doi.org/10.13973/j.cnki.manipulator.190236 doi: 10.13973/j.cnki.manipulator.190236
    [26] Liu Z, Luo J, Pan Z (2019) Mid-view projection processing based on radar point cloud. Information Technology and Network Security 38: 40–44.
    [27] Guo Y, Wang H, Gao X, Wang H, Wang Y (2026) Survey of BEV 3D object detection algorithm system. Journal of Computer Applications 46: 1238–1252.
    [28] Lian QY, Zheng SW, Tu XK, Li WH (2025) Voxel feature attention-based point cloud object dection algorithm for traffic cone. Journal of Mechanical Engineering 61: 239–249.
    [29] Chen X, Han L, Xiao Y, Xue B, Ma L (2025) 3D object detection of point cloud based on voxel-keypoint feature aggregation network. Laser Technology 50: 291–299.
    [30] Xu K, Li W (2024) End-to-end multi-task 3D object detection method based on bird's eye view images. Computer Simulation 41: 176–181.
    [31] Zhang T, Xiao Z, Zou YB (2022) Workpiece recognition and pose estimation based on 3D point cloud features. Journal of Machinery Design & Manufacturing, 252–256.https://doi.org/10.3969/j.issn.1001-3997.2022.02.054 doi: 10.3969/j.issn.1001-3997.2022.02.054
    [32] Wu J, Fang HG, Yang GX (2022) 6D pose estimation and manipulator arm grasping based on minimum size point model. Computer Integrated Manufacturing Systems 28: 2472–2480.https://doi.org/10.13196/j.cims.2022.08.018 doi: 10.13196/j.cims.2022.08.018
    [33] Zhong Y, Zhang J, Zhang H (2022) Manipulator hand-eye calibration method based on target detection. Journal of Computer Engineering 48: 100–106.https://doi.org/10.19678/j.issn.1000-3428.0060670 doi: 10.19678/j.issn.1000-3428.0060670
    [34] Fischler MA, Bolles RC (1981) Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24: 381–395.https://doi.org/10.1145/358669.358692 doi: 10.1145/358669.358692
    [35] Jocher G, Chaurasia A, Qiu J (2023) YOLO by Ultralytics. arXiv preprint arXiv: 2309.13353.
    [36] Konishi Y, Hattori K, Hashimoto M (2019) Real-time 6D object pose estimation on CPU. IEEE/RSJ International Conference on Intelligent Manipulators and Systems (IROS), 3451–3458.https://doi.org/10.1109/IROS40897.2019.8967967 doi: 10.1109/IROS40897.2019.8967967
    [37] Liao Y, Kang S, Li J, Liu Y, Liu Y, et al. (2024) Mobile-Seed: Joint semantic segmentation and boundary detection for mobile manipulators. IEEE Robot Autom Lett 9: 3902–3909.https://doi.org/10.1109/LRA.2024.3373235 doi: 10.1109/LRA.2024.3373235
    [38] Cavelli RF, Cheng PDC, Indri M (2024) Motion planning and safe object handling for a low-resource mobile manipulator as human assistant. IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), 1–8.https://doi.org/10.1109/ETFA61755.2024.10711157 doi: 10.1109/ETFA61755.2024.10711157
    [39] Gao Z, Li C, Ma D, Chong NY (2024) Object re-orientation via two-edge-contact pushing along a circular path based on friction estimation. Eighth IEEE International Conference on Manipulator Computing (IRC), 17–23.https://doi.org/10.1109/IRC63610.2024.00009 doi: 10.1109/IRC63610.2024.00009
    [40] Chen H, Quan F, Fang L, Zhang S (2019) Aerial grasping with a lightweightmanipulator based on multi-objective optimization and visual compensation. Sensors 19: 4253.https://doi.org/10.3390/s19194253 doi: 10.3390/s19194253
    [41] Xie Y, Liu J, Yang Y (2024) Pose optimization for mobile manipulator grasping based on hybrid manipulability. Ind Robot 51: 134–147.https://doi.org/10.1108/IR-06-2023-0128 doi: 10.1108/IR-06-2023-0128
  • Reader Comments
  • © 2026 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(119) PDF downloads(26) Cited by(0)

Article outline

Figures and Tables

Figures(15)  /  Tables(4)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog