基于CNN-Transformer的非对称RGB-D语义分割
DOI:
CSTR:
作者:
作者单位:

安徽理工大学

作者简介:

通讯作者:

中图分类号:

TP391

基金项目:

国家自然科学基金,安徽省自然科学基金,安徽省博士后科学基金


Asymmetric RGB-D semantic segmentation based on CNN-Transformer
Author:
Affiliation:

Anhui University of Science and Technology

Fund Project:

The National Natural Science Foundation of China,Natural Science Foundation of Anhui Province,Anhui Postdoctoral Science Foundation

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    RGB-D语义分割已被广泛研究,并取得显著效果。传统方法使用简单的策略来融合RGB和深度特征,难以有效利用多模态信息。此外,现在的方法大多采用双流Transformer提取信息,导致参数大幅增加,从而阻碍了它们的实际应用。为解决该问题,本文结合Transformer和CNN设计RGB-D语义分割网络,利用Mix Transformer提取深度特征,使用ConvText提取RGB特征。为有效利用RGB信息和深度信息,设计特征交互互补模块实现特征提取过程中的RGB信息和深度信息的交互与校正。为有效的融合RGB和深度特征,提出非对称特征选择融合模块融合RGB特征和深度特征。在NYU Depth V2和SUN RGB-D两个数据集上的大量实验结果表明,该方法可以有效的对室内复杂场景实现快速有效的分割。

    Abstract:

    RGB-D semantic segmentation has been extensively studied and achieved remarkable results. However, traditional methods use simple strategies to fuse RGB and depth features, which struggle to make full use of multimodal information. In addition, most current methods use dual-stream Transformer to extract information, leading to a substantial increase in the number of parameters, which in turn hinders their practical application. To solve this problem, this paper designs an RGB-D semantic segmentation network based on Transformer and CNN, uses Mix Transformer to extract depth features, and ConvText to extract RGB features. In order to effectively utilize RGB and depth information, a feature interaction complementation module is designed to realize the interaction and correction of RGB and depth information in the feature extraction process. For the purpose of seamlessly integrating RGB and depth features, an asymmetric feature selection fusion module is proposed to achieve effective fusion of RGB features and depth features. A large number of experimental results on the NYU Depth V2 and SUN RGB-D datasets show that this method can effectively achieve fast and effective segmentation of complex indoor scenes.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-08-01
  • 最后修改日期:2024-10-14
  • 录用日期:2024-10-28
  • 在线发布日期:
  • 出版日期:
文章二维码