赵勇-智造与未来技术学院

赵勇

赵勇研究员

职称：研究员

研究方向：具身智能视觉感知技术，双目视觉立体匹配算法、双目视觉3D感知技术、计算机视觉与机器视觉、智能制造中的视觉检测、测量、识别、定位及视觉引导技术、深度学习、强化学习、AI及大语言模型在具身智能的导航VLN和行动VLA中的应用。

Email：zhaoyong@fyust.edu.cn

个人简介
教育及工作经历
学术成果
荣誉及成就

个人简介

赵勇，博士，毕业于东南大学自动化所，师从钱钟韩院士与冯纯伯院士。曾任浙江大学电子工程系博士后及副教授。在加拿大留学期间，于Concordia University从事小波变换在图像处理与压缩中的应用研究，期间为加拿大北方电讯开发的语音包丢失补偿算法达到业界先进水平。此后，赵勇博士加入霍尼韦尔（Honeywell），担任高级软件工程师，负责视音频处理相关研发工作。在视频压缩、智能目标检测与搜索等关键技术领域实现多项突破，显著提升了霍尼韦尔视频监控系统的性能，助力其产品在北美市场提升竞争力。

2004年回国后，赵勇博士加入北京大学信息工程学院（北大深圳研究生院），致力于人工智能、机器学习、计算机视觉、视频分析及流媒体等技术的研究与开发。曾与某通信龙头企业合作，开发出全球销量领先的监控芯片视频加速引擎IVE2.0（年销量数千万片，销售额达数十亿元）；主持研发国内首款疲劳驾驶检测仪，荣获第一届中国创新创业大赛第一名；并与航盛及HKPC合作，开发出国内最早的先进辅助驾驶系统之一。

近年来，积极研发视觉技术在工业领域的应用，与多家制造企业合作开发并落地多项创新产品，包括高精度大行程瞬检仪、双目3D内窥镜、线宽机、显微镜微形貌3D测量仪、端子检测机、磁瓦检测机等，相关产品已广泛应用于实际产线。目前，赵勇博士的研究重点集中于具身智能视觉感知、机器人空间感知、基于双目视觉的3D感知、3D语义分割与6D姿态估计、视觉引导机械臂无序抓取及自动化装配等前沿技术领域。

教育及工作经历

教育经历：

1997.08-2000.05 加拿大康戈迪亚大学, 电器与计算机工程, 博士后

1988.03-1991.11 东南大学, 自动控制, 工学博士

1985.09-1988.04 西北工业大学, 应用数学, 工学硕士

1981.09-1985.07 贵州大学, 数学, 理学学士

工作经历：

2025.09至今福建福耀科技大学, 智造与未来技术学院, 研究员

2004.03-2025.07 北京大学深圳研究生院, 信息工程学院, 副教授

2000.05-2004.03 霍尼韦尔加拿大, 音视频算法与技术, 高级工程师

1997.08-2000.05 加拿大康戈迪亚大学, 电器与计算机工程系, 博士后/副研究员

1991.11-1997.08 浙江大学, 信息工程学院, 博士后/副教授

学术成果

论文（部分）：

[1] Ma, Y., Zhou, Q., Chen, X., et al. (2019). Multi-attention network for thoracic disease classification and localization. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1378–1382).

[2] Zhou, Q., Ma, Y., Lu, H., et al. (2019). Discriminative features reconstruction network for semantic segmentation. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1922–1926).

[3] Lu, H., Chen, X., Zhang, G., et al. (2019). Scanet: Spatial-channel attention network for 3D object detection. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1992–1996).

[4] Cheng, K., Sun, J., Chen, X., et al. (2019). ACNet: Aggregated channels network for automated mitosis detection. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) (pp. 453–464). Springer, Cham.

[5] Fu, C., Pei, W., Cao, Q., Zhang, C., Zhao, Y., Shen, X., & Tai, Y.-W. (2019). Non-local recurrent neural memory for supervised sequence modeling. In IEEE International Conference on Computer Vision (ICCV). Seoul, South Korea.

[6] Chen, X., Lu, H., Cheng, K., Ma, Y., Zhou, Q., & Zhao, Y. (2019). Sequentially refined spatial and channel-wise feature aggregation in encoder-decoder network for single image dehazing. In IEEE International Conference on Image Processing (ICIP). Taipei, China.

[7] Wang, B., Zhao, Y., & Chen, C. L. P. (2019). Moving cast shadows segmentation using illumination invariant feature. IEEE Transactions on Multimedia.

[8] Yu, P., Zhao, Y., Zhang, J., & Xie, X. (2019). Pedestrian detection using multi-channel visual feature fusion by learning deep quality model. Journal of Visual Communication and Image Representation, 63.

[9] Yang, W., Ai, X., Yang, Z., Xu, Y., & Zhao, Y.* (2020). Dedge-AGMNet: An effective stereo matching network optimized by depth edge auxiliary task. In 24th European Conference on Artificial Intelligence (ECAI).

[10] Chen, X., Fu, C., Zhao, Y., et al. (2020). Salience-guided cascaded suppression network for person re-identification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Chen, X., Yan, X., Zheng, F., Jiang, Y., Xia, S.-T., Zhao, Y., & Ji, R. (2020). One-shot adversarial attacks on visual tracking with dual attention. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Chen, X., Fu, C., Zheng, F., & Zhao, Y. (2020). A unified multi-scenario attacking network for visual object tracking. In AAAI Conference on Artificial Intelligence (AAAI).

[13] Yang, Z., Ai, X., Yang, W., Zhao, Y., Dai, Q., & Li, F. (2020). Deeply-fused attentive network for stereo matching. In International Conference on Pattern Recognition (ICPR).

[14] Ai, X., Yang, Z., Yang, W., Zhao, Y., Yu, Z., & Li, F. (2020). Suppressing features that contain disparity edge for stereo matching. In International Conference on Pattern Recognition (ICPR).

[15] Wang, B., Zhao, Y., & Chen, C. L. P. (2021). Hybrid transfer learning and broad learning system for wearing mask detection in the COVID-19 era. IEEE Transactions on Instrumentation and Measurement.

[16] Peng, J., Xie, W., Huang, Z., et al. (2021). Hierarchical context guided aggregation network for stereo matching. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17] Sang, H., Zhou, Q., & Zhao, Y. (2020). PCANet: Pyramid convolutional attention network for semantic segmentation. Image and Vision Computing, 103, 103997.

[18] Yang, Z., Ai, X., Yang, W., et al. (2021). Deeply-fused attentive network for stereo matching. In International Conference on Pattern Recognition (ICPR) (pp. 1717–1724).

[19] Wu, L., Xu, Y., Zhao, Y., et al. (2021). A dual distance metrics method for improving classification performance. Electronics Letters, 57(1), 13–16.

[20] Chen, W., Peng, J., Zhu, Z., et al. (2021). HPA-Net: Hierarchical and parallel aggregation network for context learning in stereo matching. In International Conference on Computer Analysis of Images and Patterns (pp. 100–109). Springer, Cham.

[21] Ye, S., Zeng, P., Li, P., et al. (2022). MLP-Stereo: Heterogeneous feature fusion in MLP for stereo matching. In IEEE International Conference on Image Processing (ICIP) (pp. 101–105).

[22] Wang, W., Ye, S., Wang, X., et al. (2022). OMNET: Real-time stereo matching with unsupervised occlusion mask. In IEEE International Conference on Image Processing (ICIP) (pp. 1241–1245).

[23] Yang, X., Feng, Z., Zhao, Y., et al. (2022). Edge supervision and multi-scale cost volume for stereo matching. Image and Vision Computing, 117, 104336.

[24] Cheng, X., Zhao, Y., Yang, W., et al. (2022). LESC: Superpixel cut-based local expansion for accurate stereo matching. IET Image Processing, 16(2), 470–484.

[25] Cheng, X., Zhao, Y., Yang, W., et al. (2022). A novel cell structure-based disparity estimation for unsupervised stereo matching. IET Image Processing, 16(6), 1678–1693.

[26] Sang, H., Yang, Z., Xu, Y., Cheng, X., Xiong, W., & Zhao, Y. (2022). Local expansion moves for stereo matching based on random sample consensus confidence. Journal of Electronic Imaging, 31(1).

[27] Wang, W., Ye, S., Wang, X., & Zhao, Y. (2022). OMNET: Real-time stereo matching with unsupervised occlusion mask. In IEEE International Conference on Image Processing (ICIP) (pp. 1241–1245).

[28] Zhao, H., Zhou, H., Zhang, Y., Zhao, Y., Yang, Y., & Ouyang, T. (2023). EAI-Stereo: Error aware iterative network for stereo matching. In Computer Vision – ACCV 2022 (pp. 3–19).

[29] Yang, X., Feng, Z., Zhao, Y., Zhang, G., & He, L. (2022). Edge supervision and multi-scale cost volume for stereo matching. Image and Vision Computing, 117.

[30] Ye, S., Zeng, P., Li, P., Wang, W., Wang, X., & Zhao, Y. (2022). MLP-Stereo: Heterogeneous feature fusion in MLP for stereo matching. In IEEE International Conference on Image Processing (ICIP) (pp. 101–105).

[31] Zhao, H., Zhou, H., Zhang, Y., Jie, C., Yang, Y., & Zhao, Y. (2023). High-frequency stereo matching network. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (Highlight).

[32] Qian, R., Zhao, Y., et al. (2023). Depth information precise completion-GAN: A precisely guided method for completing ill regions in depth maps. Remote Sensing.

[33] Qian, R., Zhao, Y., et al. (2023). An algorithm for single view occlusion area detection in binocular stereo matching. International Journal of Pattern Recognition and Artificial Intelligence, 37(2), 2350003.

[34] Qian, R., Zhao, Y., et al. (2023). MFF: An effective method of solving the ill regions in stereo matching. IET Computer Vision.

[35] Qian, R., Zhao, Y., et al. (2023). Multi condition guided depth map completion method based on diffusion model. In International Society for Optical Engineering.

[36] Yang, W., Zhao, Y., Qian, R., Li, J., et al. (2023). Depth edge and structure optimization based end-to-end self-supervised stereo matching. International Journal of Pattern Recognition and Artificial Intelligence.

[37] Yang, W., Zhang, Z., Zhao, Y., Qian, R., et al. (2023). A clustering algorithm based on the structural information of graph. Entropy.

[38] Zhai, J., et al. (2023). Learnable blur kernel for single-image defocus deblurring in the wild. In AAAI Conference on Artificial Intelligence (AAAI).

[39] Ma, C., Zeng, P., Zhai, J., Liu, Y., Zhao, Y., & Wang, X. (2023). CLIP4Stereo: Revisiting domain generalized stereo matching via CLIP. In IEEE International Conference on Image Processing (ICIP) (pp. 106–110).

[40] Liao, X., Zhao, H., Zhao, Y., Yang, F., Chen, J., Cheung, K., Wang, X., & Jiang, J. (2025). RetinaStereo: Dynamic-volume stereo matching network. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

荣誉及成就

荣誉奖励：

1. 贵州省科技进步三等奖

2. 第四届“互联网+”大学生创新创业大赛三等奖，

学术兼职：

多个国际期刊及会议审稿人，比如

IEEE Transactions PAMI

IEEE TransactionsTNNLS

IEEE TransactionsTCSVT

IEEE Transactions Consumer Electronics

IEEE Transactions on Geoscience and Remote Sensing

IET Image Processing等

赵勇

赵勇研究员

个人简介

教育及工作经历

学术成果

荣誉及成就

请升级浏览器版本

Chrome

Firefox

Opera

Edge