Yulin Wang (王语霖)


Ph.D. Candidate, Tsinghua University

CV (2022.07)Google ScholarSemantic ScholarGitHub



About Me


I am a fourth year Ph.D. candidate in the Department of Automation at Tsinghua University, advised by Prof. Gao Huang and Prof. Cheng Wu. Before that, I received my B.E. degree in Automation at Beihang University.

Feel free to call me "Rainforest", which shares the same pronunciation as "Yulin" in Chinese.

My research interests lie in the efficient training and inference of deep learning models.




News


2022.07: The Journal Version of GFNet is Accepted by TPAMI (IF=24.31).

2022.03 & 2022.07: AdaFocusV2 & AdaFocusV3 are Accepted by CVPR 2022 & ECCV 2022.

2021.12: Awarded by the Baidu Fellowship 2021 (10 PhD candidates worldwide).

2021.10: Awarded by the CCF-CV Outstanding Young Researcher Award 2021 (3 in China).

2021.09: Not All Images are Worth 16x16 Words! Our Dynamic ViT (DVT) is Accepted by NeurIPS 2021.

2021.09: Our Survey on Dynamic Neural Networks is Accepted by TPAMI (IF=24.31).

2021.07: AdaFocus is Accepted by ICCV 2021 for Oral Presentation.

2021.05: Selected to be an Outstanding Reviewer of CVPR 2021.

2021.03: Three Papers are Accepted by CVPR 2021 (with one Oral).

2021.01: The Journal Version of ISDA is Accepted by TPAMI (IF=24.31).

2021.01: One Paper is Accepted by ICLR 2021.




Recent Publications (Selected)


Glance and Focus Networks for Dynamic Visual Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI, IF=24.31), 2021
Gao Huang*, Yulin Wang*, Kangchen Lv, Haojun Jiang, Wenhui Huang, Pengfei Qi, and Shiji Song (co-first author with my advisor)
[PDF] [Code] [知乎 (on GFNet)]
Journal version of GFNet. Additional exploration on the new reward function and the multi-scale patches. New results on high-resolution image recognition and video recognition. Significantly reducing the inference cost on both CPUs and GPUs.


AdaFocus V3: On Unified Spatial-temporal Dynamic Video Recognition
European Conference on Computer Vision (ECCV) 2022
Yulin Wang, Yang Yue, Xinhong Xu, Ali Hassani, Victor Kulikov, Nikita Orlov, Shiji Song, Humphrey Shi, and Gao Huang
[PDF (to be added)]
Motivated by that most existing works model the spatial and temporal redundancy separately, we further explore the unified formulation of spatial-temporal dynamic computation on top of AdaFocusV2.


AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
Yulin Wang, Yang Yue, Yuanze Lin, Haojun Jiang, Zihang Lai, Victor Kulikov, Nikita Orlov, Humphrey Shi, and Gao Huang
[PDF] [Code]
Compared to AdaFocus-V1: End-to-End trainable, much easier to implement, less than 50% training cost, but with significantly stronger performance.


Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition
Advances in Neural Information Processing Systems (NeurIPS) 2021
Yulin Wang, Rui Huang, Shiji Song, Zeyi Huang, and Gao Huang
[PDF] [Code] [知乎] [量子位] [AI科技评论]
We develop a Dynamic Vision Transformer (DVT) to automatically configure a proper number of tokens for each individual image, leading to a significant improvement in computational efficiency, both theoretically and empirically.


Adaptive Focus for Efficient Video Recognition
IEEE/CVF International Conference on Computer Vision (ICCV Oral) 2021
Yulin Wang, Zhaoxi Chen, Haojun Jiang, Shiji Song, Yizeng Han, and Gao Huang
[PDF] [Code] [Poster] [知乎] [Bilibili]
In this paper, we explore the spatial redundancy in video recognition with the aim to improve the computational efficiency. Extensive experiments on five benchmark datasets, i.e., ActivityNet, FCVID, Mini-Kinetics, Something-Something V1&V2, demonstrate that our method is significantly more efficient than the competitive baselines.


Regularizing Deep Networks with Semantic Data Augmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI, IF=24.31), 2021
Yulin Wang, Gao Huang, Shiji Song, Xuran Pan, Yitong Xia, and Cheng Wu
[PDF] [Code] [知乎] [新智元] [AI科技评论]
Journal version of ISDA. More ImageNet results. Visualizations on ImageNet. More results on semi-supervised learning, semantic segmentation and object detection.


Revisiting Locally Supervised Learning: an Alternative to End-to-end Training
International Conference on Learning Representations (ICLR) 2021
Yulin Wang, Zanlin Ni, Shiji Song, Le Yang, and Gao Huang
[PDF] [Code] [Poster] [知乎] [PaperWeekly] [Bilibili]
We provide a deep understanding of locally supervised learning, and make it perform on par with end-to-end training, while with significantly reduced GPUs memory footprint.


Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification
Advances in Neural Information Processing Systems (NeurIPS) 2020
Yulin Wang, Kangchen Lv, Rui Huang, Shiji Song, Le Yang, and Gao Huang
[PDF] [Code] [Poster] [知乎] [AI科技评论]
We propose a general framework for inferring CNNs efficiently, which reduces the inference latency of MobileNets-V3 by 1.3x on an iPhone XS Max without sacrificing accuracy.


Implicit Semantic Data Augmentation for Deep Networks
Advances in Neural Information Processing Systems (NeurIPS) 2019
Yulin Wang, Xuran Pan, Shiji Song, Hong Zhang, Cheng Wu, and Gao Huang
[PDF] [Code] [Poster]
We propose a novel implicit semantic data augmentation (ISDA) approach to complement traditional augmentation techniques like flipping or translation.


Collaborative Learning with Corrupted Labels
Neural Networks (Q1, IF=9.66), 2019
Yulin Wang, Rui Huang, Gao Huang, Shiji Song, and Cheng Wu
[PDF]
We propose a collaborative learning approach to improve the robustness and generalization performance of DNNs on datasets with corrupted labels.


Dynamic Neural Networks: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI, IF=24.31), 2021
Yizeng Han, Gao Huang, Shiji Song, Le Yang, Honghui Wang, and Yulin Wang
[PDF] [智源社区] [机器之心-在线讲座] [Bilibili]
Dynamic neural network is an emerging research topic in deep learning. Compared to static models which have fixed computational graphs and parameters at the inference stage, dynamic networks can adapt their structures or parameters to different inputs, leading to notable advantages in terms of accuracy, computational efficiency, adaptiveness, etc. In this survey, we comprehensively review this rapidly developing area.


Transferable Semantic Augmentation for Domain Adaptation
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR Oral) 2021
Shuang Li, Mixue Xie, Kaixiong Gong, Chi Harold Liu, Yulin Wang, and Wei Li
[PDF] [Code]
This paper extends the ISDA approach to the problem of domain adaptation, resulting in a simple but effective transferable semantic augmentation (TSA) algorithm. Extensive experiments on four benchmarks are conducted.


Currently, I have several papers under review as well. I wish that I will receive positive results. If you are interested in my research, please feel free to reach me.



Invited Talks





Academic Service


Reviewer for TPAMI, IJCV, TCYB, TNNLS, TCSVT, Pattern Recognition, TMLR, ...

Reviewer for ICML, NeurIPS, ICLR, CVPR, ICCV, ECCV, ...

Program Committee Member for AAAI.



Education


Ph.D. in Pattern Recognition and Machine Learning, Tsinghua University, China.
2019.8 - Present

B.Eng. in Automation, Beihang University, China.
2015.8 - 2019.6 (GPA Top 1/231)



Research Experience


Intern, Berkeley Deep Drive, University of California Berkeley, CA, USA.
2018.7 - 2018.8, advised by Dr. Ching-Yao Chan.

Intern, Lab of Intelligent Manufacturing, Beihang University, China.
2017.6 - 2018.6, advised by Prof. Fei Tao.



Selected Honors





Contact


Email: wang-yl19@mails.tsinghua.edu.cn

Address: Room 616, Central Main building, Tsinghua University, Beijing



Source code from here.