Skip to main content
University of Oxford logo Home

Search form

  • Log in
  • Members
  • About Us
    • Contact Us
    • Travel & Maps
    • Our Building
    • Supporting Mathematics
    • Alumni
    • History
    • Art and Oxford Mathematics
    • Equality, Diversity and Inclusion
    • News
    • Vacancies
  • Study Here
    • Undergraduate Study
    • Postgraduate Study
    • Current Students
  • Research
    • Research Groups
    • Case Studies
    • Faculty Books
  • Outreach
    • Posters
    • Oxford Mathematics Alphabet
    • Oxford Online Maths Club
    • Oxford Maths Festival
    • It All Adds Up
    • Problem Solving Matters
    • MIORPA
    • PROMYS Europe
    • Oxfordshire Maths Masterclasses
    • Outreach Information
    • Mailing List
  • People
    • Key Contacts
    • People List
    • A Global Department
    • Research Fellowship Programmes
    • Professional Services Teams
  • Events
    • Venue Hire
    • Public Lectures & Events
    • Departmental Seminars & Events
    • Special Lectures
    • Conferences
    • Summer Schools
    • Past Events
    • Info for Event Organisers & Attendees

Primary tabs

  • View
  • Contact
PROFILE

Dr Shiwei Liu

Ph.D.
Pronouns
He / Him
Status
Research Fellow
+44 1865 270744
Contact form
https://shiweiliuiiiiiii.github.io/
ORCID iD
https://orcid.org/https://orcid.org/0009-0001-1255-4436
Research groups
  • Machine Learning and Data Science
  • Numerical Analysis
Address
Mathematical Institute
University of Oxford
Andrew Wiles Building
Radcliffe Observatory Quarter
Woodstock Road
Oxford
OX2 6GG
Major / recent publications

Adriana Fernandez-Lopez, Honglie Chen, Pingchuan Ma, Lu Yin, Qiao Xiao, Stavros Petridis,  Shiwei Liu,  Maja Pantic.  MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization. Interspeech, 2024.

Qiao Xiao, Pingchuan Ma, Adriana Fernandez-Lopez, Boqian Wu, Lu Yin, Stavros Petridis, Mykola Pechenizkiy, Maja Pantic, Decebal Constantin Mocanu, Shiwei Liu. Dynamic Data Pruning for Automatic Speech Recognition. Interspeech, 2024.

Zhang, Zhenyu, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, and Zhangyang Wang. "Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding." arXiv preprint arXiv:2403.04797 (2024).

Yin, Lu, You Wu, Zhenyu Zhang, Cheng-Yu Hsieh, Yaqing Wang, Yiling Jia, Mykola Pechenizkiy, Yi Liang, Zhangyang Wang, and Shiwei Liu. "Outlier weighed layerwise sparsity (owl): A missing secret sauce for pruning llms to high sparsity." The Forty-first International Conference on
Machine Learning (ICML), 2024.

Lu Yin, Ajay Jaiswal, Shiwei Liu, Souvik Kundu, and Zhangyang Wang. "Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs Difficult Downstream Tasks in LLMs." The Forty-first International Conference on Machine Learning (ICML), 2024.

Yuxin Zhang, Yuxuan Du, Gen Luo, Yunshan Zhong, Zhenyu Zhang, Shiwei Liu, Rongrong Ji. "CaM: Cache Merging for Memory-efficient LLMs Inference." The Forty-first International Conference on Machine Learning (ICML), 2024.

Jie Ji, Gen Li, Lu Yin, Minghai Qin, Geng Yuan, Linke Guo, Shiwei Liu, Xiaolong Ma. Advancing Dynamic Sparse Training by Exploring Optimization Opportunities. The Forty-first International Conference on Machine Learning (ICML), 2024.

Zhangheng Li, Shiwei Liu, Tianlong Chen, Ajay Kumar Jaiswal, Zhenyu Zhang, Dilin Wang, Raghuraman Krishnamoorthi, Shiyu Chang, Zhangyang Wang. Sparse Cocktail: Co-Training Many Sparsity Patterns and Ratios at Once. The Forty-first International Conference on Machine
Learning (ICML), 2024.

Zhang, Yuxin, Lirui Zhao, Mingbao Lin, Yunyun Sun, Yiwu Yao, Xingjia Han, Jared Tanner, Shiwei Liu, and Rongrong Ji. "Dynamic sparse no training: Training-free fine-tuning for sparse llms."  In The Twelfth International Conference on Learning Representations. 2024.

Li, Gen, Lu Yin, Jie Ji, Wei Niu, Minghai Qin, Bin Ren, Linke Guo, Shiwei Liu, and Xiaolong Ma. "NeurRev: Train Better Sparse Neural Network Practically via Neuron Revitalization." In The Twelfth International Conference on Learning Representations. 2024.

Yang, Enneng, Zhenyi Wang, Li Shen, Shiwei Liu, Guibing Guo, Xingwei Wang, and Dacheng Tao. "Adamerging: Adaptive model merging for multi-task learning." In The Twelfth International Conference on Learning Representations. 2024.

Pham, Hoang, Shiwei Liu, Lichuan Xiang, Dung Le, Hongkai Wen, and Long Tran-Thanh. "Towards Data-Agnostic Pruning At Initialization: What Makes a Good Sparse Mask?." Advances in Neural Information Processing Systems 36 (2023).

Hoang, Duc, Souvik Kundu, Shiwei Liu, and Zhangyang Wang. "Don’t just prune by magnitude! Your mask topology is a secret weapon." Advances in Neural Information Processing Systems 36 (2023): 65056-65068.

Yin, Lu, Gen Li, Meng Fang, Li Shen, Tianjin Huang, Zhangyang Wang, Vlado Menkovski, Xiaolong Ma, Mykola Pechenizkiy, and Shiwei Liu. "Dynamic sparsity is channel-level sparsity learner." Advances in Neural Information Processing Systems 36 (2023).

Jaiswal, Ajay, Shiwei Liu, Tianlong Chen, and Zhangyang Wang. "The emergence of essential sparsity in large pre-trained models: The weights that matter." Advances in Neural Information Processing Systems 36 (2023).

Huang, Tianjin, Lu Yin, Zhenyu Zhang, Li Shen, Meng Fang, Mykola Pechenizkiy, Zhangyang Wang, and Shiwei Liu. "Are large kernels better teachers than transformers for convnets?." In International Conference on Machine Learning, pp. 14023-14038. PMLR, 2023.

Jaiswal, Ajay Kumar, Shiwei Liu, Tianlong Chen, Ying Ding, and Zhangyang Wang. "Instant soup: Cheap pruning ensembles in a single pass can draw lottery tickets from large models." In International Conference on Machine Learning, pp. 14691-14701. PMLR, 2023.

Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, Ajay Jaiswal, and Zhangyang Wang. "Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!." ICLR 2023.

Shiwei Liu, Tianlong Chen, Xiaohan Chen, Xuxi Chen, Qiao Xiao, Boqian Wu, Tommi Kärkkäinen, Mykola Pechenizkiy, Decebal Mocanu, and Zhangyang Wang. "More convnets in the 2020s: Scaling up kernels beyond 51x51 using sparsity."  ICLR 2023.

Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, and Mykola Pechenizkiy. "The unreasonable effectiveness of random pruning: Return of the most naive baseline for sparse training." ICLR 2022.

Shiwei Liu, Tianlong Chen, Xiaohan Chen, Zahra Atashgahi, Lu Yin, Huanyu Kou, Li Shen, Mykola Pechenizkiy, Zhangyang Wang, and Decebal Constantin Mocanu. "Sparse training via boosting pruning plasticity with neuroregeneration." Advances in Neural Information Processing Systems 34 (2021): 9908-9922.

Shiwei Liu, Lu Yin, Decebal Constantin Mocanu, and Mykola Pechenizkiy. "Do we actually need dense over-parameterization? in-time over-parameterization in sparse training." In International Conference on Machine Learning, pp. 6989-7000. PMLR, 2021.

Preferred address

Mathematical Institute
University of Oxford
Oxford
OX2 6GG

Further details

Shiwei Liu is a Royal Society Newton International Fellow at the University of Oxford. Previously,  he was a postdoctoral fellow at UT Austin and IFML. He obtained his Ph.D. at the Eindhoven University of Technology (TU/e), the Netherlands. 

I am open to collaborating with remote students and researchers who are interested in deep learning, efficient training/inference of LLMs, learning with sparsity, architecture in Deep Learning, etc. Feel free to drop me an email if you are interested. 

Recent publications
Revisiting Flatness-Aware Optimization in Continual Learning With Orthogonal Gradient Projection.
Yang, E Shen, L Wang, Z Liu, S Guo, G Wang, X Tao, D IEEE transactions on pattern analysis and machine intelligence volume 47 issue 5 3895-3907 (08 May 2025)
Data-Adaptive Weight-Ensembling for Multi-task Model Fusion
Tang, A Shen, L Luo, Y Liu, S Hu, H Du, B Tao, D International Journal of Computer Vision 1-17 (25 Apr 2025)
Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Fernandez-Lopez, A Liu, S Yin, L Petridis, S Pantic, M volume 00 1-5 (11 Apr 2025)
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective
Jin, C Huang, T Zhang, Y Pechenizkiy, M Liu, S Chen, T Proceedings of the AAAI Conference on Artificial Intelligence volume 39 issue 4 4111-4119 (11 Apr 2025)
Sparse Sounds: Exploring Low-Dimensionality in Music Generation Model
Wang, S Liu, S volume 00 3224-3234 (18 Dec 2024)
Dynamic Data Pruning for Automatic Speech Recognition
Xiao, Q Ma, P Fernandez-Lopez, A Wu, B Yin, L Petridis, S Pechenizkiy, M Pantic, M Mocanu, D Liu, S 4488-4492 (01 Sep 2024)
MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Fernandez-Lopez, A Chen, H Ma, P Yin, L Xiao, Q Petridis, S Liu, S Pantic, M 2820-2824 (01 Sep 2024)
Dynamic sparse no training: training-free fine-tuning for sparse llms
Tanner, J (15 Mar 2024)
E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation
Wu, B Xiao, Q Liu, S Yin, L Pechenizkiy, M Mocanu, D van Keulen, M Mocanu, E Advances in Neural Information Processing Systems volume 37 (01 Jan 2024)
HRBP: Hardware-friendly Regrouping towards Block-based Pruning for Sparse CNN Training
Ma, H Zhang, C Xiang, L Ma, X Yuan, G Zhang, W Liu, S Chen, T Tao, D Wang, Y Wang, Z Xie, X Proceedings of Machine Learning Research volume 234 282-301 (01 Jan 2024)
Advancing Dynamic Sparse Training by Exploring Optimization Opportunities
Ji, J Li, G Yin, L Qin, M Yuan, G Guo, L Liu, S Ma, X Proceedings of Machine Learning Research volume 235 21606-21619 (01 Jan 2024)
NEURREV: TRAIN BETTER SPARSE NEURAL NETWORK PRACTICALLY VIA NEURON REVITALIZATION
Li, G Yin, L Ji, J Niu, W Qin, M Ren, B Guo, L Liu, S Ma, X 12th International Conference on Learning Representations, ICLR 2024 (01 Jan 2024)
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
Yin, L Wu, Y Zhang, Z Hsieh, C Wang, Y Jia, Y Li, G Jaiswal, A Pechenizkiy, M Liang, Y Bendersky, M Wang, Z Liu, S Proceedings of Machine Learning Research volume 235 57101-57115 (01 Jan 2024)
FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping
Jaiswal, A Hu, B Yin, L Ro, Y Chen, T Liu, S Akella, A Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing 16943-16956 (2024)
Sparse Cocktail: Every Sparse Pattern Every Sparse Ratio All At Once
Li, Z Liu, S Chen, T Jaiswal, A Zhang, Z Wang, D Krishnamoorthi, R Chang, S Wang, Z Proceedings of Machine Learning Research volume 235 28368-28386 (01 Jan 2024)
ADAMERGING: ADAPTIVE MODEL MERGING FOR MULTI-TASK LEARNING
Yang, E Wang, Z Shen, L Liu, S Guo, G Wang, X Tao, D 12th International Conference on Learning Representations, ICLR 2024 (01 Jan 2024)
Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
Zhang, Z Chen, R Liu, S Yao, Z Ruwase, O Chen, B Wu, X Wang, Z Advances in Neural Information Processing Systems volume 37 (01 Jan 2024)
AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
Lu, H Zhou, Y Liu, S Wang, Z Mahoney, M Yang, Y Advances in Neural Information Processing Systems volume 37 (01 Jan 2024)
Is C4 Dataset Optimal for Pruning? An Investigation of Calibration Data for LLM Pruning
Bandari, A Yin, L Hsieh, C Jaiswal, A Chen, T Shen, L Krishna, R Liu, S Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing 18089-18099 (2024)
Junk DNA Hypothesis: Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs “Difficult" Downstream Tasks in LLMs
Yin, L Jaiswal, A Liu, S Kundu, S Wang, Z Proceedings of Machine Learning Research volume 235 57053-57068 (01 Jan 2024)
Research interests

Deep Learning

Machine Learning

Learning with Sparsity

Large Language Models

Architecture of Deep Learning

Prizes, awards, and scholarships

Rising Star in AI, KAUST, 01/2024

Rising Star Award, Conference on Parsimony and Learning (CPAL), 10/2023

Best Ph.D. Dissertation Award Runner-Up, Informatics Europe, 10/2023

Newton International Fellowship Award, Royal Society & British Academy, 9/2023

Best Paper Award, Learning on Graphs Conference (LoG 2022), 11/2022

Cum Laude (Distinguished Ph.D. thesis), Eindhoven University of Technology, NL, 5% 4/2022

Facebook LinkedIn Bluesky X
TikTok Instagram YouTube
London Mathematical Society Good Practice Scheme Athena SWAN Silver Award (ECU Gender Charter) Stonewall Silver Employer 2022

© Mathematical Institute

Accessibility Statement


Privacy Policy

Cookies

sfy39587stp18

We use cookies on this site to enhance your user experience

By clicking the Accept button, you agree to us doing so.