|
Panwang Pan | 潘攀望
Hi, I’m Panwang Pan, a Senior Researcher working at the intersection of Generative AI and multimodal learning.
Previously, I was a Senior Algorithm Engineer at Alibaba Cloud, bridging research and production by deploying models to complex, real‑world systems—from embedded devices to cloud platforms. I led the algorithm deployment for the Aliyun AI‑Box.
I received my M.S. in 2019 from Xiamen University (School of Informatics).
Email
 / 
Google Scholar
 / 
Github
 / 
Twitter
 / 
Wechat
|
|
📢 News
[2026-02] Five papers were accepted to CVPR 2026. One paper was accepted to ICLR 2026.
[2025-09] Six papers, including one oral presentation, were accepted to NeurIPS 2025.
[2025-06] One paper was accepted to ICCV 2025, and we released PartCrafter, a 3D-native diffusion transformer that generates 3D objects part by part.
[2026-02] One paper was accepted to CVPR 2025.
[2025-01] Three papers, including one Spotlight paper, were accepted to ICLR 2025.
|
|
Research Overview
My recent work is organized into two directions: Multimodal Generation and VLM Multimodal Understanding. In generation, I prioritize scene generation and world models, then extend to video and 3D content creation. In understanding, I focus on Jarvis-style systems, agentic workflows, and multimodal reasoning for perception and decision-making.
|
1. Multimodal Generation
World models first, with work presented in the order of scene generation, video generation, and 3D content generation.
Representative topics: 4D scenes, dynamic worlds, controllable video generation, meshes, Gaussian splats, and semantic layouts.
|
2. VLM Multimodal Understanding
Jarvis series and related agent systems where VLMs interpret instructions, coordinate tools, and improve downstream perception.
Representative topics: image restoration agents, photo retouching agents, multimodal planning, and perception-oriented VLM pipelines.
|
|
|
Multimodal Generation
|
Video Generation
Controllable video synthesis with compositional objectives and language-grounded reward signals.
|
|
Scene Generation and World Models
Dynamic scene generation, 4D world modeling, and motion-aware representations for controllable environments.
|
|
|
4K4DGEN: Panoramic 4D Generation at 4K Resolution
Panwang Pan*‡, Renjie Li*, Bangbang Yang, Dejia Xu, Shijie Zhou, Xuanyang Zhang, Zeming Li, Achuta Kadambi, Zhangyang Wang, Zhengzhong Tu, Zhiwen Fan
[OpenReview]
[Paper]
[Project]
[Code]
4K4DGEN achieves high-quality panorama-to-4D generation at 4K resolution for the first time using efficient splatting techniques for real-time exploration.
|
|
|
DynamicVerse: Physically-Aware Multimodal Modeling for Dynamic 4D Worlds
Kairun Wen, Yuzhi Huang, Runyu Chen, Hui Zheng, Yunlong Lin, Panwang Pan, Chenxin Li, Wenyan Cong, Jian Zhang, Junbin Lu, Chenguo Lin, Dilin Wang, Zhicheng Yan, Hongyu Xu, Justin Theiss, Yue Huang, Xinghao Ding, Rakesh Ranjan, Zhiwen Fan
[Paper]
[Project]
[Code]
DynamicVerse is a physical-scale, multimodal 4D modeling framework for real-world videos.
|
3D Content Generation
Structured 3D generation spanning meshes, Gaussian splats, humans, and semantic layouts.
VLM Multimodal Understanding
|
This track covers Jarvis-style systems, agentic workflows, and multimodal understanding modules where VLMs interpret instructions, coordinate tools, and improve downstream perception and decision-making.
|
Jarvis Series and Agentic Understanding
Representative VLM systems for multimodal planning, restoration, and creative interaction.
NeurIPS 2025
|
JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent
Yunlong Lin, Zixu Lin, Kunjie Lin, Jinbin Bai, Panwang Pan, Chenxin Li, Haoyu Chen, Zhongdao Wang, Xinghao Ding‡, Wenbo Li, Shuicheng Yan‡
[Paper]
[Project]
[Code]
JarvisArt shows how an agentic VLM can plan and execute photo retouching while preserving content fidelity and faithfully following instructions.
|
🏆 Selected Awards
2024, 2023: ByteStyle Innovation Breakthrough Award (ByteDance)
2019: Outstanding Graduate of Xiamen University
2018: National Scholarship for Postgraduates, Ministry of Education (China’s highest scholarship honor)
2018: First Prize of GEDC; Second Prize of MCM & CPIPC
2017: Zhongxian Huang Scholarship, Xiamen University (≈10 awards per year)
2015: National Scholarship for Undergraduates (China’s highest scholarship honor)
|
💬 Miscellaneous
Conference Reviewer: NeurIPS, ICLR, CVPR, ICML, ICCV, ACM MM
|
|