Recent advances in generative models have achieved high-fidelity in 3D human reconstruction, yet their utility for specific tasks (e.g., human 3D segmentation) remains constrained. We propose HumanCrafter, a unified framework that enables the joint modeling of appearance and human-part semantics from a single image in a feed-forward manner. Specifically, we integrate human geometric priors in the reconstruction stage and self-supervised semantic priors in the segmentation stage. To address labeled 3D human datasets scarcity, we further develop an interactive annotation procedure for generating high-quality data-label pairs. Our pixel-aligned aggregation enables cross-task synergy, while the multi-task objective simultaneously optimizes texture modeling fidelity and semantic consistency. Extensive experiments demonstrate that HumanCrafter surpasses existing state-of-the-art methods in both 3D human-part segmentation and 3D human reconstruction \textbf{from a single image}. The ablation studies validate the efficacy of critical model designs. The constructed dataset and code will be released.
            The network architecture of HumanCrafter. The proposed method fully utilizes 2D diffusion priors and human body geometry features to regress pixel-aligned point maps via a generic Transformer (Sec.~\ref{feat_agg}). Subsequently, another Transformer (Sec.~\ref{mechanism}) employs an attention mechanism to produce a set of semantic 3D Gaussians that encapsulate geometric, appearance, and semantic information. The entire pipeline is trained in an end-to-end manner by minimizing a loss function (Sec.~\ref{sec:Objective}) that compares the predicted outputs against ground truth data and rasterized label maps from novel viewpoints.
![]()  | 
                  ![]()  | 
                  ![]()  | 
                  ![]()  | 
                
![]()  | 
                  ![]()  | 
                  ![]()  | 
                  ![]()  | 
                
![]()  | 
                  ![]()  | 
                  ![]()  | 
                  ![]()  | 
                
| Download Ply | Download Ply | Download Ply | Download Ply | 
| Ours | Ours | Ours | Ours | 
| Download Ply | Download Ply | Download Ply | Download Ply | 
| Human3Diffusion | Human3Diffusion | Human3Diffusion | Human3Diffusion | 
![]()  | 
                  ![]()  | 
                  ![]()  | 
                  ![]()  | 
                
| Download Ply | Download Ply | Download Ply | Download Ply | 
| Ours | Ours | Ours | Ours | 
| Download Ply | Download Ply | Download Ply | Download Ply | 
| Human3Diffusion | Human3Diffusion | Human3Diffusion | Human3Diffusion |