Enhancing 3D Character Generation with ControlNet and LoRA

1University of California Berkeley, 2ShanghaiTech University

Abstract

In the rapidly advancing domain of digital 3D content creation, the demand for efficient and sophisticated generation tools is increasingly crucial. This paper presents an innovative solution to augment 3D character generation by seamlessly integrating ControlNet and Low-Rank Adaptation (LoRA) into pre-existing text-to-image diffusion models. Traditional systems often grapple with issues such as lack of spatial consistency and the occurrence of multi-headed artifacts due to poor quality in multi-view image synthesis.
Our approach leverages ControlNet for refined pose control and adapts 3D Gaussian Splatting for effective spatial optimization and pruning. In addition, we utilize LoRA for the fine-tuning of pre-trained text-to-3D models, facilitating the creation of personalized and high-fidelity 3D characters that meet specific user requirements. A notable enhancement in our methodology is the application of Noise-Free Score Distillation (NFSD), which significantly elevates model performance at reduced CFG scales. This strategy enables the production of detailed, high-resolution 3D avatars from textual descriptions, while assuring feature consistency across diverse views.
To validate the effectiveness of our proposed method, we carried out comprehensive ablation studies and user evaluations. These assessments involved comparing our approach with existing baselines to showcase its superiority in generating photo-realistic 3D models that accurately reflect user inputs. Our research represents a significant advancement in AI-assisted 3D character generation, opening new avenues in industries such as gaming, animation, and virtual reality. It contributes a notable innovation to the burgeoning field of text-to-3D transformation. low rank matrices.