Research

Research

Juanxi Tian

Currently, I am interested in several topics, including but not limited to:

  • Large-scale (Vision / Language) Foundation Models: My research primarily aims to investigate architectural design and the underlying coupling bias, such as its interaction with optimizers. I identified the Backbone-Optimizer Coupling Bias (BOCB) , a phenomenon where architectural properties disproportionately influence optimization behavior. By analyzing BOCB across different architectures, I uncovered dependencies that challenge traditional optimizer assumptions and provided practical guidelines for integrating optimization strategies with architectural inductive biases. Additionally, I developed Scaling with Gradient Grouping (SGG) , an adaptive optimization framework for large language models. SGG dynamically groups parameters based on gradient similarity, enabling stable updates and reducing computational overhead. This method supports scalable training under resource constraints and advances effective training paradigms for foundation models.
  • Efficient & Scalable AI: As the scale-up of training large models becomes increasingly critical, many conventional techniques are no longer effective. Developing efficient and scalable methodologies—such as speculative decoding, advanced optimizers, and domain-specific efficiency/optimization techniques tailored for generative models—is essential. Crucially, these approaches must remain effective as model size continues to grow.
  • Unified Models for Generation and Understanding Recent efforts in unifying generation and understanding models continue to explore various aspects, including architectural design, data composition strategies, and training paradigms. While the broader community has yet to reach a consensus on the significance and long-term implications of unified modeling, I believe that building a bridge between visual representation learning and generative modeling through a shared framework is a promising direction. Such an approach may play a crucial role in advancing toward Artificial General Intelligence (AGI).

The following presents my comprehensive research experience and areas of focus, along with a timeline highlighting the periods when I was most actively engaged in each field. The template is from here.


Efficient & Scalable AI (08/2024 - Present)

How to achieve efficient and scalable general/domain-specific technologies?

How to achieve effective training more efficiently and ensure good generalization?

Show/Hide Work

  • Data-centric efficient AI
    • Openmixup: Open mixup toolbox and benchmark for visual representation learning
    • A Survey on Mixup Augmentations and Beyond
  • Efficient Optimization
    • SGG: Taming LLMs by Scaling Learning Rates with Gradient Grouping
    • Switch EMA: A Free Lunch for Better Flatness and Sharpness
    • BOCB: Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning

Generative Models (10/2024 - Present)

How to ensure the efficient and high-quality generation of generative models?

How to better build an efficient unified multimodal generation framework?

Show/Hide Work

  • Unified Framework/Model (Image)
    • MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
  • 4D
    • WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes
    • Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis