Skip to content

MatrixTeam-AI/matrix

Repository files navigation

The Matrix

Download The Matrix model weights at 🤗 Huggingface or 🤖 ModelScope

Download The Matrix Dataset at 🤗 Huggingface or 🤖 ModelScope

📚 View the Paper, Website, and Documentation

👋 Say Hi to our team and members at Matrix-Team

📍 Explore The Matrix playground online at Journee to experience real-time AI generated world.

Real-Time Inference Tech Report: Coming Soon

🌍 Global Adoption & Acknowledgements

The Matrix Dataset has already been cited, tested, or deployed by the following research labs and industry teams.
Thank you for advancing open, multimodal research together!

Dynamics Lab
Dynamics Lab
UC San Diego
University of Waterloo
Microsoft Research
ByteDance
Alibaba
Tencent
Duality AI
AMD AI Group
Journee
University of Hong Kong
Nanyang Technological University
Vector Institute
Hong Kong UST
Tsinghua University

…and growing!

What is The Matrix?

The Matrix is an advanced world model designed to generate high-quality, infinite-time interactive videos in real-time, setting a new benchmark in the field of neural interactive simulations. It is simultaneously several innovations:

  • A cutting-edge world model that generates continuous, interactive video content with unparalleled realism and length.
  • A real-time system that supports infinite content generation, overcoming previous limitations seen in simpler 2D game models like DOOM or Minecraft.
  • A powerful model architecture powered by the Swin-DPM model, designed to produce dynamic, ever-expanding content.
  • A novel training strategy that integrates both real and simulated data, enhancing the system's ability to exceptional generalization capabilities.

At its core, The Matrix combines these elements to push the boundaries of interactive video generation, making real-time, high-quality, infinite-length content a reality.

Documentation

Comprehensive documentation is available in English. This includes detailed installation steps, tutorials, and training instructions. The paper and Project Page offer more details about the method.

Model Weights

Model checkpoints can be found in Huggingface and ModelScope. Please refer to the Documentation for how to load them for inferences.

Important Updates

According to a request from Alibaba Tongyi Lab, the previous version of The Matrix was inherited from an internal version of Video DiT and could not be openly released. Therefore, we have re-implemented The Matrix code based on the previously open-released video generation model, CogVideoX. We sincerely appreciate the efforts of the CogVideo team for their contributions.

Implemented Features

Most planned components are now live, delivering real-time, infinite-horizon generation at 16 FPS with near-zero latency:

  • 8-GPU Parallel Inference for DiT & VAE
    Both the Diffusion Transformer (DiT) backbone and our VAE decoder run across 8 GPUs in parallel, yielding a 6–8× speedup over single-GPU baselines.

  • Stream Consistency Models
    Advanced consistency losses enable uninterrupted generation over arbitrary lengths, boosting end-to-end throughput by 7–10×.

Key Capabilities

  • Real-Time Control
    Instantly respond to live inputs (e.g., steering, throttle), updating the generated scene in ** real time**.

  • Infinite-Horizon Generation
    Seamlessly extend scenes without drift or degradation—generate as long as you like.

  • Low-Latency Feedback Loop
    End-to-end system sustains a continuous 16 FPS render/playback cycle for smooth interactive experiences.

Latency

GPU Type Latency VAEnum DiTnum
A100 0.6 s 3 5
A900 0.6 s 3 5
L40 1.2 s 1 3
H100 - - -

Known Issues

  • Latency bottlenecks
    End-to-end inference sometimes falls below real-time requirements under heavy load.

  • Color degradation on long straight segments
    Sustained straight driving causes gradual visual drift; sharp turns temporarily correct colors.

  • Global consistency drift
    Over extended horizons, scene coherence can degrade (e.g., object placement, lighting).

Planned

  • Training on Fused Realistic + Simulated Data
    Joint training on real-world captures and high-fidelity simulations to acquire stronger generalization ability.

  • Latency optimization
    Profiling and kernel fusion to further reduce end-to-end inference time.

  • Color stability enhancement
    Incorporate temporal color correction modules to prevent degradation on straight paths.

  • Global consistency models
    Develop long-range consistency losses and memory mechanisms to maintain scene coherence indefinitely.

Reimplementation contributions

The successful release of The Matrix Project is built upon the collective efforts of our incredibly talented team members. We extend our heartfelt gratitude for their dedication, hard work, and invaluable contributions. Those members are:

Longxiang Tang, Zhicai Wang, Ruili Feng, Ruihang Chu, Han Zhang, and Zhantao Yang

Special Thanks to Longxiang and Zhicai for their excellent contributions.

Additional Notes

There have been certain changes to the hyperparameter settings and training strategy compared to what is reported in the paper due to the re-implementation. Please be aware of these when reviewing the code.

Despite these changes, we are pleased to announce that the overall generation quality is much more advanced compared to the previous version after more careful design of methods and parameters.

Citation

If you find our work useful please consider citing:

@article{feng2024matrix,
  title={The matrix: Infinite-horizon world generation with real-time moving control},
  author={Feng, Ruili and Zhang, Han and Yang, Zhantao and Xiao, Jie and Shu, Zhilei and Liu, Zhiheng and Zheng, Andy and Huang, Yukun and Liu, Yu and Zhang, Hongyang},
  journal={arXiv preprint arXiv:2412.03568},
  year={2024}
}

License

The code in this repository is released under the Apache 2.0 License.

The Matrix model (including its corresponding Transformers module and VAE module) is released under the Apache 2.0 License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published