Tencent has published Primitifanhything: a new IA framework that reconstructs 3D forms using the primitive self-regressive generation

by Brenden Burgess

When you buy through links on our site, we may earn a commission at no extra cost to you. However, this does not influence our evaluations.

The form of primitive abstraction, which breaks down complex 3D forms into simple and interpretable geometric units, is fundamental for human visual perception and has important implications for vision and computer graphics. Although recent methods of the 3D generation – using representations such as meshes, points of points and neural fields – have enabled high fidelity content, they often do not have the semantic depth and interpretability necessary for tasks such as robotic manipulation or understanding of the scene. Traditionally, primitive abstraction has been discussed using optimization-based methods, which adapt geometric primitives to forms, but often over-semantically over-segment, or learning-based methods, which train on small sets of data specific to the category and therefore lack generalization. The first approaches used basic primitives such as cuboids and cylinders, evolving later towards more expressive forms such as superquadrics. However, a major challenge persists in the conception of methods which can abstraction of forms in a way that aligns with human cognition while generalizing in various categories of objects.

Inspired by recent breakthroughs in the generation of 3D content using large sets of data and self-regressive processors, the authors offer the abstraction of reframing form as a generative task. Rather than relying on geometric adjustment or direct regression of the parameters, their approach sequentially builds primitive assemblies to reflect human reasoning. This design more effectively captures semantic structure and geometric accuracy. Previous work in self -regressive modeling – such as Meshgpt and Meshanhything – have shown strong results in the mesh generation by treating 3D forms such as sequences, incorporating innovations such as compact tokenization and forms of forms.

Primitifanhything is a framework developed by researchers from Tencent AIPD and the Tsinghua University which redefines the abstraction of the form as a generation of primitive assembly. It introduces a transformer only into a decoder packaged on the form characteristics to generate primitive sequences of variable length. The frame uses a united unified setting scheme which supports several primitive types while maintaining high geometric precision and learning efficiency. By directly learning of abstractions of form designed by humans, Primitifanhything effectively captures the way in which complex forms are divided into simpler components. Its modular design supports easy integration of new primitive types, and experiences show that it produces high quality abstractions and perceptually aligned through various 3D forms.

Primitifanhything is a framework that models 3D form abstraction as a sequential generation task. It uses a discreet and unambiguous parameterization to represent the type, translation, rotation and scale of each primitive. These are coded and introduced in a transformer, which predicts the next primitive according to those previous and the characteristics of form extracted from the punctual clouds. A model in cascade model the dependencies between the attributes, guaranteeing a coherent generation. The training combines losses between input, a chamfer distance for the accuracy of reconstruction and Gumbel-Softmax for differentiaiable sampling. The process continues self -regressively until the end of an end -of -sequence token, allowing a flexible and human decomposition of complex 3D forms.

The researchers introduce a large -scale HumanPrim data set comprising 120,000 3D samples with primitive assemblies manually annotated. Their method is evaluated using measurements such as the chamfer distance, the distance from the earth engine, the Hausdorff distance, the Voxel-Iou and the segmentation scores (RI, VOI, SC). In relation to the methods based on existing optimization and learning, it shows higher performance and better alignment with models of human abstraction. Ablation studies confirm the importance of each design component. In addition, the framework supports the generation of 3D content from text or image inputs. It offers a user -friendly modification, high modeling quality and more than 95% storage savings, which makes it well suited to effective and interactive 3D applications.

In conclusion, Primitifanhything is a new framework which addresses the abstraction of the 3D form as a task of generation of sequences. By learning primitive assemblies designed by humans, the model effectively captures intuitive decomposition models. He obtains high quality results in various categories of objects, highlighting his strong generalization capacity. The method also supports the creation of flexible 3D content using primitive representations. Due to its efficiency and light structure, PrimitiveanYthing is well suited to allow the content generated by users in applications such as games, where performance and ease of handling are essential.


Check Paper,, Demo And GitHub page. All the merit of this research goes to researchers in this project. Also, don't hesitate to follow us Twitter And don't forget to join our 90K + ML Subdreddit.

Here is a brief overview of what we build on Marktechpost:


Sana Hassan, consulting trainee at Marktechpost and double -degree student at Iit Madras, is passionate about the application of technology and AI to meet the challenges of the real world. With a great interest in solving practical problems, it brings a new perspective to the intersection of AI and real life solutions.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.