MASH represents a 3D shape by fitting a set of Masked Anchored SpHerical distance functions as observed from the perspective of a fixed number of anchor points in 3D space. Left shows an iterative optimization of MASH parameters, from an unoriented point cloud, leading to closer and closer approximations to the ground-truth shape surface. Middle and right show the versatility of MASH in enabling a variety of downstream applications including shape completion, blending, and conditional 3D generation from multi-modal inputs including text prompts, point clouds, and single-view images.
We introduce Masked Anchored SpHerical Distances (MASH), a novel multi-view and parametrized representation of 3D shapes. Inspired by multi-view geometry and motivated by the importance of perceptual shape understanding for learning 3D shapes, MASH represents a 3D shape as a collection of observable local surface patches, each defined by a spherical distance function emanating from an anchor point. We further leverage the compactness of spherical harmonics to encode the MASH functions, combined with a generalized view cone with a parameterized base that masks the spatial extent of the spherical function to attain locality. We develop a differentiable optimization algorithm capable of converting any point cloud into a MASH representation accurately approximating ground-truth surfaces with arbitrary geometry and topology. Extensive experiments demonstrate that MASH is versatile for multiple applications including surface reconstruction, shape generation, completion, and blending, achieving superior performance thanks to its unique representation encompassing both implicit and explicit features.
MASH is a compact parametric representation consisting of a collection of observations of objects from anchors. Each anchor observes objects along direction v from a spatial point p. It uses spherical harmonic coefficients C to encode distance information along different directions, while utilizing trigonometric basis function coefficients V to generate a view cone that constrains the representation range of the spherical distance functions, thereby accurately representing local surface geometry of the object.
We use a differentiable way to calculate the sampled points on surface patches from MASH parameters, which enables us to optimize MASH parameters by minimizing Fitting Error, Coverage Error and Boundary-continuty Error.
The visualization of MASH optimization process and the differences of representability between varying numbers of anchors.
MASH has no topological constraints and can effectively represent both the internal structures and thin structures of objects.
Qualitative results on surface reconstruction with different methods.There are two details for each result to show the performance of all methods better.
Reconstructions on the chair category with two different noise levels.
Qualitative results on category-conditioned generation compared with different methods.
Text-conditioned shape generation results.
Point-cloud-conditioned shape generation results.
Image-conditioned shape generation results.