SieveNet: Selecting Point-Based Features for Mesh Networks

Abstract

Meshes are widely used in 3D computer vision and graphics, but their irregular topology poses challenges in applying them to existing neural network architectures. Recent advances in mesh neural networks turn to remeshing and push the boundary of pioneer methods that solely take the raw meshes as input. Although the remeshing offers a regular topology that significantly facilitates the design of mesh network architectures, features extracted from such remeshed proxies may struggle to retain the underlying geometry faithfully, limiting the subsequent neural network's capacity.

To address this issue, we propose SieveNet, a novel paradigm that takes into account both the regular topology and the exact geometry. Specifically, this method utilizes structured mesh topology from remeshing and accurate geometric information from distortion-aware point sampling on the surface of the original mesh. Furthermore, our method eliminates the need for hand-crafted feature engineering and can leverage off-the-shelf network architectures such as the vision transformer.

Comprehensive experimental results on classification and segmentation tasks well demonstrate the effectiveness and superiority of our method.

Benefit of SieveNet

Geometry perservation. SieveNet effectively preserves intricate geometric details by capturing precise point-based features directly from the original mesh. This stands in contrast to approaches employing remeshing techniques (mentioned in row 4), which heavily rely on proxies that can introduce geometric inaccuracies.

Topology regularity. In parallel with remeshing strategies, our approach prioritizes maintaining a regular mesh topology. However, methods that directly utilize raw mesh inputs (described in rows 1-3) encounter challenges posed by the inherent irregularities within such meshes, thereby impeding their performance in capturing consistent topology information.

Elimination of hand-crafted features. SieveNet's innovation lies in its utilization of point-based features characterized solely by positional and normal attributes. This eliminates the necessity of hand-crafted features from faces, half-edges, etc.

In conclusion, SieveNet contributes to the advancement of 3D mesh networks vision by preserving geometry, enhancing topology regularity, and eliminating the demand for hand-crafted features.

Pipeline

Illustration of our pipeline. Our method takes into account both the regular topology and precise raw geometry.

Stage 1: Topology extraction. Given a triangle mesh, simplification and subdivision are conducted to extract a regular and fine-grained topology. In the meantime, a point-level bijection is established, as shown in the following figure.

Stage 2: Point-based feature construction. We start with stratified sampling candidate points on the subdivided mesh \(\mathcal{M}\), followed by a distortion-aware selection such that the selected points seem uniformly sampled from the original mesh \(\mathcal{M}^L\) within each topology unit. The positions and normals of selected points are packed together to represent a topology unit, which is then further packed and flattened to form a patch representation. With an off-the-shelf vision transformer, our method can achieve state-of-the-art results.

Results

Table 1: Segmentation results on the HumanBody dataset (Maron et al. 2017). The \(\dagger\) rows are evaluations on the original meshes, and the \(\ddagger\) rows are evaluations on the processed inputs. SieveNet surpasses the previous methods on both evaluations.

Table 2: Classification results on Manifold40 (Hu et al.2022a). The first two methods take point clouds as input. Other methods are mesh-based methods. SieveNet has better performance than those point cloud-based and remesh-based methods.