Upload
brice-white
View
217
Download
3
Embed Size (px)
Citation preview
October 25, 2007
P_HUGG and P_OPT:An Overview of Parallel Hierarchical Cartesian
Mesh Generation and Optimization-based Smoothing
Presented at NASA Langley, Hampton, VA, bySteve L. Karman, Jr.
Vincent C. Betro
Outline
• Parallel Hierarchical Unstructured Cartesian Mesh Generation
- Terminology and Strategy
- Partitioning
- Results
• Parallel Optimization-based Smoothing
- Terminology and Strategy
- Partitioning
- Results• Conclusions• Future Work
P_HUGG: Terminology and Strategy
• Develop an algorithm for generating a high-quality mesh– Create hybrid or general cut polyhedral meshes with body-
conforming cut elements using closed loops/shells– Allow user to define refinement spacing, which may be larger or
smaller than the existing geometry spacing– Modify spacing based on curvature and intersection tests– Speed process by using MPI and grid partitioning
• Implement various C++ class structures for compact communication during meshing
• Validate mesh quality by testing on several geometries and optimizing the final mesh with parallel optimization-based smoothing
P_HUGG: Terminology and Strategy
P_HUGG uses Isotropic refinement to build the Octree structure.
This allows for uniformity which makes data structures more consistent and communication more efficient.
P_HUGG: Terminology and Strategy
• The building block of a hierarchical Cartesian mesh is the voxel, which is short for “volumetric pixel”.
• Voxels are indexed using a processor-index pair, to aid in parallel communication
• Each voxel contains information pointing to its relational location in the mesh, but no physical coordinates
– cell-to-node hash table
– parent index
– neighbor indices
– child indices
– boundary facet list/boundary element shell list
P_HUGG: Terminology and Strategy
Physical nodes…
• are also indexed using a processor-index pair• are assigned ownership by the lowest processor that owns a voxel
which contains the node• contain the physical coordinates of each node created as part of
refining a voxel• can be ignored until the general cutting process when tolerances
dictate the snapping of nodes
P_HUGG: Terminology and Strategy
Super Cell CreationIn order to begin recursive
refinement, a Cartesian super cell is
created around the existing
geometry, unless the outer boundary
is initially a cube in which case the
super cell and the outer boundary
are coincident and there will be no
“external” voxels to be turned off
during cutting.
P_HUGG: Terminology and Strategy
Spawning to
Multiple ProcessorsOnce the super cell has been refined
into as many (or more) voxels as
there exist processors, each
processor receives one (or more)
voxel(s). Ownership of the voxel is
reassigned to the processor to which
it is spawned, and the nodes’
processor-index pairs are then
updated.
P_HUGG: Terminology and Strategy
Refinement occurs on each processor simultaneously and…
• new nodes are created at the mid-edges, mid-faces, and centroids of existing voxels
• nodes are guaranteed to be unique, since all nodes created on a partition boundary are communicated and a common index is established
• neighbors are re-calculated based on the tree structure• lineage of voxels is passed along in the processor-index pair• voxels are tagged for refinement based on spacing parameters and
mesh quality constraints (including cell size gradation parameter)
P_HUGG: Terminology and Strategy
Mesh quality is enforced by determining
unacceptable voxel configurations
One face connecting more than three different levels of refinement
Opposite neighbors both at a higher level of refinement
than the current voxel
P_HUGG: Terminology and Strategy
Ghost voxels…
• are integral in assuring that refinement is consistent on borders between processors
• are denoted by having a different processor set as owner in the processor-index pair than the processor on which they reside
• allow new nodes created during refinement and cutting to be indexed correctly and not be duplicated
• exist in the normal neighboring positions to a voxel as well as at the corners
• contain no information about non-bordering children or the results of the cutting process
P_HUGG: Terminology and Strategy
• The voxels shaded in orange are in the upper left corner of a given processor.
• The voxels shaded in green are the finest level ghost voxels used in the neighbor tables on that processor.
• The voxels shaded in blue are the ghost parents of ghost voxels, but only show the children directly bordering the processor in question.
P_HUGG: Terminology and Strategy
Once a mesh has been
generated around a
geometry, the elements
(voxels and nodes) that
are outside the
computational domain
must be “turned off”.
Then, body conforming
shells are generated
with the remainders of
voxels that have been
“cut” by the geometry.
P_HUGG: Terminology and Strategy
Mark in and out status of nodes during shell creation;
use flood fill to mark remaining nodes.Uncut inside voxels are stored
as hexahedra or polyhedra.
Within a voxel, polygons with common boundaries are merged.
Create shells using the cut polygons and exposed voxel faces.
Eliminate collinear points to minimize the number of edges on
the final polygonal elements.
Triangular facets are passed down the Octree to the finest level, clipped by the bounds of each
P_HUGG: Terminology and Strategy
ToleranceTolerance
The tolerance used in P_HUGG is a user-specified factor to be multiplied times the length of a edge of a voxel on the finest level. This tolerance is used for snapping cutting intersections to already created points without significant loss of accuracy in reconstructing the original geometry. The proper size for this factor has proven to be difficult to determine from case to case.
P_HUGG: Partitioning
• In P_HUGG2D, round robin partitioning was used. This can cause a sizable increase in surface area and load balancing issues.
• In, P_HUGG, the partitioning is based on a factor computed by finding the ratio of facet areas to user spacing parameters within each pre-spawn voxel.
• This weight factor rectifies the load-balancing issues and Metis will be implemented to assist in reducing surface area.
8 procs (P_HUGG)
The color coding corresponds to the domain owned by each of the processors. A disjoint domain is a distinct possibility, when the number of processors is not a
power of eight or the facet area to spacing parameter ratio is applied.
8 procs (P_HUGG2D)
P_HUGG: Partitioning
The geometry to processor distribution on the surface of the cube without the adjacent mesh demonstrates the equal distribution obtained by the use of the facet area to defined spacing parameters ratio in correcting load balancing issues.
P_HUGG: Partitioning
• The 64-processor distribution of the mesh around the sphere shows that while load balancing has been greatly improved, the surface area issues need to be corrected with Metis.
• The same can be said of the 8-processor distribution of the mesh around the hull.
P_HUGG: Partitioning
Example of a non-trivial partition of the M6 wing, y and z planes
P_HUGG: PartitioningExample of a non-trivial partition of the M6 wing, with and without mesh
P_OPT: Terminology and Strategy
In order to remove high aspect ratio elements (sliver cells) and get improved results from the flow solver, optimization-based smoothing is performed on the mesh.
• Each node is perturbed based on a cost function calculated using Jacobians and condition numbers of the surrounding elements.
• If the perturbation improves the cost function for the node, the node is moved permanently to the new position
• The mesh is moved until eventually all perturbations cannot improve the cost function
P_OPT: Terminology and Strategy
• The M6 wing after optimization was used to spread out cells that get bunched about a voxel level change.
• The M6 wing before optimization would not be conducive to running on a flow solver due to sliver cells.
*case run in serial
P_OPT: Partitioning
• Metis is used to either decompose a mesh on one machine and feed it back into the optimizer as parallel mesh files or decompose the mesh while the code is running and feed it to other procs through communication
• Nodes are partitioned with no weighting using compressed row storage and eventually will be weighted by whether or not they are part of the geometry facets
• A standard CGNS file format is used in the parallel mesh files with the addition of a partition to global node map at the end of the file which includes the owner of each node on the process, the local node number of the node on the owning process, and the global node number
P_OPT: Results
Two Processor Optimized Cube
P_OPT: Results
Four Processor Optimized Cube
P_OPT: Results
Sixteen Processor Optimized Cube
Conclusions
P_HUGG• The algorithm now exists to generate large, high-quality meshes on complex three-
dimensional geometries in parallel• The meshes generated can be either hybrid or composed of general polyhedra• The use of general cutting allows for very precise, body-conforming meshes• User-defined spacing allows flexibility in mesh generation without loss of mesh quality• The use of the Cartesian, hierarchical Octree structure allows for ease of initial mesh
generation and future adaptation
P_OPT• The algorithm now exists to optimize large meshes in parallel using node perturbation
and cost function analysis• Users may either supply a serial CGNS file or multiple parallel CGNS files with the
addition of a partiton-to-global map• Metis is used to partition the mesh such that both surface area and load balancing
are optimal
Future Work
P_HUGG• Implement Metis to assist in load balancing and decreasing surface area
• Evaluate parallel performance and test robustness
• Tackle multiply connected polyhedra issues
P_OPT• Implement ParMetis to further load balance previously parallelized mesh
files
• If a node is part of a geometry facet, apply an extra weighting to the node when passing it to Metis or ParMetis to better load balancing effectiveness and decrease surface area
• Do parallel efficiency testing on a cluster to attempt to better the algorithm itself as well as test its robustness