GPU COMPUTING OF 2-D LAPLACE EQUATION USING BOUNDARY ...me.bilkent.edu.tr/wp-content/uploads/2015/11/2015-ULIBTK-BEMLa... · ULIBTK’15 20. Ulusal Isı Bilimi ve Tekniği Kongresi

ULIBTK’15 20. Ulusal Isı Bilimi ve Tekniği Kongresi

02-5 Eylül 2015, BALIKESİR!

GPU COMPUTING OF 2-D LAPLACE EQUATION USING BOUNDARY ELEMENT METHOD

İbrahim N. YILDIRAN*, S. Doğan ÖNER*, Barbaros CETIN*, Besim BARANOĞLU**

* Microfluidics & Lab-on-a-chip Research Group, Mechanical Engineering Department, I.D. Bilkent University, Ankara 06800 TURKEY

** Department of Manufacturing Engineering, Atılım University, Ankara 06836 TURKEY

Abstract: Laplace equation is used as governing equation in wide variety of applications in engineering. Steady state heat transfer and static electrical field problems can be examples of engineering fields where Laplace equation is used. In the case that material properties are constant throughout the solution interval, Laplace equation turns out to be a linear partial differential equation for which analytical solution is possible. However, there are some difficulties to employ these analytical solutions in every geometry. Numerical methods are valuable options to find solutions for irregular geometries. Boundary Element Method (BEM) is a numerical method that discretize the solution boundary only and it offers high accuracy solutions. In this study Boundary Element formulation, that can be computed both over GPU and CPU, for 2-D Laplace equation will be developed by using MATLAB and computational performances will be compared. Additionally, a graphical user interface (GUI) where users can form and discretize their geometries and visualize the results will be available for anyone who is interested with it through the web. Keywords: Heat Transfer, Boundary Element Method, GPU Computing

2-BOYUTLU LAPLACE DENKLEMİNİN SINIR ELEMAN YÖNTEMİYLE GPU PARALEL HESAPLANMASI

Özet: Laplace denklemi, mühendisliğin birçok alanındaki problemlerde karşımıza bünye denklemi olarak çıkmaktadır. Laplace denkleminin kullanıldığı alanlara örnek olarak kararlı ısı transferi ve statik elektrik alan problemleri gösterilebilir. Çözüm aralığında malzeme özelliklerinin sabit olması durumunda denklem doğrusal bir kısmı diferansiyel denklem olmak da çözümü içim birçok farklı analitik yöntem literatürde mevcuttur. Ancak bu yöntemlerin her geometri için uygulanmasında zorluklar vardır, bu noktada sayısal yöntemler iyi bir alternatif oluşturmaktadır. Sınır Eleman Yöntemi sadece çözüm sınırlarında ayrıklaştırma ile doğrusal diferansiyel denklemlerin çözümüne yüksek hassasiyetle imkan veren bir sayısal yöntemdir. Bu çalışma kapsamında SEY kullanılarak MATLAB yardmıyla GPU üzerinde koşturulabilen bir 2-boyutlu bir Laplace çözücüsü geliştirilmiştir. Geliştirilen çözücü hem CPU hem de GPU üzerinde koşturularak hesaplama performansları karşılaştırılmıştır. Ayrıca ağ üzerinde ilgilenen kullanıcılarla paylaşılmak üzere kullanıcıların problem geometrisini oluşturabilecekleri, ayrıklaştırma yapabilecekleri ve sonuçları görselleştirebilecekleri bir arayüz (GUI) MATLAB yardımıyla hazırlanmıştır. Anahtar Kelimler: Isı transferi, Sınır Eleman Yöntemi, GPU hesaplama INTRODUCTION Laplace Equation is a second order, linear partial differential equation which appears in many engineering fields such as heat transfer, electromagnetism, fluid dynamics etc. Its mathematical representation is:

∇!! = 0 (1) ∇!(= !!) is Laplace operator and u stands for a scalar function. In this form, Laplace Equation does not include any transient term so that it is applied only for steady-state problems. Several analytical solution techniques are available for the solution of linear

differential equations; however, these analytical techniques are applicable for regular geometries. As an alternative to analytical solution techniques, several numerical solution techniques such as finite difference, finite element, finite volume, boundary element etc. are available for the problems with irregular geometries. Among these finite element method (FEM) and the finite volume are the most popular ones, and prove themselves also for highly non-linear problems of solid and fluid mechanics. One alternative to these techniques is the Boundary Element Method (BEM) which does not require the meshing of the solution domain –meshing of the surface of the solution domain instead. It is very advantageous for the problems with infinite and/or



semi-infinite domain, with re-meshing in which the meshing of the entire solution is a huge computational burden. Using BEM, it is possible to obtain better accuracy with a fewer number of elements when it is compared to FEM (LaForce T., 2006). On the other hand, BEM requires a fundamental solution for the governing solution; therefore it is mainly applicable for linear differential equations (Antes H., 2010). One of the state-of-the-art technologies to perform parallel computing is the use of Graphics Processing Units (GPU). GPUs can offer tens of times of more FLOP performance and over ten times more memory bandwidth compared to CPUs (Tubbs K. R. and Tsai F. T.-C., 2011). MATLAB has a parallel computing toolbox in which several functions are available for GPU-computing. In this study, a boundary element formulation for the solution of 2-D steady-state conduction heat transfer problem is developed and computations are performed both on CPU and GPU. The developed formulation is validated against a benchmark problem. BOUNDARY ELEMENT FORMULATION

In absence of internal heat generation and constant thermıphysical material property, the heat conduction eqaution at steady state can be written as:

! !!!!!!!

+ !!!!!!!

+ !!!!!!!

= 0 (2)

Equation (2) is valid for solution domain and auxiliary domain where the material properties are same with solution domain and temperature is defined as !∗ and heat flux is defined as !∗. Then for auxiliary domain, Fourier’s Law and Equation (2) turn out to be respectively;

!∗! = !−!!!∗!"

(3)

!∇!!∗ + !∆(!,!) = 0 (4)

In Equation (4), ∆(!,!) is Dirac Delta Function where A represents the fixed point in the solution domain and P represents the varied point either in the domain or on the boundary. BEE formulation for Equation (3) and Equation (4) starts with introducing Betti’s Reciprocal Theorem which is defined as following;

!"!"!

!!∗ = !!!!!∗!!!

(5)

When it is integrated over the volume and applying the chain rule, resultant equation is;

!!!!

!. !!∗ !" − !(!!!∗

!"!)!"

= !!"!

(!∗ !!)!"

− !∗(!!!!!!)!"

(6)

According to Equation (2) and Equation (4) following relations are inferred respectively;

!!!!!!

= 0!&! !!!∗

!!!+ !∆ !,! = 0

(7)

If relations given in Equation (7) plug into Equation (6), general boundary element equation for 2D will be obtained as such (Cetin B. and Baranoglu B., 2014);

! ! ! ! + ! !∗ !,! ! ! !"

= ! !∗ !,! ! ! !"

(8)

In Equation (8), !∗ !,! and !∗ !,! are fundamental solutions and c(A) represents the coefficient of boundary. This term stems from the Dirac Delta Function. According to this function, if the point A is in the domain, integration of the function over the domain is unity. On the other hand if it is outside, the boundary integration is equal to zero while on the boundary it is ½. Following table shows the variation of c(A) depending on the location of point A (Salgado-Ibarra Ermes A., 2011);

! ! = !1!!!!!!!!!!!!!!!!!"!!!!"!!"!!!!!!!!!!!!!!!!!!!!!!!!!!!0!!!!!!!!!!!!!!!!!"!!!!"!!"#!!"!!!!!!!!!!!!!!!!!!!12 !!!!!!!!!!!!!!!!"!!!!"!!"!!ℎ!!!"#$%&'(

Analytical results of fundamental solutions are used to reach the solution in actual system. So briefly the fundamental solutions for 2D are (Fenner R., 2014):

!∗ !,! = −12!" ln!(!) (9)

and since !∗ = ! !!∗!" ;

!∗ !,! = −12!"

!"!"

(10)

In this study, constant element assumption is used. Therefore over an element u(P) and q(P) are constant. When these two terms are taken out of the integral in Equation (8), resultant integration is named in the rest of the computation as following;

!∗ !,! !" = !!!" (11)



!∗ !,! !" = !!!" (12)

In Equation (11) and Equation (12), subscripts i and j represent the number of fixed point A and number of varied point P respectively. After establishing boundary element equation and fundamental solutions, some steps should be followed to find solution for actual system. These steps can be summarized as follows;

1. Discretization of the domain into finite amount of elements.

2. Assumption for order of interpretation of each the element. It can be constant, linear or higher order. This means that it can be assumed that the variation of u(P) or q(P) over the element is constant, linear, quadratic or in higher order.

3. Evaluation of resultant integrals for each element and forming system of equations.

4. Solving the system by implementing boundary conditions

When these steps are followed, if the boundary is dis-cretized into N elements; resultant system of equation can be shown in matrix form as follows;

! ! = ! ! (13) This H and G matrices are arranged according to boundary conditions so that one of them has filled with unknowns and another one is with known variables. At the end, the resultant equation is;

! ! = ! ! (14) In Equation (14) K is the matrix which is composed of known variables while [L] is matrix of unknown variables. The vector {b} stands for the boundary conditions and {x} represents the unknown variables. Solution may obtained not only from the boundary but also from the domain by introducing post-processing method in Boundary Element Method. Post-processing happens by putting points inside the domain where the solution for the potential is desired. Since all the unknowns are stored in {x}, values of u(P) and q(P) are determined for all boundaries. By using these values, potential for points inside domain is computed with the following equation (Beer G. et al, 2010); !!"#$%"&' ! = ! !!" !(!) − !!" !(!) (15)

In Equation (15) !!"#$%"&' ! stands for the value of potential at point P inside the domain. Indices i and j refer to point and the element on the boundary respectively. VALIDATION OF THE FORMULATION

In order to validate the BE formulation, a benchmark problem is used. In this problem, 2-D steady state heat conduction on a square plate is investigated and results are compared with the analytical solution. Schematic

representation of benchmark problem and mesh generation is depicted in the following Figure 1(a). Also in Figure 1(b) mesh generation with 48 elements in total is demonstrated. In this problem, temperature of the right, left and bottom boundaries are kept at zero, while the temperature at the top boundary is 1. The solution of the benchmark problem can be obtained using separation of variables as (Incropera Frank F. et al, 2011):

!(!!, !!)

= ! 2!(−1)!!! + 1

! sin!(!"!!! )sinh!(!"!! !)sinh!(!"# !)

!

!!!

(16)

Figure 1(a) depicts the graphical representation of the benchmark problem. The Laplace equation inside the channel emphasizes that thermal conductivity is constant throughout the process. Figure 1(b) shows the boundary element mesh. The blue dots implies the starting and ending points of the elements. As it can be seen from the figure, only the boundary is discretized instead of all domain and number of elements in BEM are not as much as in FEM.

(a)

Figure 1. (a) Schematic drawing of the benchmark problem, (b) Mesh figure by BEM (48 Elements in total)

(b)"



For comparison, initially a line x2 = 0.25 is chosen which is parallel to horizontal axis. BEM solutions for different number of elements are compared against the analytical solution. Comparison can be seen in Table 1. It is clearly seen that, exact solution is obtained for very low number of elements. In fact, discrepancies between analytical and BEM solutions are not more than 0.2% for lower number of elements than 48. This is incontrovertibly great improvement on computation performance. Although BEM requires more memory when compared with finite element method, the accuracy is well beyond the standards of other numerical methods. Table 1. Analytical Solution-BEM Comparison at x2 =

0.25 with Different Number of Elements Number of Elements

Analytical Solution

BEM Result

Error (%)

12 0.09451 0.09432 0.2 24 0.09451 0.09449 0.02 36 0.09451 0.09450 0.01 48 0.09451 0.09451 0.000 60 0.09451 0.09451 0.000

Table 1 shows how fast and how accurately BEM converges to the analytical solution of this problem. Listed number of elements represent the total number of elements which are located on the boundaries only. If it is necessary to find the value of the potential, post-processing allows to find solution on any point inside the domain. Some post-processing results are demonstrated on the following figure in detail.

(a)

(b)

Figure 2. (a) Comparison of BEM and analytical solution: x2 = 0.25, 0.50, 0.75, (b) BEM solution

Figures in Figure 2 are result of post-processing inside the solution domain. Figure 2(a) shows the variation of temperature in horizontal direction for chosen x2 values. On the other hand Figure 2(b) depicts the temperature contour belongs to all domain. As seen from the figure, for 48 elements, BEM solution agrees well with the analytical solution. However, on the proximity of the channel walls the solution deteriorates slightly and this is a result of near wall singularity which can be dealt with some numerical techniques. Figure 2-(b) shows the contour plot of the temperature field. This contour is obtained by post-processing on a square point cloud in the solution domain. The borders of the square don’t extend to actual boundaries of the domain. Therefore the region where highest temperature values are observed don’t reach to the corner of the plot. If the point cloud were extended to the boundaries of the channel, the corners would have been resolved better. The same situation is valid also for Figure 2(a). The line graphs start at x1 = 0.1 and x1 = 0.9. Since the points on which post-processing was accomplished don’t reach to the corners of the channel, there are gaps both sides of the graph at length of 0.1. GPU COMPUTING

GPU is the graphics processing unit which promise higher computational power. It has considerably big number of multithreaded cores that are specialized for the computation – intensive and highly parallel computing compared to traditional CPU: Central Processing Unit (MathWorks, 2012). Also, GPU is the powerful tool for the simple mathematical calculation such as addition, subtraction, multiplication and division compared to traditional CPU. Therefore, GPU have offered the great potential for the considerable speedups for many applications. In this part, MATLAB® code is recoded on GPU by making simple changes to code in the light of the features in Parallel Computing Toolbox™. In some cases in code, it is needed to change for-loop to parfor-loop that allows to perform iteration over GPU. Some challenges taken into consideration to parfor –loop work properly:

• Independence • Global and Transparency • Classification • Uniqueness

Uniqueness is the most challenging part of the coding for GPU computation. In order to solve uniqueness problem, cell array Technique is used. For computation NVIDIA® GEFORCE ™ GTX850M graphical card is preferred. GPU offers users to save considerable amount of time. Since Boundary Element Method consists of several sub-parts, it is necessary to consider computation times

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

X2 Coordinate

Tem

pera

ture

Val

ues

BEM SolutionAnalytical Solution

X2 = 0.25

X2 = 0.50

X2 = 0.75



of each part separately. As a result of study, the most time consuming two parts are Assembly part where the G and H matrices for which brief information is stated in Boundary Element Formulation section are formed and Post processing part where the solution from domain is obtained. Following study shows the distinction in terms of computation performance between CPU and GPU.

Table 2. Comparison of Computation Times for Both CPU and GPU (Post-Processing Part is Included)

CPU GPU Number

of Elements

Computation Time

Number of

Elements

Computation Time

48 15.72 48 3.05 96 32.10 96 4.81

192 65.82 192 8.19 384 141 384 14.89 768 318 768 31.85

1536 804.85 1536 63.63 Table 2 shows the difference between the times has elapsed for computation that includes post-processing part over both CPU and GPU for the same number of elements. There is a point cloud in the domain which is composed of 2500 points. It is noted that CPU consumes considerably more time than GPU. Especially for computations with high number of elements GPU can speedups up to 11-12 times. This is incredible improvement especially for industrial applications. It is obvious that performance of GPU is superior to CPU by far.

Figure 3. Comparison of CPU and GPU Figure 3 is a graphical representation of discrepancy between CPU and GPU performance. It can be interpreted from this graph, GPU has outstanding performance when it is compared with CPU. It boosts computation almost 10 times. Even though number of elements used in this computation are not high that much, results give satisfying intuition of how GPU can expedite the solution. The trend of the graph proves that as number of elements are increased, the discrepancy between the time consumption of CPU and GPU is going to become more and more obvious. Since the

problems that may involve number of elements in the order of million require critical amount of time, GPU is a perfect candidate to diminish the time consumption. CONCLUSIONS In this study newer computational method, Boundary Element Method, and its formulation on conduction heat transfer are presented. Benchmark problem proves that the BEE formulation offers high accuracies with small number of elements. As it can be distinguished, there are lots of advantages of BEM,

# It discretizes only the boundary of the domain and this reduce the required number of element to get the exact result drastically.

# Solutions for the points inside the domain are obtained by either taking a cross section or generating a point cloud in the domain. This prevents to form additional connectivity in the domain hence the number of elements in total are reduced.

The disadvantages of the method can be summarized as being less accurate for non-linear problems and being applicable on only boundary value problems. As this study suggests that computation can be performed both over CPU and GPU. In the light of the results of GPU computation, computation over GPU has undeniably greater performance than CPU. As the number of elements increases the difference between these two methods becomes more explicit. GPU may increase the performance up to 11 times. REFERENCES Antes H., 2010, A Short Course on Boundary Element Methods, Institut für Angewandte Mechanik, Braunschweig Technical University, Germany Beer G. et al, 2010, The Boundary Element Method with Programming: For Engineers and Scientists, SpringerWienNewYork

Cetin B. and Baranoglu B., 2014, A particle flow specific boundary element formulation for microfluidic applications, 4th Micro and Nano Flows Conference, UCL, London, UK Fenner R., 2014, Boundary Element Methods for Engineers, Part 1: Potential Problems (First Edition), Bookboon.

Incropera Frank F. et al, 2011, Fundamentals of Heat and Mass Transfer (Seventh Ed.), Wiley, New York

Internet, 2012, MathWorks, GPU Programing in MATLAB, http://www.mathworks.com/company/newsletters/articles/gpu-programming-in-matlab.html


02-05 Eylül 2015, BALIKESİR

LaForce T., 2006, PE281 Boundary Element Method Course Notes, Stanford University, CA, USA Salgado-Ibarra Ermes A., 2011, Boundary Element Method (BEM) and Method of Fundamental Solutions (MFS) for the boundary value problems of the 2-D

Laplace's equation, University Libraries, University of Nevada, Las Vegas, USA. Tubbs K. R. and Tsai F. T.-C., 2011, GPU accelerated Lattice Boltzmann model for shallow water flow and mass transport. Int. J. for Numerical Methods in Engineering. 86 (3): 316–334

Documents

GPU COMPUTING OF 2-D LAPLACE EQUATION USING BOUNDARY ...me.bilkent.edu.tr/wp-content/uploads/2015/11/2015-ULIBTK-BEMLa... · ULIBTK’15 20. Ulusal Isı Bilimi ve Tekniği Kongresi