Parallel 3D ADI Scheme for Partially Dimension Reduced Heat Conduction Problem
Abstract
This model describes the heat equation in 3D domains, and this problem is reduced to a hybrid dimension problem, keeping the initial dimension only in some parts and reducing it to one-dimensional equation within the domains in some distance from the base regions. Such mathematical models are typical in industrial installations such as pipelines. Our aim is to add two additional improvements into this methodology. First, the economical ADI type finite volume scheme is constructed to solve the non-classical heat conduction problem. Special interface conditions are defined between 3D and 1D parts. It is proved that the ADI scheme is unconditionally stable. Second, the parallel factorization algorithm is proposed to solve the obtained systems of discrete equations. Due to both modifications the run-time of computations is reduced essentially. Results of computational experiments confirm the theoretical error analysis and scalability estimates of the parallel algorithm.
1Introduction
An important trend in development of mathematical models, simulating various real world applications, deals with new type nonlocal boundary and conjugation conditions. The non-locality mechanism helps to simulate important physical processes more accurately. Still the existence and uniqueness of the solution, its stability is very sensitive to approximation of such nonlocal boundary conditions and require development and analysis of special discrete approximation techniques and modifications of implementation algorithms. As one important example we mention the simulation of enzyme-catalysed glucose oxidation and redox reactions over inhomogeneous surfaces (Skakauskas et al., 2018; Čiegis et al., 2018).
It is a standard situation in applications of the mathematical modelling technique that systems of nonstationary 3D differential equations should be solved. Even if state of the art high-order discrete approximations are constructed on adaptive meshes, the total complexity of the obtained computational algorithms is very large. It restricts our possibility to include such models into general optimization routines or to use them in real time decision making applications.
Recently, a new important technique has been proposed to solve such applied problems more efficiently. A model reduction technique is applied to avoid the effects of curse of dimensionality. The given 3D heat conduction problem is reduced to a hybrid dimension problem, keeping the initial dimension only in some parts of the domain and reducing it to one-dimensional equation within the domains in some distance from the base regions. Such mathematical models are typical in industrial installations such as pipelines.
Our aim is to add two additional improvements into this methodology. First, the economical ADI type finite volume scheme is constructed to solve the non-classical heat conduction problem. It is proved that the ADI scheme is unconditionally stable. Second, the parallel factorization algorithm is proposed to solve the obtained systems of discrete equations. Due to both modifications the run-time of computations is reduced essentially.
The method of partial dimension reduction was first introduced in Panasenko (1998) and then developed in Panasenko (2005). It is based on the asymptotic analysis of a solution of a partial differential equation set in a thin domain and on a projection of the variational model formulation on a subspace of functions having a form of the asymptotic expansion in all zones of regular behaviour of the solution. Then the dimension of the problem is reduced within the big part of the domain keeping the full dimension description only in small zones of a singular behaviour of the solution.
Applying this approach a model with nonlocal junctions of one dimensional and full dimensional equations is obtained. Note that nonlocal conjugation conditions make the main challenges in development of effective parallel numerical methods for this class of problems of hybrid dimension. Let us mention some papers where different techniques were used to discretize the obtained hybrid mathematical model. A finite volume setting was studied in Panasenko and Viallon (2015). The method of asymptotic partial decomposition of the domain was compared with several existing methods of dimension reduction in Amar and Givoli (2018) and its effectiveness with respect to the precision and time of the execution is shown. In a recent paper (Čiegis et al., 2020) ADI type finite volume scheme was considered for a 2D heat conduction problem.
However, the question about the most effective parallel numerical solvers for such problems of hybrid dimension is still an important topic. Non-classical conjugation conditions require to develop special solvers to get the discrete solutions. A straightforward way is to solve the obtained systems of linear equations by using general iterative solvers, e.g. well-known multigrid solvers. For multidimensional parabolic problems an alternative and most popular way is to use splitting methods (Hundsdorfer and Verwer, 2003; Samarskii, 2001). It is well-known that in many cases ADI type methods are the best selection.
Thus in this paper we have developed and analysed special efficient ADI schemes to solve the hybrid 3D partial dimension reduction models for a nonstationary heat conduction problem. Our second goal is to propose efficient parallel implementations of the constructed ADI schemes. All main topics required in the analysis of discrete schemes, i.e. the approximation, positivity and stability of the solution are investigated. The accuracy of the reduced dimension model and the scalability of the parallel algorithms are tested in a series of computational experiments.
The approximation error is investigated quite straightforwardly, since the regularity of the solution and dependence of it on the reduction parameter δ is known from results given in Amosov and Panasenko (2018), Panasenko (2005). A more complicated step is to investigate the stability of proposed parallel ADI schemes in the case of new non-local conjugation conditions.
The rest of the paper is organized in the following way. In Section 2 the problem is formulated. The nonstationary heat conduction equation is formulated in the 3D parallelepiped and the initial and boundary conditions are specified. In this section also, the partially reduced dimension model is constructed and an approximate solution is defined by considering a simplified domain when 3D small size subregions are coupled with a 1D line. The non-local conjugation conditions are defined at the truncations of the tube. In Section 3 the full problem in the 3D domain is solved by using the ADI stability correction scheme. It is proved that this scheme is unconditionally stable. Its implementation requires solving many linear systems with tridiagonal matrix and the classical factorization algorithm is used. The Wang parallel factorization algorithm also can be used for parallel computers. In Section 4 the modified ADI stability correction scheme is constructed for the reduced dimension heat conduction problem. It is proved that the unique solution of the obtained system of linear equations exists for each time level. The efficient factorization algorithm is defined, it is based on algorithms presented in Čiegis et al. (2020). A parallel version of the modified factorization algorithm for implementation of the ADI scheme approximating the reduced dimension heat conduction problem is constructed in Section 5. A theoretical scalability analysis of this parallel algorithm is done. In Section 6 results of computational experiments are presented and analysed. Some final conclusions are done in Section 7.
2Problem Formulation
Let us consider a parallelepiped
We are interested to solve a linear heat equation in
(1)
(2)
(3)
(4)
Next, following Amosov and Panasenko (2018), we consider a modified approximate problem. First, let
Denote a reduced dimension domain
(5)
(6)
(7)
(8)
(9)
In
By analysing the weak form of the heat equation, it is shown in Amosov and Panasenko (2018) that the following conjugation conditions are valid at the truncation points of the space domain
(10)
(11)
3ADI Scheme for the 3D Differential Problem
The uniform spatial mesh
The following standard operators are defined for discrete functions:
Then the heat conduction problem (1)–(4) is approximated by the following alternating direction implicit (ADI) stability correction scheme (Hundsdorfer and Verwer, 2003)
(12)
At all three steps 3D systems are split into a set of 1D problems along
Lemma 1.
If a solution of the problem (1)–(4) is sufficiently smooth, then the approximation error of ADI scheme (12) is
Proof.
For a self-completeness of the text and in order to explain the approximation property of the ADI scheme (12) we present a short proof of this result (for a detailed analysis see Hundsdorfer and Verwer, 2003). By adding all equations (12) the following discrete equation is obtained
(13)
It is a straightforward task to show the result presented in the following lemma.
Lemma 2.
The discrete operators
All operators
(14)
Lemma 3.
ADI scheme (12) is unconditionally stable.
Proof.
For the given normal operators
It is easy to see that the discrete stability function approximates the continuous stability function of the differential equation (1) with the second order accuracy.
4The ADI Scheme for the Partially Dimension Reduced Problem
Let
The heat conduction problem (5)–(11) is approximated by the modified version of the ADI scheme
(15)
(16)
(17)
(18)
(19)
(20)
Note that equations (19), (20) approximate the nonlocal flux conjugation conditions (11).
We define two discrete meshes
Let us define three operators for
Then we can write the ADI scheme as
(21)
(22)
Next, our goal is to prove that the discrete operators
For
Lemma 4.
The discrete operators
Proof.
First, we investigate the operator
Next, we investigate the operator
It follows from the obtained estimates that
Due to nonlocal conjugation conditions (19), (20) the classical factorization algorithm should be modified in order to solve 1D subproblems (17)–(20).
Lemma 5.
The unique solution of the linear system of equations (15)–(20) exists and it can be computed by using the efficient factorization algorithm.
Proof.
The given proof also defines the constructive algorithm to solve (17)–(20). Since the first part (15) of the discrete ADI scheme defines an explicit algorithm and 1D subproblems (16) in the second part of the scheme are efficiently solved by using the classical factorization algorithm, then it is sufficient to consider in detail the subproblem (17)–(20). For each
1. Domain
(23)
(24)
2. Domain
(25)
(26)
(27)
(28)
Next, we compute coefficients
(29)
3. Domain
(30)
Substituting (23)–(30) into equations (19) and (20) we get a linear system of two equations to find
(31)
It is noted in Hundsdorfer and Verwer (2003) that the stability analysis of such ADI schemes for three dimensional cases is far from easy and even the linear commuting case needs a closer attention. Since in our case operators
Here we restrict to writing the stability matrix
(32)
5Parallel Algorithm
In this section a parallel version of the ADI algorithm (15)–(20) is presented. We note that development of efficient parallel algorithms for implementation of ADI type solvers is quite a non-trivial task. First the factorization algorithm for 3D problems severely depends on data movement in the memory of computers (from and to in different levels of the memory). Only few arithmetical operations are done with each small size portion of data, therefore cost of data movement is very important issue (see, Guo and Lu, 2016; Otero et al., 2020). Second, nonlocality of some parts of the discrete scheme lead to modifications of the standard factorization algorithm and these changes should be resolved efficiently by the parallel algorithm (see Čiegis et al., 2014, where a parallel ADI algorithm is constructed to solve multidimensional parabolic problems with nonlocal boundary conditions). Third, different approaches of parallelization can be used for different architectures of parallel computers. At least three main classes can be considered, including GPU processors (Imankulov et al., 2021; Xue and Feng, 2018; Otero et al., 2020), shared memory processors (multicore processors) and general global memory parallel machines, when implementation of parallel algorithms can be done, e.g. by using MPI library (Čiegis et al., 2014). In this paper we restrict to the construction of parallel algorithms for general distributed memory computers. It is obvious that these algorithms can be efficiently used also for multicore computers and distributed clusters of multicore nodes.
First, in defining a parallel version of the ADI algorithm for the reduced dimension model we restrict to a a simple modification of the standard sequential factorization algorithm. Only two parts are used to decompose the domain
We start by defining a parallel algorithm for the solution of a tridiagonal system of linear equations:
(33)
Now we are ready to describe the main decomposition details of the parallel version of ADI algorithm (15)–(20).
1. The case of
Before computations of the explicit step (21), the first process sends the value
The
The last step
A simple but still accurate complexity model of the parallel ADI algorithm can be constructed
(34)
We note that this model takes into account only main computational and data communication costs. In real computations the values of these parameters can vary. One more important detail, i.e. automatic memory caching is also not estimated in this formula. Thus efficiency estimates of such models are only asymptotically valuable, but exactly such information is sufficient in most cases.
Let us define the speed-up
Since
2. The case of
Before parallel computations of the explicit step (21), the first process sends the value
The second part of data communication is done among two pairs of processes: the first and third processes exchange the values of
The
The
The last step
The complexity model of the parallel ADI algorithm for
(35)
It follows from this theoretical model, that now the data communication costs depend linearly on the size of
3. The case of
The complexity model of the parallel ADI algorithm for
(36)
Remark 1.
Let us consider the case when the number of processors/cores is larger than 8. In order to use all p processors, we modify the constructed parallel algorithm. We assume that p is an even number and factorize it as
6Computational Experiments
Consider the problem (1)–(4) in the domain Ω with geometry parameters
The accuracy of time integration of ADI schemes. First, we have tested the time integration accuracy of the ADI solver for the full dimension model. A uniform space grid
Table 1
τ | ||
0.002 | 2.016 | |
0.001 | 2.012 | |
0.0005 | 2.021 | |
0.00025 | 2.072 |
It follows from the presented results, that the accuracy of ADI scheme agrees well with the theoretical prediction.
The accuracy of the reduced dimension model. Next we investigate the accuracy of the reduced dimension model (5)–(11). For the space discretization a uniform grid
The numerical tests have been performed on the computer with Intel® Xeon® processor E5-2670, 8GB RAM, and the algorithms have been implemented using C++ language.
We note that CPU time for computing the full model solution by using the ADI scheme (12) is 55.9 seconds.
Table 2 gives for a sequence of reduction parameter δ errors
Table 2
CPU | 19.4 | 29.1 | 40.8 | 60.2 |
It follows from the presented results, that starting from
The efficiency of the parallel ADI scheme (15)–(20). In this section we present results of the parallel scalability tests. All parallel numerical tests in this work were performed on the computer cluster “HPC Vanagas” at the High Performance Computing Laboratory of Vilnius Gediminas technical university. We have used up to 8 cores of Intel® Xeon® processor E5-2630 with 20 cores (2.20 GHz) and 32 GB DDR4 of RAM. The parallel algorithms have been implemented using MPI library.
Our goal is to investigate the efficiency of the proposed parallel version of the ADI scheme (15)–(20). We have solved two problems with different sizes of the discrete meshes
Table 3
77.4 | – | – | 254.1 | – | – | |
17.5 | 4.42 | 1.12 | 57.79 | 4.40 | 1.10 | |
9.85 | 7.86 | 0.97 | 29.2 | 8.70 | 1.08 |
Two conclusions follow from the presented results of computational experiments. First, the constructed parallel algorithm scales well even for problems of moderate sizes.
Second, MPI library enables efficient usage of the allocated local memory for each process, thus the efficiency of the parallel algorithm
As a future task, it is interesting to consider the scalability of the modified parallel algorithm from Remark 1, when the number of processes is larger than 8.
7Conclusions
In this paper we have applied the new efficient approach how to solve the heat equation in 3D domains. This problem is reduced to a hybrid dimension problem, keeping the initial dimension only in some parts and reducing it to one-dimensional equation within the domains in some distance from the base regions. Such mathematical models are typical in industrial installations such as pipelines. We added two additional improvements into this methodology. First, the economical ADI type finite volume scheme is constructed to solve the non-classical heat conduction problem. A finite volume method is used to approximate space differential operators with non classical conjugation conditions between the 3D and 1D parts. It is proved that the ADI scheme is unconditionally stable.
Second, the parallel factorization algorithm is proposed to solve the obtained systems of discrete equations. The ADI scheme leads to non-iterative implementation algorithm and a set of one dimensional linear systems are solved by using the parallel versions of the factorization algorithm for the parallel solving tridiagonal equations systems in each direction. An efficient modification of the basic factorization algorithm is developed to resolve non-local conjugation conditions.
The presented results of numerical experiments confirm both main theoretical conclusions. Due to both modifications the run-time of computations is reduced essentially. First, the hybrid reduced dimension models can be used to simulate heat conduction models for a broad set of domains and CPU time of the serial algorithm is reduced up to factor 3. Second, the developed parallel version of the ADI algorithm enables to achieve effective acceleration of computations and it scales well even for discrete problems of moderate sizes.
References
1 | Amar, H., Givoli, D. ((2018) ). Mixed-dimensional modeling of time-dependent wave problems using the Panasenko construction. Journal of Theoretical and Computational Acoustics, 26: (3), 1850034. https://doi.org/10.1142/S2591728518500342. |
2 | Amosov, A., Panasenko, G. ((2018) ). Partial dimension reduction for the heat equation in a domain containing thin tubes. Mathematical Methods in the Applied Sciences, 41: (18), 9529–9545. https://doi.org/10.1002/mma.5311. |
3 | Čiegis, R., Katauskis, P., Skakauskas, V. ((2018) ). Numerical methods for fractional diffusion. Nonlinear Analysis: Modelling and Control, 23: (4), 234–250. https://doi.org/10.15388/NA.2018.2.6. |
4 | Čiegis, R., Suboč, O., Bugajev, A. ((2014) ). Parallel numerical algorithms for three-dimensional parabolic and pseudoparabolic problems with different boundary conditions. Nonlinear Analysis: Modelling and Control, 19: (3), 382–395. https://doi.org/10.15388/NA.2014.3.5. |
5 | Čiegis, R., Panasenko, G., Pileckas, K., Šumskas, V. ((2020) ). ADI scheme for partially dimension reduced heat conduction models. Computers & Mathematics with Applications, 80: (5), 1275–1286. https://doi.org/10.1016/j.camwa.2020.06.012. |
6 | Eymard, R., Gallouët, T., Herbin, R. ((2000) ). Finite volume methods. Handbook of Numerical Analysis, 7: , 713–1018. https://doi.org/10.15388/NA.2018.2.6. |
7 | Faille, I. ((1992) ). A control volume method to solve an elliptic equation on a two-dimensional irregular mesh. Computer Methods in Applied Mechanics and Engineering, 100: (2), 275–290. https://doi.org/10.1016/0045-7825(92)90186-N. |
8 | Guo, G., Lu, S. ((2016) ). Unconditional stability of alternating difference schemes with intrinsic parallelism for two-dimensional fourth-order diffusion equation. Computers & Mathematics with Applications, 71: , 1944–1959. |
9 | Hundsdorfer, W., Verwer, J. ((2003) ). Numerical Solution of Time-Dependent Advection-Diffusion-Reaction Equations. Springer Series in Computational Mathematics, Vol. 33: . Springer, Berlin, Heidelberg, New York, Tokyo. https://doi.org/10.1007/978-3-662-09017-6. |
10 | Imankulov, T., Daribayev, B., Mukhambetzhanov, S. ((2021) ). Comparative analysis of parallel algorithms for solving oil recovery problem using CUDA and OpenCL. International Journal of Nonlinear Analysis and Applications, 12: (1), 351–364. |
11 | Kumar, V., Grama, A., Gupta, A., Karypis, G. ((1994) ). Introduction to Parallel Computing: Design and Analysis of Algorithms. Benjamin/Cummings, Redwood City, CA. |
12 | Otero, B., Rojas, O., Moya, F., Castillo, J.E. ((2020) ). Alternating direction implicit time integrations for finite difference acoustic wave propagation: parallelization and convergence. Computers and Fluids, 205: , 104584. |
13 | Panasenko, G. ((1998) ). Method of asymptotic partial decomposition of domain. Mathematical Models and Methods in Applied Sciences, 8: (1), 139–156. https://doi.org/10.1142/S021820259800007X. |
14 | Panasenko, G. ((2005) ). Multi-scale Modeling for Structures and Composites. Springer Netherlands, Dordrecht. https://doi.org/10.1007/1-4020-2982-9. |
15 | Panasenko, G., Viallon, M.C. ((2015) ). Finite volume implementation of the method of asymptotic partial domain decomposition for the heat equation on a thin structure. Russian Journal of Mathematical Physics, 22: (2), 237–263. https://doi.org/10.1134/S1061920815020107. |
16 | Samarskii, A.A. ((2001) ). The Theory of Difference Schemes. Marcel Dekker, New York. https://doi.org/10.1007/1-4020-2982-9. |
17 | Skakauskas, V., Katauskis, P., Čiegis, R. ((2018) ). Modelling of the NO + CO reaction over inhomogeneous surfaces. Journal of Mathematical Chemistry, 56: (9), 2626–2642. https://doi.org/10.1007/s10910-018-0908-3. |
18 | Viallon, M.C. ((2013) ). Error estimate for a 1D–2D finite volume scheme. Comparison with a standard scheme on a 2D non-admissible mesh. Comptes Rendus Mathematique, 351: (1), 47–51. https://doi.org/10.1016/j.crma.2013.01.011. |
19 | Xue, G., Feng, H. (2018). A new parallel algorithm for solving parabolic equations. Advances in Difference Equations. 174: . https://doi.org/10.1186/s13662-018-1617-8. |