oct.2015 CHINESE JoURNAL oF ENGINEERING MATHEM rICS Vo1.32 NO.5 doi:10.3969/j.issn.1005—3085.2015.05.011 Article ID:1005・-3085(2015)05--0726--17 A Fast Propagation Method for the Helmholtz Equation术 LE G Wei (State Key Laboratory of Scientiifc and Engineering Computing, Chinese Academy of Sciences,Beijing 100190) Abstract:A fast method is proposed for solving the high frequency Helmholtz equation.The building block of the new fast method is an overlapping domain decomposition method for layered medium.In the new fast method,the computation domain iS irstly decomposed hiferarchically into many subdomains on diferent levels.Then the mapping from incident waves to out—going waves on all the subdomains are set up.Finally,the wave propagates on the subdomain boundaries on different levels to reach the solution to the Helmholtz equation.The new fast method iS of 1OW complexity,and suitable for parallel computing.Numerical experiments show that with the new fast method,2D Helmholtz equations with half billion unknowns could be solved eficientlfy on massively parallel machines. Keywords:Helmholtz equation;finite difference;fast method;domain decomposition method; PMT Classiifcation:AMS(2000)65N22;65F10 Document COde:A CLC number:O241.82;O241.6 1 Introduction We consider in this paper to solve the Helmholtz equation in the full space with Sommerfeld radiation condition △ +k2u一.厂,in 。, (1) r1 ( 一 尼u)_÷。,as r= _÷。。, where k iS the wave number. Many domain decomposition method have recently been developed to solve the Helmholtz equation,most of them are non—overlapped,and the major diferences be— tween them are the interface conditions.Engquist and Ying[ , ]proposed a sweeping Received:01 June 2015. Accepted:12 Aug 2015. Biography:Leng wei(Born in 1984),Male,Doctor,Assistant Professor Research field:numerical solution of partial diferential equations. Foundation item:The National 863 Project of China(2012AAO1A309). NO.5 Leng Wei:A Fast Propagation Method for the Helmholtz Equation 727 preconditioner by approximating the inverse of Schur complements in the LDLt factor— ization,Stolk[3.proposed a domain decomposition method with a transmissi0n condi— tion based on perfect matched layers,Vion and Geuzaine[4]proposed a double sweep preconditioner that use a transmission condition that involves Dirichlet—to-Neumann (DtN)operator,Zepeda and Demanet[ ]introduced the method of polarized trace that use a transmission condition in boundary integral form,Liu and Ying[。】developed an additive sweeping preconditi0ner that use a transmission condition built with the boundary values of the intermediate wave directly.Chen and Xiang[7,s]proposed the source transfer domain decomposition method that transfer the source in subdomains, and recently Du and W_u[。J improved the method so that the transfer applies in both directions. The domain decomposition methods in the literature usually approximately solve the Helmholtz equation with varying medium,either with approximated interface con— dition or with approximated Green function,thus they are commonly used as precon- ditioners for Krylov subspace method such as GMRES. An overlapping source transfer domain decomposition method is proposed for Helmholtz equation with layered medium,the method follows the natural wave tray- eling process in layered medium,which involves the reflections and refractions at the interface of the layers.The convergence of the new domain decomposition method is ensured by the overlapping region,and the good accuracy of the new domain decom— position method makes it the building block of the new fast method. The domain decomposition method suffers from slow convergence rate when the number of subdomains is large,thus multilevel grid is needed so that the informa- tion is brought to far away subdomains without passing the subdomains on the way. The upper level grid for Poisson type problem could be coarser since the amount of information decreases fast as the distance grows.Howeverfor Helmholtz equation, ,the grid size should be maintained small to represent wave shapes on the upper level grid.Fortunately,the trace on the subdomain boundaries could be used to represent the solution on the subdomain,thus the computation cost on upper level grid is not formidable. The fsta method we proposed first setup the trace mapping on subdomains of diferent levels.Then the sources are converted to traces on the bottom leve1.and propagate on higher and higher levels till the top level,then the traces on high levels are decomposed into traces on lower and lower level,at lst the traces ian the bottom levelis converted back to solutions and summed up.In such up and down process.the wave travels to far away regions via the traces on high levels. The rest of the paper is organized as follows.In section 2,an overlapping source 728 CHINESE J0URNAL OF ENGINEERING MATHEMATICS V0L.32 transfer domain decomposition method is proposed for Helmholtz equation with layered medium.In section 3.the fast algorithm is described.The multilevel domain decom- position with quadtree structure is built,and the algorithm to build incident trace to field trace mapping on subdomains is proposed,then source—up and solution—down algorithm are proposed.The numerical experiment for Marmousi model is present in section 4. 2 The overlapping source transfer DDM The foundation of the fast method iS the overlapping source transfer domain de— composition method for the Helmholtz equation.We first propose and analyze the overlapping STDDM for Helmholtz problem with three layered medium,then revise 讧 讦 讦 the method and substitute the solving of subdomain problem into mapping,and at last propose the overlapping domain decomposition method for four subdomains,which is the building block of the fast propagation method. 2.1 STDDM in three layered medium Consider the Helmholtz equation(1)defined in 。,where the source f is given, and the wave number iS diferent in three horizontal layers Y<-d, k(y)= d Y d, Y>d, s shown ian Figure 1.The upper interface Y=d is denoted F1,and the lower interface Y=-d is denoted F2. y…w 、、 、、l … 、 y rl .一 d ; { X X XF2 一一 d 一一— Figure l Domain decomposition in Y direction for three layered problem NO.5 Leng Wei:A Fast Propagation Method for the Helmholtz Equation The frequency domain wave equations defined on unbounded domain could be solved on truncated domain with the perfect matched layer as the absorbing boundary condition[10,11].To solve Helmholtz problem(1),the unbounded domain is truncated to a rectangle[一l1,z1]×[一l2,f2]j with a PML layer of length lpml attached to the boundary,and the truncated domain Q becomes【-11一lpml,ll+lpm1】×[一12—1pml,12+lpml】. We refer the domain without PML layer as the interior of domain Q,denoted Q.For simplicity,we denote 11+/pml as 11. The uniaxial PML method[u】is used in this paper.where the complex coordinate is stretched in X and Y direction separately fxj ( )=/aj(t)dt,J=1,2, J0 and the medium property is chosen that ( )=0 for Itf lj,and ( )>0 in PML layer Itl>l,.Then the PML equation is j-1 ・(AVu)+k2u=f,in Q, where (3) 一aiagf OA(x) /1(X1), ), J(x) =O/1 1) 2 2) The computation domain is decomposed to two overlapping subdomains,the upper one Q1=[一1—1,l—1]×[一 —lp l,l2+zp 1]and the lower one Q2=[一ll,11]×[一12一lpml,l+ /pm1],with an overlapping region【_.1.1,_11×[1]_i,2),as is shown in Figure 1.Similar PML equations as(31 are built on the two subdomains,and the parameter A and J in the PML equation are denoted A and or subdomaifn Qt,i=1,2. The new domain decomposition method first solve the subdomain problem with the restricted source .(AtVut)+k2ut= ,in Qt,i:l,2, (4) where fl=.厂. <0 for 1,and,2:.厂・)( ≥0 for Q2,and the solution is denoted for i=1,2. Then,the wave field in 1 is transferred as source to 2 meanwhile the wave ifeld in Q2 is transferred as source to Q1,with the new transferred sources the PML equations on the subdomains are solved and new wave fields are generated,and SO on ・(A1 ui十 )+ks 81十 = 1( ;), in Q1, ・(AlVu ̄)一南 , in Q1, (5) 1( )=一 ・(A2V钆 十 )+ 。 ;十 = 2( ), in Q2, (6) 皿2( ;)=一 V・(A2Vu ̄)一后 , in Q2, 730 CHINESE JoURNAL oF ENGINEERING MATHEMATICS V0L.32 where 1 and皿2 are the source transfer function,s is the iteration step,8=0,1,2,…. Note that the transferred source 1( )=0 for Y<l or Y>l+/pml,thus it has a compact support in the PML layer,SO does 1( ).At last,the PML solutions on subdomains are summed up as the solution obtained by the domain decomposition method o。 r“DDM=== ∑(札 +u ). s=0 (7) Although the PML equation(4)-(6)solves the truncated Helmholtz equation in the subdomain approximately,the convergence of the series(7)to the solution of(3)could be shown by )1 ,= (N +钆 )1 + )一,+ (∑(“, (∑( +乱一Ⅳ s=O s=1 N =一 (u )一 ( )+ ( 1+札21)+ (∑(ui+ )) s=2 N =一 (u )一 ( )+ (“, + )+ f ∑(u +u )) s=3 = =…(u + ), —÷。o.which could be ensured by _÷0 as Nand the remaining term ( + ) the convergence of the PML method[ together with the analysis of wave traveling in layered medium as follows. The solution of the domain decomposition method in the form of(7)could be interpreted as the superposition of the incident waves,reflected waves and refracted waves that propagate in the layers[ 引as is illustrated in Figure 2. ,r r Figure 2 Wave traveling抽threelayeredmedium Suppose the incident wave u0 comes from the upper layer,then at interface F1,u0 causes a reflected wave u00 going upwards in the upper layer and a refracted wave Uol N0.5 Leng Wei:A Fast Propagation Method for the Helmholtz Equation 731 going downwards in the middle layer.The wave Uo+Uoo+Uol is approximately the solution 2 of the subdomain equation(4)with i=1. Then at interface F2,U01 causes a reflected wave Uo12 going downwards in the lower layer and a refracted wave Uo11 going upwards in the middle layer.The wave 1+ 12+Uo11 is approximately the solution u5 of the subdomain equation(6). Then at interface F1,U011 causes a reflected wave U0111 going downwards in the middle layer and a refracted wave U01 10 going upwards in the upper layer.The wave 11+ l10十U0111 is approximately the solution of the subdomain equation(5). The traveling process goes on,and the superposition of all the waves is the solution to (3), u=U0+U00+U01+U012+U0l1+ 110+U0111+U01112+Uo1111+…, (8) and the series(8)is approximately the series(7). The convergence of the new overlapping domain decomposition method relates closely to the medium property of the layers and the size of the overlapping region. When the overlapping region of the subdomains lies inside the middle layer of the three, e.g.,l<d,the convergence rate of the domain decomposition method is at most the convergence rate of the series(8).The worst case happens when there is a narrow wave guide,and the overlapping domain lies inside the wave guide,e.g.k2>kl=k3,l<d and d is smal1.To avoid such case,the overlapping region should have a non-zero minimum size. The overlapping region ensures the convergence of the new domain decomposition method for layered medium.The convergence of non-overlapping DDM might deterio- rate if the subdomain interface lies right in a waveguide.We have two remarks on the new domain decomposition method. Remark 1 The convergence of the solution enables direct solving the Helmholtz equation with the method,rather than use it 88 a preconditioner,which is crucial for our new fst method.a Remark 2 An extend PML layer could be defined that it includes a PML layer and an overlapping layer that doesn’t absorb at all,for example,the layer[一ll,l1]× 【0,2+2p 1]is an extend PML layer.Since it’s all about the PML layer parameters,we do not make a distinction between the two and simply call them the PML layer. 2.2 Mapping instead of solving The domain decomposition method in the above subsection could be revised that the solving of PML equations on subdomains(5)and(6)is substituted by mapping. For subdomain Q1,a mapping 1 from incidents trace U on the line[一ll,11]x 0 to the wave solution in Q】is defined s afollows:Given U on the line【一l1,l1】×0,solve 732 CHINESE JOURNAL OF ENGINEERING MATHEMATICS VoL.32 as its extension such that ・(A2 也)+ =0, in【-ll,l1]×【0,l+/pm ̄], (9) (10) =U ,on[一11,_111×0.] It’S obvious that if U is the trace of a solution to(6),then is the restriction of that solution on the region[_.1.1,l—1]X[0,j,+fpml】.The extension is then converted as source 1( )=一 V・(A1 也)一 , in Ql, with which the wave field solution to PML equation in subdomain 1 is solved f ・(A1 面)+ 面= 1( ), in Q1. (12) The mapping is then defined as = 1( ). Another mapping from incidents trace U on the line[-/1,21】×0 to the ifeld trace UF on the same line,is defined by F=71(u )全G1( )l[-h,_1】 0. Although both the incident trace and the field trace is on the line[一11,11】×0,it is referred s aincident boundary or ifeld boundary,respectfully.For subdomain 2,similar mapping 2 and could be defined. Now the domain decomposition method for Helmholtz equation with three lay- ered medium could be revised as follows:first,solve the subdomain problem with the restricted source ・(A Vu )+k2ut= ,in Qi,i=1,2, (13) where^=,・ <0 for Q1,and,2=.厂・)( o for Q2,the solution is denoted for i=1,2,and the field trace of the solutions are ’。:u ×01 for i:1,2. Then each subdomain takes its neighbor’S ifeld trace s iats own incident trace,map the incident trace to filed trace,and SO on ,= ,u,in Q1, in Q2, ’卧 = ( I ), in Ql, n Q2, ( “), i¨= “= ,or 8=0,1,2,…,and the domain decomposiftion solution is 咖M: +u。+ (∑ )+ fz s=1 No.5 Leng Wei:A Fast Propagation Method for the Helmholtz Equation 2.3 STDDM with four subdomains The above domain decomposition method with two subdomain in Y direction could be easily extended to four subdomains in both z and Y directions.The major diference iS that the incident boundaries,field boundaries and their source transfer regions are a little complicated for four subdomains. The total domain Q is decomposed into four smaller subdomains QtJ,i,J=1,2. ,The interior(region without PML layer)of the subdomain Qt,J are denoted Qt,J,they are non—overlapped and their union iS the interior of the tota1 domain,as is shown in Figure 3(a),where the hatched area is the PML layer.Each subdomain J has lts PML layer lie in its neighbors.There are three kind of incident boundaries,denoted f,and three kind of ifeld boundaries,denoted r ,f0r subdomain Q J as in Figure 3(b)-Figure a(d)and Figure 3(e)-Figure a(g)respectfully,where the shadowed 3]yea is the source transfer region.and the thick lines are the incident or field boundaries. 2,the incident boundary for wave comes from For examples,on subdomain Q2.subdomain Q12 is shown in Figure 3(b),and the field boundary for wave goes to ,subdomain Q12 is shown in Figure 3(f). ,~ ~ Q12 .Q22 .Q11 .2.1 Q22 .(a) (e) Figure 3:Domain decomposition with four subdomains (g) The incident traces on boundary ,are denoted as U I,and the field traces on FThe mapping rom the ifncident trace to the solution boundary F Fj are denoted s U,a..“f is denoted on subdomain Qt,,, ,while the mapping from the incident trace to the . ifeld trace on subdomain Qif is denoted , The domain decomposition method with four subdomains is shown in Algorithm 1. In the algorithm,the wave propagates between children subdomains via the iteration (3)一(7),we call it the iteration of incident and field traces from now on. 734 CHINESE JoURNAL OF ENGINEERING MATHEMATICS V0L.32 Algorithm 1 Domain decomposition with four subdomains ,1:Solve the mapping ,J on subdomain QtJ,i,J=1,2, with direct solver. 2:Solve the local problem on QtJ with source ,,J=, , restrict the solution u to field trace 3:while∑Il 4: to its siblings 5: 6: >£do j as incident trace U Send subdomain Q ’s field trace Record the incident traces U I’ Map the incident trace to field trace U F, = ,J( “) 7: Set 8=s+1 8:end while J with the summation of incident traces using direct 9:Solve the local problem on Qt,s。lv erj the soluti0n is denoted g ( 嗜). 10:Sum up the solutions of all subdomains to get the total solution === >∑(』_- I一 、 0+ , 十 i I ( 、∑U )●__一 一,l ),I . =t,J=1,2 8>0 3 The fast propagation method 3.1 Hierarchical domain decomposition A rectangular domain of[一L,L]is decomposed into smaller rectangular blocks for subdomain1 on diferent levels.Denote the number of levels as NL+1,the level Z=0 is referred as the bottom level and the level l=NL is referred as the top leve1. The number of blocks in X direction at level f is 2NL一,where l=0,1,…=, .Let f1,2,…,2NL 1,on the level 1,the block which is the —th block in X direction and the J—th block in Y direction,is denoted ,J;l,where i,J∈厶.Each block shares an overlapping PML layer region of length/pml with its neighbors on the same leve1. The quadtree structure of the multiple leve1 domain decomposition is built as fol— 2 on level l=2NL~,…lows.Each block Qt;, ,1 has four children 2t,2j一1;2—1, Q2t2J;l一1 ,2J;l一1, 2 1,2J一1;f一1, Q2 一1,on level l一1.For simplicity,the children of block QtJ;1 is denoted Q ,;1—1,where ,i =2i一1,2i,J =2j一1,2j.On the other hand,each block Qi,J;l on level l has a j/21:/+1 on level 1+1,where l<NL.The father—son relationship of the father ̄ril21r.N0.5 Leng Wei:A Fast Propagation Method for the Helmholtz Equation 735 blocks leads to the quadtree structure.The hierarchical domain decomposition with 4 levels is shown in Figure 4. Figure 4:Hierarchical Domain decomposition wj如4 levels The incident boundaries and field boundaries on block ,j;1 include not only the boundaries between siblings as in Figure 3,but also its ascendants’incident boundaries and ifeld boundaries,as is shown in Figure 5,where the hatched area is the PML layer, the shadowed area is the source transfer region,and the thick lines are the incident or field boundaries.For example,the incident boundaries and corresponding source transfer region for ascendants of child Q2t2j;l is shown in Figure 5(b)一Figure 5(0,and ,the total field traces and corresponding source transfer region or chifld 2t2J;z is shown ,in Figure 5(g).The incident boundary of block ,J;l is denoted rl z,and the field J;l is denoted F Fboundary of block gti;z. ,, We see ;c C , J,.1—1 and F F;z c ,r The mapping from incident trace to solution on the block is denoted 拍,and the mapping from incident trace to ifeld trace on the block is denoted ,J;1. Q2 一12j:z .Q2 2j:l .(C) 2 1.2j一1:z Q2 2j—l:z .(a) (e) (g) Figure 5:Extension of incident boundaries and field boundaries 736 CHINESE JOURNAL OF ENGINEERING MATHEMATICS V0L.32 3・2 Setup phase In the setup phase,the mapping from incident traces to field traces is constructed bottom up level by leve1.The mapping on block Q j,iJ∈/0 could be computed with ,external direct solver,and the mapping on block Q ,. of level l>0 is computed as follows. Given an incident lies in 州,it must lie in the incident boundary of one of the children,denoted as Q 。;f~1.First the local problem on children 。;z一1 with source is considered,and the field trace of the solution on r .f_l is solved by mapping ,。,如“一1.Then the field trace of Q 。;l一1 is send to its siblings as incidentsand the iteration of incident and filed trace between siblings applies.The incident trace in the iteration is denoted vUt,J,;l-l ̄where s is the iteration number.At last,field trace on ,一1 caused by sum of incidents computed with the mapping , _l11 along with ,are add up as the field trace 川on 训 the field trace caused by on Q caused by 5, U Y. " ̄i0,jo;I-1( ) +∑ , ;z ,/,●一/ ,; i ,J ) , (15) (16) and the mapping is ∑ c( )=% The algorithm of building the mapping from incident traces to field traces is as follows. Algorithm 2 Build mapping of incident tra 2:for levels l:1,…3: 4: 5: 西 —————一 . 1:On level 0,build the mapping of incident to field with direct solver, , do On block , or ifncident lies in Find the children ;f do 。;f_1 such that lies in F I. 1 6: On children Q 0;f一1, 一1, .map the incident to ifeld trace U F。, and add part of them to father field trace U F, 7: set ;f-1=U F f_1 on children Q ;f_1=0 on other children Q J,;f_1. 一, and set 8: 9: while∑l u/F, ,s1『I>£do , If_】 Send the children’s corresponding field trace .一1 to its siblings as incidents V iI,,sq ,1 10: Map the incident trace to field trace ’ 1: ,,』';/--1\ -TI, s+l,-21) No.5 Leng Wei:A Fast Propagation Method for the Helmholtz Equation 737 11: 12: Set 8=8+1 end While 13: Map the sum of incidents to field trace on children, and add them to father’S ifeld% 14: end f10r 15:elld for 3.3 Solve phase With the mapping of incident traces to filed traces that is constructed on each block of all levels,the Helmholtz equation could be solved in two phases,the source—up phase and the solution-down phase. 3.3.1 The source-up phase In the source—up phase,the wave propagates on all levels bottom up as incident traces.The following problem is considered,for the block QtJ;1,the local solution on ,its four children are known,e.g., ∥;f_l,SO does their field traces ;l-’-l ̄how to solve the solution Ui 2 on Q ;l,and its ifeld trace 矗The iteration of incident and ifled trace between siblings applies directly,denote the incident traces in the iteration s a,l2—1,and the solution on Q ;z is Ui,j;l=∑( 0 + i,jl;l-1(∑咙, ).), i }jf 8 (17) and the field trace of t1J is ,U F,J; = iI , ( u/F,0 一 + 1,j,;l-1(∑叱 ’ ,J;z 、8 …t,J;I ) (18) Review the procedure we found that to apply the procedure to next level,the incident to field mapping operation , ,一1 of children is needed,while the solving operation ,川一1(∑ 8 一1)could be post processed.Apply the procedure from bottom level to top level leads to the following source—up algorithm. Algorithm 3 Source—up Require:Right hand side f of the linear system Ensure:Solution u on Q , ,and sum of incidents on QiJ;l on level l>0 1:On level l=0. solve the local problem on Q with the source 2:for levels l=1,…,ⅣL do =, , the solution ‰and the its ifeld trace U F.,0 are recorded. CHINESE JOURNAL OF ENGINEERING MATHEMATICS VoL.32 3: On block , , 一1 of the four children’S ∥Il_1 l use the field trace add the restriction of J -lj to set u/F0 ,,J1= ,,州_11 4: while l lT ̄tF,,;f一1lI> do fs,5: 6: 7: 8: Send children’s corresponding ifeld traces 多;f_1 s+l = 一 to its siblings as incidents Map the incidents to field Set 8=8+1 end while ( 9: Sum up the incidents∑ S 一1 on children, map them to the field traces on children, then add to U F,J.;z. 10:end for The solution to the total problem could then be expressed as :=: i,jElo ) ∑ 钆‰+ £.・ £.-0 ∑( ̄i',jt;1-1(∑咙, )/>0 i,j∈jl i ,j 。 (19) 3.3.2 The solution-down phase In the 8olution-down phase the wave propagates on all levels top down as incident traces. The solution(19)resulting from Algorithm 3 still needs to solve the local Helmholtz Droblem with given incidents on blocks of diferent levels,fortunately,the local solutions could be break down to lower and lower level till level 0.We consider the following problem:on the block Qt ,given the incidents First the incident traces how to solve%;f(%;2). 刊is divided into the incident traces on children,U I;f= .∑ 5+ 0—1,then with the incident to ifeld mapping , 一l on each childr enj field ( )=∑( ̄it,j';l-1(∑ 一 )). Apply the procedure from level l=帆(20) to l=1,since there are already sum of incidents∑ 一1 on children blocks f-l,the incidents∑ ’ 一1 from Q训 should be dded on children.The algorithm is described as follows. NO.5 Leng Wei:A Fast Propagation Method for the Helmholtz Equation 739 Algorithm 4 Solution-down Require:Solution u 0,;1 on Qt,, and sum of incidents on Qton level l>0 , Ensure:Solution u of the linear system 1:for levels Z=ⅣL,…,1 do 2: On block Q{f;z,divide the sunl of incidents to its children, ,诅 ; l一厶 i一、 I ,o,J ;l-1 t ,J 3 Map the incidents to ifeld  ̄r/F∥,1.1_1= ,∥|f-1( while lI 一 II>£do 5: Send children’s corresponding field trace ;f_1 to its siblings as incidents ∥s+l;f-1 6: Map the incident trace to field trace 川,sq-11= ,—, f_1\( frI,s +l,f_1) 7: Set 8=8+1 s J 8: end while 9: Add the sum of incidents on children Q ,;l-1, 『7Ii .一 ,J ;l--1‘一 ∑ 一 +∑ f一1 10:end for 11:On level Z=0. solve the local problem on Q with the incidents , and add the solution to total solution U. Now the solution to the total problem is =∑( + 。( ∑ vi, j;。)). (21) t,jEIo 4 Numerical experiments The new fast method is tested on the 2D Marmousi model in seismology,which is 3,000 m deep and 9,200 m wide.Only P—wave is considered,thus elsatic wave equation becomes an acoustic equation.The velocity profile is shown in Figure 6,the maximum velocity is 5500 km/s and the minimum velocity is 1500 km/s. No.5 Leng Wei:A Fast Propagation Method for the Helmholtz Equation 741 complexity is O(N。/ logⅣ1.The detailed time cost in setup phase is shown in Table 2.The mapping on the bottom block is solved with direct solver,e.g.MUMPS[ ,and the time cost is almost constant,since the bottom level block is of ifxed size.However, the time cost of building mapping on level l+l is roughly twice of level l,where 1>0, which is time consuming for large Helmholtz problems. 5 Conclusions A fast method is proposed for solving Helmholtz equations,the new method has a setup phase of complexity O(N。/ log N)and a solve phase of complexity O(N log N). Our future work are to reduce the computation time of the new method by exploiting the low rank structure of the mappings and to accelerate dense matrix operations with GPU. Talbe j:Time cost On seconds)of the new method Freq No. Time Time Time NL Size wl2 ̄r procs setup solve total 1 2400×800 37 12 40 195 235 2 4800 x 1600 70 48 140 205 345 3 9600 X 3200 137 192 333 309 642 4 19200 X 6400 270 768 1212 685 1897 5 38400×12800 537 3,072 2891 883 3774 Talbe 2 Detailed setup phase time cost On seconds) Time Time Time Time Time NL Level 0 Level 1 Level 2 Level 3 Level 4 1 39.6 —— —— 2 100 40.1 —— —— 3 l19 86.2 128 —— 4 127 74.7 256 754 5 129 98.7 355 716 1592 References: 【1】Engquist B,Ying L.Sweeping preconditioner for the Helmholtz equation:hierarchical matrix representa- tion[J].Communications on Pure and Applied Mathematics,2011,64(5):697-735 742 CHINESE JOURNAL oF ENGINEERING MATHEMATICS V0L.32 Engquist B,Ying L.Sweeping preconditioner for the Helmholtz equatiectly matched lay— n on:movi£j ng perfers[J1.SIAM;Multiscale Modeling and Simulation,2011,9(2):686—710 Stolk C C.A rapidly converging domain decomposition method for the Helmholtz equation[J].Journal of Computational Physics,2013,241:240—252 Vion A,Geuzaine C.Double sweep preconditioner for optimized Schwarz methods applied to the Helmholtz problem[J].Journal of Computational Physics,2014,266:171・190 Zepeda L N,Demanet L.The method of polarized traces for the 2D Helmholtz equation[OL].Available from http://arXiv/abs/1410.5910v2 Liu F,Ying L.Additive sweeping preconditioner for the Helmholtz equation[OL].Available from http://arXiv/abs/1504.04058 Chen Z,Xiang X.A source transfer domain decomposition method for Helmholtz equations in unbounded domain[J].SIAM Journal on Numerical Analysis,2013,51(4):2331—2356 Chen Z,Xiang X.A source transfer domain decomposition method for Helmholtz equations in unbounded domain part II:extensions[J].Numerical Mathematics:Theory,Methods and Applications,2013,6(3): 538-555 Du Y,Wu H.An improved pure source transfer domain decomposition method for Helmholtz equations in unbounded domain[OL].Available from http://arXiv/abs/1505.06052 Berenger J P.A perfectly matched layer for the absorption of electromagnetic waves[J].Journal of Com— putational Physics,1994,114(2):185—200 Chew W C.Weedon W H.A 3D perfectly matched medium from modiied Maxwellf’S equations with stretched coordinates[J].Microwave and Optical Technology Letters,1994,7(13):599—604 Chen Z,Zheng W.Convergence of the uniaxial perfectly matched layer method for time-harmonic scattering problems in two-layered media[J].SIAM Journal on Numerical Analysis,2010,48(6):2158—2185 Chew W C.Waves and Fields in Inhomogeneous Media[M].New Jersey:Wiley—IEEE Press,1999 Amestoy P R,Duff I S,L’Excellent J Y,et a1.A fully asynchronous multifrontal solver using distributed dynamic scheduling[J].SIAM Journal on Matrix Analysis and Applications,2001,23(1):15—41 求解Helmholtz方程的快速算法 冷伟 f中国科学院科学与工程计算国家重点实验室,北京100190) 摘要:本文提出了求解Helmholtz方程的一个新的快速算法.该算法是建立在有重叠区域的 区域分解算法之上的.该算法首先对求解区域进行层次的区域分解,然后建立了各层次的子区 域上的入射波到出射波的映射,最后通过层次的传播波的信息,得到Helmholtz方程的解.该 方法具有计算复杂度小、适合大规模并行计算的优点,数值实验表明,该方法能够有效的并行 求解有上亿自由度的二维Helmholtz方程. 关键词:Helmholtz方程;有限差分;快速算法;区域分解;完全匹配层