.. -*- rst -*- -*- restructuredtext -*-

======================================
Homework 7
======================================


Due 23.59, Dec/13/2020 
++++++++++++++++++++++++++++++++++++++++++++++++++

Parallel computing II: solving wave equation in 2D with MPI
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**Note1.** This is your final assignment bearing **200 points** with potentially **50 bonus points**.

**Note2.** If you have not already studied and practiced the material
and codes in the sections `PDEs`__ and `MPI`__, STOP here! Go back and
do your study and practice and then come back and continue with this
assignment.

**Note 3.** It is very important that you follow the rules of thumb in
`Good citizenship`__.

--------------------------

Consider the wave equation in the conservative form in two dimensions

.. math::
   :nowrap:

   \begin{equation}
   u_{tt} =(a(x,y)u_x)_x + (a(x,y)u_y)_y + f(x,y,t), \ \ (x,y) \in D =[-1,1]^2, \ \ t \in [0,2],
   \end{equation}

augmented with initial conditions

.. math::
   :nowrap:

   \begin{equation}
   u(x,y,0) = f_1(x,y), \ \ u_t(x,y,0) = f_2(x,y), \ \  \quad (x,y) \in D,
   \end{equation}

and Dirichlet boundary conditions 

.. math::
   :nowrap:

   \begin{equation}
   u(x,y,t) = g(x,y,t), \ \  \quad  (x,y) \in \partial D.
   \end{equation}


Here, :math:`a(x,y) = 1+\sin(x) \, \cos(y)` is the given wave speed, and :math:`u=u(x,y,t)` is
the wave solution to be computed.

__ http://math.unm.edu/~motamed/Teaching/Fall20/HPSC/pde.html
__ http://math.unm.edu/~motamed/Teaching/Fall20/HPSC/mpi.html
__ http://math.unm.edu/~motamed/Teaching/Fall20/HPSC/stampede2.html#good-citizenship


Part I. Serial computation (70 points)
====================================================

1. Use the method of manufactured solutions (as explained `here`__) and manufacture an
   initial-boundary value problem for which the true solution is
   known. 

2. Write a Fortran code that implements a second-order accurate finite
   difference method for solving your manufactured problem by the
   method of lines (as explained `here`__). You will need to create
   subroutines for the manufactured forcing, boundary conditions, the
   Laplacian, abd initial data. You may would like to see the Matlab
   version of the code, located in the course repository under the
   directory ``hpsc2020/hw7/Matlab``. You can also let ``nx=ny``
   throughout this project, that is consider the same number of grid
   points in both ``x`` and ``y`` directions. This would imply that ``hx=hy``.

3. Verify that your code works correctly by verifying the convergence
   rate of the maximum approximation error over the domain :math:`D` at time :math:`t=2`. You
   will need to plot (in log-scale) the error versus a decreasing sequence of grid spacing and show that the
   error decreases with the expected rate, which is 2.
   

__ http://math.unm.edu/~motamed/Teaching/Fall20/HPSC/pde.html#iii-verification-convergence-and-manufactured-solutions
__ http://math.unm.edu/~motamed/Teaching/Fall20/HPSC/pde.html#method-of-lineas-and-finite-differencing-on-uniform-grids


Part II. Parallelization (130 points)
================================================


1. Make your serial Fortran code parallel using MPI library. Employ a
   one-dimensional parallelization strategy by splitting the
   computational domain :math:`D` into ``nprocs`` vertical slabs. This
   is a one-dimensional domain decomposition with ``px_max = npocs``
   processors ranging from the processor number ``px=1`` to
   ``px=px_max``. The communication between processors will be similar
   to the 1D wave equation that we have seen in class. 


2. Apply your code to the same manufactured problem that you found in
   part I. Check that your parallel code and domain decomposition and
   communication are working correctly by checking the convergence of
   the maximum error in approximate solution at time :math:`t=2` and
   making sure that the error decays with the expected rate 2. Include
   some neat plots of the solution at final time and of the error
   convergence in your report. The easiest way to output in parallel
   is to simply have each processor write its own part of the solution
   at the final time and then put it all together in a visualization program (e.g. Matlab). 
      

3. Study the strong and week scaling on Stampede2. Include some neat
   plots in your report. As always, start small: for example let
   ``nx=xy=100`` and use 5 nodes. Then gradually increase
   the problem size and the number of nodes. Do not use more than 50
   nodes and more than ``nx=xy=5000`` for this project. 


Part III. Extra assignments (50 bonus points)
===================================================      

   
**2D domain decomposition.** The above one
   dimensional domain decomposition does not minimize the
   communication between sub-domains.

   A. You can improve this by splitting the domain into ``px_max*py_max`` processors where
   ``px_max`` and ``py_max`` are chosen to minimize
   ``abs(nx/px_max-ny/py_max)``.

   B. After decomposing the computational domain in a two
   dimensional way you will need to update the
   communication. Start by assigning the left, right, top, and bottom
   neighbors of each process. Be careful abour ghost points: each
   process will have an array ``u(0:nxl+1,0:nyl+1)`` where the
   physical domain owned by each process corresponds to the indicies ``1:nxl,1:nyl``.
     
   C. Check that your domain decomposition and communication is working
   by computing the error in approximate solution.
   

   D. Study the strong and week scalings on Stampede2, and compare with
   the 1D strategy in Part II.
   
       
* As usual, arrange your results neatly in your report and comment on
  them.