Program 6: FAQ and recent questions+answers

Spring, 2019
Jewett

Program 6: Faq, questions, answers

ATMS 502 / CSE 566
Numerical Fluid Dynamic

Updated Tue. Apr. 23

[Frequently asked & other] Questions & Answers, and Hints

[Coding: C] When compiling in C I get errors saying RAND_MAX is undefined.

I had not realized this: You need to include <stdlib.h> in addition to <math.h>. (More on stdlib.h)
[Running parallel] I get errors trying to set OMP_NUM_THREADS to run my code in parallel

The settings I described in class (and that are in some of my scripts) are for the csh/tcsh command shells.
If you are using Bash by default (here's how to check), then use the following instead:

export OMP_NUM_THREADS=8 (or whatever value you want, between 1 and 16)
run-your-program-name

FYI: more on the bash export command, on environment variables and on Linux shells.
[Coding PL] I have flux(i-1/2) ... How do we determine the flux at the Right side of the grid zone?

You don't have to. The flux at the right side of the grid zone is equal to the flux at the left side of the grid zone for the next grid point / zone to the right. Assuming you store flux values the same as U (for example), then flux(i) is on the left side of the grid zone, and
flux(i+1) is the flux on the right side of the current grid zone.
[Plotting: plot3d] Does plot3d require that all arrays to be plotted have the same size?

Right now - yes, it does.
[Physics tests] Test cases testD3u, testD3v, and testD3w are not found

This is now fixed.

=========== earlier questions ============
[Results] Are the v values at the positions of u or of w?

V values are at neither. V is a half grid distance north / south of a theta/P’ point, just like U is a half distance east/west,
and W is a half distance above/below.

Keep in mind most of my plots have U averaged in X, V averaged in Y and W averaged in Z so all the arrays are (nx,ny,nz) when I
plot them. Also, X-Z slices of U will resemble Y-Z slices of V provided that the initial warm-or-cold-bubble is truly in the middle(x,y) of
the domain – especially early in the simulation. Later, the differing boundaries between X and Y will lead to some differences, but early
on those U(x,z) and V(y,z) plots will look similar.
[Coding PL] I am ... developing the code for piecewise linear advection scheme, and I’m not sure about which speed to use
to calculate Courant number ... Do we still need to use the averaged speed (due to staggering grid) to get the currant number?

No, do not average any speeds. For piecewise linear, you need the local Courant numbers as part of the Flux values which are co-located with
each U (or V or W - in any case, staggered) velocity. So don’t average - pass in the nx+1 (or ny+1 or nz+1) velocity values and calculate
abs( u(i)*dt/dx ) when calculating.
[Coding PL] To my understanding, we need two Courant numbers, each for one of the flux terms F(i+1) and F(i) to calculate the new
value. Since it’s used in the flux term, can we just use the original (without staggering average) u(i+1) for the F(i+1) term and u(i) for F(i)?

Remember you don't need to compute "two flux values" for each grid point.
Rather, compute an array (nx+1 wide) of flux values, and when you do the advection, you'll use (flux(i+1)-flux(i)) in computing the new theta,
where flux(i) is located at the same point as U(i) (assuming you're doing an x-advection pass. If y-advection, it is at V(j), etc.)a
[Advection tests] I am now trying Test A1 (x-advection) but encounter a problem: the temperature cannot be transported out of right boundary
(x-direction). On course website, it says in the main program "set U, V, or W to a constant value every time step, just after BC is called."
Should we also overwrite the BCs of temperature in main program ? Otherwise the symmetry BC for temperature is like putting a wall at
boundary. Should we use zero-gradient temperature as BCs in main program for Test A1(x-advection) and Test A3 (z-advection)?

Yes, you would have to “defeat” the boundary conditions for test A1 to work. Personally, I would not worry about the theta boundary conditions
and only see if you match the results in the first 200 seconds or so (but, you may want to override the U boundary conditions or you may get noise
there – or, if you get it, don’t worry about it since this test is inconsistent with our X BCs).

==================== Older comments from 2014 or before follow =============================
[Results] My results differ slightly from yours...

One thing to note is that my plots online have all variables at p- (or theta-) locations.
That is, I average U in X, V in Y, and W in Z so all my arrays are (nx,ny,nz).
You don't have to do this, but it may explain some difference between my results and yours.

Also, for test F, the use of random perturbations means your results are almost certain to disagree
somewhat from mine. Finally, remember that C results usually differ from Fortran.
[Coding] What loop ranges should I be using for U, V and W?

The notation I use here is A:B, meaning from index value of A to value of B.
For P and Theta, it is simply 1:nx, 1:ny, and 1:nz (in Fortran; in C it is I1:I2, J1:J2, K1:K2).
For U, your loops should run 2:nx, 1:ny and 1:nz (in Fortran. In C: I1+2:I2, J1:J2, K1:K2)
    ... because you set I=1 and I=nx+1 from the (a)symmetry boundary conditions at I=2 and at I=nx.
For V, your loops should run 1:nx, 1:ny and 1:nz (in Fortran. In C: I1:I2, J1:J2 and K1:K2)
    ... because V at J=ny+1 is set equal to your J=1 results (Fortran notation here).
    ... Note several folks have used 2:ny for V, but you need to solve at either J=1 or J=ny+1,
        so you can set the other point based on the periodic boundary conditions.
For W, your loops should run 1:nx, 1:ny and 2:nz (in Fortran. In C: I1:I2, J1:J2 and K1+1:K2)
    ... because W is always zero at k=1 and k=nz+1.
[Plotting] Vis5D does not work on stampede.

This is true; I was unable to get it to compile correctly on stampede. Please use visit instead.
I recommend using Vis5D only if you have a native Mac or Linux system for which there are precompiled binaries.
[Plotting] plot3d does not work correctly for +/- isosurface values in some cases.

This is true; I will fix this as soon as possible. In the meantime, plot the positive or negative values
of surfaces separately to compare to my results online.
... if you try to plot "+/-10", it will work; "+/-100" will not (it reads this as +/-10 instead of 100).
[Running] Batch system: How do I get input into my program, in batch?

In the place in the batch job "deck" where you run your program, instead of, say

    myprogram
or
    /usr/bin/time -p my program

do this instead:

    myprogramname << EOF
    first line of input
    second line of input
    ...
    last line of input
    EOF

actually any word will work, doesn't have to be EOF. END_OF_INPUT is what I used in my job deck.
For your info, my batch job deck is now on stampede in ~tg457444/502/Pgm6/brian_batch.deck
[Compiling] Having problems getting code to compile parallel.

Make sure your makefile has $(OPTIONS) (or however you worded it) both in the step for compiling an individual file
as well as for the full program. I have put my makefile on stampede at ~tg457444/502/Pgm6/makefile.txt, if you want
to take a look at what I used for my code.
[Extra credit] Extra credit?

Out of time; my apologies. Too late to put up now. There will be a curve on grading -
[Program ALMOST works] I have a symmetry problem / odd results

I'd concentrate on Test D. Drop the bubble, run 10 steps, check each step. Then run 400 steps...
U plotted in XZ cross sections should be symmetric, as should V plotted in YZ cross sections.
If not, check for something really strange in the P field.
When in doubt, turn off (comment out) the nonlinear U, V, and W advection terms. If
things look better, turn on the W nonlinear advection ( udw/dx, vdw/dy, wdw/dz - all the W terms).
If it still looks OK, turn on V nonlinear advection. If that is ok, then put back/turn on U
nonlinear advection. And you can always use a zero diffusion coefficient to "turn off" diffusion
for testing.
[Out of time] What should I do if I am out of time?

Submit all your code, and maybe a short explanation of what worked and what didn't.
You can submit your results from test cases so I can see how things were going.
Include any plots that are useful.
IF your code WORKS but there just isn't time to run everything, run the low-res final problem, instead.

I'll be lurking on compass and email while grading.

[Initial condition] Not sure if my initial density-averaged-to-W-levels is correct

For the 100-meter case, here are my base-state values, including density at theta/p levels, and at W levels -

   k  z(km)   t(K) theta,K   p(mb)   density   rho-w
 ----------------------------------------------------
   1  0.050  299.51  300.00  994.32  1.15656  *******
   2  0.150  298.53  300.00  983.02  1.14716  1.15186
   3  0.250  297.56  300.00  971.81  1.13780  1.14248
   4  0.350  296.58  300.00  960.69  1.12849  1.13315
   5  0.450  295.60  300.00  949.67  1.11923  1.12386
   6  0.550  294.63  300.00  938.73  1.11001  1.11462
   7  0.650  293.65  300.00  927.89  1.10084  1.10543
   8  0.750  292.67  300.00  917.13  1.09172  1.09628
   9  0.850  291.69  300.00  906.47  1.08263  1.08717
  10  0.950  290.72  300.00  895.89  1.07360  1.07812
  11  1.050  289.74  300.00  885.40  1.06461  1.06910
  12  1.150  288.76  300.00  875.00  1.05566  1.06014
  13  1.250  287.79  300.00  864.69  1.04676  1.05121
  14  1.350  286.81  300.00  854.47  1.03791  1.04234
  15  1.450  285.83  300.00  844.33  1.02910  1.03351
  16  1.550  284.86  300.00  834.28  1.02034  1.02472
  17  1.650  283.88  300.00  824.31  1.01162  1.01598
  18  1.750  282.90  300.00  814.43  1.00294  1.00728
  19  1.850  281.92  300.00  804.63  0.99431  0.99863
  20  1.950  280.95  300.00  794.92  0.98573  0.99002
  21  2.050  279.97  300.00  785.29  0.97719  0.98146

[Plotting] My values don't quite agree with yours and the min/max location is off by 1 point ...

Student figured out this one: it is because my online plots are always averaged to (theta,p) locations,
so U has been averaged in X and V in Y and W in the Z direction.
[Visualization] Does the VisIt/Vis5d plot need to be of the final problem? That makes a large file.
Is a 3d vis picture from a test case OK?

I will provide what specific visualization I want. A visualization of a test case is not sufficient for full credit.
[Running in batch] My batch job was killed due to exceeding wallclock (or)
                                the job could not be submitted (or)
                                trouble running my program in batch

There are three key changes you may need to make e.g. to the sample batch job deck in ~tg457444/502/Pgm6/[C or Fortran]
   a) change the queue name from "development" to "normal" (development has a 2-hour wallclock limit)
   b) change the requested wallclock time from five minutes (00:05:00) to something longer, e.g. 8 hours (08:00:00), if that much is needed
   c) change the running of the program - instead of my timing test program, use yours; and use the "<< END" syntax
       to provide input to your program. Example:
                       ./my_program_name << END
                       100
                       8250
                       8250
                       4250
                       (enter any parameters/input your program needs, the same way you do when running interactively)
                       END
[Plot3d/netcdf] plot3d is not making a netcdf file for me, complaining about "Error identifying field name"

This was happening if the field names etc written out for T=0 (or, the first time saved) differed from those used later.
I believe I have a workaround for this in place now.
[Plot3d] I want to make color (rather than the default black & white) postscript from my metacode/gmeta file.

Use ~tg457444/502/Tools/metaps [usual arguments just like metagif] color

The "color" argument specifies that the postscript will contain color.
[Amplitude problem] My pressure/vertical velocity (for example) fields are a factor of 2 too large (or) too small compared to yours.

This has turned out to be a problem with: (1) using dt rather than 2*dt for later u/v/w/p updates, (2) some sort of update problem
[Solution too fast] My fields evolve more quickly than yours

This was due to using tstep (either dt or 2dt) rather than dt for theta. Remember that theta advection and diffusion terms are always
multiplied by ∆t (not ever by 2*∆t) since theta, unlike u/v/w/p, is always forward in time.
[VisIt] How do I transfer the netcdf file from Stampede to my PC?

On a mac, I recommend the free program Fugu. On windows, WinSCP.
On linux, you could try a command line scp on Stampede of the file to your PC, or scp from your PC to pull the file off Stampede.
[Vis5d] I could not display fields in Vis5d over X-windows on my PC

Try this:

~tg457444/502/Tools/vis5d -mbs 400 run.vis5d

Vis5d will let you know if it recommends more memory to be used when running the program (-mbs allows you to do so).
[Solution problems] My field is evolving wrong [e.g. changing every other time step]

First, make sure your update is correct and the change from forward-in-time to centered-time differencing is done OK.
Secondly, at the start of each time step, do the first part of the update. So I do things this way in my main time step loop:

         u3=u1
         v3=v1
         w3=w1
         t1=t2

         call bc(...)
         call advect(...)
         call diffusion(...)
         call pgf(...)

         if (n > 1) then
            p1=p2
            u1=u2
            v1=v2
            w1=w2
         endif

         p2=p3
         u2=u3
         v2=v3
         w2=w3
         t1=t2
[vis5d] I get an error message, "X11 connection rejected ... Unable to open default display"

If you are connecting from Windows, make sure Putty has X11 forwarding/tunneling enabled, and that Xming is running.
If you are on linux or a mac, make sure you connected with "ssh -X you@stampede.tacc.utexas.edu"; if that doesn't work,
try "ssh -Y" and if that doesn't work, on a mac, start a Xterm within X11, type "xhost +", and try "ssh -X" again.
[Required plots etc] Are you going to post what plots are required and what should be submitted?

Yes and yes. The required plots will be listed on the same page as the official problem settings.
[Running in parallel] Is it ok to run the program serial [i.e. 1-processor] ?

You can run any of the test cases (or, for example, the low-resolution final problem) in parallel to get credit for parallelization. Feel free to run your final full-resolution integration (or whatever final run you make) serially if that works better.

For proof that you did run something in parallel, I require output from a batch job (the "filename.o123456" type batch output file) included in the files you submit to me in e.g. a zip file containing everything.
[What to plot out] Are the times on the required problem settings/plots page in seconds or in number of time steps ?

Every reference to "T=" on the official problem settings page is to time in seconds, not number of time steps. Divide by the time step for the number of steps to make the plot.
[Plotting: final problem, low resolution] Is the plotting condition the same as that for full resolution run ?

The slices (K=, J=, or I= settings on the ) refer to the full resolution domain. While the times are comparable to the low resolution case, the axes would differ. Since the low resolution case is at 150 meters instead of 50, I would choose a slice or cross section location at 1/3rd the grid index of the high resolution one (so instead of J=91 I'd try J=30).
[Making a batch run deck] Is [a batch deck file] supposed to exist already, or do we make it from scatch ?

I recommend editing/modifying the one I have in the ~tg457444/502/Pgm6/[C or Fortran]/*.deck files on Stampede. Remember to submit to me the batch output file -- you should have a .o file, that I'd like you to submit with your code etc.
[Problem specifying initial U field] Error when setting the initial U field ("Syntax error, found REAL_CONSTANT...")

There is an error in the handout; a "-" is missing. Instead of

u1(i,j,k)=(ranval 0.5)*upertur

do this:

u1(i,j,k)=(ranval-0.5)*upertur
[Running batch] When I ran my program in batch, I got the error "forrtl: severe (24): end-of-file during read, unit -4, file stdin"

Try this:

      (name of your program6 executable) << END
      now type in the input which you
      normally enter when running the program
      interactively on Ember.
      END

This 'redirects' all the input parameters you normally type in, and feeds them to your program when it is run in batch.
[PGF testB] My vertical velocity magnitudes are off by a factor of 2..."

The error was a parenthesis; the (∆t or 2∆t) time increment (I call it tstep) used in updating w3 in the PGF routine wasn't being multiplied times all the other terms (e.g. buoyancy) in the expression w3(i,j,k) = w3(i,j,k) + ...
[Running] When using batch, do I have to embed the parameter values in the code before compiling?
I ask so because when I tried batch, I didn't see the usual dialogues to input the parameter values.

There is a way to input/read-in the parameter values (no need to embed in code) when running in batch. You do something like

time program_executable_name << SOMETHING
first input line
second input line
...
last input line
SOMETHING

where 'SOMETHING' can be any 1-word character string. I sometimes use EOF, which stands for End of File, but that is an old habit.
Anything will do there, and it does what you want: the program runs, and where you read (or C, scanf), it gets the input you have
specified here, one line at a time.
[Running] When I tried to run the program with hand-in parameter setting (i.e. large 3d domain) with
program_name .1CORE interactively, it tells me "segmentation fault".

Before running your job (in batch or interactively), try:

unlimit
limit stacksize unlimited
(then run your program)

In my case this worked and I could run (for 30 min, anyway) the full problem at 50m grid spacing
[Plot3d] We use plot3d command "set window 90" ... when I tried this with larger grid-zone size tests, I found plot3d
doesn't really allow a value of 90. Only values within the range of 8~45 are allowed. Is it fine to plot with window size of 45 then?

That's true. The window value cannot exceed nx, ny and nz. So for nz=45 runs, use set window 45
[Parallelization] I've compiled with -openmp and the "time" command reports 98% use (i.e. not > 1 cpu)

[Needed to both compile and link with -openmp; their makefile did not have -openmp in the compile, only the link statement]
[Main pgm: order of calling routines] Is there a particular order to run the pgf, diffusion and advection routines?

I'm doing the same order [UVW advection, theta advection, diffusion, and PGF].
Given the way our advection is handled, we need to do the order of calls as: (a) advection, (b) diffusion, (c) PGF
Try to keep your arrays distinct. In particular, set t2=t1 at top of the time step loop and then do all work - e.g. theta advection 1-d copies
to/from the 3d array - using t2. That is, the t1 array that goes into diffusion is really the time n value (n+1 from previous time step).
[Installing Vis5d] I want to install Vis5d on my computer ... I am not sure if my Macbook can install Vis5d or not.

I have a page on our class site:

http://www.atmos.illinois.edu/courses/atmos502-fa2013/Vis5d/

I have vis5d on a couple of macs (macbook pro and mac mini) at my office. I installed fink (after making sure I had a recent Xcode,
which has latest gcc and is ~1 GB download) and then Vis5d. Alternatively, on Macs, try MacPorts.
[T advection TestA] My x advection has a factor of 2 problem and Y and Z advection do not move the bubble at all!

If Y and Z advection don't do anything, maybe it is an array name problem. Remember to keep copying to/from the same 3d array in your theta advection, so you copy your 1-d row/column of theta and U for x-advection and then after the update copy it back to the same 3d array. Maybe you switched from s1 to s2 somewhere in your code, and your Y/Z advection are working on an array that is no longer used.

If you are using the 33x33x16 domain, and the bubble correctly starts in the middle of the domain, it should have the bubble center move to right on the edge of the domain in X-advection tests in [dx*(33-1)/2] / U = 8000 meters / 20m/s = 400 seconds.

I do this: top of time step loop in main program: before anything else -

t2 = t1
u3 = u1
v3 = v1
w3 = w1
p3 = p1 (or just do this in your pgf routine as part of the update)

then I use t2 (the new, or n+1 time level theta) everywhere in my advection routine, so all updates are to it.
[∆t, tstep, #steps] Plotting at 100 sec actually means plotting at 200^th step with dt =1, right? Because, tstep becomes 2*dt?

No, because we are taking centered steps from (n-1) to (n+1); that is what tstep=2*dt is doing.
And we are doing forward time for theta, so it really is marching along at dt, not 2*dt.
[U advection] In the nonlinear advection for u, there is v*du/dy , which is the second part of the equation ...

That is expressed in program 6 as

   y-average{ x-average{v}*du/dy }

I pull these things apart from the outside inward. So this is a U equation, and the first part is an average in Y.
So look at the staggered grid diagram containing u and v; this is an X-Y diagram.
You seek a y-average of something centered on u(i,j).
So you want avg{v}*du/dy at j+1/2 (relative to u(i,j) !!),
   + avg{v}*du/dy at j-1/2 (again relative to u(i,j)), the sum divided by 2.
The first part of this is

  u3(i,j,k)=u3(i,j,k) - tstep/(4.0*dy)* (
   ( v2(i,j+1,k)+v2(i-1,j+1,k) )*( u2(i,j+1,k)-u2(i,j,k) )
   + ( second part ) )
[T advection] When I will use strang splitting for theta, what should be the range of the do loops?

Well, the directional splitting will still be one direction at a time, so you have interior-most loops over i or j or k for x,y,z advection. But now the outer loops will be over the two (instead of one) remaining indices. So the x-advection pass in my code looks like

do k
  do j
   i loop: copy t2(i,j,k) to q1d(i)
   set bc's on q1d
   i loop: copy u2(i,j,k) to vel1d(i)
   call mpl routine...
   i loop: compute t2(i,j,k) update with t2(i,j,k),flux(i),flux(i+1), and du/dx term
  enddo
enddo

Followup question: Are those loops always from 1:nx, 1:ny and 1:nz?

Well, this is theta advection, so the loops for theta will be all 1:nx or 1:ny or 1:nz. If you are carrying around ghost points in your 3d theta arrays, you could go ahead and copy the ghost points on the innermost array ("I" loop for x advection), e.g. -1:nx+2 ... or, just do 1:nx and handle the ghost points internally in your 1-d advection somehow. The velocity field advection will require nx+1 (or ny+1 or nz+1) points, so 1:nx+1 for U2.
[Putfield] When I tried NX+1,NY,NZ for U field, I got an error "bad array size found" from putfield...

Yes, all arrays [passed to putfield] need to be the same size. Please average the U, V, and W fields to theta & pressure points, so those
arrays are the same size as theta and p. so (u(i,j,k)+u(i+1,j,k))/2, etc.
[Init] If I use the same seeding value 0 for SRAND function, do I need to get the exact same result with yours?

No; I'm looking for qualitative agreement.
[T advection] In advection Test A, had problems with slight noise and change in amplitude as theta perturbation exited the +Y boundary and moved into the -Y boundary

The problem was due to the period boundary conditions in the Y-direction.
Remember theta(i,ny+1,k) = theta(i,1,k) and also that theta(i,0,k) = theta(i,ny,k).
[T advection] Are we still doing the theta advection one row or column vector at a time? And how do I know which direction of wind to send to the subroutine. For example, when doing x-direction advection do I send the v or w wind and do I send a 1D or 2D array? Or maybe I use a loop somewhere to get the whole 3D array advected?

The Strang splitting is like the directional splitting we did before, just in a specific order instead of one that varies from one time step to the next.

So x-advection always uses U, and always copies an I list of U and theta values, with dt/2
... y-advection always uses V and copies a J list of V and theta values with dt/2
... and of course z advection uses W and a vertical column of theta values with dt
... then y with dt/2, and finish up with x and dt/2.
Each pass completes a run through all the data in just their one direction, and the results are passed to the next. So in the sense program 6 was xz/zx ... now you can just fix the order and do x-y-z-y-x on each time step.
In the call to srand(0.0), what does 0.0 correspond to, as well as the rand(0) - what value does it hold?

srand initializes the random number generator. The idea is you can initialize it with whatever you want, and while the sequence that
follows when you call rand() is ostensibly randomly distributed, you are guaranteed the same series of numbers provided you give it the
same "seed" value - in this case zero. Someone seeking a truly unknown sequence of numbers could initialize it with the current time
in seconds, say. The argument to srand should really be an integer, not a real number: call srand(0).

rand(0) - returns a random value. As I recall the 0 is just a placeholder.
Intel documentation: SRAND, RAND
time steps & Strang splitting, dt vs. tstep

We do time steps as we did before regarding dt and tstep (tstep for everything except theta; theta gets dt). So for Strang splitting
for theta you pass dt or dt/2, not tstep or tstep/2. The change from tstep=dt to tstep=2*dt is done after the first time step, as before.
problem with initial condition - looks strange when plotted w/plot3d

The problem was that an array with ghost points was passed to the putfield routine, but putfield was called as though there were not.
That is, the array t(-1:nx+2,-1:ny+2,-1:nz+2) was passed to putfield with:

call putfield('T',0.0,t,nx,ny,nz)

What was needed: call putfield('T',0.0,t(1:nx,1:ny,1:nz),nx,ny,nz)
Boundary conditions - same in x,z as before? Does U still change sign? X boundaries still at i=1,nx? W still 0 at 1 and nz+1?

Yes to all; there are no changes in the x or z boundary conditions from Program #6. The new variable for flow in the Y direction, v, will be symmetric in x (like theta - same sign i.e. symmetry on either side of X symmetric boundaries),and 0-gradient in Z.
[Problems with vis5d, or a very large netcdf file from plot3d for visit]

I suggest trying to save just one or two data times from plot3d for vis5d or netcdf (for visit), instead of many data times.
Remember, when creating a 3d data file for vis5d or visit using plot3d, you can select the time(s) you want written out. An example:

Enter fields to plot, "all" for all fields, "?" for help, or "exit" to quit: vis5d

Creating Vis5D data set; enter name of Vis5D file [return=run.v5d]: case2.v5d
33 times available; range:    0.00 800.00
Enter first time, last time, and interval: 700 700 0
   1 times found -- range: 700 700
Opening NCAR graphics
Reading data
Storing to case2.v5d: T    T= 700.0
Storing to case2.v5d: P    T= 700.0
Storing to case2.v5d: U    T= 700.0
Storing to case2.v5d: V    T= 700.0
Storing to case2.v5d: W    T= 700.0
Storing to case2.v5d: Q    T= 700.0
Storing to case2.v5d: vort T= 700.0
Found all data; data time count=           7
Run Vis5D now? [y/n, return=yes]: y
Is there a way to select which time level you are plotting? What I mean is, when it asks (T, U, V, etc..) and you pick one,
it ends up creating a gmeta (or vis5d or netcdf) for all of it... I can't seem to find example syntax for choosing just T=50, for example...

If you type "?" in plot3d, you are given a list of possible commands. Typing set times will prompt you for a start and end time to plot.
I just started to wonder why my code is so slow?

FYI, advection is at least part of the problem. You can get a timing breakdown of your code using gprof -

http://www.ncsa.illinois.edu/UserInfo/Resources/Hardware/SGIAltix/Doc/timing.html#gprofp

You compile your code with certain options, run it (don't need to run terribly long; say, 50 time steps), and then use the profiling tool described above to get timing data. In my case, the monotonic PL code took the most time by far.
I have a question regarding plot3d: How do you plot vorticity?

Provided you have named your horizontal velocity components U and V, plot3d will compute vorticity for you.
Just use "vort" as the name of the field.
Problem with segmentation violation when running program

Try (either interactively or in batch) typing "unlimit" and then "limit stacksize unlimited" before running your program. If you are using Fortran, try compiling with subscript checking ("ncargf90 [or ifort] -check all -traceback program.f90 ...") before running.
Problems running in batch - e.g. segmentation violation

Try above comments regarding unlimit and limit commands.

If your program produced a RunHistory.dat file, try using plot3d to view as much output as you have - do you see the program blowing up, and if so, in what variable and in what part of the domain? These can be important clues in tracking down the problem.
[If you want to time your program or figure out how long it has to run...]

The obvious way is to run it for 10 or 20 time steps and scale that up to the full number of time steps you need to use. If you want a measure while the job is running, try some of this code.
How do I delete a batch job?

Type "qdel jobnumber" ... you can get the job number while jobs are running by typing "qstat -u $user", and use the jobnumber (not jobnumber.co-master...) in using the qdel command to delete the batch job.
[Program is running extremely slowly]

It could be your code has NaNs and Stampede is carrying them around during execution. This can greatly extend your runtime. There
may be compile options to catch this and cause the program to fail rather than continue to act in this way.
[Segmentation faults/NaNs when plotting/other errors when running parallel code

You should probably restrict your placement of parallel directives to well-confined multiply-nested loops. In particular, I suggest not parallelizing theta (scalar) advection, as that seems to be problematic (but do parallelize the box terms for U,V,W advection), and I would not parallelize any loops with function or subroutine calls within them.
I am getting errors running my program within batch ...
-or- How do I run my program - with input normally typed in - in a batch job?

See the "Parallel execution and batch" section above, to which I have added an example of the batch deck I used for program #5.
In general: You have several ways to get input to your program in batch jobs.

1) put the input - that you normally type when running interactively - *into a file*. For example, edit the file "data" on cobalt and put the same input text you normally use in that file (no comments or anything, just what you normally would type from the keyboard). Exit the editor. Then run "pgm6 < data" interactively to make sure it works, and use that in the batch job (you could do "time pgm6 < data" if you wish)

2) combine running the program and the input all within the batch deck script. Eg., after you have all the directives for number of processors and such, actually run your (compiled) program (assumed here to be named 'pgm6') with

time pgm6 << EOF
input
input
more input ...
...
last line of input
EOF

this feeds the input into the program, within your batch script.
Program is blowing up with segmentation fault if domain size is increased from test case dimensions to full problem.

Try typing (on Stampede):

unlimit
limit stacksize unlimited

and then run your program and see if this changes the behavior.
Another possibility is to run your code (as is, unparallelized) in batch, which may make more memory available to it.
[Problem with advection - scalar field was not moving]

Cause was found to be with update routine; in C, was using i1,i2,k1,k2 instead of I1,I2,K1,K2
(i1,i2,... variables were used but not set)
[Problem with vorticity plots looking OK except magnitude off by a factor of 10]

Likely cause is not entering the correct value of ∆x when running program plot3d. Plot3d needs ∆x to compute vorticity for you.
The default value of ∆x (if you just hit return when running plot3d) is 50 meters, and test problems are all run at 500 meters.
[Problem with all fields looking OK except pressure]

Cause was found to be with the 1-D density(z) field, which influences the pressure tendencies, causing the errors.
[Problem with all fields looking OK but V field min/max location off by one point]

Cause was error in averaging of v field to theta locations for plotting.
My (U,W, and/or P') fields are remaining at zero instead of evolving.

The first place to look is probably your update step. Is it possible you are copying an array of zeroes over your other array time levels? If not that, note that if you have a statement like:

u3 = u1 - ....

in your advect routine, for example, and you comment out the call to advect to do a PGF test, you will need to add the statement "u3=u1" before calling PGF. Otherwise, arrays like U and W may not be correctly taking the leapfrog step forward. As a matter of habit, I put u3=u1 and w3=w1 in my main program before calling advect, diffusion, and pgf, and those subroutines then just add to the new (n+1) values; they have statements like u3 = u3 + ... and w3 = w3 + ... in the advect, diffusion, and PGF routines.
Do we need to take 2 theta time steps for every leapfrog U/W/P' time step?

No. We can 'mix' the forward-in-time theta integration and leapfrog-in-time U/W/P integration without taking double the number of theta steps. Remember the leapfrog time is jumping 2*dt, but is doing so from (n-1) to (n+1), whereas theta is going from (n) to (n+1).
On page 1, for the discrete form of equation for 'theta', the first one is given in matrix form, does the upper entry in the matrix (big bracket) means in X-direction, lower entry means Z-direction?

Yes, exactly. Really, other than boundary conditions, you don't change your scalar (theta) advection at all for this program. The diffusion is off in a separate subroutine (with leapfrog rather than forward time differencing).
Check your initial state: it says 'set rho_w level @ k=1 to k=9999 as a check', what exactly do you mean by that? then set @ what value?

What I meant is: we create the density array I call rho at the U,Theta,P levels from the equations in the handout. Then we average this density to the intermediate vertical velocity levels - I call this averaged density array "rho_w". The very first rho_w (at k=1) is used in conjunction with vertical velocity, which is always zero at k=1. So it makes sense to just set rho_w(k=1) = 9999 or some other large value, because it should always be multiplied by zero. Otherwise it really isn't used, since W at k=1 is not evolved in time.
Check your initial state: it says 'set rho_w level @ k=1 to k=9999 as a check', what exactly do you mean by that? then set @ what value?

What I meant is: we create the density array I call rho at the U,Theta,P levels from the equations in the handout. Then we average this density to the intermediate vertical velocity levels - I call this averaged density array "rho_w". The very first rho_w (at k=1) is used in conjunction with vertical velocity, which is always zero at k=1. So it makes sense to just set rho_w(k=1) = 9999 or some other large value, because it should always be multiplied by zero. Otherwise it really isn't used, since W at k=1 is not evolved in time.
I have compared my results with yours and for time step 100 my u,w and p have similar results to yours. But, for time step 200 the u and w are a little bit different from yours but the pressure has values of 300-400 and your pressure field is about 600. I dont know if this is a problem.

I worry about pressure fields if the magnitude is increasing dramatically and consistently. If your field is different from mine but the pattern is similar (after all, only the gradients matter), I wouldn't worry about it. Only U, W, theta really matter - and the pattern of P should be similar (otherwise the other fields wouldn't be).

So, no, I won't be grading on absolute values of P. It would be incorrect as we have no real constraint on P', and I'm sure coding and compiler differences could lead to a drift. Perhaps if everyone removed the mean P' on every time step, it might make sense to compare the values instead of just the pattern/gradients, but as it is, it doesn't make sense to do so.
[Numerous questions about early vertical velocity field magnitudes in PGF test]

I believe my results are correct. If you find your vertical velocity amplitudes are appreciably larger (say, the W field minimum at step #2) than mine, you might check to make sure your (1) array updates are being done correctly, (2) that you set u3=u1 and w3=w1 at the start of each time step (that way you can add in contributions from various processes with w3=w3+... and u3=u3+... ; and (3) that your "tstep" (my time increment for u, w, and p') doubles from dt to 2*dt after the first time step, to transition your code from forward in time to leapfrog time for u, w, and p'.
[Boundary problems in scalar field in the advection tests]

The problem turned out to be with boundary conditions in the scalar advection code; theta BCs were not being set between the directionally split advection steps (between X and Z, or Z and X). In their case the problem did not occur if the mean potential temperature (theta-bar) was 0, but was apparent if it was = 300K.

[Time steps] "you set dt to 2dt after initial; but each timestep represents dt. Why are we increasing dt to 2dt again?"

That file listing includes the following:

  step    time     umin      umax      wmin       wmax     tmin     tmax       pmin       pmax
------------------------------------------------------------------------------------------------
    0    0.000 -20.00000  20.00000   0.00000   0.00000   0.00000  25.00000    0.0000   0.0000
 Changing to 2dt =  0.5000000    
    1    0.250 -20.00000  20.00000   0.00000   0.00000  -0.00092  24.99994    0.0000   0.0000
    2    0.500 -20.00000  20.00000   0.00000   0.00000  -0.00174  24.99966    0.0000   0.0000

The change shown was not dt, but tstep. Tstep was changed from dt, for the first step, to 2*dt, which is used
for all other time steps in the u,w,p expressions which look like:

u3 = u3 + tstep*(terms)

[Questions about staggering and averaging, in the pressure gradient terms; do we need to average p' since not at u,w locations?]

True, the p locations are different from u and w, but note the terms are really:

u: dp'/dx
w: dp'/dz

both of which, due to the staggering, are actually valid at u and w locations; dp/dx is centered on U, and dp/dz is centered on W.
No additional averaging is necessary. For the U and W equations, the pressure terms we use are derivatives valid at the U and W locations.
[Questions about theta diffusion]

The governing equations at the top of page 1 do not (but should) list diffusion for theta, though they are in the discretized equations.
The discretized equations are correct.
NOTE there is theta diffusion for all advection scheme cases; your diffusion will not vary depending on the advection form you use (this is a change from the handout).
Why are the units for the density g kg^-1? Why isn't it kg m^-3?

You are right and the handout is incorrect. It is kg/m**3.
What value of Cs (sound speed) we should use?

I will specify it along with the hand-in problem settings. For now, check against the test case results posted online (where it says what sound speed to use)
What is "rho_{theta level}" ??

This is density (rho-bar) valid at the heights (above the lowest W level, where z=0) of the u, p and theta variables.
The height of W levels is dz/2 from those of u,p,theta.
I don't understand where it says "Check the initial state with this data for dx=dz=100m, at k=11,..."?
... is this the initial condition we should use, with perturbations introduced only at level k=11? (plus the temperature perturbation
introduced later on) ?

No, that information is just for you to check your density calculations. The temp perturbations are independent of this.
What is p' ? Is it P = overbar{p} + p' where overbar{p} is calculated using the eqn on page 2? and total P using the value from the first line
on top of page 3? then we get the initial p'?

We only use overbar P to calculate density. Otherwise it is not needed.
We set the initial p' field to zero, and let it evolve - we do not calculate or need the total pressure at all.
What are xradius and zradius?

xradius and zradius are the terms in the denominator of the radius calculation expression (x and z terms).
Poor choices of wording, probably, since radius implies a circular rather than elliptical perturbation for our initial scalar theta field.

In programs 2/3, the initial condition expression used tests of d vs. r; here, the ratio of distance and radius is built into the r(m) expression, such that for r>1, there is no perturbation added to the initial theta (initial value = theta-bar = 300K) field, and for r <= 1, we use the expression in the handout.
What is directionally split, besides theta advection?

Nothing. Your perturbation temperature (theta) advection is the same as before -- directionally split. Everything else -- theta diffusion; u, w diffusion; u, w advection; pressure gradient and buoyancy terms - are unsplit, evaluated directly in 2D. So, e.g., the u advection looks like:
u3 = u3 + tstep*[(x advection of u)+(z advection of u)]
What do you mean "free slip" boundary conditions for u ?

Here it means you don't have to do anything. The horizontal flow u "slips" freely above the ground surface located where w(k=1)=0.0 ... had we used "no slip" conditions for u, you might, for example, set u to zero at k=1 for all time. The only relevant boundary conditions on u are how we set the ghost zones: 0-gradient at the top and bottom of the model domain.
[Not sure how to set up initial conditions for temperature with loop m=1,2 on handout page 3]

[2D case] For each location (i,k), you set t(i,k) to tbar, that is, to 300.0. Then, include a loop from 1...2 that checks to see if your (i,k) grid point is within radius of 1.0; if so, add to your t(i,k) the appropriate contribution from temperature perturbation #1 or #2 using the cos(r) function on page 3 of the handout.
[Hint: be very careful with array dimensions now that NX [and for 3D, NY] is not equal to NZ!]

If you program in C, you may still have arrays dimensioned [NXDIM][NXDIM] or the like, and you may have for() loops ranging from i1...i2 for both "x" and "y" dimensions from your code in program #3. So you will want to check carefully your array dimensions, and your do- (or for-) loop bounds in coding up program #5.
There are places in the handout, page 1, that show diffusion of theta, or of the perturbation theta' ... does it matter? Is this an error?

The handout is written in the more general case where the theta-bar value is a function of height: tbar(z), in my code. In such a case, doing horizontal diffusion can be on theta or theta', since the mean will drop out (not affect) the calculation. When tbar is a function of height, the mean will not "drop out" and so the diffusion we use is on the deviation from the mean, T-Tbar. For that reason, I used d**(theta')/dz**2 in the handout, but since Tbar is not a function of height - but, rather, a constant - it doesn't matter whether your x or z diffusion is on the total theta (i.e. t1 array) or on the deviation (t1-tbar).
[Problems with boundary conditions for theta, as applied to advection of temperature]

You can always enforce your temperature boundary conditions in the routine that calls your 1-d advection method. If you are performing x-advection of theta, you can set the ghost zone values in the 1-d array q before you pass it to the 1-d advection routine. Note that we have changed x-boundary conditions for theta since program #3; you may still have old 0-gradient code in there. For vertical advection of theta, you can similarly set your ghost zone values of your 1-d q array to be 0-gradient before calling your 1-d advection routine.

If not the above, then call a routine to set your temperature boundary conditions prior to calling your advection routine for theta.
Next?