Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meteorology for GCHP advection #342

Open
lizziel opened this issue Sep 5, 2023 · 70 comments
Open

Meteorology for GCHP advection #342

lizziel opened this issue Sep 5, 2023 · 70 comments
Assignees
Labels
category: Discussion An extended discussion of a particular topic

Comments

@lizziel
Copy link
Contributor

lizziel commented Sep 5, 2023

Name and Institution

Name: Lizzie Lundgren
Institution: Harvard University

New GCHP feature or discussion

This issue is to discuss current work related to meteorology used in GCHP advection. There are several things that I hope to get into version 14.3.0.

  1. Validation of GCHP runs using hourly mass fluxes. All official benchmarks use 3-hourly winds instead. Hourly mass fluxes are available for GEOS-FP at C720 (limited time range) and GEOS-IT at C180 (met option to be available in 14.3.0). Mass fluxes are not available for MERRA2.
  2. Implement mass flux regridding update in MAPL. This update from @sdeastham is currently a MAPL PR pending review. The same update needs to be put into our MAPL fork which is an older version of MAPL than what the PR is based on.
  3. Document resource constraints when using mass fluxes. See Issue using REGRID_METHOD_CONSERVE_HFLUX reading c180 GEOS-IT data GEOS-ESM/MAPL#2118.
  4. The algorithm for computing dry pressure level edges for advection in the GCHPctmEnv gridded component needs an overhaul. We currently (1) sum moisture-corrected total pressure delta across all levels to get surface dry pressure and then (2) construct the 3D dry pressures from the surface dry pressure using Ap and Bp. This method should be compared with a direct computation of 3D dry pressure from 3D total pressure (no reconstruction from surface pressure using Ap/Bp).
  5. Add pressure diagnostics in advection. These will appear in HISTORY.rc for gridded component DYNAMICS instead of GCHPchem.
  6. Add budget transport diagnostics and/or vertical flux diagnostics per species.

Pinging @sdeastham and @1Dandan who will help with this work.

@lizziel lizziel added the category: Discussion An extended discussion of a particular topic label Sep 5, 2023
@lizziel lizziel self-assigned this Sep 5, 2023
@lizziel lizziel added this to the 14.3.0 milestone Sep 5, 2023
@1Dandan
Copy link

1Dandan commented Sep 6, 2023

Thanks @lizziel for initiating the discussion. Just to confirm that I can check out the version of 14.2.0-rc.1to use dry pressure instead of total pressure for mass-flux simulations, right?

@lizziel
Copy link
Contributor Author

lizziel commented Sep 7, 2023

Hi @1Dandan. Yes, you can checkout 14.2.0-rc.1 and use the transport tracer simulation. 14.2.0 is still in development because of an issue with full chemistry but the transport tracer simulation should work fine. We are seeing some wonky results for certain tracers in the stratosphere, e.g. SF6, but I think that is okay for the mass flux validation. @sdeastham, correct me if I am wrong on that.

@sdeastham
Copy link
Contributor

That's correct! At this point we're on a fact-finding mission - let's at least start with 14.2.0-rc.1 and see what happens. We know there are issues no matter what we do, so we may as well try and quantify them.

@1Dandan
Copy link

1Dandan commented Sep 7, 2023

Thanks for confirming. Then I'll go head to set up a simulation (intend to run for the year of 2022 at C24) and will let you know when it is finished.

@lizziel
Copy link
Contributor Author

lizziel commented Sep 8, 2023

Sounds good. All of the runs we will do should be the following:

  • Transport Tracer simulation
  • 1 year duration (Jan 1 2022 - Jan 1 2023)
  • Default diagnostics
  • Same initial restart file as used for 14.2.0 transport tracer benchmark (GEOSChem.Restart.20190101_0000z.c24.nc4 located here)
  • C24 grid resolution
  • GEOS-Chem version 14.2.0-rc.1

I will do the wind runs at Harvard and @1Dandan will do the mass flux runs at WashU. Here is a summary of the first runs we will do.

  1. @lizziel: "GEOS-FP" option when creating run directory. This uses 3-hourly winds in advection.
  2. @1Dandan: "GEOS-FP native data -> Use mass fluxes? yes" option when creating run directory. This uses hourly C720 mass fluxes in advection.

Both of these runs use dry pressure in advection. The run directory differences are all in ExtData.rc and GCHP.rc, with the primary ExtData.rc differences being (1) fields used for advection (see below), and (2) raw files from GMAO versus processed files. The raw files and the processed files contain the same data but the processed files have the vertical level flipped and data concatenated into daily files. Raw files also handle optical depth differently. Here is a breakdown of the primary differences in ExtData.rc.

GEOS-FP processed and winds in advection

 --- Surface pressure, 3-hr instantaneous ---
PS1 hPa N Y 0        none 1.0 PS ./MetDir/%y4/%m2/GEOSFP.%y4%m2%d2.I3.025x03125.nc
PS2 hPa N Y 0;001000 none 1.0 PS ./MetDir/%y4/%m2/GEOSFP.%y4%m2%d2.I3.025x03125.nc

 --- 3D variables, 3-hr instantaneous ---
SPHU1 kg_kg-1 N Y 0        none none QV ./MetDir/%y4/%m2/GEOSFP.%y4%m2%d2.I3.025x03125.nc
SPHU2 kg_kg-1 N Y 0;001000 none none QV ./MetDir/%y4/%m2/GEOSFP.%y4%m2%d2.I3.025x03125.nc

 --- 3D variables, 3-hr averaged ---
UA;VA m_s-1 N Y F0;013000 none none U;V ./MetDir/%y4/%m2/GEOSFP.%y4%m2%d2.A3dyn.025x03125.nc

GEOS-FP native and mass fluxes in advection

MFXC;MFYC Pa_m+2_s-1    N H F0;003000 none  0.6666666 MFXC;MFYC  ./MetDir/../../GEOS_C720/GEOS_FP_Native/Y%y4/M%m2/D%d2/GEOS.fp.asm.tavg_1hr_ctm_c0720_v72.%y4%m2%d2_%h2%n2.V01.nc4 2021-03-11T00:30:00P01:00
CXC;CYC   1             N H F0;003000 none  none CX;CY           ./MetDir/../../GEOS_C720/GEOS_FP_Native/Y%y4/M%m2/D%d2/GEOS.fp.asm.tavg_1hr_ctm_c0720_v72.%y4%m2%d2_%h2%n2.V01.nc4 2021-03-11T00:30:00P01:00
PS1       Pa            N Y  0        none  0.01 PS              ./MetDir/../../GEOS_C720/GEOS_FP_Native/Y%y4/M%m2/D%d2/GEOS.fp.asm.inst_1hr_ctm_c0720_v72.%y4%m2%d2_%h2%n2.V01.nc4 2021-03-11T00:00:00P01:00
PS2       Pa            N Y  0;001000 none  0.01 PS              ./MetDir/../../GEOS_C720/GEOS_FP_Native/Y%y4/M%m2/D%d2/GEOS.fp.asm.inst_1hr_ctm_c0720_v72.%y4%m2%d2_%h2%n2.V01.nc4 2021-03-11T00:00:00P01:00
SPHU1     kg_kg-1       N Y  0        none  none QV              ./MetDir/../../GEOS_C720/GEOS_FP_Native/Y%y4/M%m2/D%d2/GEOS.fp.asm.inst_1hr_ctm_c0720_v72.%y4%m2%d2_%h2%n2.V01.nc4 2021-03-11T00:00:00P01:00
SPHU2     kg_kg-1       N Y  0;001000 none  none QV              ./MetDir/../../GEOS_C720/GEOS_FP_Native/Y%y4/M%m2/D%d2/GEOS.fp.asm.inst_1hr_ctm_c0720_v72.%y4%m2%d2_%h2%n2.V01.nc4 2021-03-11T00:00:00P01:00
UA;VA     m_s-1         N Y F0;013000 none  none U;V             ./MetDir/Y%y4/M%m2/D%d2/GEOS.fp.asm.tavg3_3d_asm_Nv.%y4%m2%d2_%h2%n2.V01.nc4 2014-02-11T01:30:00P03:00

OPTDEP      TAUCLI+TAUCLW 0

Here are the differences in GCHP.rc:

< METEOROLOGY_VERTICAL_INDEX_IS_TOP_DOWN: .false.
< IMPORT_MASS_FLUX_FROM_EXTDATA: .false.
---
> METEOROLOGY_VERTICAL_INDEX_IS_TOP_DOWN: .true.
> IMPORT_MASS_FLUX_FROM_EXTDATA: .true.

I have a few questions for @sdeastham:

  1. Are these two runs sufficient to start?
  2. Do you see any problems in the config file entries above ?
  3. Should we also do a mass flux run with total pressure in advection?
  4. Any other comments?

@1Dandan, do you have any questions?

@1Dandan
Copy link

1Dandan commented Sep 8, 2023

Hi @lizziel, yes, I do have some questions.

  1. Is there an option I need to turn on to use moisture-corrected mass flux or is it default?
  2. I usually spin up for one month, it may not be sufficient if talking about concentrations in stratosphere. For the simulation period, do you want me to start just at Jan-01-2022 and see the evolving of mass conservation?

@lizziel
Copy link
Contributor Author

lizziel commented Sep 8, 2023

Is there an option I need to turn on to use moisture-corrected mass flux or is it default?

Good question. The default is moisture-corrected mass flux, to go along with the default of using dry pressure. Is using moisture-corrected mass flux the best way? I'm not sure. @sdeastham, do you think we should try different permutations of moisture-corrected mass flux and dry/total pressure? If yes, which combinations?

I usually spin up for one month, it may not be sufficient if talking about concentrations in stratosphere. For the simulation period, do you want me to start just at Jan-01-2022 and see the evolving of mass conservation?

The 14.2.0 transport tracer restart file for 2019 was created from a 10-year run using 14.2.0. The GC-Classic restart file at the end was regridded to C24. I think this is sufficient for what we are trying to do (@sdeastham, tell us if you disagree).

@sdeastham
Copy link
Contributor

Assuming that the moisture-corrected mass flux means the mass flux of dry air only (i.e. multiplying the mass flux by 1.0/(1-QV) where QV is taken from the upwind grid cell), then you would want to use the moisture-corrected flux for dry pressure advection and the original (total) flux for total pressure advection. I don't think that any other combination makes sense but happy to discuss!

As for the restart file - I think that's fine. Using a consistent set of initial conditions is all that really matters until we can get to the point where we can do something rigorous in comparison to GEOS. Starting from a high-res file and degrading to the target resolution would be better than starting from a GC-Classic file or low-res file, but I think the differences will be small.

@sdeastham
Copy link
Contributor

As for the questions you posted @lizziel:

  1. Are these two runs sufficient to start? Yes, although see 3
  2. Do you see any problems in the config file entries above ? Nothing obvious
  3. Should we also do a mass flux run with total pressure in advection? I think this is a good idea. We know it's got problems but data will help us in diagnosis
  4. Any other comments? Thanks for leading the charge on this!

@lizziel
Copy link
Contributor Author

lizziel commented Sep 8, 2023

Thanks @sdeastham!
There will now be a third run to add to the list. I will summarize all three here to avoid confusion.

  1. @lizziel: Dry pressure and 3-hourly winds in advection. "GEOS-FP" option when creating run directory. No manual changes needed.
  2. @1Dandan: Dry pressure and moisture-corrected hourly C720 mass fluxes in advection. "GEOS-FP native data -> Use mass fluxes? yes" option when creating run directory. No manual changes needed.
  3. @1Dandan: Total pressure and native hourly C720 mass fluxes (not moisure-corrected) in advection. "GEOS-FP native data -> Use mass fluxes? yes" option when creating run directory. Make manual changes in file GCHP.rc:
USE_TOTAL_AIR_PRESSURE_IN_ADVECTION: 1
CORRECT_MASS_FLUX_FOR_HUMIDITY: 0

I am still waiting for the restart file to show up on the Harvard ftp site. I will post here when it is available. I also want to double-check that the total air pressure and native mass flux options look okay before proceeding.

@lizziel
Copy link
Contributor Author

lizziel commented Sep 11, 2023

The 14.2.0 transport tracer restart file to be used for the mass flux runs is now available. Download file GEOSChem.Restart.20190101_0000z.c24.nc4 at http://ftp.as.harvard.edu/gcgrid/geos-chem/1yr_benchmarks/14.2.0-rc.1/GCHP/TransportTracers/Restarts/.

@lizziel
Copy link
Contributor Author

lizziel commented Sep 11, 2023

I am using the following libraries for the GEOS-FP winds run:

  1) gmp/6.2.1-fasrc01    5) openmpi/4.1.0-fasrc01   9) netcdf-c/4.8.0-fasrc01
  2) mpfr/4.1.0-fasrc01   6) zlib/1.2.11-fasrc01    10) netcdf-fortran/4.5.3-fasrc01
  3) mpc/1.2.1-fasrc01    7) szip/2.1.1-fasrc01     11) flex/2.6.4-fasrc01
  4) gcc/10.2.0-fasrc01   8) hdf5/1.10.7-fasrc01    12) cmake/3.25.2-fasrc01

I will use 96 cores across 2 nodes (48 cores per node) and enable monthly mid-run restart file.

@lizziel
Copy link
Contributor Author

lizziel commented Sep 14, 2023

The dry pressure + winds run is now complete. @1Dandan, do you have an idea of when you will be able to do the mass flux runs? To share it with me you can post it at http://geoschemdata.wustl.edu/ExternalShare/.

@1Dandan
Copy link

1Dandan commented Sep 15, 2023

Hi @lizziel, yes, it is running now and will probably take around 1 week to finish. It seems slower than wind runs. I will post the results at http://geoschemdata.wustl.edu/ExternalShare/ once they are ready.

@lizziel
Copy link
Contributor Author

lizziel commented Sep 15, 2023

Great, thanks! It is also extra slow because you are using the native files instead of the usual processed files. The two sets of data are the same grid resolution but the processed are concatenated into daily files and many fewer collections. Opening and closing many files a day causes a performance hit.

@1Dandan
Copy link

1Dandan commented Sep 27, 2023

@lizziel , thanks for your patience. Both simulations with total and moisture-corrected mass fluxes have been completed. I used same restart file as suggested. Configuration files, restarts and outputs are copied to:
For total mass-flux run:
http://geoschemdata.wustl.edu/ExternalShare/GCHP-v14.2.0-rc.1/rundir-TT-MF-c24-tot/
For moisture-corrected mass-flux run:
http://geoschemdata.wustl.edu/ExternalShare/GCHP-v14.2.0-rc.1/rundir-TT-MF-c24-dry/
Let me know if you need any other files.

@lizziel
Copy link
Contributor Author

lizziel commented Sep 28, 2023

Hi @1Dandan and @sdeastham. Unfortunately the recent 1-year fullchem benchmark for 14.2.0 showed a problem with the GC-Classic to GCHP restart file conversion that also impacts these runs that @1Dandan and I just did. GCPy was used for the first time to generate GCHP restart files and it went under the radar that the lev dimension retained the GC-Classic lev attribute positive "up". Upon GCHP read all 3D restart file variables are vertically flipped. Amazingly it does not crash the model, but does lead to incorrect values, particularly in the stratosphere for the transport tracer simulation.

Apologies that we will need to rerun these simulations. I will post where to get a new restart file once it is available. We are going back to using csregridtool to generate GCHP restarts for these benchmarks until GCPy GC-Classic to GCHP restart conversion is more thoroughly validated.

@1Dandan
Copy link

1Dandan commented Sep 28, 2023

I see. I'll wait for the new restart file to rerun simulations.

@1Dandan
Copy link

1Dandan commented Oct 30, 2023

Hi @lizziel, I am wondering if the new restart file is ready or not?

@lizziel
Copy link
Contributor Author

lizziel commented Oct 30, 2023

Hi @1Dandan, yes, sorry for the radio silence! For the new runs we can use the restart file used for the 14.2.0 benchmarks found here: http://geoschemdata.wustl.edu/ExtData/GEOSCHEM_RESTARTS/GC_14.2.0/. Let's use the official 14.2.0 release for the runs since it is now released. Let me know if you have any questions.

@1Dandan
Copy link

1Dandan commented Oct 30, 2023

Sure, I see that the GCHP v14.2.2 is released. @lizziel, do you want me to restart the transport tracer simulations with the official version, or is the version of v14.2.0-rc.1 also benign?

@lizziel
Copy link
Contributor Author

lizziel commented Oct 31, 2023

We should use v14.2.2 since it will automatically links to the correct restart file and 14.2.1 includes a fix for using native GEOS-FP fields in the transport tracer simulation. Did you run into an error with the ocean mask in your last runs?

@1Dandan
Copy link

1Dandan commented Oct 31, 2023

Sure, I'll use v14.2.2 then. I will let you know when the runs finished.

Did you run into an error with the ocean mask in your last runs?

Oh, yes. I forgot to let you know that I edited the line of ocean mask in ExtData.rc for v14.2.0-rc.1 as follows:

 #==============================================================================
 # Country/region masks
 #==============================================================================
 #OCEAN_MASK   1 N Y - none none FROCEAN    ./MetDir/2011/01/GEOSFP.20110101.CN.025x03125.nc
 OCEAN_MASK   1 N Y - none none FROCEAN    ./MetDir/GEOS.fp.asm.const_2d_asm_Nx.00000000_0000.V01.nc4
 #

@lizziel
Copy link
Contributor Author

lizziel commented Oct 31, 2023

That's the bug. It should be fixed in 14.2.2. Let me know if you run into anything else that you need to fix. I can put any fixes into the next version.

@1Dandan
Copy link

1Dandan commented Nov 7, 2023

Hi @lizziel, the two mass-flux transport tracer runs at C24 of dry pressure and total pressure have finished. They both run smoothly without any errors.

The mass-flux simulation results with dry pressure and moisture-corrected mass flux are at:
http://geoschemdata.wustl.edu/ExternalShare/GCHP-v14.2.2/rundir-TT-MF-c24-dry/

The mass-flux simulation results with total pressure and non-corrected mass flux are at:
http://geoschemdata.wustl.edu/ExternalShare/GCHP-v14.2.2/rundir-TT-MF-c24-tot/

Let me know if you need any other files.

@lizziel
Copy link
Contributor Author

lizziel commented Nov 7, 2023

Excellent! I will download the data and make comparison plots.

@lizziel
Copy link
Contributor Author

lizziel commented Nov 28, 2023

I generated comparison plots for (1) dry pressure versus total pressure, both using mass fluxes, and (2) winds versus mass fluxes, both using dry pressure. See https://ftp.as.harvard.edu/gcgrid/geos-chem/validation/gchp_mass_fluxes_vs_winds/.

@yuanjianz
Copy link

@lizziel, sure. I am using the 201901 restart file in the default restart folder.

@yuanjianz
Copy link

yuanjianz commented Apr 1, 2024

@lizziel, I am experiencing memory leaks and unusal low performances for my transport tracer simulation. I compared my log files with Dandan's last time. For cube sphere mass flux, her throughput is 3 times of mine and my memory statistics shows sharp increase. Please check my log file attached.(14.3 is mine and 14.2 is Dandan's) Currently I am trying to run another transport tracer under main branch to see if it is something introduced recently.

This happens to raw wind as well. My raw wind run dies of a signal 9 for OOM after 7 months of simulation. The log files show memory usage increases from 30% to 90% during the simulation.

14.3-gchp.20220101_0000z.log.txt
14.2-gchp.log.txt

P.S. I am running on 3 nodes, and each node is given 300GB of memory.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 1, 2024

Hi @yuanjianz, I will take a look and try a shorter test run on my end too.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 1, 2024

I don't see this in my processed winds 1-year run so that rules out the new diagnostics.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 1, 2024

I have a 1-month raw fields run in the queue. In the meantime, could you post your results for the first 6 months of the year? I can do a comparison separate from performance issues.

@yuanjianz
Copy link

yuanjianz commented Apr 1, 2024

@lizziel, here are my results for first 8 months with log files for first 6 months in log1 folder. http://geoschemdata.wustl.edu/ExternalShare/tt-geosfp-raw-wind/

@yuanjianz
Copy link

Hi @lizziel, the memory leak is probably related to my compiler and environment set-up. I tested processed wind between two compiler environment, and Dandan's environment would not have the memory leak issue on Compute1. I guess although there is an OOM issue with my previous gnu environment(which possibly originates from OFED), it would not cause much difference in the results. So I am just restarting with Dandan's intel environment. Please correct me if I am wrong.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 2, 2024

Hi @yuanjianz, I have not seen such a severe memory leak due to environment/system before, but it makes sense it would be this since I don't think the code changes would cause it. Could you give full details all libraries/system specs which result in the problem, along with libraries/system specs which do not? Regardless, as you say, the results should not be different. This can be tested by comparing the first month's data produced using the two different environments.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 2, 2024

To clarify, we would expect numerical noise differences for different compilers, e.g. intel versus GNU. But there should not be systematic bias and diffs should be very small.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 2, 2024

I should also note that there is a known small memory leak in GCHP that seems to be from MAPL. I created an issue a couple years ago on this at GEOS-ESM/MAPL#1793. It is small enough that it has not been addressed yet.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 2, 2024

@yuanjianz, could you put your raw wind output into subdirectories OutputDir and Restarts? I need the restart files as well as the diagnostics. Thanks!

@yuanjianz
Copy link

yuanjianz commented Apr 2, 2024

@lizziel, sure. They are now there. It is now going to November, so I assume raw wind would finish today and raw mass flux probably tomorrow or Thursday.

For the environment, it is a little bit hard to expand the full environment. Because we are using Docker+spack and did not generate module files for all dependencies.

The memory leak GNU(ubuntu20.01):

spack find --loaded
-- linux-ubuntu20.04-skylake_avx512 / [email protected] -----------------
[email protected]

-- linux-ubuntu20.04-skylake_avx512 / [email protected] ----------------
[email protected]  [email protected]  [email protected]  [email protected]  [email protected]  [email protected]  [email protected]
==> 8 loaded packages
---
spack find
-- linux-ubuntu20.04-skylake_avx512 / [email protected] -----------------
[email protected]                [email protected]    [email protected]       [email protected]    [email protected]    [email protected]    [email protected]
[email protected]  [email protected]  [email protected]  [email protected]  [email protected]    [email protected]  [email protected]
[email protected]              [email protected]     [email protected]     [email protected]    [email protected]   [email protected]   [email protected]
[email protected]          [email protected]     [email protected]       [email protected]   [email protected]  [email protected]       [email protected]

-- linux-ubuntu20.04-skylake_avx512 / [email protected] ----------------
[email protected]                       [email protected]        [email protected]           [email protected]         [email protected]
[email protected]                     [email protected]         [email protected]     [email protected]         [email protected]
[email protected]                 [email protected]    [email protected]         [email protected]      [email protected]
[email protected]                         [email protected]       [email protected]      [email protected]           [email protected]
[email protected]                         [email protected]       [email protected]        [email protected]         [email protected]
[email protected]                      [email protected]       [email protected]             [email protected]            [email protected]
ca-certificates-mozilla@2023-05-30  [email protected]      [email protected]             [email protected]    [email protected]
[email protected]                        [email protected]     [email protected]           [email protected]         [email protected]
[email protected]                          [email protected]   [email protected]        [email protected]  [email protected]
[email protected]                       [email protected]  [email protected]  [email protected]
[email protected]                          [email protected]      [email protected]        [email protected]
[email protected]                         [email protected]     [email protected]        [email protected]
[email protected]                     [email protected]       [email protected]         [email protected]
==> 89 installed packages

The working intel environment(centos7):

To note that the GNU environment used spack find external to document some dependencies while Intel environment did not. Also, the Intel environment is installed with Mellanox OFED for MPI while GNU uses libfabric. If you are more interested in the detailed setup, the GNU environment is the official docker image maintained by @yidant with a little bit modification(+hl and + fortran for hdf5 and netcdf-c). @1Dandan's intel docker is built from Compute1-supported base.

I will try to install OFED in the GNU environment to see if it would fix the problem in the future if I have time.
GNU.txt

@1Dandan
Copy link

1Dandan commented Apr 2, 2024

To add for the intel environment, the ESMF version is v8.3.1.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 2, 2024

Not sure if this is related, but there was a bug report from former GCST member Will Downs about a bug registering memory in GCHP when using libfabric. #47

@yuanjianz
Copy link

Hi @lizziel, the raw wind 1 yr geosfp transport tracer is ready:
http://geoschemdata.wustl.edu/ExternalShare/tt-geosfp-c24-raw-wind/

I noticed that by changing to Intel environment, although memory leak disappears and running fast enough at first, the simulation slows down to half speed comparing to Dandan's last time runs. From the time diagnostics, Bracket in ExtData takes most of the time. Not sure about why this is happening.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 3, 2024

Hi @yuanjianz, I am looking at the results and the diagnostics look off, for both of our runs. The passive tracer restarts compare well, with difference of 1e-6, but I think the diagnostic is getting corrupted. This may explain the slow-down.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 3, 2024

Stangely I cannot reproduce the issue. I am doing another run with the new diagnostics turned off.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 3, 2024

I ran a 1-month simulation using 14.2.2 and 14.3.1 for GEOS-FP processed files and get identical results except for st80_25 (as expected). I do not see a slow-down. I am trying to make sure that a constant value for every grid box in the monthly mean of passive tracer concentrations makes sense. We do see this in version 14.2.2. I am skeptical given the values in the internal state.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 3, 2024

Separate from this issue of constant values for passive tracer, I do see that the raw versus processed bug is fixed.

@yuanjianz
Copy link

Hi @lizziel, thanks for the update. You said you found corrupted diagnostics in your run as well. Do you think it was the new diagnostics that caused the performance degradation on my end? And it seems only happening for raw files as well, because my GEOS-IT preprocessed wind fullchem benchmark using 14.3.1 with the new diagnostics GCHPctmLevel* did not show performance issues.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 4, 2024

Hi @yuanjianz, we expect the run with the raw files to perform not as well as using preprocessed files because there are so many files to read and with high frequency. Do you see the same performance issue using 14.3.0 instead of 14.3.1?

@yuanjianz
Copy link

Hi @lizziel, thanks for the explanation. I haven't done the one with 14.3.0. I am just curious about the diagnostic corruption you mentioned above. What does it mean? Do you think I should turn off the new diagnostics in 14.3.1 and then rerun a performance test between the two versions?

@lizziel
Copy link
Contributor Author

lizziel commented Apr 4, 2024

See #399 for discussion of the suspected Passive Tracer diagnostic issue. I am not going to worry about it much for now since it does not impact mass conservation tests (those use restart files) and is not recently introduced.

@lizziel
Copy link
Contributor Author

lizziel commented Apr 4, 2024

Here is the global mass table for passive tracer for @yuanjianz's 2022 GEOS-FP run with raw GMAO fields and using winds in advection:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  Global Mass of Passive Tracer in 14.3.1_GEOS-FP_raw_wind
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 Date        Mass [Tg]
 ----------  ----------------
 2022-01-01   17.6562799006358
 2022-02-01   17.6527063054427
 2022-03-01   17.6527047860219
 2022-04-01   17.6527058698120
 2022-05-01   17.6527059098902
 2022-06-01   17.6527058070735
 2022-07-01   17.6527057721245
 2022-08-01   17.6527057131759
 2022-09-01   17.6527056042564
 2022-10-01   17.6527059860906
 2022-11-01   17.6527059325285
 2022-12-01   17.6527059325285

 Summary
 ------------------------------
 Max mass =   17.6562799006358 Tg
 Min mass =   17.6527047860219 Tg
 Abs diff =    3575114613.909 g
 Pct diff =      0.0202525033 %

NOTE: The last month was not available so I copied Nov.

For comparison, here are results for the same run using procesessed winds. Note that both of these runs use dry pressure in advection.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  Global Mass of Passive Tracer in 14.3.1_GEOS-FP_processed_wind
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 Date        Mass [Tg]
 ----------  ----------------
 2022-01-01   17.6562799006358
 2022-02-01   17.6527063301804
 2022-03-01   17.6527047587778
 2022-04-01   17.6527058170920
 2022-05-01   17.6527058567409
 2022-06-01   17.6527058000323
 2022-07-01   17.6527057656179
 2022-08-01   17.6527056823411
 2022-09-01   17.6527056193937
 2022-10-01   17.6527059680841
 2022-11-01   17.6527059053356
 2022-12-01   17.6527059056348

 Summary
 ------------------------------
 Max mass =   17.6562799006358 Tg
 Min mass =   17.6527047587778 Tg
 Abs diff =    3575141858.008 g
 Pct diff =      0.0202526576 %

@yuanjianz
Copy link

Hi @lizziel, the GEOS-FP raw mass flux run is ready now.
Please check the link here: http://geoschemdata.wustl.edu/ExternalShare/tt-geosfp-raw-csmf/

@lizziel
Copy link
Contributor Author

lizziel commented Apr 8, 2024

Thanks @yuanjianz. Here is the mass conservation table for your mass flux run:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  Global Mass of Passive Tracer in 14.3.1_GEOS-FP_raw_mass_fluxes
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 Date        Mass [Tg]
 ----------  ----------------
 2022-01-01   17.6562799006358
 2022-02-01   17.6527118519604
 2022-03-01   17.6527119102029
 2022-04-01   17.6527118174852
 2022-05-01   17.6527117827595
 2022-06-01   17.6527118234161
 2022-07-01   17.6527118121272
 2022-08-01   17.6527117025911
 2022-09-01   17.6527117019211
 2022-10-01   17.6527120035089
 2022-11-01   17.6527118178688
 2022-12-01   17.6527118540502

 Summary
 ------------------------------
 Max mass =   17.6562799006358 Tg
 Min mass =   17.6527117019211 Tg
 Abs diff =    3568198714.657 g
 Pct diff =      0.0202133178 %

Looks like the mass conservation issue with mass fluxes is fixed with the raw GMAO fields bug fix.

@yuanjianz
Copy link

Hi @lizziel @sdeastham, my recent mass flux fullchem benchmark is showing unreasonbale high surface aerosol concentration than wind.

Looking back at Lizzie's previous GEOS-IT C180 mass flux v.s. wind transport tracer simulation, I found it seems to be due to much weaker advection in mass flux runs. Taking SF6 and Rn222 as examples(plots from Lizzie's comparison above):
image
*plots are annual mean massflux - wind or massflux/wind

My instinct is that only a shift from wind to mass flux won't have such a large effect. And as I can recall Martin et al, GMD, 2022, GCHPv13 paper indicates mass flux should have less dampened mass flux than wind. I wonder your opinion on this, thanks!

@sdeastham
Copy link
Contributor

Thanks @yuanjianz ! In your last post, are you saying that you think the shift from wind to mass flux should be having a smaller effect than this? That would be my expectation too - but I want to be sure we're on the same page. It does look to me like there has been a substantial reduction in vertical mixing, but the interesting thing is that this is exactly what we would expect. I'm curious - how do the horizontal mass fluxes compare between the wind and mass-flux simulations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Discussion An extended discussion of a particular topic
Projects
None yet
Development

No branches or pull requests

4 participants