Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in running the tracking algorithm for CM1 output #75

Open
ealucy opened this issue Dec 5, 2023 · 5 comments
Open

Issue in running the tracking algorithm for CM1 output #75

ealucy opened this issue Dec 5, 2023 · 5 comments

Comments

@ealucy
Copy link

ealucy commented Dec 5, 2023

My goal is to track vorticity features within tropical cyclone model output in CM1. Currently, when running the code, I'm getting an error that the algorithm doesn't seem to be seeing the files within the directory that I have housed them:

(aug23_env) el381212@turing:/nfs/tcdynasty/lucy$ python run_generic_tracking.py config.yml
2023-12-05 20:01:29,557 - pyflextrkr.idfeature_driver - INFO - Identifying features from raw data
2023-12-05 20:01:30,181 - pyflextrkr.idfeature_driver - INFO - Total number of files to process: 0
2023-12-05 20:01:30,184 - pyflextrkr.idfeature_driver - INFO - Done with features from raw data.
2023-12-05 20:01:30,184 - pyflextrkr.tracksingle_driver - INFO - Tracking sequential pairs of idfeature files
2023-12-05 20:01:30,185 - pyflextrkr.tracksingle_driver - INFO - Total number of files to process: 0
2023-12-05 20:01:30,185 - pyflextrkr.tracksingle_driver - INFO - Done with tracking sequential pairs of idfeature files
2023-12-05 20:01:30,185 - pyflextrkr.gettracks - INFO - Tracking features sequentially from single track files
2023-12-05 20:01:30,186 - pyflextrkr.gettracks - INFO - Total number of files to process: 0
Traceback (most recent call last):
File "/nfs/tcdynasty/lucy/run_generic_tracking.py", line 61, in
tracknumbers_filename = gettracknumbers(config)
^^^^^^^^^^^^^^^^^^^^^^^
File "/nfs/knight/mamba_aug23/envs/aug23_env/lib/python3.11/site-packages/pyflextrkr/gettracks.py", line 74, in gettracknumbers
logger.debug(f"files[0]: {files[0]}")
~~~~~^^^
IndexError: list index out of range

I'll also attach my config file:


ERA5 vorticity anomaly tracking configuration file

Identify features to track

run_idfeature: True

Track single consecutive feature files

run_tracksingle: True

Run tracking for all files

run_gettracks: True

Calculate feature statistics

run_trackstats: True

Link merge/split tracks

run_mergesplit: True

Map tracking to pixel files

run_mapfeature: True

Start/end date and time

startdate: '20000101_000004'
enddate: '20000101_000008'

Parallel processing set up

run_parallel: 1 (local cluster), 2 (Dask MPI)

run_parallel: 1
nprocesses: 32 # Number of processors to use if run_parallel=1

databasename: 'cm1out_'
#databasename: ERA5_SFvortPV_

Specify date/time string format in the file name

E.g., radar_20181101.011503.nc --> yyyymodd.hhmmss

E.g., wrfout_2018-11-01_01:15:00 --> yyyy-mo-dd_hh:mm:ss

time_format: 'yyyymodd_hhmmss'

Input files directory

clouddata_path: '/nfs/tcdynasty/lucy/cm1/'

Working directory for the tracking data

root_path: '/nfs/tcdynasty/lucy/cm1_tracking/'

root_path: '/pscratch/sd/j/jmarquis/ERA5_waccem/Bandpassed/'

Working sub-directory names

tracking_path_name: 'vtracking'
stats_path_name: 'vortstats'
pixel_path_name: 'vortracking'

Specify types of feature being tracked

This adds additional feature-specific statistics to be computed

feature_type: 'generic'

Specify data structure

datatimeresolution: 1/3600 # hours
pixel_radius: .015625 # km
x_dimname: 'ni'
y_dimname: 'nj'
time_dimname: 'time'
time_coordname: 'time'
x_coordname: 'x'
y_coordname: 'y'
field_varname: 'rel_vort'

Feature detection parameters

label_method: 'skimage.watershed'

peak_local_max params:

plm_min_distance: 15 # min_distance - distance buffer between maxima; num grid points
plm_exclude_border: 5 # exclude_border - distance buffer between maxima and the domain sides; num grid points
plm_threshold_abs: 0 # threshold_abs - minimum magnitude of PSI' required to define a maxima

watershed params:

cont_thresh: 0.00002 # PSI' contour defining outermost of flood-filled object area
compa: 0 #"compactness factor" - (how much you'll let a flood fill spread into a neighbor's domain. Zero or < 100 seemed ok.)

field_thresh: [1.6, 1000] # variable thresholds

min_size: .1 # Min area to define a feature (km^2)
R_earth: 6378.0 # Earth radius (km)

Tracking parameters

timegap: 1/3600 # hour
othresh: 0.3 # overlap percentage threshold
maxnclouds: 100 # Maximum number of features in one snapshot
nmaxlinks: 10 # Maximum number of overlaps that any single feature can be linked to
duration_range: [6, 800] # A vector [minlength,maxlength] to specify the duration range for the tracks

Flag to remove short-lived tracks [< min(duration_range)] that are not mergers/splits with other tracks

0:keep all tracks; 1:remove short tracks

remove_shorttracks: 1

Set this flag to 1 to write a dense (2D) trackstats netCDF file

Note that for datasets with lots of tracks, the memory consumption could be very large

trackstats_dense_netcdf: 1

Minimum time difference threshold to match track stats with cloudid pixel files

match_pixel_dt_thresh: 60.0 # seconds

Link merge/split parameters to main tracks

maintrack_area_thresh: .1 # [km^2] Main track area threshold
maintrack_lifetime_thresh: 60/3600 # [hour] Main track duration threshold
split_duration: 30/3600 # [hour] Split tracks <= this length is linked to the main tracks
merge_duration: 30/3600 # [hour] Merge tracks <= this length is linked to the main tracks

Define tracked feature variable names

feature_varname: 'feature_number'
nfeature_varname: 'nfeatures'
featuresize_varname: 'npix_feature'

Track statistics output file dimension names

tracks_dimname: 'tracks'
times_dimname: 'times'
fillval: -9999

Output file base names

finalstats_filebase: 'trackstats_final_'
pixeltracking_filebase: 'vort_tracks_'

List of variable names to pass from input to tracking output data

pass_varname:

  • 'rel_vort'

All the files are housed in the /nfs/tcdynasty/lucy/cm1/ directory, but it seems to me that they're not being found by the code. Any assistance is much appreciated!

@feng045
Copy link
Collaborator

feng045 commented Dec 7, 2023

Based on your config, the code would be searching for input files like this:
/nfs/tcdynasty/lucy/cm1/cm1out_yyyymodd_hhmmss.nc

And the files date/time must be within this range:
startdate: '20000101_000004'
enddate: '20000101_000008'

You should check to make sure that matches your input files.

@ealucy
Copy link
Author

ealucy commented Dec 8, 2023

Yes, they match. The files are titled like this: 'cm1out_20000101_000004.nc'. Curious!

@ealucy ealucy closed this as not planned Won't fix, can't repro, duplicate, stale Dec 8, 2023
@ealucy ealucy reopened this Dec 8, 2023
@feng045
Copy link
Collaborator

feng045 commented Dec 8, 2023

I just realized that your startdate and enddate only differ by 4 seconds. The code calculates the date/times from your filenames (hence the specified datetime format 'yyyymodd_hhmmss' in the config), and then only keeps those that fall within your specified startdate and enddate for processing.

What does your file names look like? Can you put the list of your full file names here?

@ealucy
Copy link
Author

ealucy commented Dec 8, 2023

Yes, that is correct. This is the file list:
cm1out_20000101_000004.nc
cm1out_20000101_000005.nc
cm1out_20000101_000006.nc
cm1out_20000101_000007.nc
cm1out_20000101_000008.nc
There are only these five files, as the entire dataset is not housed locally. I was hoping to test the tracker on these few to get an idea of how it works before attempting to do so on the entire dataset.

@feng045
Copy link
Collaborator

feng045 commented Dec 8, 2023

I think I may know why. The function in PyFLEXTRKR converting input file datetimes did not use the digits down to seconds precision. See the code at this line.

You can try making a larger datetime window that include all the files you have, e.g.,
startdate: '20000101_000000'
enddate: '20000101_001000'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants