-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue in running the tracking algorithm for CM1 output #75
Comments
Based on your config, the code would be searching for input files like this: And the files date/time must be within this range: You should check to make sure that matches your input files. |
Yes, they match. The files are titled like this: 'cm1out_20000101_000004.nc'. Curious! |
I just realized that your startdate and enddate only differ by 4 seconds. The code calculates the date/times from your filenames (hence the specified datetime format 'yyyymodd_hhmmss' in the config), and then only keeps those that fall within your specified startdate and enddate for processing. What does your file names look like? Can you put the list of your full file names here? |
Yes, that is correct. This is the file list: |
I think I may know why. The function in PyFLEXTRKR converting input file datetimes did not use the digits down to seconds precision. See the code at this line. You can try making a larger datetime window that include all the files you have, e.g., |
My goal is to track vorticity features within tropical cyclone model output in CM1. Currently, when running the code, I'm getting an error that the algorithm doesn't seem to be seeing the files within the directory that I have housed them:
(aug23_env) el381212@turing:/nfs/tcdynasty/lucy$ python run_generic_tracking.py config.yml
2023-12-05 20:01:29,557 - pyflextrkr.idfeature_driver - INFO - Identifying features from raw data
2023-12-05 20:01:30,181 - pyflextrkr.idfeature_driver - INFO - Total number of files to process: 0
2023-12-05 20:01:30,184 - pyflextrkr.idfeature_driver - INFO - Done with features from raw data.
2023-12-05 20:01:30,184 - pyflextrkr.tracksingle_driver - INFO - Tracking sequential pairs of idfeature files
2023-12-05 20:01:30,185 - pyflextrkr.tracksingle_driver - INFO - Total number of files to process: 0
2023-12-05 20:01:30,185 - pyflextrkr.tracksingle_driver - INFO - Done with tracking sequential pairs of idfeature files
2023-12-05 20:01:30,185 - pyflextrkr.gettracks - INFO - Tracking features sequentially from single track files
2023-12-05 20:01:30,186 - pyflextrkr.gettracks - INFO - Total number of files to process: 0
Traceback (most recent call last):
File "/nfs/tcdynasty/lucy/run_generic_tracking.py", line 61, in
tracknumbers_filename = gettracknumbers(config)
^^^^^^^^^^^^^^^^^^^^^^^
File "/nfs/knight/mamba_aug23/envs/aug23_env/lib/python3.11/site-packages/pyflextrkr/gettracks.py", line 74, in gettracknumbers
logger.debug(f"files[0]: {files[0]}")
~~~~~^^^
IndexError: list index out of range
I'll also attach my config file:
ERA5 vorticity anomaly tracking configuration file
Identify features to track
run_idfeature: True
Track single consecutive feature files
run_tracksingle: True
Run tracking for all files
run_gettracks: True
Calculate feature statistics
run_trackstats: True
Link merge/split tracks
run_mergesplit: True
Map tracking to pixel files
run_mapfeature: True
Start/end date and time
startdate: '20000101_000004'
enddate: '20000101_000008'
Parallel processing set up
run_parallel: 1 (local cluster), 2 (Dask MPI)
run_parallel: 1
nprocesses: 32 # Number of processors to use if run_parallel=1
databasename: 'cm1out_'
#databasename: ERA5_SFvortPV_
Specify date/time string format in the file name
E.g., radar_20181101.011503.nc --> yyyymodd.hhmmss
E.g., wrfout_2018-11-01_01:15:00 --> yyyy-mo-dd_hh:mm:ss
time_format: 'yyyymodd_hhmmss'
Input files directory
clouddata_path: '/nfs/tcdynasty/lucy/cm1/'
Working directory for the tracking data
root_path: '/nfs/tcdynasty/lucy/cm1_tracking/'
root_path: '/pscratch/sd/j/jmarquis/ERA5_waccem/Bandpassed/'
Working sub-directory names
tracking_path_name: 'vtracking'
stats_path_name: 'vortstats'
pixel_path_name: 'vortracking'
Specify types of feature being tracked
This adds additional feature-specific statistics to be computed
feature_type: 'generic'
Specify data structure
datatimeresolution: 1/3600 # hours
pixel_radius: .015625 # km
x_dimname: 'ni'
y_dimname: 'nj'
time_dimname: 'time'
time_coordname: 'time'
x_coordname: 'x'
y_coordname: 'y'
field_varname: 'rel_vort'
Feature detection parameters
label_method: 'skimage.watershed'
peak_local_max params:
plm_min_distance: 15 # min_distance - distance buffer between maxima; num grid points
plm_exclude_border: 5 # exclude_border - distance buffer between maxima and the domain sides; num grid points
plm_threshold_abs: 0 # threshold_abs - minimum magnitude of PSI' required to define a maxima
watershed params:
cont_thresh: 0.00002 # PSI' contour defining outermost of flood-filled object area
compa: 0 #"compactness factor" - (how much you'll let a flood fill spread into a neighbor's domain. Zero or < 100 seemed ok.)
field_thresh: [1.6, 1000] # variable thresholds
min_size: .1 # Min area to define a feature (km^2)
R_earth: 6378.0 # Earth radius (km)
Tracking parameters
timegap: 1/3600 # hour
othresh: 0.3 # overlap percentage threshold
maxnclouds: 100 # Maximum number of features in one snapshot
nmaxlinks: 10 # Maximum number of overlaps that any single feature can be linked to
duration_range: [6, 800] # A vector [minlength,maxlength] to specify the duration range for the tracks
Flag to remove short-lived tracks [< min(duration_range)] that are not mergers/splits with other tracks
0:keep all tracks; 1:remove short tracks
remove_shorttracks: 1
Set this flag to 1 to write a dense (2D) trackstats netCDF file
Note that for datasets with lots of tracks, the memory consumption could be very large
trackstats_dense_netcdf: 1
Minimum time difference threshold to match track stats with cloudid pixel files
match_pixel_dt_thresh: 60.0 # seconds
Link merge/split parameters to main tracks
maintrack_area_thresh: .1 # [km^2] Main track area threshold
maintrack_lifetime_thresh: 60/3600 # [hour] Main track duration threshold
split_duration: 30/3600 # [hour] Split tracks <= this length is linked to the main tracks
merge_duration: 30/3600 # [hour] Merge tracks <= this length is linked to the main tracks
Define tracked feature variable names
feature_varname: 'feature_number'
nfeature_varname: 'nfeatures'
featuresize_varname: 'npix_feature'
Track statistics output file dimension names
tracks_dimname: 'tracks'
times_dimname: 'times'
fillval: -9999
Output file base names
finalstats_filebase: 'trackstats_final_'
pixeltracking_filebase: 'vort_tracks_'
List of variable names to pass from input to tracking output data
pass_varname:
All the files are housed in the /nfs/tcdynasty/lucy/cm1/ directory, but it seems to me that they're not being found by the code. Any assistance is much appreciated!
The text was updated successfully, but these errors were encountered: