From 16e31edbb1756139f183cb6353cfa363199c8932 Mon Sep 17 00:00:00 2001
From: Kay Robbins <1189050+VisLab@users.noreply.github.com>
Date: Tue, 16 Jul 2024 10:38:01 -0500
Subject: [PATCH] Updated the docs for HED and NWB
---
docs/source/HedAnnotationInNWB.md | 179 +-
docs/source/HedMatlabTools.md | 4 +-
docs/source/HedOnlineTools.md | 4 +-
docs/source/HedPythonTools.md | 261 +-
...ickstart.md => HedRemodelingQuickstart.md} | 990 +--
...modelingTools.md => HedRemodelingTools.md} | 5592 ++++++++---------
docs/source/HedSearchGuide.md | 4 +-
docs/source/HedSummaryGuide.md | 6 +-
docs/source/HedValidationGuide.md | 6 +-
docs/source/HowCanYouUseHed.md | 30 +-
docs/source/UnderstandingHedVersions.md | 1 +
docs/source/index.rst | 5 +-
src/jupyter_notebooks/remodeling/README.md | 2 +-
13 files changed, 3483 insertions(+), 3601 deletions(-)
rename docs/source/{FileRemodelingQuickstart.md => HedRemodelingQuickstart.md} (94%)
rename docs/source/{FileRemodelingTools.md => HedRemodelingTools.md} (97%)
create mode 100644 docs/source/UnderstandingHedVersions.md
diff --git a/docs/source/HedAnnotationInNWB.md b/docs/source/HedAnnotationInNWB.md
index da4d50d..a075fb5 100644
--- a/docs/source/HedAnnotationInNWB.md
+++ b/docs/source/HedAnnotationInNWB.md
@@ -4,39 +4,37 @@
NWB is used extensively as the data representation for single cell and animal recordings as well as
human neuroimaging modalities such as IEEG. HED (Hierarchical Event Descriptors) is a system of
standardized vocabularies and supporting tools that allows fine-grained annotation of data.
+HED annotations can now be used in NWB.
-Each NWB file (extension `.nwb`) is a self-contained (and hopefully complete) representation of
-experimental data for a single experiment.
-The file should contain all experimental stimuli, acquired data, and metadata synchronized to a global
-timeline for the experiment.
-See [**Intro to NWB**](https://nwb-overview.readthedocs.io/en/latest/intro_to_nwb/1_intro_to_nwb.html)
-for a basic introduction to NWB.
-
-## The ndx-hed NWB extension
-The [**ndx-hed**](https://github.com/hed-standard/ndx-hed) extension allows HED annotations to be added as a column to any NWB
-[**DynamicTable**](https://hdmf-common-schema.readthedocs.io/en/stable/format.html#sec-dynamictable).
-extends the NWB [**VectorData**](https://hdmf-common-schema.readthedocs.io/en/stable/format.html#sec-dynamictable)
-class, allowing HED data to be added as a column to any NWB
-[**DynamicTable**](https://hdmf-common-schema.readthedocs.io/en/stable/format.html#sec-dynamictable).
-The `DynamicTable` class is the underlying base class for many data structures within NWB files,
-and this extension allows HED annotations to be easily added to NWB.
-See [**DynamicTable Tutorial**](https://hdmf.readthedocs.io/en/stable/tutorials/plot_dynamictable_tutorial.html#sphx-glr-tutorials-plot-dynamictable-tutorial-py)
+A standardized HED vocabulary is referred to as a HED schema.
+A single term in a HED vocabulary is called a HED tag.
+A HED string consists of one or more HED tags separated by commas and possibly grouped using parentheses.
+
+The [**ndx-hed**](https://github.com/hed-standard/ndx-hed) extension consists of a `HedTags` class that extends
+the NWB [**VectorData**](https://hdmf-common-schema.readthedocs.io/en/stable/format.html#sec-dynamictable) class,
+allowing HED data to be added as a column to any NWB [**DynamicTable**](https://hdmf-common-schema.readthedocs.io/en/stable/format.html#sec-dynamictable).
+`VectorData` and `DynamicTable` are base classes for many NWB data structures.
+See the [**DynamicTable Tutorial**](https://hdmf.readthedocs.io/en/stable/tutorials/plot_dynamictable_tutorial.html#sphx-glr-tutorials-plot-dynamictable-tutorial-py)
for a basic guide for usage in Python and
[**DynamicTable Tutorial (MATLAB)**](https://neurodatawithoutborders.github.io/matnwb/tutorials/html/dynamic_tables.html)
for introductory usage in MATLAB.
+The `ndx-hed` extension is not currently supported in MATLAB, although support is planned in the future.
+
+## NWB ndx-hed installation
-The class that implements the ndx-hed extension is called `HedTags`.
-This class represents a column vector (`VectorData`) of HED tags and the version
-of the HedSchema needed to validate the tags.
-Valid
-which
+Should it be uploaded to PyPi?
-## Example HED usage in NWB
+## NWB ndx-hed examples
-### HED as a standalone vector
+### HedTags as a standalone vector
-The `HedTags` class has two required argument (the `hed_version` and the `data`) and two optional arguments
-(`name` and `description`).
+The `HedTags` class has two required arguments (`hed_version` and `data`) and two optional arguments
+(`name` and `description`).
+The result of the following example is a `HedTags` object whose data vector includes 2 elements.
+Notice that the `data` argument value is a list with 2 values representing two distinct HED strings.
+These values are validated using HED schema version 8.3.0 when `tags` is created.
+If any of the tags had been invalid, the constructor would have raised a `ValueError`.
+The example uses the default column name (`HED`) and the default column description.
````{admonition} Create a HedTags object.
:class: tip
@@ -45,21 +43,31 @@ The `HedTags` class has two required argument (the `hed_version` and the `data`)
tags = HedTags(hed_version='8.3.0', data=["Correct-action", "Incorrect-action"])
```
````
-The result is a `VectorData` object whose data vector includes 2 elements.
-Notice that data is a list with 2 values representing two distinct HED strings.
-The values of these elements are validated using HED schema version 8.3.0 when `tags` is created.
-If any of the tags had been invalid, the constructor would raise a `ValueError`.
-````{admonition} Add a row to an existing HED VectorData
+You must specify the version of the HED vocabulary to be used.
+We recommend that you use the latest version of HED (currently 8.3.0).
+A separate HED version is used for each instance of the `HedTags` column,
+so in theory you could use a different version for each column.
+This is not recommended, as annotations across columns and tables may be combined for analysis.
+See [**Understanding HED versions**](./UnderstandingHedVersions.md) for a more detailed explanation
+of HED versioning.
+
+### Adding a row to HedTags
+
+The following example assumings that a `HedTags` object `tags` as already been
+created as illustrated in the previous example.
+
+````{admonition} Add a row to an existing HedTags object
:class: tip
```python
tags.add_row("Sensory-event, Visual-presentation")
```
+````
+
After this `add_row` operation, `tags` has 3 elements. Notice that "Sensory-event, Visual-presentation"
is a single HED string, not two HED strings.
-
-````
+In contrast, ["Correct-action", "Incorrect-action"] is a list with two HED strings.
### HED in a table
@@ -82,9 +90,12 @@ color_table = DynamicTable(
```
````
The example sets up a table with columns named `color_code` and `HED`.
-Here are some common operations that you might want to perform on such a table:
-````{admonition}
-Get row 0 of `color_table` as a Pandas DataFrame:
+Table `colors` has 3 rows.
+
+### Add a row to a `DynamicTable`
+Once a table has been required, you can add a row using the table's `add_row` method.
+
+````{admonition} Get row 0 of color_table as a Pandas DataFrame:
```python
df = color_table[0]
```
@@ -93,29 +104,95 @@ Append a row to `color_table`:
color_table.add_row(color_code=4, HED="Black")
```
````
-The `HED`
-color_table = DynamicTable(name="colors", description="Color table for the experiment",
- columns=[{"name"="code", "description"="Internal color codes", data=[1, 2, 3]},
- HedTags(name="HED", hed_version="8.3.0", data=["Red", "Green", "Blue"])]
-
+As mentioned above, the `DynamicTable` class is used as the base class for many table classes including the
+`TimeIntervals`, `Units`, and `PlaneSegmentation`.
+For example `icephys` classes that extend `DynamicTable` include `ExperimentalConditionsTable`, `IntracellularElectrodesTable`,
+`IntracellularResponsesTable`, `IntracellularStimuliTable`, `RepetitionsTable`, `SequentialRecordingsTable`,
+`SimultaneousRecordingsTable` and the `SweepTable`.
+This means that HED can be used to annotate a variety of NWB data.
+
+HED tools recognize a column as containing HED annotations if it is an instance of `HedTags`.
+This is in contrast to BIDS ([**Brain Imaging Data Structure**](https://bids.neuroimaging.io/)),
+which identifies HED in tabular files by the presence of a `HED` column,
+or by an accompanying JSON sidecar, which associates HED annotations with tabular column names.
+
+## HED and ndx-events
+
+The NWB [**ndx-events**](https://github.com/rly/ndx-events) extension provides data structures for
+representing event information about data recordings.
+The following table lists elements of the *ndx-events* extension that inherit from
+`DynamicTable` and can accommodate HED annotations.
+
+```{list-table} ndx-events tables that can use HED.
+:header-rows: 1
+:name: ndx-events-data-structures
+
+* - Table
+ - Purpose
+ - Comments
+* - `EventsTypesTable`
+ - Information about each event type
One row per event type.
+ - Analogous to BIDS events.json.
+* - `EventsTable`
+ - Stores event instances
One row per event instance.
+ - Analogous to BIDS events.tsv.
+* - `TtlTypesTable`
+ - Information about each TTL type.
+ -
+* - `TtlTable`
+ - Information about each TTL instance.
+ -
```
+HED annotations that are common to a particular type of event can be added to e NWB `EventsTypesTable`,
+which is analogous to the `events.json` in BIDS.
+A `HED` column can be added to a BIDS `events.tsv` file to provide HED annotations specific
+to each event instance.
+Any number of `HedTags` columns can be added to the NWB `EventsTable` to provide different types
+of HED annotations for each event instance.
+The HEDTools ecosystem currently supports assembling the annotations from all sources to construct
+complete annotations for event instances in BIDS. Similar support is planned for NWB files.
-```
-## Installation
+## HED in NWB files
-## Implementation
-The implementation is based around
-### The HedTags class
+A single NWB recording and its supporting data is stored in an `NWBFile` object.
+The NWB infrastructure efficiently handles reading, writing, and accessing large `NWBFile` objects and their components.
+The following example shows the creation of a simple `NWBFile` using only the required constructor arguments.
+````{admonition} Create an NWBFile object called my_nwb.
+```python
+from datetime import datetime
+from dateutil.tz import tzutc
+from pynwb import NWBFile
+
+my_nwb = NWBFile(session_description='a test NWB File',
+ identifier='TEST123',
+ session_start_time=datetime(1970, 1, 1, 12, tzinfo=tzutc()))
-### In TrialTable
+```
+````
-### in EventTable
+An `NWBFile` has many fields, which can be set using optional parameters to the constructor
+or set later using method calls.
-file including the `TimeIntervals`, `Units`, `PlaneSegmentation`, and many of the
-the `icephys` tables (`ExperimentalConditionsTable`, `IntracellularElectrodesTable`,
-`IntracellularResponsesTable, `IntracellularStimuliTable`, `RepetitionsTable`, SequentialRecordingsTable`,
-`SimultaneousRecordingsTable` and the `SweepTable`)(`TrialTable`, Epch
\ No newline at end of file
+````{admonition} Add a HED trial column to an NWB trial table and add trial information.
+```python
+my_nwb.add_trial_column(name="HED", hed_version="8.3.0", col_cls=HedTags, data=[], description="temp")
+my_nwb.add_trial(start_time=0.0, stop_time=1.0, HED="Correct-action")
+my_nwb.add_trial(start_time=2.0, stop_time=3.0, HED="Incorrect-action")
+```
+````
+The optional parameters for the `NWBFile` constructor whose values can inherit from `DynamicTable`
+include `epochs`, `trials`, `invalid_times`, `units`, `electrodes`, `sweep_table`,
+`intracellular_recordings`, `icephys_simultaneous_recordings`, `icephys_repetitions`, and
+`icephys_experimental_conditions`.
+The `NWBFile` class has methods of the form `add_xxx_column` for the
+`epochs`, `electrodes`, `trials`, `units`,and `invalid_times` tables.
+The other tables also allow a HED column to be added by constructing the appropriate table
+prior to passing it to the `NWBFile` constructor.
+
+In addition, the `stimulus` input is a list or tuple of objects that could include `DynamicTable` objects.
+
+The NWB infrastructure provides IO functions to serialize these HED-augmented tables.
\ No newline at end of file
diff --git a/docs/source/HedMatlabTools.md b/docs/source/HedMatlabTools.md
index da8f934..4720b2b 100644
--- a/docs/source/HedMatlabTools.md
+++ b/docs/source/HedMatlabTools.md
@@ -722,8 +722,8 @@ For the remodeling operations, first and second operation must be the dataset ro
directory and the remodeling file name, respectively.
In this example, dataset `ds003645` has been downloaded from [**openNeuro**](https://openneuro.org) to the `G:\` drive.
The remodeling file used in this example can be found at
-See [**File remodeling quickstart**](FileRemodelingQuickstart.md)
-and [**File remodeling tools**](FileRemodelingTools.md) for
+See [**HED remodeling quickstart**](HedRemodelingQuickstart)
+and [**HED remodeling tools**](HedRemodelingTools) for
additional information.
(web-service-matlab-demos-anchor)=
diff --git a/docs/source/HedOnlineTools.md b/docs/source/HedOnlineTools.md
index 4c70e5f..00e3c0f 100644
--- a/docs/source/HedOnlineTools.md
+++ b/docs/source/HedOnlineTools.md
@@ -151,8 +151,8 @@ to generate a JSON sidecar based on all the events files in a BIDS dataset.
The HED remodeling tools provide an interface to nearly all the HED tools functionality without programming. To use the tools, create a JSON file containing the commands that you
wish to execute on the events file. Command are available to do various transformations
and summaries of events files as explained in
-the [**File remodeling quickstart**](https://www.hed-resources.org/en/latest/FileRemodelingQuickstart.html) and the
-[**File remodeling tools**](https://www.hed-resources.org/en/latest/FileRemodelingTools.html).
+the [**HED remodeling quickstart**](https://www.hed-resources.org/en/latest/HedRemodelingQuickstart.html) and the
+[**HED remodeling tools**](https://www.hed-resources.org/en/latest/HedRemodelingTools.html).
``````{admonition} Execute a remodel script.
diff --git a/docs/source/HedPythonTools.md b/docs/source/HedPythonTools.md
index 02b3f57..5a60524 100644
--- a/docs/source/HedPythonTools.md
+++ b/docs/source/HedPythonTools.md
@@ -1,22 +1,44 @@
# HED Python tools
+The primary codebase for HED support is in Python.
+Source code for the HED Python tools is available in the
+[**hed-python**](https://github.com/hed-standard/hed-pythong) GitHub repository
+See the [**HED tools API documentation**](https://hed-python.readthedocs.io/en/latest/) for
+detailed information about the HED Tools API.
+
+Many of the most-frequently used tools are available using the
+[**HED remodeling tools**](https://www.hed-resources.org/en/latest/HedRemodelingTools.html).
+Using the remodeling interface, users specify operations and parameters in a JSON
+file rather than writing code.
+
+The [**HED online tools**](https://hedtools.org/hed) provide an easy-to-use GUI and web service for accessing the tools.
+See the [**HED online tools documentation**](https://www.hed-resources.org/en/latest/HedOnlineTools.html)
+for more information.
+
+The [**HED MATLAB tools**](https://www.hed-resources.org/en/latest/HedMatlabTools.html)
+provide a MATLAB wrapper for the HED Python tools.
+For users that do not have an appropriate version of Python installed for their MATLAB,
+the tools access the online tool web service to perform the operation.
+
+## HED Python tool installation
The HED (Hierarchical Event Descriptor) scripts and notebooks assume
that the Python HedTools have been installed.
-The HedTools package is not yet available on PyPI, so you will need to install it
-directly from GitHub using:
+The HedTools package is available on PyPi and can be installed using:
```shell
- pip install git+https://github.com/hed-standard/hed-python/@master
+ pip install hedtools
```
-There are several types of Jupyter notebooks and other HED support tools:
-* [**Jupyter notebooks for HED in BIDS**](jupyter-notebooks-for-hed-in-bids-anchor) - aids for HED annotation in BIDS.
-* [**Jupyter notebooks for data curation**](jupyter-curation-notebooks-anchor) - aids for
-summarizing and reorganizing event data.
-* [**Calling HED tools**](calling-hed-tools-anchor) - specific useful functions/classes.
+Prerelease versions of HedTools are available on the `develop` branch of the
+[**hed-python**](https://github.com/hed-standard/hed-pythong) GitHub repository
+and can be installed using:
+```shell
+ pip install git+https://github.com/hed-standard/hed-python/@develop
+```
-(jupyter-notebooks-for-hed-in-bids-anchor)=
-## Jupyter notebooks for HED in BIDS
+
+(jupyter-notebooks-for-hed-anchor)=
+## Jupyter notebooks for HED
The following notebooks are specifically designed to support HED annotation
for BIDS datasets.
@@ -225,222 +247,3 @@ This is very useful for testing new schemas that are underdevelopment.
### Validate BIDS datasets
The [**validate_bids_datasets.ipynb**](https://github.com/hed-standard/hed-examples/blob/main/src/jupyter_notebooks/bids/validate_bids_datasets.ipynb) is similar to the other validation notebooks, but it takes a list of datasets to validate as a convenience.
-
-
-(jupyter-curation-notebooks-anchor)=
-## Jupyter notebooks for data curation
-
-All data curation notebooks and other examples can now be found
-in the [**hed-examples**](https://github.com/hed-standard/hed-examples) repository.
-
-
-(consistency-of-BIDS-event-files-anchor)=
-### Consistency of BIDS event files
-
-Some neuroimaging modalities such as EEG, typically contain event information
-encoded in the data recording files, and the BIDS `events.tsv` files are
-generated post hoc.
-
-In general, the following things should be checked before data is released:
-1. The BIDS `events.tsv` files have the same number of events as the data
-recording and that onset times of corresponding events agree.
-2. The associated information contained in the data recording and event files is consistent.
-3. The relevant metadata is present in both versions of the data.
-
-The example data curation scripts discussed in this section assume that two versions
-of each BIDS event file are present: `events.tsv` and a corresponding `events_temp.tsv` file.
-The example datasets that are using for these tutorials assume that the recordings
-are in EEG.set format.
-
-(calling-hed-tools-anchor)=
-## Calling HED tools
-
-This section shows examples of useful processing functions provided in HedTools:
-
-* [**Getting a list of filenames**](getting-a-list-of-files-anchor)
-* [**Dictionaries of filenames**](dictionaries-of-filenames-anchor)
-* [**Logging processing steps**](logging-processing-steps-anchor)
-
-
-(getting-a-list-of-files-anchor)=
-### Getting a list of files
-
-Many situations require the selection of files in a directory tree based on specified criteria.
-The `get_file_list` function allows you to pick out files with a specified filename
-prefix and filename suffix and specified extensions
-
-The following example returns a list of full paths of the files whose names end in `_events.tsv`
-or `_events.json` that are not in any `code` or `derivatives` directories in the `bids_root_path`
-directory tree.
-The search starts in the directory root `bids_root_path`:
-
-````{admonition} Get a list of specified files in a specified directory tree.
-:class: tip
-```python
-file_list = get_file_list(bids_root_path, extensions=[ ".json", ".tsv"], name_suffix="_events",
- name_prefix="", exclude_dirs=[ "code", "derivatives"])
-```
-````
-
-(dictionaries-of-filenames-anchor)=
-### Dictionaries of filenames
-
-The HED tools provide both generic and BIDS-specific classes for dictionaries of filenames.
-
-Many of the HED data processing tools make extensive use of dictionaries specifying both data and format.
-
-#### BIDS-specific dictionaries of files
-
-Files in BIDS have unique names that indicate not only what the file represents,
-but also where that file is located within the BIDS dataset directory tree.
-
-##### BIDS file names and keys
-A BIDS file name consists of an underbar-separated list of entities,
-each specified as a name-value pair,
-followed by suffix indicating the data modality.
-
-For example, the file name `sub-001_ses-3_task-target_run-01_events.tsv`
-has entities subject (`sub`), task (`task`), and run (`run`).
-The suffix is `events` indicating that the file contains events.
-The extension `.tsv` gives the data format.
-
-Modality is not the same as data format, since some modalities allow
-multiple formats. For example, `sub-001_ses-3_task-target_run-01_eeg.set`
-and `sub-001_ses-3_task-target_run-01_eeg.edf` are both acceptable
-representations of EEG files, but the data is in different formats.
-
-The BIDS file dictionaries represented by the class `BidsFileDictionary`
-and its extension `BidsTabularDictionary` use a set combination of entities
-as the file key.
-
-For a file name `sub-001_ses-3_task-target_run-01_events.tsv`,
-the tuple ('sub', 'task') gives a key of `sub-001_task-target`,
-while the tuple ('sub', 'ses', 'run') gives a key of `sub-001_ses-3_run-01`.
-The use of dictionaries of file names with such keys makes it
-easier to associate related files in the BIDS naming structure.
-
-Notice that specifying entities ('sub', 'ses', 'run') gives the
-key `sub-001_ses-3_run-01` for all three files:
-`sub-001_ses-3_task-target_run-01_events.tsv`, `sub-001_ses-3_task-target_run-01_eeg.set`
-and `sub-001_ses-3_task-target_run-01_eeg.edf`.
-Thus, the expected usage is to create a dictionary of files of one modality.
-
-````{admonition} Create a key-file dictionary for files ending in events.tsv in bids_root_path directory tree.
-:class: tip
-```python
-from hed.tools import FileDictionary
-from hed.util import get_file_list
-
-file_list = get_file_list(bids_root_path, extensions=[ ".set"], name_suffix="_eeg",
- exclude_dirs=[ "code", "derivatives"])
-file_dict = BidsFileDictionary(file_list, entities=('sub', 'ses', 'run) )
-```
-````
-
-In this example, the `get_file_list` filters the files of the appropriate type,
-while the `BidsFileDictionary` creates a dictionary with keys such as
-`sub-001_ses-3_run-01` and values that are `BidsFile` objects.
-`BidsFile` can hold the file name of any BIDS file and keeps a parsed
-version of the file name.
-
-
-
-#### A generic dictionary of filenames
-
-
-````{admonition} Create a key-file dictionary for files ending in events.json in bids_root_path directory tree.
-:class: tip
-```python
-from hed.tools import FileDictionary
-from hed.util import get_file_list
-
-file_list = get_file_list(bids_root_path, extensions=[ ".json"], name_suffix="_events",
- exclude_dirs=[ "code", "derivatives"])
-file_dict = FileDictionary(file_list, name_indices=name_indices)
-```
-````
-
-Keys are calculated from the filename using a `name_indices` tuple,
-which indicates the positions of the name-value entity pairs in the
-BIDS file name to use.
-
-The BIDS filename `sub-001_ses-3_task-target_run-01_events.tsv` has
-three name-value entity pairs (`sub-001`, `ses-3`, `task-target`,
-and `run-01`) separated by underbars.
-
-The tuple (0, 2) gives a key of `sub-001_task-target`,
-while the tuple (0, 3) gives a key of `sub-001_run-01`.
-Neither of these choices uniquely identifies the file.
-The tuple (0, 1, 3) gives a unique key of `sub-001_ses-3_run-01`.
-The tuple (0, 1, 2, 3) also works giving `sub-001_ses-3_task-target_run-01`.
-
-If you choose the `name_indices` incorrectly, the keys for the event files
-will not be unique, and the notebook will throw a `HedFileError`.
-If this happens, modify your `name_indices` key choice to include more entity pairs.
-
-For example, to compare the events stored in a recording file and the events
-in the `events.tsv` file associated with that recording,
-we might dump the recording events in files with the same name, but ending in `events_temp.tsv`.
-The `FileDictionary` class allows us to create a keyed dictionary for each of these event files.
-
-
-(logging-processing-steps-anchor)=
-### Logging processing steps
-
-Often event data files require considerable processing to assure
-internal consistency and compliance with the BIDS specification.
-Once this processing is done and the files have been transformed,
-it can be difficult to understand the relationship between the
-transformed files and the original data.
-
-The `HedLogger` allows you to document processing steps associated
-with the dataset by identifying key as illustrated in the following
-log file excerpt:
-
-(example-output-hed-logger-anchor)=
-`````{admonition} Example output from HED logger.
-:class: tip
-```text
-sub-001_run-01
- Reordered BIDS columns as ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
- Dropped BIDS skip columns ['trial_type', 'value', 'response_time', 'stim_file', 'HED']
- Reordered EEG columns as ['sample_offset', 'event_code', 'cond_code', 'type', 'latency', 'urevent', 'usertags']
- Dropped EEG skip columns ['urevent', 'usertags', 'type']
- Concatenated the BIDS and EEG event files for processing
- Dropped the sample_offset and latency columns
- Saved as _events_temp1.tsv
-sub-002_run-01
- Reordered BIDS columns as ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
- Dropped BIDS skip columns ['trial_type', 'value', 'response_time', 'stim_file', 'HED']
- Reordered EEG columns as ['sample_offset', 'event_code', 'cond_code', 'type', 'latency', 'urevent', 'usertags']
- Dropped EEG skip columns ['urevent', 'usertags', 'type']
- Concatenated the BIDS and EEG event files for processing
- . . .
-```
-`````
-
-Each of the lines following a key represents a print message to the logger.
-
-The most common use for a logger is to create a file dictionary
-using [**make_file_dict**](dictionaries-of-filenames-anchor)
-and then to log each processing step using the file's key.
-This allows a processing step to be applied to all the relevant files in the dataset.
-After all the processing is complete, the `print_log` method
-outputs the logged messages by key, thus showing all the
-processing steps that have been applied to each file
-as shown in the [**previous example**](example-output-hed-logger-anchor).
-
-(using-hed-logger-example-anchor)=
-`````{admonition} Using the HED logger.
-:class: tip
-```python
-from hed.tools import HedLogger
-status = HedLogger()
-status.add(key, f"Concatenated the BIDS and EEG event files")
-
-# ... after processing is complete output or save the log
-status.print_log()
-```
-`````
-
-The `HedLogger` is used throughout the processing notebooks in this repository.
diff --git a/docs/source/FileRemodelingQuickstart.md b/docs/source/HedRemodelingQuickstart.md
similarity index 94%
rename from docs/source/FileRemodelingQuickstart.md
rename to docs/source/HedRemodelingQuickstart.md
index 52be3e2..357212e 100644
--- a/docs/source/FileRemodelingQuickstart.md
+++ b/docs/source/HedRemodelingQuickstart.md
@@ -1,495 +1,495 @@
-(file-remodeling-quickstart-anchor)=
-# File remodeling quickstart
-
-This tutorial works through the process of restructuring tabular (`.tsv`) files using the HED file remodeling tools.
-These tools particularly useful for creating event files from
-information in experimental logs and for restructuring event files to enable a particular analysis.
-
-The tools, which are written in Python, are designed to be run on an entire dataset.
-This dataset can be in BIDS
-([**Brain Imaging Data Structure**](https://bids.neuroimaging.io/)),
-Alternative users can specify files with a particular suffix and extension appearing
-in a specified directory tree.
-The later format is useful for restructuring that occurs early in the experimental process,
-for example, during the conversion from the experimental control software formats.
-
-The tools can be run using a command-line script, called from a Jupyter notebook,
-or run using online tools. This quickstart covers the basic concepts of remodeling and
-develops some basic examples of how remodeling is used. See the
-[**File remodeling tools**](./FileRemodelingTools.md)
-guide for detailed descriptions of the available operations.
-
-* [**What is remodeling?**](what-is-remodeling-anchor)
-* [**The remodeling process**](the-remodeling-process-anchor)
-* [**JSON remodeling files**](json-remodeling-files-anchor)
- * [**Basic remodel operation syntax**](basic-remodel-operation-syntax-anchor)
- * [**Applying multiple remodel operations**](applying-multiple-remodel-operations-anchor)
- * [**More complex remodeling**](more-complex-remodeling-anchor)
- * [**Remodeling file locations**](remodeling-file-locations-anchor)
-* [**Using the remodeling tools**](using-the-remodeling-tools-anchor)
- * [**Online tools for debugging**](online-tools-for-debugging-anchor)
- * [**The command-line interface**](the-command-line-interface-anchor)
- * [**Jupyter notebooks for remodeling**](jupyter-notebooks-for-remodeling-anchor)
-
-(what-is-remodeling-anchor)=
-## What is remodeling?
-
-Although the remodeling process can be applied to any tabular file,
-they are most often used for restructuring event files.
-Event files, which consist of identified time markers linked to the timeline of the experiment,
-provide a crucial bridge between what happens in
-the experiment and the experimental data.
-
-Event files are often initially created using information in the log files
-generated by the experiment control software.
-The entries in the log files mark time points within the experimental record at which something
-changes or happens (such as the onset or offset of a stimulus or a participant response).
-These event files are then used to identify portions of the data
-corresponding to particular points or blocks of data to be analyzed or compared.
-
-**Remodeling** refers to the process of file restructuring including creating, modifying, and
-reorganizing tabular files in order to
-disambiguate or clarify their information to enable or streamline
-their analysis and/or further distribution.
-HED-based remodeling can occur at several stages during the acquisition and processing
-of experimental data as shown in this schematic diagram:
-![schematic diagram](./_static/images/WebWorkflow.png).
-
-In addition to restructuring during initial structuring of the tabular files,
-further event file restructuring may be useful when the event files are not suited to the requirements of a particular analysis. Thus, restructuring can be an iterative process, which is supported by the HED Remodeling Tools for datasets with tabular event files.
-
-The following table gives a summary of the tools available in the HED remodeling toolbox.
-
-(summary-of-hed-remodeling-operations-anchor)=
-````{table} Summary of the HED remodeling operations for tabular files.
-| Category | Operation | Example use case |
-| -------- | ------- | -----|
-| **clean-up** | | |
-| | [*remove_columns*](remove-columns-anchor) | Remove temporary columns created during restructuring. |
-| | [*remove_rows*](remove-rows-anchor) | Remove rows with a particular value in a specified column. |
-| | [*rename_columns*](rename-columns-anchor) | Make columns names consistent across a dataset. |
-| | [*reorder_columns*](reorder-columns-anchor) | Make column order consistent across a dataset. |
-| **factor** | | |
-| | [*factor_column*](factor-column-anchor) | Extract factor vectors from a column of condition variables. |
-| | [*factor_hed_tags*](factor-hed-tags-anchor) | Extract factor vectors from search queries of HED annotations. |
-| | [*factor_hed_type*](factor-hed-type-anchor) | Extract design matrices and/or condition variables. |
-| **restructure** | | |
-| | [*merge_consecutive*](merge-consecutive-anchor) | Replace multiple consecutive events of the same type
with one event of longer duration. |
-| | [*remap_columns*](remap-columns-anchor) | Create *m* columns from values in *n* columns (for recoding). |
-| | [*split_rows*](split-rows-anchor) | Split trial-encoded rows into multiple events. |
-| **summarization** | | |
-| | [*summarize_column_names*](summarize-column-names-anchor) | Summarize column names and order in the files. |
-| | [*summarize_column_values*](summarize-column-values-anchor) |Count the occurrences of the unique column values. |
-| | [*summarize_definitions*](summarize-definitions-anchor) |Summarize definitions used and report inconsistencies. |
-| | [*summarize_hed_tags*](summarize-hed-tags-anchor) | Summarize the HED tags present in the
HED annotations for the dataset. |
-| | [*summarize_hed_type*](summarize-hed-type-anchor) | Summarize the detailed usage of a particular type tag
such as *Condition-variable* or *Task*
(used to automatically extract experimental designs). |
-| | [*summarize_hed_validation*](summarize-hed-validation-anchor) | Validate the data files and report any errors. |
-| | [*summarize_sidecar_from_events*](summarize-sidecar-from-events-anchor) | Generate a sidecar template from an event file. |
-````
-
-The **clean-up** operations are used at various phases of restructuring to assure consistency
-across files in the dataset.
-
-The **factor** operations produce column vectors of the same length as the number of rows in a file
-in order to encode condition variables, design matrices, or the results of other search criteria.
-See the
-[**HED conditions and design matrices**](./HedConditionsAndDesignMatrices.md)
-for more information on factoring and analysis.
-
-The **restructure** operations modify the way that files represent information.
-
-The **summarization** operations produce dataset-wide and individual file
-summaries of various aspects of the data.
-
-More detailed information about the remodeling operations can be found in
-the [**File remodeling tools**](file-remodeling-tools-anchor) guide.
-
-(the-remodeling-process-anchor)=
-## The remodeling process
-
-Remodeling consists of applying a list of operations to a tabular file
-to restructure or modify the file in some way.
-The following diagram shows a schematic of the remodeling process.
-
-![Event remodeling process](./_static/images/EventRemappingProcess.png)
-
-Initially, the user creates a backup of the selected files.
-This backup process is performed only once, and the results are
-stored in the `derivatives/remodel/backups` subdirectory of the dataset.
-
-Restructuring applies a sequence of remodeling operations given in a JSON remodeling
-file to produce a final result.
-By convention, we name these remodeling instruction files `_rmdl.json`
-and store them in the `derivatives/remodel/remodeling_files` directory relative
-to the dataset root directory.
-
-The restructuring always proceeds by looking up each data file in the backup
-and applying the transformation to the backup before overwriting the non-backed up version.
-
-The remodeling file provides a record of the operations performed on the file
-starting with the original file.
-If the user detects a mistake in the transformation instructions,
-he/she can correct the remodeling JSON file and rerun.
-
-Usually, users will use the default backup, run the backup request once, and
-work from the original backup.
-However, user may also elect to create a named backup, use the backup
-as a checkpoint mechanism, and develop scripts that use the check-pointed versions as the starting point.
-This is useful if different versions of the events files are needed for different purposes.
-
-
-(json-remodeling-files-anchor)=
-## JSON remodeling files
-
-The operations to restructure a tabular file are stored in a remodel file in JSON format.
-The file consists of a list of JSON dictionaries.
-
-(basic-remodel-operation-syntax-anchor)=
-### Basic remodel operation syntax
-
-Each dictionary specifies an operation, a description of the purpose, and the operation parameters.
-The basic syntax of a remodeler operation is illustrated in the following example which renames
-the *trial_type* column to *event_type*.
-
-
-````{admonition} Example of a remodeler operation.
-:class: tip
-
-```json
-{
- "operation": "rename_columns",
- "description": "Rename a trial type column to more specific event_type",
- "parameters": {
- "column_mapping": {
- "trial_type": "event_type"
- },
- "ignore_missing": true
- }
-}
-```
-````
-
-Each remodeler operation has its own specific set of required parameters
-that can be found under [**File remodeling tools**](./FileRemodelingTools.md).
-For *rename_columns*, the required operations are *column_mapping* and *ignore_missing*.
-Some operations also have optional parameters.
-
-(applying-multiple-remodel-operations-anchor)=
-### Applying multiple remodel operations
-
-A remodel JSON file consists of a list of one or remodel operations,
-each specified in a dictionary.
-These operations are performed by the remodeler in the order they appear in the file.
-In the example below, a summary is performed after renaming,
-so the result reflects the new column names.
-
-````{admonition} An example JSON remodeler file with multiple operations.
-:class: tip
-
-```json
-[
- {
- "operation": "rename_columns",
- "description": "Rename a trial type column to more specific event_type.",
- "parameters": {
- "column_mapping": {
- "trial_type": "event_type"
- },
- "ignore_missing": true
- }
- },
- {
- "operation": "summarize_column_names",
- "description": "Get column names across files to find any missing columns.",
- "parameters": {
- "summary_name": "Columns after remodeling",
- "summary_filename": "columns_after_remodel"
- }
- }
-]
-```
-````
-
-By stacking operations you can make several changes to a data file,
-which is important because the changes are always applied to a copy of the original backup.
-If you are planning new changes to the file, note that you are always changing
-a copy of the original backed up file, not a previously remodeled `.tsv`.
-
-(more-complex-remodeling-anchor)=
-### More complex remodeling
-
-This section discusses a complex example using the
-[**sub-0013_task-stopsignal_acq-seq_events.tsv**](./_static/data/sub-0013_task-stopsignal_acq-seq_events.tsv)
-events file of AOMIC-PIOP2 dataset available on [OpenNeuro](https://openneuro.org) as ds002790.
-Here is an excerpt of the event file.
-
-
-(sample-remodeling-events-file-anchor)=
-````{admonition} Excerpt from an event file from the stop-go task of AOMIC-PIOP2 (ds002790).
-| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex |
-| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- |
-| 0.0776 | 0.5083 | go | n/a | 0.565 | |correct | right | female
-| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female |
-| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female |
-| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female |
-| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male |
-| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male |
-````
-
-This event file corresponds to a stop-signal experiment.
-Participants were presented with faces and had to decide the sex of the face
-by pressing a button with left or right hand.
-However, if a stop signal occurred before this selection, the participant was to refrain from responding.
-
-The structure of this file corresponds to the [**BIDS**](https://bids.neuroimaging.io/) format
-for event files.
-The first column, which must be called `onset` represents the time from the start of recording
-in seconds of the temporal marker represented by that row in the file.
-In this case that temporal marker represents the presentation of a face image.
-
-Notice that the *stop_signal_delay* and *response_time* columns contain information
-about additional events (when a trial stop signal was presented and when the participant
-pushed a button).
-These events are encoded implicitly as offsets from the presentation of the go signal.
-Each row is the file encodes information for an entire trial rather than what occurred
-at a single temporal marker.
-This strategy is known as *trial-level* encoding.
-
-Our goal is to represent all the trial events (e.g., go signal, stop signal, and response)
-in separate rows of the event file using the *split_rows* restructuring operation.
-The following example shows the remodeling operations to perform the splitting.
-
-````{admonition} Example of split_rows operation for the AOMIC stop signal task.
-:class: tip
-
-```json
-[
- {
- "operation": "split_rows",
- "description": "Split response event from trial event based on response_time column.",
- "parameters": {
- "anchor_column": "trial_type",
- "new_events": {
- "response": {
- "onset_source": ["response_time"],
- "duration": [0],
- "copy_columns": ["response_accuracy", "response_hand"]
- },
- "stop_signal": {
- "onset_source": ["stop_signal_delay"],
- "duration": [0.5]
- }
- },
- "remove_parent_row": false
- }
- }
-]
-```
-````
-
-The example uses the *split_rows* operation to convert this
-file from trial encoding to event encoding.
-In trial encoding each event marker (row in the event file) represents
-all the information in a single trial.
-Event markers such as the participant's response key-press are encoded implicitly
-as an offset from the stimulus presentation.
-while event encoding includes event markers for each individual event within the trial.
-
-The [**Split rows**](./FileRemodelingTools.md#split-rows)
-explanation under [**File remodeling tools**](./FileRemodelingTools.md)
-shows the required parameters for the *split_rows* operation.
-The required parameters are *anchor_column*, *new_events*, and *remove_parent_row*.
-
-The *anchor_column* is the column we want to add new events corresponding to the stop signal and the response.
-In this case we are going to add events to an existing column: *trial_type*.
-The new events will be in new rows and the existing rows will not be overwritten
-because *remove_parent_event* is false.
-(After splitting we may want to rename *trial_type* to *event_type* since the individual
-rows in the data file no longer represent trials, but individual events within the trial.)
-
-Next we specify how the new events are generated in the *new_events* dictionary.
-Each type of new event has a name, which is a key in the *new_events* dictionary.
-Each key is associated with a dictionary
-specifying the values of the following parameters.
-
-* *onset_source*
-* *duration*
-* *copy_columns*
-
-The *onset_source* is a list indicating how to calculate the onset for the new event
-relative to the onset of the anchor event.
-The list contains any combination of column names and numerical values,
-which are evaluated and added to the onset value of the row being split.
-Column names are evaluated to the row values in the corresponding columns.
-
-In our example, the response time and stop signal delay are calculated relative to the trial's onset,
-so we only need to add the value from the respective column.
-Note that these new events do not exist for every trial.
-Rows where there was no stop signal have an `n/a` in the *stop_signal_delay* column.
-This is processed automatically, and remodeler does not create new events
-when any items in the *onset_source* list is missing or `n/a`.
-
-The *duration* specifies the duration for the new events.
-The AOMIC data did not measure the durations of the button presses,
-so we set the duration of the response event to 0.
-The AOMIC data report indicates that the stop signal lasted 500 ms.
-
-The *copy_columns* is an optional parameter indicating which columns from the parent event should be copied to the
-newly-created event.
-We would like to transfer the *response_accuracy* and the *response_hand* information to the *response* event.
-Since no extra column values are to be transferred for *stop_signal*, columns other than *onset*, *duration*,
-and *trial_type* are filled with `n/a`.
-
-
-The final remodeling file can be found at:
-[**finished json remodeler**](./_static/data/AOMIC_splitevents_rmdl.json)
-
-(remodeling-file-locations-anchor)=
-### Remodeling file locations
-
-The remodeling tools expect the full path for the JSON remodeling operation file to be given
-when the remodeling is executed.
-However, it is a good practice to include all remodeling files used with the dataset.
-The JSON remodeling operation files are usually located in the
-`derivatives/remodel/remodeling_files` subdirectory below the dataset root,
-and have file names ending in `_rmdl.json`.
-
-The backups are always in the `derivatives/remodel/backups` subdirectory under the dataset root.
-Summaries produced by the restructuring tools are located in `derivatives/remodel/summaries`.
-
-In the next section we will go over several ways to call the remodeler.
-
-(using-the-remodeling-tools-anchor)=
-## Using the remodeling tools
-
-The remodeler can be called in a number of ways including using online tools and from the command line.
-The following sections explain various ways to use the available tools.
-
-(online-tools-for-debugging-anchor)=
-### Online tools for debugging
-
-Although the event restructuring tools are designed to be run on an entire dataset,
-you should consider working with a single data file during debugging.
-The HED online tools provide support for debugging your remodeling script and for
-seeing the effect of remodeling on a single data file before running on the entire dataset.
-You can access these tools on the [**HED tools online tools server**](https://hedtools.ucsd.edu/hed).
-
-To use the online remodeling tools, navigate to the events page and select the *Execute remodel script* action.
-Browse to select the data file to be remodeled and the JSON remodel file
-containing the remodeling operations.
-The following screenshot shows these selections for the split rows example of the previous section.
-
-![Remodeling tools online](./_static/images/RemodelingOnline.png)
-
-Press the *Process* button to complete the action.
-If the remodeling script has errors,
-the result will be a downloaded text file with the errors identified.
-If the remodeling script is correct,
-the result will be a data file with the remodeling transformations applied.
-If the remodeling script contains summarization operations,
-the result will be a zip file with the modified data file and the summaries included.
-
-If you are using one of the remodeling operations that relies on HED tags, you will
-also need to upload a suitable JSON sidecar file containing the HED annotations for the data file
-if you turn the *Include summaries* option on.
-
-(the-command-line-interface-anchor)=
-### The command-line interface
-
-After [**installing the remodeler**](./FileRemodelingTools.md#installing-the-remodel-tools),
-you can run the tools on a full BIDS dataset,
-or on any directory using the command-line interface using
-`run_remodel_backup`, `run_remodel`, and `run_remodel_restore`.
-A full overview of all arguments is available at
-[**File remodeling tools**](./FileRemodelingTools.md#remodel-command-line-arguments).
-
-The `run_remodel_backup` is usually run only once for a dataset.
-It makes the baseline backup of the event files to assure that nothing will be lost.
-The remodeling always starts from the backup files.
-
-The `run_remodel` restores the data files from the corresponding backup files and then
-executes remodeling operations from a JSON file.
-A sample command line call for `run_remodel` is shown in the following example.
-
-(remodel-run-anchor)=
-````{admonition} Command to run a summary for the AOMIC dataset.
-:class: tip
-
-```bash
-python run_remodel /data/ds002790 /data/ds002790/derivatives/remodel/remodeling_files/AOMIC_summarize_rmdl.json \
- -b -s .txt -x derivatives
-
-```
-````
-
-The parameters are as follows:
-
-* `data_dir` - (Required first argument) Root directory of the dataset.
-* `model_path` - (Required second argument) Path of JSON file with remodeling operations.
-* `-b` - (Optional) If present, assume BIDS formatted data.
-* `-s` - (Optional) list of formats to save summaries in.
-* `-x` - (Optional) List of directories to exclude from event processing.
-
-There are three types of command line arguments:
-
-[**Positional arguments**](./FileRemodelingTools.md#positional-arguments),
-[**Named arguments**](./FileRemodelingTools.md#named-arguments),
-and [**Named arguments with values**](./FileRemodelingTools.md#named-arguments).
-
-The positional arguments, `data_dir` and `model_path` are not optional and must
-be the first and second arguments to `run_remodel`, respectively.
-The named arguments (with and without values) are optional.
-They all have default values if omitted.
-
-The `-b` option is a named argument indicating whether the dataset is in BIDS format.
-If in BIDS format, the remodeling tools can extract information such as the HED schema
-and the HED annotations from the dataset.
-BIDS data file names are unique, which is convenient for reporting summary information.
-Name arguments are flags-- their presence indicates true and absence indicates false.
-
-The `-s` and `-x` options are examples of named arguments with values.
-The `-s .txt` specifies that summaries should be saved in text format.
-The `-x derivatives` indicates that the `derivatives` subdirectory should not
-be processed during remodeling.
-
-This script can be run multiple times without doing backups and restores,
-since it always starts with the backed up files.
-
-The first argument of the command line scripts is the full path to the root directory of the dataset.
-The `run_remodel` requires the full path of the json remodeler file as the second argument.
-A number of optional key-value arguments are also available.
-
-After the `run_remodel` finishes, it overwrites the data files (not the backups)
-and writes any requested summaries in `derivatives/remodel/summaries`.
-
-The summaries will be written to `/data/ds002790/derivatives/remodel/summaries` folder in text format.
-By default, the summary operations will return both.
-
-The [**summary file**](./_static/data/AOMIC_column_names_2022_12_21_T_13_12_35_641062.txt) lists all different column combinations and for each combination, the files with those columns.
-Looking at the different column combinations you can see there are three, one for each task that was performed for this dataset.
-
-Going back to the [**split rows example**](#more-complex-remodeling) of remodeling,
-we see that splitting the rows into multiple rows only makes sense if the event files have the same columns.
-Only the event files for the stop signal task contain the `stop_signal_delay` column and the `response_time` column.
-The summarizing the column names across the dataset allows users to check whether the column
-names are consistent across the dataset.
-A common use case for BIDS datasets is that the event files have a different structure
-for different tasks.
-The `-t` command-line option allows users to specify which tasks to perform remodeling on.
-Using this option allows users to select only the files that have the specified task names
-in their filenames.
-
-
-Now you can try out the *split_rows* on the full dataset!
-
-
-(jupyter-notebooks-for-remodeling-anchor)=
-### Jupyter notebooks for remodeling
-
-Three Jupyter remodeling notebooks are available at
-[**Jupyter notebooks for remodeling**](https://github.com/hed-standard/hed-examples/tree/main/src/jupyter_notebooks/remodeling).
-
-These notebooks are wrappers that create the backup as well as run restructuring operations on data files.
-If you do not have access to a Jupyter notebook facility, the article
-[Six easy ways to run your Jupyter Notebook in the cloud](https://www.dataschool.io/cloud-services-for-jupyter-notebook/) discusses various no-cost options
-for running Jupyter notebooks online.
+(hed-remodeling-quickstart-anchor)=
+# HED remodeling quickstart
+
+This tutorial works through the process of restructuring tabular (`.tsv`) files using the HED file remodeling tools.
+These tools particularly useful for creating event files from
+information in experimental logs and for restructuring event files to enable a particular analysis.
+
+The tools, which are written in Python, are designed to be run on an entire dataset.
+This dataset can be in BIDS
+([**Brain Imaging Data Structure**](https://bids.neuroimaging.io/)),
+Alternative users can specify files with a particular suffix and extension appearing
+in a specified directory tree.
+The later format is useful for restructuring that occurs early in the experimental process,
+for example, during the conversion from the experimental control software formats.
+
+The tools can be run using a command-line script, called from a Jupyter notebook,
+or run using online tools. This quickstart covers the basic concepts of remodeling and
+develops some basic examples of how remodeling is used. See the
+[**HED remodeling tools**](./HedRemodelingTools)
+guide for detailed descriptions of the available operations.
+
+* [**What is remodeling?**](what-is-remodeling-anchor)
+* [**The remodeling process**](the-remodeling-process-anchor)
+* [**JSON remodeling files**](json-remodeling-files-anchor)
+ * [**Basic remodel operation syntax**](basic-remodel-operation-syntax-anchor)
+ * [**Applying multiple remodel operations**](applying-multiple-remodel-operations-anchor)
+ * [**More complex remodeling**](more-complex-remodeling-anchor)
+ * [**Remodeling file locations**](remodeling-file-locations-anchor)
+* [**Using the remodeling tools**](using-the-remodeling-tools-anchor)
+ * [**Online tools for debugging**](online-tools-for-debugging-anchor)
+ * [**The command-line interface**](the-command-line-interface-anchor)
+ * [**Jupyter notebooks for remodeling**](jupyter-notebooks-for-remodeling-anchor)
+
+(what-is-remodeling-anchor)=
+## What is remodeling?
+
+Although the remodeling process can be applied to any tabular file,
+they are most often used for restructuring event files.
+Event files, which consist of identified time markers linked to the timeline of the experiment,
+provide a crucial bridge between what happens in
+the experiment and the experimental data.
+
+Event files are often initially created using information in the log files
+generated by the experiment control software.
+The entries in the log files mark time points within the experimental record at which something
+changes or happens (such as the onset or offset of a stimulus or a participant response).
+These event files are then used to identify portions of the data
+corresponding to particular points or blocks of data to be analyzed or compared.
+
+**Remodeling** refers to the process of file restructuring including creating, modifying, and
+reorganizing tabular files in order to
+disambiguate or clarify their information to enable or streamline
+their analysis and/or further distribution.
+HED-based remodeling can occur at several stages during the acquisition and processing
+of experimental data as shown in this schematic diagram:
+![schematic diagram](./_static/images/WebWorkflow.png).
+
+In addition to restructuring during initial structuring of the tabular files,
+further event file restructuring may be useful when the event files are not suited to the requirements of a particular analysis. Thus, restructuring can be an iterative process, which is supported by the HED Remodeling Tools for datasets with tabular event files.
+
+The following table gives a summary of the tools available in the HED remodeling toolbox.
+
+(summary-of-hed-remodeling-operations-anchor)=
+````{table} Summary of the HED remodeling operations for tabular files.
+| Category | Operation | Example use case |
+| -------- | ------- | -----|
+| **clean-up** | | |
+| | [*remove_columns*](remove-columns-anchor) | Remove temporary columns created during restructuring. |
+| | [*remove_rows*](remove-rows-anchor) | Remove rows with a particular value in a specified column. |
+| | [*rename_columns*](rename-columns-anchor) | Make columns names consistent across a dataset. |
+| | [*reorder_columns*](reorder-columns-anchor) | Make column order consistent across a dataset. |
+| **factor** | | |
+| | [*factor_column*](factor-column-anchor) | Extract factor vectors from a column of condition variables. |
+| | [*factor_hed_tags*](factor-hed-tags-anchor) | Extract factor vectors from search queries of HED annotations. |
+| | [*factor_hed_type*](factor-hed-type-anchor) | Extract design matrices and/or condition variables. |
+| **restructure** | | |
+| | [*merge_consecutive*](merge-consecutive-anchor) | Replace multiple consecutive events of the same type
with one event of longer duration. |
+| | [*remap_columns*](remap-columns-anchor) | Create *m* columns from values in *n* columns (for recoding). |
+| | [*split_rows*](split-rows-anchor) | Split trial-encoded rows into multiple events. |
+| **summarization** | | |
+| | [*summarize_column_names*](summarize-column-names-anchor) | Summarize column names and order in the files. |
+| | [*summarize_column_values*](summarize-column-values-anchor) |Count the occurrences of the unique column values. |
+| | [*summarize_definitions*](summarize-definitions-anchor) |Summarize definitions used and report inconsistencies. |
+| | [*summarize_hed_tags*](summarize-hed-tags-anchor) | Summarize the HED tags present in the
HED annotations for the dataset. |
+| | [*summarize_hed_type*](summarize-hed-type-anchor) | Summarize the detailed usage of a particular type tag
such as *Condition-variable* or *Task*
(used to automatically extract experimental designs). |
+| | [*summarize_hed_validation*](summarize-hed-validation-anchor) | Validate the data files and report any errors. |
+| | [*summarize_sidecar_from_events*](summarize-sidecar-from-events-anchor) | Generate a sidecar template from an event file. |
+````
+
+The **clean-up** operations are used at various phases of restructuring to assure consistency
+across files in the dataset.
+
+The **factor** operations produce column vectors of the same length as the number of rows in a file
+in order to encode condition variables, design matrices, or the results of other search criteria.
+See the
+[**HED conditions and design matrices**](./HedConditionsAndDesignMatrices.md)
+for more information on factoring and analysis.
+
+The **restructure** operations modify the way that files represent information.
+
+The **summarization** operations produce dataset-wide and individual file
+summaries of various aspects of the data.
+
+More detailed information about the remodeling operations can be found in
+the [**HED remodeling tools**](hed-remodeling-tools-anchor) guide.
+
+(the-remodeling-process-anchor)=
+## The remodeling process
+
+Remodeling consists of applying a list of operations to a tabular file
+to restructure or modify the file in some way.
+The following diagram shows a schematic of the remodeling process.
+
+![Event remodeling process](./_static/images/EventRemappingProcess.png)
+
+Initially, the user creates a backup of the selected files.
+This backup process is performed only once, and the results are
+stored in the `derivatives/remodel/backups` subdirectory of the dataset.
+
+Restructuring applies a sequence of remodeling operations given in a JSON remodeling
+file to produce a final result.
+By convention, we name these remodeling instruction files `_rmdl.json`
+and store them in the `derivatives/remodel/remodeling_files` directory relative
+to the dataset root directory.
+
+The restructuring always proceeds by looking up each data file in the backup
+and applying the transformation to the backup before overwriting the non-backed up version.
+
+The remodeling file provides a record of the operations performed on the file
+starting with the original file.
+If the user detects a mistake in the transformation instructions,
+he/she can correct the remodeling JSON file and rerun.
+
+Usually, users will use the default backup, run the backup request once, and
+work from the original backup.
+However, user may also elect to create a named backup, use the backup
+as a checkpoint mechanism, and develop scripts that use the check-pointed versions as the starting point.
+This is useful if different versions of the events files are needed for different purposes.
+
+
+(json-remodeling-files-anchor)=
+## JSON remodeling files
+
+The operations to restructure a tabular file are stored in a remodel file in JSON format.
+The file consists of a list of JSON dictionaries.
+
+(basic-remodel-operation-syntax-anchor)=
+### Basic remodel operation syntax
+
+Each dictionary specifies an operation, a description of the purpose, and the operation parameters.
+The basic syntax of a remodeler operation is illustrated in the following example which renames
+the *trial_type* column to *event_type*.
+
+
+````{admonition} Example of a remodeler operation.
+:class: tip
+
+```json
+{
+ "operation": "rename_columns",
+ "description": "Rename a trial type column to more specific event_type",
+ "parameters": {
+ "column_mapping": {
+ "trial_type": "event_type"
+ },
+ "ignore_missing": true
+ }
+}
+```
+````
+
+Each remodel operation has its own specific set of required parameters
+that can be found under [**HED remodeling tools**](./HedRemodelingTools).
+For *rename_columns*, the required operations are *column_mapping* and *ignore_missing*.
+Some operations also have optional parameters.
+
+(applying-multiple-remodel-operations-anchor)=
+### Applying multiple remodel operations
+
+A remodel JSON file consists of a list of one or remodel operations,
+each specified in a dictionary.
+These operations are performed by the remodeler in the order they appear in the file.
+In the example below, a summary is performed after renaming,
+so the result reflects the new column names.
+
+````{admonition} An example JSON remodeler file with multiple operations.
+:class: tip
+
+```json
+[
+ {
+ "operation": "rename_columns",
+ "description": "Rename a trial type column to more specific event_type.",
+ "parameters": {
+ "column_mapping": {
+ "trial_type": "event_type"
+ },
+ "ignore_missing": true
+ }
+ },
+ {
+ "operation": "summarize_column_names",
+ "description": "Get column names across files to find any missing columns.",
+ "parameters": {
+ "summary_name": "Columns after remodeling",
+ "summary_filename": "columns_after_remodel"
+ }
+ }
+]
+```
+````
+
+By stacking operations you can make several changes to a data file,
+which is important because the changes are always applied to a copy of the original backup.
+If you are planning new changes to the file, note that you are always changing
+a copy of the original backed up file, not a previously remodeled `.tsv`.
+
+(more-complex-remodeling-anchor)=
+### More complex remodeling
+
+This section discusses a complex example using the
+[**sub-0013_task-stopsignal_acq-seq_events.tsv**](./_static/data/sub-0013_task-stopsignal_acq-seq_events.tsv)
+events file of AOMIC-PIOP2 dataset available on [OpenNeuro](https://openneuro.org) as ds002790.
+Here is an excerpt of the event file.
+
+
+(sample-remodeling-events-file-anchor)=
+````{admonition} Excerpt from an event file from the stop-go task of AOMIC-PIOP2 (ds002790).
+| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex |
+| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- |
+| 0.0776 | 0.5083 | go | n/a | 0.565 | |correct | right | female
+| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female |
+| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female |
+| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female |
+| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male |
+| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male |
+````
+
+This event file corresponds to a stop-signal experiment.
+Participants were presented with faces and had to decide the sex of the face
+by pressing a button with left or right hand.
+However, if a stop signal occurred before this selection, the participant was to refrain from responding.
+
+The structure of this file corresponds to the [**BIDS**](https://bids.neuroimaging.io/) format
+for event files.
+The first column, which must be called `onset` represents the time from the start of recording
+in seconds of the temporal marker represented by that row in the file.
+In this case that temporal marker represents the presentation of a face image.
+
+Notice that the *stop_signal_delay* and *response_time* columns contain information
+about additional events (when a trial stop signal was presented and when the participant
+pushed a button).
+These events are encoded implicitly as offsets from the presentation of the go signal.
+Each row is the file encodes information for an entire trial rather than what occurred
+at a single temporal marker.
+This strategy is known as *trial-level* encoding.
+
+Our goal is to represent all the trial events (e.g., go signal, stop signal, and response)
+in separate rows of the event file using the *split_rows* restructuring operation.
+The following example shows the remodeling operations to perform the splitting.
+
+````{admonition} Example of split_rows operation for the AOMIC stop signal task.
+:class: tip
+
+```json
+[
+ {
+ "operation": "split_rows",
+ "description": "Split response event from trial event based on response_time column.",
+ "parameters": {
+ "anchor_column": "trial_type",
+ "new_events": {
+ "response": {
+ "onset_source": ["response_time"],
+ "duration": [0],
+ "copy_columns": ["response_accuracy", "response_hand"]
+ },
+ "stop_signal": {
+ "onset_source": ["stop_signal_delay"],
+ "duration": [0.5]
+ }
+ },
+ "remove_parent_row": false
+ }
+ }
+]
+```
+````
+
+The example uses the *split_rows* operation to convert this
+file from trial encoding to event encoding.
+In trial encoding each event marker (row in the event file) represents
+all the information in a single trial.
+Event markers such as the participant's response key-press are encoded implicitly
+as an offset from the stimulus presentation.
+while event encoding includes event markers for each individual event within the trial.
+
+The [**Split rows**](./HedRemodelingTools#split-rows)
+explanation under [**HED remodeling tools**](./HedRemodelingTools)
+shows the required parameters for the *split_rows* operation.
+The required parameters are *anchor_column*, *new_events*, and *remove_parent_row*.
+
+The *anchor_column* is the column we want to add new events corresponding to the stop signal and the response.
+In this case we are going to add events to an existing column: *trial_type*.
+The new events will be in new rows and the existing rows will not be overwritten
+because *remove_parent_event* is false.
+(After splitting we may want to rename *trial_type* to *event_type* since the individual
+rows in the data file no longer represent trials, but individual events within the trial.)
+
+Next we specify how the new events are generated in the *new_events* dictionary.
+Each type of new event has a name, which is a key in the *new_events* dictionary.
+Each key is associated with a dictionary
+specifying the values of the following parameters.
+
+* *onset_source*
+* *duration*
+* *copy_columns*
+
+The *onset_source* is a list indicating how to calculate the onset for the new event
+relative to the onset of the anchor event.
+The list contains any combination of column names and numerical values,
+which are evaluated and added to the onset value of the row being split.
+Column names are evaluated to the row values in the corresponding columns.
+
+In our example, the response time and stop signal delay are calculated relative to the trial's onset,
+so we only need to add the value from the respective column.
+Note that these new events do not exist for every trial.
+Rows where there was no stop signal have an `n/a` in the *stop_signal_delay* column.
+This is processed automatically, and remodeler does not create new events
+when any items in the *onset_source* list is missing or `n/a`.
+
+The *duration* specifies the duration for the new events.
+The AOMIC data did not measure the durations of the button presses,
+so we set the duration of the response event to 0.
+The AOMIC data report indicates that the stop signal lasted 500 ms.
+
+The *copy_columns* is an optional parameter indicating which columns from the parent event should be copied to the
+newly-created event.
+We would like to transfer the *response_accuracy* and the *response_hand* information to the *response* event.
+Since no extra column values are to be transferred for *stop_signal*, columns other than *onset*, *duration*,
+and *trial_type* are filled with `n/a`.
+
+
+The final remodeling file can be found at:
+[**finished json remodeler**](./_static/data/AOMIC_splitevents_rmdl.json)
+
+(remodeling-file-locations-anchor)=
+### Remodeling file locations
+
+The remodeling tools expect the full path for the JSON remodeling operation file to be given
+when the remodeling is executed.
+However, it is a good practice to include all remodeling files used with the dataset.
+The JSON remodeling operation files are usually located in the
+`derivatives/remodel/remodeling_files` subdirectory below the dataset root,
+and have file names ending in `_rmdl.json`.
+
+The backups are always in the `derivatives/remodel/backups` subdirectory under the dataset root.
+Summaries produced by the restructuring tools are located in `derivatives/remodel/summaries`.
+
+In the next section we will go over several ways to call the remodeler.
+
+(using-the-remodeling-tools-anchor)=
+## Using the remodeling tools
+
+The remodeler can be called in a number of ways including using online tools and from the command line.
+The following sections explain various ways to use the available tools.
+
+(online-tools-for-debugging-anchor)=
+### Online tools for debugging
+
+Although the event restructuring tools are designed to be run on an entire dataset,
+you should consider working with a single data file during debugging.
+The HED online tools provide support for debugging your remodeling script and for
+seeing the effect of remodeling on a single data file before running on the entire dataset.
+You can access these tools on the [**HED tools online tools server**](https://hedtools.ucsd.edu/hed).
+
+To use the online remodeling tools, navigate to the events page and select the *Execute remodel script* action.
+Browse to select the data file to be remodeled and the JSON remodel file
+containing the remodeling operations.
+The following screenshot shows these selections for the split rows example of the previous section.
+
+![Remodeling tools online](./_static/images/RemodelingOnline.png)
+
+Press the *Process* button to complete the action.
+If the remodeling script has errors,
+the result will be a downloaded text file with the errors identified.
+If the remodeling script is correct,
+the result will be a data file with the remodeling transformations applied.
+If the remodeling script contains summarization operations,
+the result will be a zip file with the modified data file and the summaries included.
+
+If you are using one of the remodeling operations that relies on HED tags, you will
+also need to upload a suitable JSON sidecar file containing the HED annotations for the data file
+if you turn the *Include summaries* option on.
+
+(the-command-line-interface-anchor)=
+### The command-line interface
+
+After [**installing the remodeler**](./HedRemodelingTools#installing-the-remodel-tools),
+you can run the tools on a full BIDS dataset,
+or on any directory using the command-line interface using
+`run_remodel_backup`, `run_remodel`, and `run_remodel_restore`.
+A full overview of all arguments is available at
+[**HED remodeling tools**](./HedRemodelingTools#remodel-command-line-arguments).
+
+The `run_remodel_backup` is usually run only once for a dataset.
+It makes the baseline backup of the event files to assure that nothing will be lost.
+The remodeling always starts from the backup files.
+
+The `run_remodel` restores the data files from the corresponding backup files and then
+executes remodeling operations from a JSON file.
+A sample command line call for `run_remodel` is shown in the following example.
+
+(remodel-run-anchor)=
+````{admonition} Command to run a summary for the AOMIC dataset.
+:class: tip
+
+```bash
+python run_remodel /data/ds002790 /data/ds002790/derivatives/remodel/remodeling_files/AOMIC_summarize_rmdl.json \
+ -b -s .txt -x derivatives
+
+```
+````
+
+The parameters are as follows:
+
+* `data_dir` - (Required first argument) Root directory of the dataset.
+* `model_path` - (Required second argument) Path of JSON file with remodeling operations.
+* `-b` - (Optional) If present, assume BIDS formatted data.
+* `-s` - (Optional) list of formats to save summaries in.
+* `-x` - (Optional) List of directories to exclude from event processing.
+
+There are three types of command line arguments:
+
+[**Positional arguments**](./HedRemodelingTools#positional-arguments),
+[**Named arguments**](./HedRemodelingTools#named-arguments),
+and [**Named arguments with values**](./HedRemodelingTools#named-arguments).
+
+The positional arguments, `data_dir` and `model_path` are not optional and must
+be the first and second arguments to `run_remodel`, respectively.
+The named arguments (with and without values) are optional.
+They all have default values if omitted.
+
+The `-b` option is a named argument indicating whether the dataset is in BIDS format.
+If in BIDS format, the remodeling tools can extract information such as the HED schema
+and the HED annotations from the dataset.
+BIDS data file names are unique, which is convenient for reporting summary information.
+Name arguments are flags-- their presence indicates true and absence indicates false.
+
+The `-s` and `-x` options are examples of named arguments with values.
+The `-s .txt` specifies that summaries should be saved in text format.
+The `-x derivatives` indicates that the `derivatives` subdirectory should not
+be processed during remodeling.
+
+This script can be run multiple times without doing backups and restores,
+since it always starts with the backed up files.
+
+The first argument of the command line scripts is the full path to the root directory of the dataset.
+The `run_remodel` requires the full path of the json remodeler file as the second argument.
+A number of optional key-value arguments are also available.
+
+After the `run_remodel` finishes, it overwrites the data files (not the backups)
+and writes any requested summaries in `derivatives/remodel/summaries`.
+
+The summaries will be written to `/data/ds002790/derivatives/remodel/summaries` folder in text format.
+By default, the summary operations will return both.
+
+The [**summary file**](./_static/data/AOMIC_column_names_2022_12_21_T_13_12_35_641062.txt) lists all different column combinations and for each combination, the files with those columns.
+Looking at the different column combinations you can see there are three, one for each task that was performed for this dataset.
+
+Going back to the [**split rows example**](#more-complex-remodeling) of remodeling,
+we see that splitting the rows into multiple rows only makes sense if the event files have the same columns.
+Only the event files for the stop signal task contain the `stop_signal_delay` column and the `response_time` column.
+The summarizing the column names across the dataset allows users to check whether the column
+names are consistent across the dataset.
+A common use case for BIDS datasets is that the event files have a different structure
+for different tasks.
+The `-t` command-line option allows users to specify which tasks to perform remodeling on.
+Using this option allows users to select only the files that have the specified task names
+in their filenames.
+
+
+Now you can try out the *split_rows* on the full dataset!
+
+
+(jupyter-notebooks-for-remodeling-anchor)=
+### Jupyter notebooks for remodeling
+
+Three Jupyter remodeling notebooks are available at
+[**Jupyter notebooks for remodeling**](https://github.com/hed-standard/hed-examples/tree/main/src/jupyter_notebooks/remodeling).
+
+These notebooks are wrappers that create the backup as well as run restructuring operations on data files.
+If you do not have access to a Jupyter notebook facility, the article
+[Six easy ways to run your Jupyter Notebook in the cloud](https://www.dataschool.io/cloud-services-for-jupyter-notebook/) discusses various no-cost options
+for running Jupyter notebooks online.
diff --git a/docs/source/FileRemodelingTools.md b/docs/source/HedRemodelingTools.md
similarity index 97%
rename from docs/source/FileRemodelingTools.md
rename to docs/source/HedRemodelingTools.md
index 65216bf..ce3decb 100644
--- a/docs/source/FileRemodelingTools.md
+++ b/docs/source/HedRemodelingTools.md
@@ -1,2796 +1,2796 @@
-(file-remodeling-tools-anchor)=
-# File remodeling tools
-
-**Remodeling** refers to the process of transforming a tabular file
-into a different form in order to disambiguate the
-information or to facilitate a particular analysis.
-The remodeling operations are specified in a JSON (`.json`) file,
-giving a record of the transformations performed.
-
-There are two types of remodeling operations: **transformation** and **summarization**.
-The **transformation** operations modify the tabular files,
-while **summarization** produces an auxiliary information file but leaves
-the tabular files unchanged.
-
-The file remodeling tools can be applied to any tab-separated value (`.tsv`) file
-but are particularly useful for restructuring files representing experimental events.
-Please read the [**File remodeling quickstart**](./FileRemodelingQuickstart.md)
-tutorials for an introduction and basic use of the tools.
-
-The file remodeling tools can be applied to individual files using the
-[**HED online tools**](https://hedtools.ucsd.edu/hed) or to entire datasets
-using the [**remodel command-line interface**](remodel-command-line-interface-anchor)
-either by calling Python scripts directly from the command line
-or by embedding calls in a Jupyter notebook.
-The tools are also available as
-[**HED RESTful services**](./HedOnlineTools.md#hed-restful-services).
-The online tools are particularly useful for debugging.
-
-This user's guide contains the following topics:
-
-* [**Overview of remodeling**](overview-of-remodeling-anchor)
-* [**Installing the remodel tools**](installing-the-remodel-tools-anchor)
-* [**Remodel command-line interface**](remodel-command-line-interface-anchor)
-* [**Remodel scripts**](remodel-scripts-anchor)
- * [**Backing up files**](backing-up-files-anchor)
- * [**Remodeling files**](remodeling-files-anchor)
- * [**Restoring files**](restoring-files-anchor)
-* [**Remodel with HED**](remodel-with-hed-anchor)
-* [**Remodel sample files**](remodel-sample-files-anchor)
- * [**Sample remodel file**](sample-remodel-file-anchor)
- * [**Sample remodel event file**](sample-remodel-event-file-anchor)
- * [**Sample remodel sidecar file**](sample-remodel-sidecar-file-anchor)
-* [**Remodel transformations**](remodel-transformations-anchor)
- * [**Factor column**](factor-column-anchor)
- * [**Factor HED tags**](factor-hed-tags-anchor)
- * [**Factor HED type**](factor-hed-type-anchor)
- * [**Merge consecutive**](merge-consecutive-anchor)
- * [**Remap columns**](remap-columns-anchor)
- * [**Remove columns**](remove-columns-anchor)
- * [**Remove rows**](remove-rows-anchor)
- * [**Rename columns**](rename-columns-anchor)
- * [**Reorder columns**](reorder-columns-anchor)
- * [**Split rows**](split-rows-anchor)
-* [**Remodel summarizations**](remodel-summarizations-anchor)
- * [**Summarize column names**](summarize-column-names-anchor)
- * [**Summarize column values**](summarize-column-values-anchor)
- * [**Summarize definitions**](summarize-definitions-anchor)
- * [**Summarize hed tags**](summarize-hed-tags-anchor)
- * [**Summarize hed type**](summarize-hed-type-anchor)
- * [**Summarize hed validation**](summarize-hed-validation-anchor)
- * [**Summarize sidecar from events**](summarize-sidecar-from-events-anchor)
-* [**Remodel implementation**](remodel-implementation-anchor)
-
-
-(overview-of-remodeling-anchor)=
-## Overview of remodeling
-
-Remodeling consists of restructuring and/or extracting information from tab-separated
-value files based on a specified list of operations contained in a JSON file.
-
-Internally, the remodeling operations represent the tabular file using a
-[**Pandas DataFrame**](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html).
-
-(transformation-operations-anchor)=
-### Transformation operations
-
-**Transformation** operations, shown schematically in the following
-figure, are designed to transform an incoming tabular file
-into a new DataFrame without modifying the incoming data.
-
-![Transformation operations](./_static/images/TransformationOperations.png)
-
-Transformation operations are stateless and do not save any context information or
-affect future applications of the transformation.
-
-Transformations, themselves, do not have any output and just return a new,
-transformed DataFrame.
-In other words, transformations do not operate in place on the incoming DataFrame,
-but rather, they create a new DataFrame containing the result.
-
-Typically, the calling program is responsible for reading and saving the tabular file,
-so the user can choose whether to overwrite or create a new file.
-
-See the [**remodeling tool program interface**](remodel-command-line-interface-anchor)
-section for information on how to call the operations.
-
-(summarization-operations-anchor)=
-### Summarization operations
-
-**Summarization** operations do not modify the input DataFrame but rather extract and save information in an internally stored summary dictionary as shown schematically in the following figure.
-
-![Summary operations](./_static/images/SummaryOperation.png)
-
-The dispatcher that executes remodeling operations can be interrogated at any time
-for the state information contained in the global summary dictionary and can save additional summary information at any time during execution.
-Usually summaries are dumped at the end of processing to the `derivatives/remodel/summaries`
-subdirectory under the dataset root.
-
-Summarization operations may appear anywhere in the operation list,
-and the same type of summary may appear multiple times under different names in order to track progress.
-
-The dispatcher stores information from each uniquely named summarization operation
-as a separate summary dictionary entry.
-Within its summary information, most summarization operations keep a separate
-summary for each individual file and have methods to create an overall summary
-of the information for all the files that have been processed by the summarization.
-
-Summarization results are available in JSON (`.json`) and text (`.txt`) formats.
-
-(available-operations-anchor)=
-### Available operations
-
-The following table lists the available remodeling operations with brief example use cases
-and links to further documentation. Operations not listed in the summarize section are transformations.
-
-(remodel-operation-summary-anchor)=
-````{table} Summary of the HED remodeling operations for tabular files.
-| Category | Operation | Example use case |
-| -------- | ------- | -----|
-| **clean-up** | | |
-| | [*remove_columns*](remove-columns-anchor) | Remove temporary columns created during restructuring. |
-| | [*remove_rows*](remove-rows-anchor) | Remove rows with n/a values in a specified column. |
-| | [*rename_columns*](rename-columns-anchor) | Make columns names consistent across a dataset. |
-| | [*reorder_columns*](reorder-columns-anchor) | Make column order consistent across a dataset. |
-| **factor** | | |
-| | [*factor_column*](factor-column-anchor) | Extract factor vectors from a column of condition variables. |
-| | [*factor_hed_tags*](factor-hed-tags-anchor) | Extract factor vectors from search queries of HED annotations. |
-| | [*factor_hed_type*](factor-hed-type-anchor) | Extract design matrices and/or condition variables. |
-| **restructure** | | |
-| | [*merge_consecutive*](merge-consecutive-anchor) | Replace multiple consecutive events of the same type
with one event of longer duration. |
-| | [*remap_columns*](remap-columns-anchor) | Create m columns from values in n columns (for recoding). |
-| | [*split_rows*](split-rows-anchor) | Split trial-encoded rows into multiple events. |
-| **summarize** | | |
-| | [*summarize_column_names*](summarize-column-names-anchor) | Summarize column names and order in the files. |
-| | [*summarize_column_values*](summarize-column-values-anchor) | Count the occurrences of the unique column values. |
-| | [*summarize_hed_tags*](summarize-hed-tags-anchor) | Summarize the HED tags present in the
HED annotations for the dataset. |
-| | [*summarize_hed_type*](summarize-hed-type-anchor) | Summarize the detailed usage of a particular type tag
such as *Condition-variable* or *Task*
(used to automatically extract experimental designs). |
-| | [*summarize_hed_validation*](summarize-hed-validation-anchor) | Validate the data files and report any errors. |
-| | [*summarize_sidecar_from_events*](summarize-sidecar-from-events-anchor) | Generate a sidecar template from an event file. |
-````
-
-The **clean-up** operations are used at various phases of restructuring to assure consistency
-across dataset files.
-
-The **factor** operations produce column vectors with the same number of rows as the data file
-from which they were calculated.
-They encode condition variables, design matrices, or other search criteria.
-See the [**HED conditions and design matrices**](./HedConditionsAndDesignMatrices.md)
-for more information on factoring and analysis.
-
-The **restructure** operations modify the way in which a data file represents its information.
-
-The **summarize** operations produce dataset-wide summaries of various aspects of the data files
-as well as summaries of the individual files.
-
-(installing-the-remodel-tools-anchor)=
-## Installing the remodel tools
-
-The remodeling tools are available in the GitHub
-[**hed-python**](https://github.com/hed-standard/hed-python) repository
-along with other tools for data cleaning and curation.
-Although version 0.1.0 of this repository is available on [**PyPI**](https://pypi.org/)
-as `hedtools`, the version containing the restructuring tools (Version 0.2.0)
-is still under development and has not been officially released.
-However, the code is publicly available on the `develop` branch of the
-hed-python repository and
-can be directly installed from GitHub using `pip`:
-
-```text
-pip install git+https://github.com/hed-standard/hed-python/@develop
-```
-
-The web services and online tools supporting remodeling are available
-on the [**HED online tools dev server**](https://hedtools.ucsd.edu/hed_dev).
-When version 0.2.0 of `hedtools` is officially released on PyPI, restructuring
-will become available on the released [**HED online tools**](https://hedtools.ucsd.edu/hed).
-A docker version is also under development.
-
-The following diagram shows a schematic of the remodeling process.
-
-![Event remodeling process](./_static/images/EventRemappingProcess.png)
-
-Initially, the user creates a backup of the specified tabular files (usually `events.tsv` files).
-This backup is a mirror of the data files in the dataset,
-but is located in the `derivatives/remodel/backups` directory and never modified once the backup is created.
-
-Remodeling applies a sequence of operations specified in a JSON remodel file
-to the backup versions of the data files.
-The JSON remodel file provides a record of the operations performed on the file.
-If the user detects a mistake in the transformations,
-he/she can correct the transformation file and rerun the transformations.
-
-Remodeling always runs on the original backup version of the file rather than
-the transformed version, so the transformations can always be corrected and rerun.
-It is possible to by-pass the backup, particularly if only using summarization operations,
-but this is not recommended and should be done with care.
-
-(remodel-command-line-interface-anchor)=
-## Remodel command-line interface
-
-The remodeling toolbox provides Python scripts with command-line interfaces
-to create or restore backups and to apply operations to the files in a dataset.
-The file remodeling tools may be applied to datasets that are in free form under a directory root
-or that are in [**BIDS-format**](https://bids.neuroimaging.io/).
-Some operations use [**HED (Hierarchical Event Descriptors)**](./IntroductionToHed.md) annotations.
-See the [**Remodel with HED**](remodel-with-hed-anchor) section for a discussion
-of these operations and how to use them.
-
-The remodeling command-line interface can be used from the command line,
-called from another Python program, or used in a Jupyter notebooks.
-Example Jupyter notebooks using the remodeling commands can be found
-[**here**](https://github.com/hed-standard/hed-examples/tree/main/src/jupyter_notebooks/remodeling).
-
-
-(calling-remodel-tools-anchor)=
-### Calling remodel tools
-
-The remodeling tools provide three Python programs for backup (`run_remodel_backup`),
-remodeling (`run_remodel`) and restoring (`run_remodel_restore`) event files.
-These programs can be called from the command line or from another Python program.
-
-The programs use a standard command-line argument list for specifying input as summarized in the following table.
-
-(remodeling-operation-summary-anchor)=
-````{table} Summary of command-line arguments for the remodeling programs.
-| Script name | Arguments | Purpose |
-| ----------- | -------- | ------- |
-|*run_remodel_backup* | *data_dir*
*-bd -\\-backup-dir*
*-bn -\\-backup-name*
*-e -\\-extensions*
*-f -\\-file-suffix*
*-t -\\-task-names*
*-v -\\-verbose*
*-x -\\-exclude-dirs*| Create a backup event files. |
-|*run_remodel* | *data_dir*
*model_path*
*-b -\\-bids-format*
*-bd -\\-backup-dir*
*-bn -\\-backup-name*
*-e -\\-extensions*
*-f -\\-file-suffix*
*-i -\\-individual-summaries*
*-j -\\-json-sidecar*
*-ld -\\-log-dir*
*-nb -\\-no-backup*
*-ns -\\-no-summaries*
*-nu -\\-no-update*
*-r -\\-hed-version*
*-s -\\-save-formats*
*-t -\\-task-names*
*-v -\\-verbose*
*-w -\\-work-dir*
*-x -\\-exclude-dirs* | Restructure or summarize the event files. |
-|*run_remodel_restore* | *data_dir*
*-bd -\\-backup-dir*
*-bn -\\-backup-name*
*-t -\\-task-names*
*-v -\\-verbose*
| Restore a backup of event files. |
-
-````
-All the scripts have a required argument, which is the full path of the dataset root (*data_dir*).
-The `run_remodel` program has a required parameter which is the full path of a JSON file
-containing a specification of the remodeling commands to be run.
-
-(remodel-command-line-arguments-anchor)=
-### Remodel command-line arguments
-
-This section describes the arguments that are used for the remodeling command-line interface
-with examples and more details.
-
-#### Positional arguments
-
-Positional arguments are required and must be given in the order specified.
-
-`data_dir`
-> The full path of dataset root directory.
-
-`model_path`
-> The full path of the JSON remodel file (for *run_remodel* only).
->
-#### Named arguments
-
-Named arguments consist of a key starting with a hyphen and are possibly followed by a value.
-Named arguments can be given in any order or omitted.
-If omitted, a specified default is used.
-Argument keys and values are separated by spaces.
-
-For argument values that are lists, the key is given followed by the items in the list,
-all separated by spaces.
-
-Each command has two different forms of the key name: a short form (a single hyphen followed by a single character)
-and a longer form (two hyphens followed by a more self-explanatory name).
-Users are free to use either form.
-
-`-b`, `--bids-format`
-> If this flag present, the dataset is in BIDS format with sidecars. Tabular files and their associated sidecars are located using BIDS naming.
-
-`-bd`, `--backup-dir`
-> The path to the directory holding the backups (default: `[data_root]/derivatives/remodel/backups`).
-> Use the `-nb` option if you wish to omit the backup (in `run_remodel`).
-
-`-bn`, `--backup-name`
-> The name of the backup used for the remodeling (default: `default_back`).
-
-`-e`, `--extensions`
-> This option is followed by a list of file extension(s) of the data files to process.
-> The default is `.tsv`. Comma separated tabular files are not permitted.
-
-`-f`, `--file-suffix`
-> This option is followed by the suffix names of the files to be processed.
-> For example `events` (the default) captures files named `events.tsv` if the default extension is used.
-> The filename without the extension must end in one of the specified suffixes in order to be
-> backed up or transformed.
-
-`-i`, `--individual-summaries`
-> This option offers a choice among three options:
-> - `separate`: Individual summaries for each file in separate files in addition the overall summary.
-> - `consolidated`: Individual summaries written in the same file as the overall summary.
-> - `none`: Only an overall summary.
-
-`-j`, `--json-sidecar`
-> This option is followed by the full path of the JSON sidecar with HED annotations to be
-> applied during the processing of HED-related remodeling operations.
-
-`-ld`, `--log-dir`
-> This option is followed by the full path of a directory for writing log files.
-> A log file is written if the remodeling tools raise an exception and the program terminates.
-> Note that a log file is not written for issues gathered during operations such as `summarize_hed_valistion`
-> because reporting HED validation errors is a normal part of this operation.
-> On the other hand, errors in the JSON remodeling file do raise and exception and are reported in the log.
-
-`-nb`, `--no-backup`
-> If present, no backup is used. Rather operations are performed directly on the files.
-
-`-ns`, `--no-summaries`
-> If present, no summary files are output.
-
-`-nu`, `--no-update`
-> If present, the modified files are not output.
-
-`-r`, `--hed-versions`
-> This option is followed by one or more HED versions. Versions of the standard schema are specified
-> by their semantic versions (e.g., `8.1.0`), while library schema versions are prefixed by their
-> library name (e.g., `score_1.0.0`).
-
-> If more than one HED schema version is given, all but one of the versions must start with an
-> additional namespace designator (e.g., `sc:`). At most one version can omit the namespace designator
-> when multiple schema are being used. In annotations, tags must start with the namespace
-> designator of the corresponding schema from which they were selected (e.g. `sc:Sleep-modulator`
-> if the SCORE library was designated by `sc:score_1.0.0`).
-
-`-s`, `--save-formats`
-> This option is followed by the extensions (including .) of the formats in which
-> to save summaries (default: `.txt` `.json`).
-
-`-t`, `--task-names`
-> The name(s) of the tasks to be included (for BIDS-formatted files only).
-> When a dataset includes multiple tasks, the event files are often structured
-> differently for each task and thus require different transformation files.
-> This option allows the backups and operations to be restricted to an individual task.
-
-> If this option is omitted, all tasks are used. This means that all `events.tsv` files are
-> restored from a backup if the backup is used, the operations are performed on all `events.tsv` files, and summaries are combined over all tasks.
-
-> If a list of specific task names follows this option, only datafiles corresponding to
-> the listed tasks are processed giving separate summaries for each listed task.
-
-> If a "*" follows this option, all event files are processed and separate summaries are created for each task.
-
-> Task detection follows the BIDS convention. Tasks are detected by finding "task-x" in the file names of `events.tsv` files. Here x is the name of the task. The task name is followed by an underbar, by a period, or be at the end of the filename.
-
-`-v`, `--verbose`
-> If present, more comprehensive messages documenting transformation progress
-> are printed to standard output.
-
-`-w`, `--work-dir`
-> The path to the remodeling work root directory --both for summaries (default: `[data_root]/derivatives/remodel`).
-> Use the `-nb` option if you wish to omit the backup (in `run_remodel`).
-
-`-x`, `--exclude-dirs`
-> The directories to exclude when gathering the data files to process.
-> For BIDS datasets, these are typically `derivatives`, `stimuli`, and `sourcecode`.
-> Any subdirectory with a path component named `remodel` is automatically excluded from remodeling, as
-> these directories are reserved for storing backup, state, and result information for the remodeling process itself.
-
-(remodel-scripts-anchor)=
-## Remodel scripts
-
-This section discusses the three main remodeling scripts with command-line interfaces
-to support backup, remodeling, and restoring the tabular files used in the remodeling process.
-These scripts can be run from the command line or from another Python program using a function call.
-
-(backing-up-files-anchor)=
-### Backing up files
-
-The `run_remodel_backup` Python program creates a backup of the specified files.
-The backup is always created in the `derivatives/remodel/backups` subdirectory
-under the dataset root as shown in the following example for the
-sample dataset `eeg_ds003645s_hed_remodel`,
-which can be found in the `datasets` subdirectory of the
-[**hed-examples**](https://github.com/hed-standard/hed-examples) GitHub repository.
-
-![Remodeling backup structure](./_static/images/RemodelingBackupStructure.png)
-
-
-The backup process creates a mirror of the directory structure of the source files to be backed up
-in the directory `derivatives/remodel/backups/backup_name/backup_root` as shown in the figure above.
-The default backup name is `default_back`.
-
-In the above example, the backup has subdirectories `sub-002` and `sub-003` just
-like the main directory of the dataset.
-These subdirectories only contain backups of the files to be transformed
-(by default files with names ending in `events.tsv`).
-
-In addition to the `backup_root`, the backup directory also contains a dictionary of backup files
-in the `backup_lock.json` file. This dictionary is used internally by the remodeling tools.
-The backup should be created once and not modified by the user.
-
-The following example shows how to run the `run_remodel_backup` program from the command line
-to back up the dataset located at `/datasets/eeg_ds003645s_hed_remodel`.
-
-(remodel-backup-anchor)=
-````{admonition} Example of calling run_remodel_backup from the command line.
-:class: tip
-
-```bash
-python run_remodel_backup /datasets/eeg_ds003645s_hed_remodel -x derivatives stimuli
-
-```
-````
-
-Since the `-f` and `-e` arguments are not given, the default file suffix and extension values
-apply, so only files of the form `events.tsv` are backed up.
-The `-x` option excludes any source files from the `derivatives` and `stimuli` subdirectories.
-These choices can be overridden using additional command-line arguments.
-
-The following shows how the `run_remodel_backup` program can be called from a
-Python program or a Jupyter notebook.
-The command-line arguments are given in a list instead of on the command line.
-
-(remodel-backup-jupyter-anchor)=
-````{admonition} Example of Python code to call run_remodel_backup using a function call.
-:class: tip
-
-```python
-
-import hed.tools.remodeling.cli.run_remodel_backup as cli_backup
-
-data_root = '/datasets/eeg_ds003645s_hed_remodel'
-arg_list = [data_root, '-x', 'derivatives', 'stimuli']
-cli_backup.main(arg_list)
-
-```
-````
-
-During remodeling, each file in the source is associated with a backup file using
-its relative path from the dataset root.
-Remodeling is performed by reading the backup file, performing the operations specified in the
-JSON remodel file, and overwriting the source file as needed.
-
-Users can also create alternatively named backups by providing the `-n` argument with a backup name to
-the `run_remodel_backup` program.
-To use backup files from another named backup, call the remodeling program with
-the `-n` argument and the correct backup name.
-Named backups can provide checkpoints to allow the execution of
-transformations to start from intermediate points.
-
-**NOTE**: You should not delete backups, even if you have created multiple named backups.
-The backups provide useful state and provenance information about the data.
-
-(remodeling-files-anchor)=
-### Remodeling files
-
-Remodeling consists of applying a sequence of operations from the
-[**remodel operation summary**](remodel-operation-summary-anchor)
-to successively transform each backup file according to the instructions
-and to overwrite the actual files with the final result.
-
-If the dataset has no backups, the actual data files rather than the backups are transformed.
-You are expected to [**create the backup**](backing-up-files-anchor) (just once)
-before running the remodeling operations.
-Going without backup is not recommended unless you are only doing summarization operations.
-
-The operations are specified as a list of dictionaries in a JSON file in the
-[**remodel sample files**](remodel-sample-files-anchor) as discussed below.
-
-Before running remodeling transformations on an entire dataset,
-consider using the [**HED online tools**](https://hedtools.ucsd.edu/hed)
-to debug your remodeling operation file on a single file.
-The remodeling process always starts with the original backup files,
-so the usual development path is to incrementally add operations to the end
-of your transformation JSON file as you develop and test on a single file
-until you have the desired end result.
-
-The following example shows how to run a remodeling script from the command line.
-The example assumes that the backup has already been created for the dataset.
-
-(run-remodel-anchor)=
-````{admonition} Example of calling run_remodel from the command line.
-:class: tip
-
-```bash
-python run_remodel /datasets/eeg_ds003645s_hed_remodel /datasets/remove_extra_rmdl.json -x derivatives simuli
-
-```
-````
-
-The script has two required arguments the dataset root and the path to the JSON remodel file.
-Usually, the JSON remodel files are stored with the dataset itself in the
-`derivatives/remodel/remodeling_files` subdirectory, but common scripts can be stored in a central place elsewhere.
-
-The additional keyword option, `-x` in the example indicates that directory paths containing the component `derivatives` or the component `stimuli` should be excluded.
-Excluded directories need not have their excluded path component at the top level of the dataset.
-Subdirectory paths containing the `remodel` path component are automatically excluded.
-
-The command-line interface can also be used in a Jupyter notebook or as part of a larger Python
-program by calling the `main` function with the equivalent command-line arguments provided
-in a list with the positional arguments appearing first.
-
-The following example shows Python code to remodel a dataset using the command-line interface.
-This code can be used in a Jupyter notebook or in another Python program.
-
-````{admonition} Example Python code to call run_remodel using a function call.
-:class: tip
-
-```python
-import hed.tools.remodeling.cli.run_remodel as cli_remodel
-
-data_root = '/datasets/eeg_ds003645s_hed_remodel'
-model_path = '/datasets/remove_extra_rmdl.json'
-arg_list = [data_root, model_path, '-x', 'derivatives', 'stimuli']
-cli_remodel.main(arg_list)
-
-```
-````
-
-(restoring-files-anchor)=
-### Restoring files
-
-Since remodeling always uses the backed up version of each data file,
-there is no need to restore these files to their original state
-between remodeling runs.
-However, when finished with an analysis,
-you may want to restore the data files to their original state.
-
-The following example shows how to call `run_remodel_restore` to
-restore the data files from the default backup.
-The restore operation restores all the files in the specified backup.
-
-(run-remodel-restore-anchor)=
-````{admonition} Example of calling run_remodel_restore from the command line.
-:class: tip
-
-```bash
-python run_remodel_restore /datasets/eeg_ds003645s_hed_remodel
-
-```
-````
-
-As with the other command-line programs, `run_remodel_restore` can be also called using a function call.
-
-````{admonition} Example Python code to call *run_remodel_restore* using a function call.
-:class: tip
-
-```python
-import hed.tools.remodeling.cli.run_restore as cli_remodel
-
-data_root = '/datasets/eeg_ds003645s_hed_remodel'
-cli_remodel.main([data_root])
-
-```
-````
-(remodel-with-hed-anchor)=
-## Remodel with HED
-
-[**HED**](introduction-to-hed-anchor) (Hierarchical Event Descriptors) is a
-system for annotating data in a manner that is both human-understandable and machine-actionable.
-HED provides much more detail about the events and their meanings,
-If you are new to HED see the
-[**HED annotation quickstart**](./HedAnnotationQuickstart.md).
-For information about HED's integration into BIDS (Brain Imaging Data Structure) see
-[**BIDS annotation quickstart**](./BidsAnnotationQuickstart.md).
-
-Currently, five remodeling operations rely on HED annotations:
-- [**factor_hed_tags**](factor-hed-tags-anchor)
-- [**factor_hed_type**](factor-hed-type-anchor)
-- [**summarize_hed_tags**](summarize-hed-tags-anchor)
-- [**summarize_hed_type**](summarize-hed-type-anchor)
-- [**summarize_hed_validation**](summarize-hed-validation-anchor).
-
-HED tags provide a mechanism for advanced data analysis and for
-extracting experiment-specific information from the data files.
-However, since HED information is not always stored in the data files themselves,
-you may need to provide a HED schema and a JSON sidecar.
-
-The HED schema defines the allowed HED tag vocabulary, and the JSON sidecar
-associates HED annotations with the information in the columns of the event files.
-If you are not using any of the HED operations in your remodeling,
-you do not have to provide this information.
-
-
-(extracting-hed-information-from-bids-anchor)=
-### Extracting HED information from BIDS
-
-The simplest way to use HED with `run_remodel` is to use the `-b` option,
-which indicates that the dataset is in [**BIDS**](https://bids.neuroimaging.io/) (Brain Imaging Data Structure) format.
-
-BIDS is a standardized way of organizing neuroimaging data.
-HED and BIDS are well integrated.
-If you are new to BIDS, see the
-[**BIDS annotation quickstart**](./BidsAnnotationQuickstart.md).
-
-A HED-annotated BIDS dataset provides the HED schema version in the `dataset_description.json`
-file located directly under the BIDS dataset root.
-
-BIDS datasets must have filenames in a specific format,
-and the HED tools can locate the relevant JSON sidecars for each data file based on this information.
-
-
-(directly-specifying-hed-information-anchor)=
-### Directly specifying HED information
-
-If your data is already in BIDS format, using the `-b` option is ideal since
-the needed information can be located automatically.
-However, early in the experimental process,
-your datafiles are not likely to be organized in BIDS format,
-and this option will not be available if you want to use HED.
-
-Without the `-b` option, the remodeling tools locate the appropriate files based
-on specified filename suffixes and extensions.
-In order to use HED operations, you must explicitly specify the HED versions
-using the `-r` option.
-The `-r` option supports a list of HED versions if multiple HED schemas are used.
-For example: `-r 8.1.0 sc:score_1.0.0` specifies that vocabulary will be drawn
-from standard HED Version 8.1.0 and from
-HED SCORE library version 1.0.0.
-Annotations containing tags from SCORE should be prefixed with `sc:`.
-Note: both of the schemas can be viewed by the [**HED Schema Viewer**](https://www.hedtags.org/display_hed.html).
-
-Usually, annotators will consolidate HED annotations in a single JSON sidecar file
-located at the top-level of the dataset.
-The path of this sidecar can be passed as a command-line argument using the `-j` option.
-If more than one JSON sidecar file contains HED annotations, users will need to call the lower-level
-remodeling functions to perform these operations.
-
-The following example illustrates a command-line call that passes both a HED schema version and
-the path to the JSON file with the HED annotations.
-
-(run-remodel-with-hed-direct-anchor)=
-````{admonition} Remodeling a non-BIDS dataset using HED.
-:class: tip
-
-```bash
-python run_remodel /datasets/eeg_ds003645s_hed_remodel /datasets/summarize_conditions_rmdl.json \
--x derivatives simuli -r 8.1.0 -j /datasets/eeg_ds003645s_hed_remodel/task-FacePerception_events.json
-
-```
-````
-
-(remodel-with-hed-direct-python-anchor)=
-````{admonition} Example Python code to use run_remodel on a non-BIDS dataset.
-:class: tip
-
-```python
-import hed.tools.remodeling.cli.run_remodel as cli_remodel
-
-data_root = '/datasets/eeg_ds003645s_hed_remodel'
-model_path = '/datasets/summarize_conditions_rmdl.json'
-json_path = '/datasets/eeg_ds003645s_hed_remodel/task-FacePerception_events.json'
-arg_list = [data_root, model_path, '-x', 'derivatives', 'stimuli', '-r' 8.1.0 '-j' json_path]
-cli_remodel.main(arg_list)
-
-```
-````
-
-(remodel-error-handling-anchor)=
-## Remodel error handling
-
-Errors can occur during several stages in during remodeling and how they are
-handled depends on the type of error and where the error occurs.
-Except for the validation summary, the underlying remodeling code raises exceptions for most errors.
-
-
-(errors-in-the-remodel-file-anchor)=
-### Errors in the remodel file
-
-Each operation requires specific parameters to execute properly.
-The underlying implementation for each operation defines these parameters using a [**json schema**](https://json-schema.org/)
-as the `PARAMS` property of the operation's class definition.
-The use of the JSON schema allows the remodeler to specify and validate requirements on most of an
-operation's parameters using standardized methods.
-
-The [**remodeler_validator**](https://github.com/hed-standard/hed-python/blob/master/hed/tools/remodeling/remodeler_validator.py)
-compiles a JSON schema for the remodeler from individual operations and validates
-the remodel file against the compiled JSON schema. The validator should always before executing any remodel operations.
-
-For example, the command line [**run_remodel**](https://raw.githubusercontent.com/hed-standard/hed-python/develop/hed/tools/remodeling/cli/run_remodel.py)
-program calls the validator before executing any operations.
-If there are errors, `run_remodel` reports the errors for all operations and exits.
-This allows users to correct errors in all operations in one pass without any data modification.
-The [**HED online tools**](https://hedtools.org/hed) are particularly useful for debugging
-the syntax and other issues in the remodeling process.
-
-(execution-time-remodel-errors-anchor)=
-### Execution-time remodel errors
-
-When an error occurs during execution, an exception is raised.
-Exceptions are raised for invalid or missing files or if a transformed file
-is unable to be rewritten due to improper file permissions.
-Each individual operation may also raise an exception if the
-data file being processed does not have the expected information,
-such as a column with a particular name.
-
-Exceptions raised during execution cause the process to be terminated and no
-further files are processed.
-
-
-(remodel-sample-files-anchor)=
-## Remodel sample files
-
-All remodeling operations are specified in a standardized JSON remodel input file.
-The following shows the contents of the JSON remodeling file `remove_extra_rmdl.json`,
-which contains a single operation with instructions to remove the `value` and `sample` columns
-from the data file if the columns exist.
-
-(sample-remodel-file-anchor)=
-### Sample remodel file
-
-````{admonition} A sample JSON remodeling file with a single remove_columns transformation operation.
-:class: tip
-
-```json
-[
- {
- "operation": "remove_columns",
- "description": "Remove unwanted columns prior to analysis",
- "parameters": {
- "remove_names": ["value", "sample"]
- }
- }
-]
-
-```
-````
-
-Each operation is specified in a dictionary with three top-level keys: "operation", "description",
-and "parameters". The value of the "operation" is the name of the operation.
-The "description" value should include the reason this operation was needed,
-not just a description of the operation itself.
-Finally, the "parameters" value is a dictionary mapping parameter name to
-parameter value.
-
-The parameters for each operation are listed in
-[**Remodel transformations**](remodel-transformations-anchor) and
-[**Remodel summarizations**](remodel-summarizations-anchor) sections.
-An operation may have both required and optional parameters.
-Optional parameters may be omitted if unneeded, but all parameters are specified in
-the "parameters" section of the dictionary.
-The full specification of the remodel file is also provided as a [**JSON schema**](https://json-schema.org/).
-
-The remodeling JSON files should have names ending in `_rmdl.json` to more easily
-distinguish them from other JSON files.
-Although these files can be stored anywhere, their preferred location is
-in the `derivatives/remodel/models` subdirectory under the dataset root so
-that they can provide provenance for the dataset.
-
-(sample-remodel-event-file-anchor)=
-### Sample remodel event file
-
-Several examples illustrating the remodeling operations use the following excerpt of the stop-go task from sub-0013
-of the AOMIC-PIOP2 dataset available on [**OpenNeuro**](https://openneuro.org) as ds002790.
-The full event file is
-[**sub-0013_task-stopsignal_acq-seq_events.tsv**](./_static/data/sub-0013_task-stopsignal_acq-seq_events.tsv).
-
-
-````{admonition} Excerpt from an event file from the stop-go task of AOMIC-PIOP2 (ds002790).
-| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex |
-| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- |
-| 0.0776 | 0.5083 | go | n/a | 0.565 | |correct | right | female
-| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female |
-| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female |
-| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female |
-| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male |
-| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male |
-````
-
-(Sample-remodel-sidecar-file-anchor)=
-### Sample remodel sidecar file
-
-For remodeling operations that use HED, a JSON sidecar is usually required to provide the
-necessary HED annotations. The following JSON sidecar excerpt is used in several examples to
-illustrate some of these operations.
-The full JSON file can be found at
-[**task-stopsiqnal_acq-seq_events.json**](./_static/data/task-stopsignal_acq-seq_events.json).
-
-
-````{admonition} Excerpt of JSON sidecar with HED annotations for the stop-go task of AOMIC-PIOP2.
-:class: tip
-
-```json
-{
- "trial_type": {
- "HED": {
- "succesful_stop": "Sensory-presentation, Visual-presentation, Correct-action, Image, Label/succesful_stop",
- "unsuccesful_stop": "Sensory-presentation, Visual-presentation, Incorrect-action, Image, Label/unsuccesful_stop",
- "go": "Sensory-presentation, Visual-presentation, Image, Label/go"
- }
- },
- "stop_signal_delay": {
- "HED": "(Auditory-presentation, Delay/# s)"
- },
- "sex": {
- "HED": {
- "male": "Def/Male-image-cond",
- "female": "Def/Female-image-cond"
- }
- },
- "hed_defs": {
- "HED": {
- "def_male": "(Definition/Male-image-cond, (Condition-variable/Image-sex, (Male, (Image, Face))))",
- "def_female": "(Definition/Female-image-cond, (Condition-variable/Image-sex, (Female, (Image, Face))))"
- }
- }
-}
-```
-````
-Notice that the JSON file has some keys (e.g., "trial_type", "stop_signal_delay", and "sex")
-which also correspond to columns in the events file.
-The "hed_defs" key corresponds to an extra entry in the JSON file that, in this case, provides the definitions needed in the HED annotation.
-
-HED operations also require the HED schema. Most of the examples use HED standard schema version 8.1.0.
-
-(remodel-transformations-anchor)=
-## Remodel transformations
-
-(factor-column-anchor)=
-### Factor column
-
-The *factor_column* operation appends factor vectors to tabular files
-based on the values in a specified file column.
-Each factor vector contains a 1 if the corresponding row had that column value and a 0 otherwise.
-The *factor_column* is used to reformat event files for analyses such as linear regression
-based on column values.
-
-(factor-column-parameters-anchor)=
-#### Factor column parameters
-
-```{admonition} Parameters for the *factor_column* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *column_name* | str | The name of the column to be factored.|
-| *factor_values* | list | Column values to be included as factors. |
-| *factor_names* | list| (**Optional**) Column names for created factors. |
-```
-
-If *column_name* is not a column in the data file, a `ValueError` is raised.
-
-If *factor_values* is empty, factors are created for each unique value in *column_name*.
-Otherwise, only factors for the specified column values are generated.
-If a specified value is missing in a particular file, the corresponding factor column contains all zeros.
-
-If *factor_names* is empty, the newly created columns are of the
-form *column_name.factor_value*.
-Otherwise, the newly created columns have names *factor_names*.
-If *factor_names* is not empty, then *factor_values* must also be specified
-and both lists must be of the same length.
-
-(factor-column-example-anchor)=
-#### Factor column example
-
-The *factor_column* operation in the following example specifies that factor columns
-should be created for *succesful_stop* and *unsuccesful_stop* of the *trial_type* column.
-The resulting columns are called *stopped* and *stop_failed*, respectively.
-
-
-````{admonition} A sample JSON file with a single *factor_column* transformation operation.
-:class: tip
-
-```json
-[{
- "operation": "factor_column",
- "description": "Create factors for the succesful_stop and unsuccesful_stop values.",
- "parameters": {
- "column_name": "trial_type",
- "factor_values": ["succesful_stop", "unsuccesful_stop"],
- "factor_names": ["stopped", "stop_failed"]
- }
-}]
-```
-````
-
-The results of executing this *factor_column* operation on the
-[**sample remodel event file**](sample-remodel-event-file-anchor) are:
-
-````{admonition} Results of the factor_column operation on the samplepip data.
-
-| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex | stopped | stop_failed |
-| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- | ---------- | ---------- |
-| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female | 0 | 0 |
-| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female | 0 | 1 |
-| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female | 0 | 0 |
-| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female | 1 | 0 |
-| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male | 0 | 1 |
-| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male | 0 | 0 |
-````
-
-(factor-hed-tags-anchor)=
-### Factor HED tags
-
-The *factor_hed_tags* operation is similar to the *factor_column* operation
-in that it produces factor vectors containing 0's and 1,
-which are appended to the returned DataFrame.
-However, rather than basing these vectors on values in a specified column,
-the factors are computed by determining whether the assembled HED annotations for each row
-satisfies a specified search query.
-
-An example search query is whether the assembled HED annotation contains a particular HED tag.
-The [**HED search guide**](./HedSearchGuide.md) tutorial discusses the HED search facility in more detail.
-
-
-(factor-hed-tags-parameters-anchor)=
-#### Factor HED tags parameters
-
-```{admonition} Parameters for the *factor_hed_tags* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *queries* | list | A list of HED query strings. |
-| *query_names* | list | (**Optional**) A list of names for the factor columns generated by the queries. |
-| *remove_types* | list | (**Optional**) Structural HED tags to be removed (usually `Condition-variable` and `Task`). |
-| *expand_context* | bool | (**Optional**: default True) Expand the context and remove
`Onset` and`Offset` tags before the query. |
-
-```
-The *query_names* list, which must be empty or the same length as *queries*,
-contains the names of the factor columns produced by the search.
-If the *query_names* list is empty, the result columns are titled "query_1",
-"query_2", etc.
-
-Most of the time the *remove_types* should be set to `["Condition-variable", "Task"]` and the effects of
-the experimental design captured using the `factor_hed_types_op`.
-If *expand_context* is set to *false*, the additional context provided by `Onset`, `Offset`, and `Duration`
-is ignored.
-
-(factor-hed-tags-example-anchor)=
-#### Factor HED tags example
-
-The *factor_hed-tags* operation in the following example produce two factor
-columns with 1's where the HED string for a row contains the `Correct-action`
-and `Incorrect-action`, respectively.
-The resulting factor columns are named *correct* and *incorrect*, respectively.
-
-````{admonition} A sample JSON file with a single *factor_hed_tags* transformation operation.
-:class: tip
-
-```json
-[{
- "operation": "factor_hed_tags",
- "description": "Create factors based on whether the event represented a correct or incorrect action.",,
- "parameters": {
- "queries": ["correct-action", "incorrect-action"],
- "query_names": ["correct", "incorrect"],
- "remove_types": ["Condition-variable", "Task"],
- "expand_context": false
- }
-}]
-```
-````
-
-The results of executing this *factor_hed-tags* operation on the
-[**sample remodel event file**](sample-remodel-event-file-anchor) using the
-[**sample remodel sidecar file**](sample-remodel-sidecar-file-anchor) for HED annotations is:
-
-
-````{admonition} Results of *factor_hed_tags*.
-
-| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex | correct | incorrect |
-| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- | ---------- | ---------- |
-| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female | 0 | 0 |
-| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female | 0 | 1 |
-| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female | 0 | 0 |
-| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female | 1 | 0 |
-| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male | 0 | 1 |
-| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male | 0 | 0 |
-````
-
-(factor-hed-type-anchor)=
-### Factor HED type
-
-The *factor_hed_type* operation produces factor columns
-based on values of the specified HED type tag.
-The most common type is the HED *Condition-variable* tag, which corresponds to
-factor vectors based on the experimental design.
-Other commonly use type tags include *Task*, *Control-variable*, and *Time-block*.
-
-We assume that the dataset has been annotated using HED tags to properly document
-information such as experimental conditions, and focus on how such an annotated dataset can be
-used with remodeling to produce factor columns corresponding to these
-type variables.
-
-For additional information on how to encode experimental designs using HED, see
-[**HED conditions and design matrices**](./HedConditionsAndDesignMatrices.md).
-
-(factor-hed-type-parameters-anchor)=
-#### Factor HED type parameters
-
-```{admonition} Parameters for *factor_hed_type* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *type_tag* | str | HED tag used to find the factors (most commonly *Condition-variable*).|
-| *type_values* | list | (**Optional**) Values to factor for the *type_tag*.
If omitted, all values of that *type_tag* are used. |
-```
-The event context (as defined by onsets, offsets and durations) is always expanded and one-hot (0's and 1's)
-encoding is used for the factors.
-
-(factor-hed-type-example-anchor)=
-#### Factor HED type example
-
-The *factor_hed_type* operation in the following example appends
-additional columns to each data file corresponding to
-each possible value of each *Condition-variable* tag.
-The columns contain 1's for rows corresponding to rows (e.g., events) for which that condition
-applies and 0's otherwise.
-
-````{admonition} A JSON file with a single *factor_hed_type* transformation operation.
-:class: tip
-
-```json
-[{
- "operation": "factor_hed_type",
- "description": "Factor based on the sex of the images being presented.",
- "parameters": {
- "type_tag": "Condition-variable"
- }
-}]
-```
-````
-
-The results of executing this *factor_hed-tags* operation on the
-[**sample remodel event file**](sample-remodel-event-file-anchor) using the
-[**sample remodel sidecar file**](sample-remodel-sidecar-file-anchor) for HED annotations are:
-
-
-````{admonition} Results of *factor_hed_type*.
-
-| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex | Image-sex.Female-image-cond | Image-sex.Male-image-cond |
-| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- | ------- | ---------- |
-| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female | 1 | 0 |
-| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female | 1 | 0 |
-| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female | 1 | 0 |
-| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female | 1 | 0 |
-| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male | 0 | 1 |
-| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male | 0 | 1 |
-````
-
-(merge-consecutive-anchor)=
-### Merge consecutive
-
-Sometimes a single long event in experimental logs is represented by multiple repeat events.
-The *merge_consecutive* operation collapses these consecutive repeat events into one event with
-duration updated to encompass the temporal extent of the merged events.
-
-(merge-consecutive-parameters-anchor)=
-#### Merge consecutive parameters
-
-```{admonition} Parameters for the *merge_consecutive* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *column_name* | str | The name of the column which is the basis of the merge.|
-| *event_code* | str, int, float | The value in *column_name* that triggers the merge. |
-| *set_durations* | bool | If true, set durations based on merged events. |
-| *ignore_missing* | bool | If true, missing *column_name* or *match_columns* do not raise an error. |
-| *match_columns* | list | (**Optional**) Columns whose values must match to collapse events. |
-```
-
-The first of the group of rows (each representing an event) to be merged is called the anchor
-for the merge. After the merge, it is the only row in the group
-that remains in the data file. The result is identical
-to its original version, except for the value in the `duration` column.
-
-If the *set_duration* parameter is true, the new duration is calculated as though
-the event began with the onset of the first event (the anchor row) in the group and
-ended at the point where all the events in the group have ended.
-This method allows for small gaps between events and for events in which an
-intermediate event in the group ends after later events.
-If the *set_duration* parameter is false, the duration of the merged row is set to `n/a`.
-
-If the data file has other columns besides `onset`, `duration` and *column_name*,
-the values in the other columns must be considered during the merging process.
-The *match_columns* is a list of the other columns whose values must agree with those
-of the anchor row in order for a merge to occur. If *match_columns* is empty, the
-other columns in each row are not taken into account during the merge.
-
-(merge-consecutive-example-anchor)=
-#### Merge consecutive example
-
-The *merge_consecutive* operation in the following example causes consecutive
-`succesful_stop` events whose `stop_signal_delay`, `response_hand`, and `sex` columns
-have the same values to be merged into a single event.
-
-
-````{admonition} A JSON file with a single *merge_consecutive* transformation operation.
-:class: tip
-
-```json
-[{
- "operation": "merge_consecutive",
- "description": "Merge consecutive *succesful_stop* events that match the *match_columns.",
- "parameters": {
- "column_name": "trial_type",
- "event_code": "succesful_stop",
- "set_durations": true,
- "ignore_missing": true,
- "match_columns": ["stop_signal_delay", "response_hand", "sex"]
- }
-}]
-```
-````
-
-When this operation is applied to the following input file,
-the three events with a value of `succesful_stop` in the `trial_type` column starting
-at `onset` value 13.5939 are merged into a single event.
-
-````{admonition} Input file for a *merge_consecutive* operation.
-
-| onset | duration | trial_type | stop_signal_delay | response_hand | sex |
-| ----- | -------- | ---------- | ----------------- | ------------- | --- |
-| 0.0776 | 0.5083 | go | n/a | right | female|
-| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | right | female|
-| 9.5856 | 0.5084 | go | n/a | right | female|
-| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | female|
-| 14.2 | 0.5083 | succesful_stop | 0.2 | n/a | female|
-| 15.3 | 0.7083 | succesful_stop | 0.2 | n/a | female|
-| 17.3 | 0.5083 | unsuccesful_stop | 0.25 | n/a | female|
-| 19.0 | 0.5083 | unsuccesful_stop | 0.25 | n/a | female|
-| 21.1021 | 0.5083 | unsuccesful_stop | 0.25 | left | male|
-| 22.6103 | 0.5083 | go | n/a | left | male |
-````
-
-Notice that the `succesful_stop` event at `onset` value `17.3` is not
-merged because the `stop_signal_delay` column value does not match the value in the previous event.
-The final result has `duration` computed as `2.4144` = `15.3` + `0.7083` - `13.5939`.
-
-````{admonition} The results of the *merge_consecutive* operation.
-
-| onset | duration | trial_type | stop_signal_delay | response_hand | sex |
-| ----- | -------- | ---------- | ------------------ | ------------- | --- |
-| 0.0776 | 0.5083 | go | n/a | right | female |
-| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | right | female |
-| 9.5856 | 0.5084 | go | n/a | right | female |
-| 13.5939 | 2.4144 | succesful_stop | 0.2 | n/a | female |
-| 17.3 | 2.2083 | unsuccesful_stop | 0.25 | n/a | female |
-| 21.1021 | 0.5083 | unsuccesful_stop | 0.25 | left | male |
-| 22.6103 | 0.5083 | go | n/a | left | male |
-````
-
-The events that had onsets at `17.3` and `19.0` are also merged in this example
-
-(remap-columns-anchor)=
-### Remap columns
-
-The *remap_columns* operation maps combinations of values in *m* specified columns of a data file
-into values in *n* columns using a defined mapping.
-Remapping is useful during analysis to create columns in event files that are more directly useful
-or informative for a particular analysis.
-
-Remapping is also important during the initial generation of event files from experimental logs.
-The log files generated by experimental control software often generate a code for each type of log entry.
-Remapping can be used to convert the column containing these codes into one or more columns with more informative information.
-
-
-(remap-columns-parameters-anchor)=
-#### Remap columns parameters
-
-
-```{admonition} Parameters for the *remap_columns* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *source_columns* | list | A list of *m* names of the source columns for the map.|
-| *destination_columns* | list | A list of *n* names of the destination columns for the map. |
-| *map_list* | list | A list of mappings. Each element is a list of *m* source
column values followed by *n* destination values.
Mapping source values are treated as strings. |
-| *ignore_missing* | bool | If false, source column values not in the map generate "n/a"
destination values instead of errors. |
-| *integer_sources* | list | (**Optional**) A list of source columns that are integers.
The *integer_sources* must be a subset of *source_columns*. |
-```
-A column cannot be both a source and a destination,
-and all source columns must be present in the data files.
-New columns are created for destination columns that are missing from a data file.
-
-The *remap_columns* operation only works for columns containing strings or integers,
-as it is meant for remapping categorical codes.
-You must specify which source columns contain integers so that `n/a` values
-can be handled appropriately.
-
-The *map_list* parameter specifies how each unique combination of values from the source
-columns will be mapped into the destination columns.
-If there are *m* source columns and *n* destination columns,
-then each entry in *map_list* must be a list with *m* + *n* elements.
-The first *m* elements are the key values from the source columns.
-The *map_list* should have targets for all combinations of values that appear in the *m* source columns
-unless *ignore_missing* is true.
-
-After remapping, the tabular file will contain both source and destination columns.
-If you wish to replace the source columns with the destination columns,
-use a *remove_columns* transformation after the *remap_columns*.
-
-
-(remap-columns-example-anchor)=
-#### Remap columns example
-
-The *remap_columns* operation in the following example creates a new column called *response_type*
-based on the unique values in the combination of columns *response_accuracy* and *response_hand*.
-
-````{admonition} A JSON file with a single *remap_columns* transformation operation.
-:class: tip
-
-```json
-[{
- "operation": "remap_columns",
- "description": "Map response_accuracy and response hand into a single column.",
- "parameters": {
- "source_columns": ["response_accuracy", "response_hand"],
- "destination_columns": ["response_type"],
- "map_list": [["correct", "left", "correct_left"],
- ["correct", "right", "correct_right"],
- ["incorrect", "left", "incorrect_left"],
- ["incorrect", "right", "incorrect_left"],
- ["n/a", "n/a", "n/a"]],
- "ignore_missing": true
- }
-}]
-```
-````
-In this example there are two source columns and one destination column,
-so each entry in *map_list* must be a list with three elements
-two source values and one destination value.
-Since all the values in *map_list* are strings,
-the optional *integer_sources* list is not needed.
-
-The results of executing the previous *remap_column* command on the
-[**sample remodel event file**](sample-remodel-event-file-anchor) are:
-
-````{admonition} Mapping columns *response_accuracy* and *response_hand* into a *response_type* column.
-
-| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex | response_type |
-| ----- | -------- | ---------- | ---------- | ----------------- | ------------- | ----------------- | --- | ------------------- |
-| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female | correct_right |
-| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female | correct_right |
-| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female | correct_right |
-| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female | n/a |
-| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male | correct_left |
-| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male | correct_left |
-````
-
-In this example, *remap_columns* combines the values from columns `response_accuracy` and
-`response_hand` to produce a new column called `response_type` that specifies both response hand and correctness information using a single code.
-
-(remove-columns-anchor)=
-### Remove columns
-
-Sometimes columns are added during intermediate processing steps. The *remove_columns*
-operation is useful for cleaning up unnecessary columns after these processing steps complete.
-
-(remove-columns-parameters-anchor)=
-#### Remove columns parameters
-
-```{admonition} Parameters for the *remove_columns* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *column_names* | list of str | A list of columns to remove.|
-| *ignore_missing* | boolean | If true, missing columns are ignored, otherwise raise `KeyError`. |
-```
-
-If one of the specified columns is not in the file and the *ignore_missing*
-parameter is *false*, a `KeyError` is raised for the missing column.
-
-(remove-columns-example-anchor)=
-#### Remove columns example
-
-The following example specifies that the *remove_columns* operation should remove the `stop_signal_delay`,
-`response_accuracy`, and `face` columns from the tabular data.
-
-````{admonition} A JSON file with a single *remove_columns* transformation operation.
-:class: tip
-
-```json
-[{
- "operation": "remove_columns",
- "description": "Remove extra columns before the next step.",
- "parameters": {
- "column_names": ["stop_signal_delay", "response_accuracy", "face"],
- "ignore_missing": true
- }
-}]
-```
-````
-
-The results of executing this operation on the
-[**sample remodel event file**](sample-remodel-event-file-anchor)
-are shown below.
-The *face* column is not in the data, but it is ignored, since *ignore_missing* is true.
-If *ignore_missing* had been false, a `KeyError` would have been raised.
-
-```{admonition} Results of executing the *remove_columns*.
-| onset | duration | trial_type | response_time | response_hand | sex |
-| ----- | -------- | ---------- | ------------- | ------------- | --- |
-| 0.0776 | 0.5083 | go | 0.565 | right | female |
-| 5.5774 | 0.5083 | unsuccesful_stop | 0.49 | right | female |
-| 9.5856 | 0.5084 | go | 0.45 | right | female |
-| 13.5939 | 0.5083 | succesful_stop | n/a | n/a | female |
-| 17.1021 | 0.5083 | unsuccesful_stop | 0.633 | left | male |
-| 21.6103 | 0.5083 | go | 0.443 | left | male |
-````
-
-(remove-rows-anchor)=
-### Remove rows
-
-The *remove_rows* operation eliminates rows in which the named column has one of the specified values.
-This operation is useful for removing event markers corresponding to particular types of events
-or, for example having `n/a` in a particular column.
-
-
-(remove-rows-parameters-anchor)=
-#### Remove rows parameters
-
-```{admonition} Parameters for *remove_rows*.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *column_name* | str | The name of the column to be tested.|
-| *remove_values* | list | A list of values to be tested for removal. |
-```
-The operation does not raise an error if a data file does not have a column named
-*column_name* or is missing a value in *remove_values*.
-
-(remove-rows-example-anchor)=
-#### Remove rows example
-
-The following *remove_rows* operation removes the rows whose *trial_type* column
-contains either `succesful_stop` or `unsuccesful_stop`.
-
-````{admonition} A JSON file with a single *remove_rows* transformation operation.
-:class: tip
-
-```json
-[{
- "operation": "remove_rows",
- "description": "Remove rows where trial_type is either succesful_stop or unsuccesful_stop.",
- "parameters": {
- "column_name": "trial_type",
- "remove_values": ["succesful_stop", "unsuccesful_stop"]
- }
-}]
-```
-````
-
-The results of executing the previous *remove_rows* operation on the
-[**sample remodel event file**](sample-remodel-event-file-anchor) are:
-
-````{admonition} The results of executing the previous *remove_rows* operation.
-
-| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex |
-| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- |
-| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female |
-| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female |
-| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male |
-````
-
-After removing rows with `trial_type` equal to `succesful_stop` or `unsuccesful_stop` only the
-three `go` trials remain.
-
-
-(rename-columns-anchor)=
-### Rename columns
-
-The `rename_columns` operations uses a dictionary to map old column names into new ones.
-
-(rename-columns-parameters-anchor)=
-#### Rename columns parameters
-
-```{admonition} Parameters for *rename_columns*.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *column_mapping* | dict | The keys are the old column names and the values are the new names.|
-| *ignore_missing* | bool | If false, a `KeyError` is raised if a dictionary key is not a column name. |
-
-```
-
-If *ignore_missing* is false, a `KeyError` is raised if a column specified in
-the mapping does not correspond to a column name in the data file.
-
-(rename-columns-example-anchor)=
-#### Rename columns example
-
-The following example renames the `stop_signal_delay` column to be `stop_delay` and
-the `response_hand` to be `hand_used`.
-
-````{admonition} A JSON file with a single *rename_columns* transformation operation.
-:class: tip
-
-```json
-[{
- "operation": "rename_columns",
- "description": "Rename columns to be more descriptive.",
- "parameters": {
- "column_mapping": {
- "stop_signal_delay": "stop_delay",
- "response_hand": "hand_used"
- },
- "ignore_missing": true
- }
-}]
-
-```
-````
-
-The results of executing the previous *rename_columns* operation on the
-[**sample remodel event file**](sample-remodel-event-file-anchor) are:
-
-````{admonition} After the *rename_columns* operation is executed, the sample events file is:
-| onset | duration | trial_type | stop_delay | response_time | response_accuracy | hand_used | sex |
-| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- |
-| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female |
-| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female |
-| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female |
-| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female |
-| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male |
-| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male |
-````
-
-(reorder-columns-anchor)=
-### Reorder columns
-
-The *reorder_columns* operation reorders the indicated columns in the specified order.
-This operation is often used to place the most important columns near the beginning of the file for readability
-or to assure that all the data files in dataset have the same column order.
-Additional parameters control how non-specified columns are treated.
-
-(reorder-columns-parameters-anchor)=
-#### Reorder columns parameters
-
-```{admonition} Parameters for the *reorder_columns* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *column_order* | list | A list of columns in the order they should appear in the data.|
-| *ignore_missing* | bool | Controls handling column names in the reorder list that aren't in the data. |
-| *keep_others* | bool | Controls handling of columns not in the reorder list. |
-
-```
-
-If *ignore_missing* is true
-and items in the reorder list do not exist in the file, the missing columns are ignored.
-On the other hand, if *ignore_missing* is false,
-a column name in the reorder list that is missing from the data raises a *ValueError*.
-
-The *keep_others* parameter controls whether columns in the data that
-do not appear in the *column_order* list are dropped (*keep_others* is false) or
-put at the end in the relative order that they appear in the file (*keep_others* is true).
-
-BIDS event files are required to have `onset` and `duration` as the first and second columns, respectively.
-
-(reorder-columns-example-anchor)=
-#### Reorder columns example
-
-The *reorder_columns* operation in the following example specifies that the first four
-columns of the dataset should be: `onset`, `duration`, `response_time`, and `trial_type`.
-Since *keep_others* is false, these will be the only columns retained.
-
-````{admonition} A JSON file with a single *reorder_columns* transformation operation.
-:class: tip
-
-```json
-[{
- "operation": "reorder_columns",
- "description": "Reorder columns.",
- "parameters": {
- "column_order": ["onset", "duration", "response_time", "trial_type"],
- "ignore_missing": true,
- "keep_others": false
- }
-}]
-```
-````
-
-
-The results of executing the previous *reorder_columns* transformation on the
-[**sample remodel event file**](sample-remodel-event-file-anchor) are:
-
-````{admonition} Results of *reorder_columns*.
-
-| onset | duration | response_time | trial_type |
-| ----- | -------- | ---------- | ------------- |
-| 0.0776 | 0.5083 | 0.565 | go |
-| 5.5774 | 0.5083 | 0.49 | unsuccesful_stop |
-| 9.5856 | 0.5084 | 0.45 | go |
-| 13.5939 | 0.5083 | n/a | succesful_stop |
-| 17.1021 | 0.5083 | 0.633 | unsuccesful_stop |
-| 21.6103 | 0.5083 | 0.443 | go |
-````
-
-(split-rows-anchor)=
-### Split rows
-
-The *split_rows* operation
-is often used to convert event files from trial-level encoding to event-level encoding.
-This operation is meant only for tabular files that have `onset` and `duration` columns.
-
-In **trial-level** encoding, all the events in a single trial
-(usually some variation of the cue-stimulus-response-feedback-ready sequence)
-are represented by a single row in the data file.
-Often, the onset corresponds to the presentation of the stimulus,
-and the other events are not reported or are implicitly reported.
-
-In **event-level** encoding, each row represents the temporal marker for a single event.
-In this case a trial consists of a sequence of multiple events.
-
-
-(split-rows-parameters-anchor)=
-#### Split rows parameters
-
-```{admonition} Parameters for the *split_rows* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *anchor_column* | str | The name of the column that will be used for split_rows codes.|
-| *new_events* | dict | Dictionary whose keys are the codes to be inserted as new events
in the *anchor_column* and whose values are dictionaries with
keys *onset_source*, *duration*, and *copy_columns (**Optional**)*. |
-| *remove_parent_event* | bool | If true, remove parent event. |
-
-```
-
-The *split_rows* operation requires an *anchor_column*, which could be an existing
-column or a new column to be appended to the data.
-The purpose of the *anchor_column* is to hold the codes for the new events.
-
-The *new_events* dictionary has the new events to be created.
-The keys are the new event codes to be inserted into the *anchor_column*.
-The values in *new_events* are themselves dictionaries.
-Each of these dictionaries has three keys:
-
-- *onset_source* is a list of items to be added to the *onset*
-of the event row being split to produce the `onset` column value for the new event. These items can be any combination of numerical values and column names.
-- *duration* a list of numerical values and/or column names whose values are to be added
-to compute the `duration` column value for the new event.
-- *copy_columns* a list of column names whose values should be copied into each new event.
-Unlisted columns are filled with `n/a`.
-
-
-The *split_rows* operation sorts the split rows by the `onset` column and raises a `TypeError`
-if the `onset` and `duration` are improperly defined.
-The `onset` column is converted to numeric values as part splitting process.
-
-(split-rows-example-anchor)=
-#### Split rows example
-
-The *split_rows* operation in the following example specifies that new rows should be added
-to encode the response and stop signal. The anchor column is `trial_type`.
-
-
-````{admonition} A JSON file with a single *split_rows* transformation operation.
-:class: tip
-
-```json
-[{
- "operation": "split_rows",
- "description": "add response events to the trials.",
- "parameters": {
- "anchor_column": "trial_type",
- "new_events": {
- "response": {
- "onset_source": ["response_time"],
- "duration": [0],
- "copy_columns": ["response_accuracy", "response_hand", "sex", "trial_number"]
- },
- "stop_signal": {
- "onset_source": ["stop_signal_delay"],
- "duration": [0.5],
- "copy_columns": ["trial_number"]
- }
- },
- "remove_parent_event": false
- }
- }]
-```
-````
-
-The results of executing this *split_rows* operation on the
-[**sample remodel event file**](sample-remodel-event-file-anchor) are:
-
-````{admonition} Results of the previous *split_rows* operation.
-
-| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex |
-| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- |
-| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female |
-| 0.6426 | 0 | response | n/a | n/a | correct | right | female |
-| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female |
-| 5.7774 | 0.5 | stop_signal | n/a | n/a | n/a | n/a | n/a |
-| 6.0674 | 0 | response | n/a | n/a | correct | right | female |
-| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female |
-| 10.0356 | 0 | response | n/a | n/a | correct | right | female |
-| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female |
-| 13.7939 | 0.5 | stop_signal | n/a | n/a | n/a | n/a | n/a |
-| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male |
-| 17.3521 | 0.5 | stop_signal | n/a | n/a | n/a | n/a | n/a |
-| 17.7351 | 0 | response | n/a | n/a | correct | left | male |
-| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male |
-| 22.0533 | 0 | response | n/a | n/a | correct | left | male |
-````
-
-In a full processing example, it might make sense to rename `trial_type` to be
-`event_type` and to delete the `response_time` and the `stop_signal_delay` columns,
-since these items have been unfolded into separate events.
-This could be accomplished in subsequent clean-up operations.
-
-(remodel-summarizations-anchor)=
-## Remodel summarizations
-
-Summarizations differ transformations in two respects: they do not modify the input data file,
-and they keep information about the results from each file that has been processed.
-Summarization operations may be used at several points in the operation list as checkpoints
-during debugging as well as for their more typical informational uses.
-
-All summary operations have two required parameters: *summary_name* and *summary_filename*.
-
-The *summary_name* is the unique key used to identify the
-particular incarnation of this summary in the dispatcher.
-Care should be taken to make sure that the *summary_name* is unique within
-a given JSON remodeling file if the same summary operation is used more than
-once within the file (e.g. for before and after summary information).
-
-The *summary_filename* should also be unique and is used for saving the summary upon request.
-When the remodeler is applied to full datasets rather than single files,
-the summaries are saved in the `derivatives/remodel/summaries` directory under the dataset root.
-A time stamp and file extension are appended to the *summary_filename* when the
-summary is saved.
-
-(summarize-column-names-anchor)=
-### Summarize column names
-
-The *summarize_column_names* tracks the unique column name patterns found in data files across
-the dataset and which files have these column names.
-This summary is useful for determining whether there are any non-conforming data files.
-
-Often event files associated with different tasks have different column names,
-and this summary can be used to verify that the files corresponding to the same task
-have the same column names.
-
-A more problematic issue is when some event files for the same task
-have reordered column names or use different column names.
-
-(summarize-columns-names-parameters-anchor)=
-#### Summarize column names parameters
-
-The *summarize_column_names* operation has no parameters and only requires the
-*summary_name* and the *summary_filename* to specify the operation.
-
-The *summarize_column_names* operation only has the two parameters required of
-all summaries.
-
-```{admonition} Parameters for the *summarize_column_names* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *summary_name* | str | A unique name used to identify this summary.|
-| *summary_filename* | str | A unique file basename to use for saving this summary. |
-| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
-```
-
-(summarize-column-names-example-anchor)=
-#### Summarize column names example
-
-The following example remodeling file produces a summary, which when saved
-will appear with file name `AOMIC_column_names_xxx.txt` or
-`AOMIC_column_names_xxx.json` where `xxx` is a timestamp.
-
-````{admonition} A JSON file with a single *summarize_column_names* summarization operation.
-:class: tip
-```json
-[{
- "operation": "summarize_column_names",
- "description": "Summarize column names.",
- "parameters": {
- "summary_name": "AOMIC_column_names",
- "summary_filename": "AOMIC_column_names"
- }
-}]
-```
-````
-
-When this operation is applied to the [**sample remodel event file**](sample-remodel-event-file-anchor),
-the following text summary is produced.
-
-````{admonition} Result of applying *summarize_column_names* to the sample remodel file.
-:class: tip
-
-```text
-
-Summary name: AOMIC_column_names
-Summary type: column_names
-Summary filename: AOMIC_column_names
-
-Summary details:
-
-Dataset: Number of files=1
- Columns: ['onset', 'duration', 'trial_type', 'stop_signal_delay', 'response_time', 'response_accuracy', 'response_hand', 'sex']
- sub-0013_task-stopsignal_acq-seq_events.tsv
-
-Individual files:
-
-sub-0013_task-stopsignal_acq-seq_events.tsv:
- ['onset', 'duration', 'trial_type', 'stop_signal_delay', 'response_time', 'response_accuracy', 'response_hand', 'sex']
-
-```
-````
-
-Since we are only summarizing one event file, there is only one unique pattern -- corresponding
-to the columns: *onset*, *duration*, *trial_type*, *stop_signal_delay*, *response_time*, *response_accuracy*, *response_hand*, and *response_time*.
-
-When the dataset has multiple column name patterns, the summary lists unique pattern separately along
-with the names of the data files that have this pattern.
-
-The JSON version of the summary is useful for programmatic manipulation,
-while the text version shown above is more readable.
-
-
-(summarize-column-values-anchor)=
-### Summarize column values
-
-The summarize column values operation provides a summary of the number of times various
-column values appear in event files across the dataset.
-
-
-(summarize-columns-values-parameters-anchor)=
-#### Summarize column values parameters
-
-The following table lists the parameters required for using the summary.
-
-```{admonition} Parameters for the *summarize_column_values* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *summary_name* | str | A unique name used to identify this summary.|
-| *summary_filename* | str | A unique file basename to use for saving this summary. |
-| *append_timecode* | bool | (**Optional**: Default false) If True, append a time code to filename. |
-| *max_categorical* | int | (**Optional**: Default 50) If given, the text summary shows top *max_categorical* values.
Otherwise the text summary displays all categorical values.|
-| *skip_columns* | list | (**Optional**) A list of column names to omit from the summary.|
-| *value_columns* | list | (**Optional**) A list of columns to omit the listing unique values. |
-| *values_per_line* | int | (**Optional**: Default 5) If given, the text summary displays this
number of values per line (default is 5).|
-
-```
-
-In addition to the standard parameters, *summary_name* and *summary_filename* required of all summaries,
-the *summarize_column_values* operation requires two additional lists to be supplied.
-The *skip_columns* list specifies the names of columns to skip entirely in the summary.
-Typically, the `onset`, `duration`, and `sample` columns are skipped, since they have unique values for
-each row and their values have limited information.
-
-The *summarize_column_values* is mainly meant for creating summary information about columns
-containing a finite number of distinct values.
-Columns that contain numeric information will usually have distinct entries for
-each row in a tabular file and are not amenable to such summarization.
-These columns could be specified as *skip_columns*, but another option is to
-designate them as *value_columns*. The *value_columns* are reported in the summary,
-but their distinct values are not reported individually.
-
-For datasets that include multiple tasks, the event values for each task may be distinct.
-The *summarize_column_values* operation does not separate by task, but expects the
-calling programs filter the files by task as desired.
-The `run_remodel` program supports selecting files corresponding to a particular task.
-
-Two additional optional parameters are available for specifying aspects of the text summary output.
-The *max_categorical* optional parameter specifies how many unique values should be displayed
-for each column. The *values_per_line* controls how many categorical column values (with counts)
-are displayed on each line of the output. By default, 5 values are displayed.
-
-(summarize-column-values-example-anchor)=
-#### Summarize column values example
-
-The following example shows the JSON for including this operation in a remodeling file.
-
-````{admonition} A JSON file with a single *summarize_column_values* summarization operation.
-:class: tip
-```json
-[{
- "operation": "summarize_column_values",
- "description": "Summarize the column values in an excerpt.",
- "parameters": {
- "summary_name": "AOMIC_column_values",
- "summary_filename": "AOMIC_column_values",
- "skip_columns": ["onset", "duration"],
- "value_columns": ["response_time", "stop_signal_delay"]
- }
-}]
-```
-````
-
-A text format summary of the results of executing this operation on the
-[**sample remodel event file**](sample-remodel-event-file-anchor)
-is shown in the following example.
-
-````{admonition} Sample *summarize_column_values* operation results in text format.
-:class: tip
-```text
-Summary name: AOMIC_column_values
-Summary type: column_values
-Summary filename: AOMIC_column_values
-
-Overall summary:
-Dataset: Total events=6 Total files=1
- Categorical column values[Events, Files]:
- response_accuracy:
- correct[5, 1] n/a[1, 1]
- response_hand:
- left[2, 1] n/a[1, 1] right[3, 1]
- sex:
- female[4, 1] male[2, 1]
- trial_type:
- go[3, 1] succesful_stop[1, 1] unsuccesful_stop[2, 1]
- Value columns[Events, Files]:
- response_time[6, 1]
- stop_signal_delay[6, 1]
-
-Individual files:
-
-sub-0013_task-stopsignal_acq-seq_events.tsv:
-Total events=200
- Categorical column values[Events, Files]:
- response_accuracy:
- correct[5, 1] n/a[1, 1]
- response_hand:
- left[2, 1] n/a[1, 1] right[3, 1]
- sex:
- female[4, 1] male[2, 1]
- trial_type:
- go[3, 1] succesful_stop[1, 1] unsuccesful_stop[2, 1]
- Value columns[Events, Files]:
- response_time[6, 1]
- stop_signal_delay[6, 1]
-```
-````
-
-Because the [**sample remodel event file**](sample-remodel-event-file-anchor)
-only has 6 events, we expect that no value will be represented in more than 6 events.
-The column names corresponding to value columns just have the event counts in them.
-
-This command was executed with the `-i` option in `run_remodel`,
-results from the individual data files are shown after the overall summary.
-The individual results are similar to the overall summary because only one data file
-was processed.
-
-For a more extensive example see the
-[**text**](./_static/data/summaries/FacePerception_column_values_summary.txt)
-and [**JSON**](./_static/data/summaries/FacePerception_column_values_summary.json)
-format summaries of the sample dataset
-[**ds003645s_hed**](https://github.com/hed-standard/hed-examples/tree/main/datasets/eeg_ds003645s_hed)
-using the [**summarize_columns_rmdl.json**](./_static/data/summaries/summarize_columns_rmdl.json)
-remodeling file.
-
-
-(summarize-definitions-anchor)=
-### Summarize definitions
-
-The summarize definitions operation provides a summary of the `Def-expand` tags found across the dataset,
-nothing any ambiguous or erroneous ones. If working on a BIDS dataset, it will initialize with the known definitions
-from the sidecar, reporting any deviations from the known definitions as errors.
-
-(summarize-definitions-parameters-anchor)=
-#### Summarize definitions parameters
-
-**NOTE: This summary is still under development**
-The following table lists the parameters required for using the summary.
-
-```{admonition} Parameters for the *summarize_definitions* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *summary_name* | str | A unique name used to identify this summary.|
-| *summary_filename* | str | A unique file basename to use for saving this summary. |
-| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
-```
-
-The *summarize_definitions* is mainly meant for verifying consistency in unknown `Def-expand` tags.
-This comes up where you have an assembled dataset, but no longer have the definitions stored (or never created them to begin with).
-
-
-(summarize-definitions-example-anchor)=
-#### Summarize definitions example
-
-The following example shows the JSON for including this operation in a remodeling file.
-
-````{admonition} A JSON file with a single *summarize_definitions* summarization operation.
-:class: tip
-```json
-[{
- "operation": "summarize_definitions",
- "description": "Summarize the definitions used in this dataset.",
- "parameters": {
- "summary_name": "HED_column_definition_summary",
- "summary_filename": "HED_column_definition_summary"
- }
-}]
-```
-````
-
-A text format summary of the results of executing this operation on the
-[**sub-003_task-FacePerception_run-3_events.tsv**](_static/data/sub-003_task-FacePerception_run-3_events.tsv) file
-of the [**eeg_ds_003645s_hed_column**](https://github.com/hed-standard/hed-examples/tree/main/datasets/eeg_ds003645s_hed_column) dataset is shown in the following example.
-
-````{admonition} Sample *summarize_definitions* operation results in text format.
-:class: tip
-```text
-Summary name: HED_column_definition_summary
-Summary type: definitions
-Summary filename: HED_column_definition_summary
-
-Overall summary:
- Known Definitions: 17 items
- cross-only: 2 items
- description: A white fixation cross on a black background in the center of the screen.
- contents: (Visual-presentation,(Background-view,Black),(Foreground-view,(Center-of,Computer-screen),(Cross,White)))
- face-image: 2 items
- description: A happy or neutral face in frontal or three-quarters frontal pose with long hair cropped presented as an achromatic foreground image on a black background with a white fixation cross superposed.
- contents: (Visual-presentation,(Background-view,Black),(Foreground-view,((Center-of,Computer-screen),(Cross,White)),(Grayscale,(Face,Hair,Image))))
- circle-only: 2 items
- description: A white circle on a black background in the center of the screen.
- contents: (Visual-presentation,(Background-view,Black),(Foreground-view,((Center-of,Computer-screen),(Circle,White))))
- press-left-finger: 2 items
- description: The participant presses a key with the left index finger to indicate a face symmetry judgment.
- contents: ((Index-finger,(Experiment-participant,Left-side-of)),(Keyboard-key,Press))
- press-right-finger: 2 items
- description: The participant presses a key with the right index finger to indicate a face symmetry evaluation.
- contents: ((Index-finger,(Experiment-participant,Right-side-of)),(Keyboard-key,Press))
- famous-face-cond: 2 items
- description: A face that should be recognized by the participants
- contents: (Condition-variable/Face-type,(Image,(Face,Famous)))
- unfamiliar-face-cond: 2 items
- description: A face that should not be recognized by the participants.
- contents: (Condition-variable/Face-type,(Image,(Face,Unfamiliar)))
- scrambled-face-cond: 2 items
- description: A scrambled face image generated by taking face 2D FFT.
- contents: (Condition-variable/Face-type,(Image,(Disordered,Face)))
- first-show-cond: 2 items
- description: Factor level indicating the first display of this face.
- contents: ((Condition-variable/Repetition-type,Item-interval/0,(Face,Item-count/1)))
- immediate-repeat-cond: 2 items
- description: Factor level indicating this face was the same as previous one.
- contents: ((Condition-variable/Repetition-type,Item-interval/1,(Face,Item-count/2)))
- delayed-repeat-cond: 2 items
- description: Factor level indicating face was seen 5 to 15 trials ago.
- contents: (Condition-variable/Repetition-type,(Face,Item-count/2),(Item-interval,(Greater-than-or-equal-to,Item-interval/5)))
- left-sym-cond: 2 items
- description: Left index finger key press indicates a face with above average symmetry.
- contents: (Condition-variable/Key-assignment,((Asymmetrical,Behavioral-evidence),(Index-finger,(Experiment-participant,Right-side-of))),((Behavioral-evidence,Symmetrical),(Index-finger,(Experiment-participant,Left-side-of))))
- right-sym-cond: 2 items
- description: Right index finger key press indicates a face with above average symmetry.
- contents: (Condition-variable/Key-assignment,((Asymmetrical,Behavioral-evidence),(Index-finger,(Experiment-participant,Left-side-of))),((Behavioral-evidence,Symmetrical),(Index-finger,(Experiment-participant,Right-side-of))))
- face-symmetry-evaluation-task: 2 items
- description: Evaluate degree of image symmetry and respond with key press evaluation.
- contents: (Experiment-participant,Task,(Discriminate,(Face,Symmetrical)),(Face,See),(Keyboard-key,Press))
- blink-inhibition-task: 2 items
- description: Do not blink while the face image is displayed.
- contents: (Experiment-participant,Inhibit-blinks,Task)
- fixation-task: 2 items
- description: Fixate on the cross at the screen center.
- contents: (Experiment-participant,Task,(Cross,Fixate))
- initialize-recording: 2 items
- description:
- contents: (Recording)
- Ambiguous Definitions: 0 items
-
- Errors: 0 items
-```
-````
-
-Since this file didn't have any ambiguous or incorrect `Def-expand` groups, those sections are empty.
-Ambiguous definitions are those that take a placeholder, but it doesn't have enough information
-to be sure to which tag the placeholder applies.
-Erroneous ones are ones with conflicting expanded forms.
-
-Currently, summaries are not generated for individual files,
-but this is likely to change in the future.
-
-Below is a simple example showing the format when erroneous or ambiguous definitions are found.
-
-````{admonition} Sample input for *summarize_definitions* operation documenting ambiguous/erroneous definitions.
-:class: tip
-```text
-((Def-expand/Initialize-recording,(Recording)),Onset)
-((Def-expand/Initialize-recording,(Recording, Event)),Onset)
-(Def-expand/Specify-age/1,(Age/1, Item-count/1))
-```
-````
-
-````{admonition} Sample *summarize_definitions* operation error results in text format.
-:class: tip
-```text
-Summary name: HED_column_definition_summary
-Summary type: definitions
-Summary filename: HED_column_definition_summary
-
-Overall summary:
- Known Definitions: 1 items
- initialize-recording: 2 items
- description:
- contents: (Recording)
- Ambiguous Definitions: 1 items
- specify-age/#: (Age/#,Item-count/#)
- Errors: 1 items
- initialize-recording:
- (Event,Recording)
-```
-````
-
-It is assumed the first definition encountered is the correct definition, unless the first one is ambiguous.
-Thus, it finds (`Def-expand/Initialize-recording`,(`Recording`) and considers it valid, before encountering
-(`Def-expand/Initialize-recording`,(`Recording`, `Event`)), which is now deemed an error.
-
-
-(summarize-hed-tags-anchor)=
-### Summarize HED tags
-
-The *summarize_hed_tags* operation extracts a summary of the HED tags present
-in the annotations of a dataset.
-This summary operation assumes that the structure in question is suitably
-annotated with HED (Hierarchical Event Descriptors).
-You must provide a HED schema version.
-If the data has annotations in a JSON sidecar, you must also provide its path.
-
-(summarize-hed-tags-parameters-anchor)=
-#### Summarize HED tags parameters
-
-The *summarize_hed_tags* operation has the two required parameters
-(*tags* and *expand_context*) in addition to the standard *summary_name* and *summary_filename* parameters.
-
-```{admonition} Parameters for the *summarize_hed_tags* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *summary_name* | str | A unique name used to identify this summary.|
-| *summary_filename* | str | A unique file basename to use for saving this summary. |
-| *tags* | dict | Dictionary with category title keys and tags in that category as values. |
-| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
-| *include_context* | bool | (**Optional**: Default true) If true, expand the event context to
account for onsets and offsets. |
-| *remove_types* | list | (**Optional**) A list of types such as
`Condition-variable` and `Task` to remove. |
-| *replace_defs* | bool | (**Optional**: Default true) If true, the `Def` tags are replaced with the
contents of the definition (no `Def` or `Def-expand`). |
-| *word_cloud* | dict | (**Optional**) If present, the operation produces a
word cloud image in addition to the summaries. |
-```
-
-The *tags* dictionary has keys that specify how the user wishes the tags
-to be categorized for display.
-Note that these keys are titles designating display categories, not HED tags.
-
-The *tags* dictionary values are lists of actual HED tags (or their children)
-that should be listed under the respective display categories.
-
-If the optional parameter *include_context* is true, the counts include tags contributing
-to the event context in events intermediate between onsets and offsets.
-
-If the optional parameter *replace_defs* is true, the tag counts include
-tags contributed by contents of the definitions.
-
-If *word_cloud* parameter is provided but its value is empty, the default word cloud settings are used.
-The following table lists the optional parameters used to control the appearance of the word cloud image.
-
-```{admonition} Optional keys in the word cloud dictionary value.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *background_color* | str | The matplotlib name of the background color (default "black").|
-| *contour_color* | str | The matplotlib name of the contour color if mask provided. |
-| *contour_width* | float | Width of contour if mask provided (default 3). |
-| *font_path* | str | The path of the system font to use in place of the default font. |
-| *height* | int | Height in pixels of the image (default 300).|
-| *mask_path* | str | The path of the mask image to use if *use_mask* is true
and an image other than the brain is needed. |
-| *max_font_size* | float | The maximum font size to use in the image (default 15). |
-| *min_font_size* | float | The minimum font size to use in the image (default 8).|
-| *prefer_horizontal* | float | Fraction of horizontal words in image (default 0.75). |
-| *scale_adjustment* | float | Constant to add to log10 count transformation (default 7). |
-| *use_mask* | dict | If true, a mask image is used to provide a contour around the words. |
-| *width* | int | Width in pixels of image (default 400). |
-```
-
-(summarize-hed-tags-example-anchor)=
-#### Summarize HED tags example
-
-The following remodeling command specifies that the tag counts should be grouped
-under the titles: *Sensory events*, *Agent actions*, and *Objects*.
-Any leftover tags will appear under the title "Other tags".
-
-````{admonition} A JSON file with a single *summarize_hed_tags* summarization operation.
-:class: tip
-```json
-[{
- "operation": "summarize_hed_tags",
- "description": "Summarize the HED tags in the dataset.",
- "parameters": {
- "summary_name": "summarize_hed_tags",
- "summary_filename": "summarize_hed_tags",
- "tags": {
- "Sensory events": ["Sensory-event", "Sensory-presentation",
- "Task-stimulus-role", "Experimental-stimulus"],
- "Agent actions": ["Agent-action", "Agent", "Action", "Agent-task-role",
- "Task-action-type", "Participant-response"],
- "Objects": ["Item"]
- }
- }
-}]
-```
-````
-
-The results of executing this operation on the
-[**sample remodel event file**](sample-remodel-event-file-anchor) are shown below.
-
-````{admonition} Text summary of *summarize_hed_tags* operation on the sample remodel file.
-:class: tip
-
-```text
-Summary name: summarize_hed_tags
-Summary type: hed_tag_summary
-Summary filename: summarize_hed_tags
-
-Overall summary:
-Dataset: Total events=1200 Total1 file=6
- Main tags[events,files]:
- Sensory events:
- Sensory-presentation[6,1] Visual-presentation[6,1] Auditory-presentation[3,1]
- Agent actions:
- Incorrect-action[2,1] Correct-action[1,1]
- Objects:
- Image[6,1]
- Other tags[events,files]:
- Label[6,1] Def[6,1] Delay[3,1]
-
-Individual files:
-
-aomic_sub-0013_excerpt_events.tsv:
-Total events=6
- Main tags[events,files]:
- Sensory events:
- Sensory-presentation[6,1] Visual-presentation[6,1] Auditory-presentation[3,1]
- Agent actions:
- Incorrect-action[2,1] Correct-action[1,1]
- Objects:
- Image[6,1]
- Other tags[events,files]:
- Label[6,1] Def[6,1] Delay[3,1]
-
-```
-````
-
-The HED tag *Task-action-type* was specified in the "Agent actions" category,
-*Incorrect-action* and *Correct-action*, which are children of *Task-action-type*
-in the [**HED schema**](https://www.hedtags.org/display_hed.html),
-will appear with counts in the list under this category.
-
-The sample events file had 6 events, including 1 correct action and 2 incorrect actions.
-Since only one file was processed, the information for *Dataset* was
-similar to that presented under *Individual files*.
-
-For a more extensive example, see the
-[**text**](./_static/data/summaries/FacePerception_hed_tag_summary.txt)
-and [**JSON**](./_static/data/summaries/FacePerception_hed_tag_summary.json)
-format summaries of the sample dataset
-[**ds003645s_hed**](https://github.com/hed-standard/hed-examples/tree/main/datasets/eeg_ds003645s_hed)
-using the [**summarize_hed_tags_rmdl.json**](./_static/data/summaries/summarize_hed_tags_rmdl.json)
-remodeling file.
-
-(summarize-hed-type-anchor)=
-### Summarize HED type
-
-The *summarize_hed_type* operation is designed to extract experimental design matrices or other
-experimental structure.
-This summary operation assumes that the structure in question is suitably
-annotated with HED (Hierarchical Event Descriptors).
-The [**HED conditions and design matrices**](https://hed-examples.readthedocs.io/en/latest/HedConditionsAndDesignMatrices.html)
-explains how this works.
-
-(summarize-hed-type-parameters-anchor)=
-#### Summarize HED type parameters
-
-The *summarize_hed_type* operation provides detailed information about a specified tag,
-usually `Condition-variable` or `Task`.
-This summary provides useful information about experimental design.
-
-```{admonition} Parameters for the *summarize_hed_type* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *summary_name* | str | A unique name used to identify this summary.|
-| *summary_filename* | str | A unique file basename to use for saving this summary. |
-| *type_tag* | str | Tag to produce a summary for (most often *condition-variable*).|
-| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename.|
-```
-In addition to the two standard parameters (*summary_name* and *summary_filename*),
-the *type_tag* parameter is required.
-Only one tag can be given, so you must provide a separate operations in the remodel file
-for multiple type tags.
-
-(summarize-hed-type-example-anchor)=
-#### Summarize HED type example
-
-````{admonition} A JSON file with a single *summarize_hed_type* summarization operation.
-:class: tip
-```json
-[{
- "operation": "summarize_hed_type",
- "description": "Summarize column names.",
- "parameters": {
- "summary_name": "AOMIC_condition_variables",
- "summary_filename": "AOMIC_condition_variables",
- "type_tag": "condition-variable"
- }
-}]
-```
-````
-
-The results of executing this operation on the
-[**sample remodel event file**](sample-remodel-event-file-anchor) are shown below.
-
-````{admonition} Text summary of *summarize_hed_types* operation on the sample remodel file.
-:class: tip
-
-```text
-Summary name: AOMIC_condition_variables
-Summary type: hed_type_summary
-Summary filename: AOMIC_condition_variables
-
-Overall summary:
-
-Dataset: Type=condition-variable Type values=1 Total events=6 Total files=1
- image-sex: 2 levels in 6 event(s)s out of 6 total events in 1 file(s)
- female-image-cond [4,1]: ['Female', 'Image', 'Face']
- male-image-cond [2,1]: ['Male', 'Image', 'Face']
-
-Individual files:
-
-aomic_sub-0013_excerpt_events.tsv:
-Type=condition-variable Total events=6
- image-sex: 2 levels in 6 events
- female-image-cond [4 events, 1 files]:
- Tags: ['Female', 'Image', 'Face']
- male-image-cond [2 events, 1 files]:
- Tags: ['Male', 'Image', 'Face']
-```
-````
-
-Because *summarize_hed_type* is a HED operation,
-a HED schema version is required and a JSON sidecar is also usually needed.
-This summary was produced by using `hed_version="8.1.0"` when creating the `dispatcher`
-and using the [**sample remodel sidecar file**](sample-remodel-sidecar-file-anchor) in the `do_op`.
-The sidecar provides the annotations that use the `condition-variable` tag in the summary.
-
-For a more extensive example, see the
-[**text**](./_static/data/summaries/FacePerception_hed_type_summary.txt)
-and [**JSON**](./_static/data/summaries/FacePerception_hed_type_summary.json)
-format summaries of the sample dataset
-[**ds003645s_hed**](https://github.com/hed-standard/hed-examples/tree/main/datasets/eeg_ds003645s_hed)
-using the [**summarize_hed_types_rmdl.json**](./_static/data/summaries/summarize_hed_types_rmdl.json)
-remodeling file.
-
-(summarize-hed-validation-anchor)=
-### Summarize HED validation
-
-The *summarize_hed_validation* operation runs the HED validator on the requested data
-and produces a summary of the errors.
-See the [**HED validation guide**](./HedValidationGuide.md) for available methods of
-running the HED validator.
-
-
-(summarize-hed-validation-parameters-anchor)=
-#### Summarize HED validation parameters
-
-In addition to the required *summary_name* and *summary_filename* parameters,
-the *summarize_hed_validation* operation has a required boolean parameter *check_for_warnings*.
-If *check_for_warnings* is false, the summary will not report warnings.
-
-```{admonition} Parameters for the *summarize_hed_validation* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *summary_name* | str | A unique name used to identify this summary.|
-| *summary_filename* | str | A unique file basename to use for saving this summary. |
-| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
-| *check_for_warnings* | bool | (**Optional**: Default false) If true, warnings are reported in addition to errors. |
-```
-The *summarize_hed_validation* is a HED operation and the calling program must provide a HED schema version
-and usually a JSON sidecar containing the HED annotations.
-
-The validation process takes place in two stages: first the JSON sidecar is validated.
-This strategy is used because a single error in the JSON sidecar can generate an error message
-for every line in the corresponding data file.
-
-If the JSON sidecar has errors (warnings don't count), the validation process is terminated
-without validation of the data file and assembled HED annotations.
-
-If the JSON sidecar does not have errors,
-the validator assembles the annotations for each line in the data files and validates
-the assembled HED annotation.
-Data file-wide consistency, such as matched onsets and offsets, is also checked.
-
-
-(summarize-hed-validation-example-anchor)=
-#### Summarize HED validation example
-
-````{admonition} A JSON file with a single *summarize_hed_validation* summarization operation.
-:class: tip
-```json
-[{
- "operation": "summarize_hed_validation",
- "description": "Summarize validation errors in the sample dataset.",
- "parameters": {
- "summary_name": "AOMIC_sample_validation",
- "summary_filename": "AOMIC_sample_validation",
- "check_for_warnings": true
- }
-}]
-```
-````
-
-To demonstrate the output of the validation operation, we modified the first row of the
-[**sample remodel event file**](sample-remodel-event-file-anchor)
-so that `trial_type` column contained the value `baloney` rather than `go`.
-This modification generates a warning because the meaning of `baloney` is not defined
-in the [**sample remodel sidecar file**](sample-remodel-sidecar-file-anchor).
-The results of executing the example operation with the modified file are shown
-in the following example.
-
-
-````{admonition} Text summary of *summarize_hed_validation* operation on a modified sample data file.
-:class: tip
-
-```text
-Summary name: AOMIC_sample_validation
-Summary type: hed_validation
-Summary filename: AOMIC_sample_validation
-
-Summary details:
-
-Dataset: [1 sidecar files, 1 event files]
- task-stopsignal_acq-seq_events.json: 0 issues
- sub-0013_task-stopsignal_acq-seq_events.tsv: 6 issues
-
-Individual files:
-
- sub-0013_task-stopsignal_acq-seq_events.tsv: 1 sidecar files
- task-stopsignal_acq-seq_events.json has no issues
- sub-0013_task-stopsignal_acq-seq_events.tsv issues:
- HED_UNKNOWN_COLUMN: WARNING: Column named 'onset' found in file, but not specified as a tag column or identified in sidecars.
- HED_UNKNOWN_COLUMN: WARNING: Column named 'duration' found in file, but not specified as a tag column or identified in sidecars.
- HED_UNKNOWN_COLUMN: WARNING: Column named 'response_time' found in file, but not specified as a tag column or identified in sidecars.
- HED_UNKNOWN_COLUMN: WARNING: Column named 'response_accuracy' found in file, but not specified as a tag column or identified in sidecars.
- HED_UNKNOWN_COLUMN: WARNING: Column named 'response_hand' found in file, but not specified as a tag column or identified in sidecars.
- HED_SIDECAR_KEY_MISSING[row=0,column=2]: WARNING: Category key 'baloney' does not exist in column. Valid keys are: ['succesful_stop', 'unsuccesful_stop', 'go']
-
-```
-````
-
-This summary was produced using HED schema version `hed_version="8.1.0"` when creating the `dispatcher`
-and using the [**sample remodel sidecar file**](sample-remodel-sidecar-file-anchor) in the `do_op`.
-
-
-(summarize-sidecar-from-events-anchor)=
-### Summarize sidecar from events
-
-The summarize sidecar from events operation generates a sidecar template from the event
-files in the dataset.
-
-
-(summarize-sidecar-from-events-parameters-anchor)=
-#### Summarize sidecar from events parameters
-
-The following table lists the parameters required for using the summary.
-
-```{admonition} Parameters for the *summarize_sidcar_from_events* operation.
-:class: tip
-
-| Parameter | Type | Description |
-| ------------ | ---- | ----------- |
-| *summary_name* | str | A unique name used to identify this summary.|
-| *summary_filename* | str | A unique file basename to use for saving this summary. |
-| *skip_columns* | list | A list of column names to omit from the sidecar.|
-| *value_columns* | list | A list of columns to treat as value columns in the sidecar. |
-| *append_timecode* | bool | (Optional) If True, append a time code to filename.
False is the default. |
-```
-The standard summary parameters, *summary_name* and *summary_filename* are required.
-The *summary_name* is the unique key used to identify the
-particular incarnation of this summary in the dispatcher.
-Since a particular operation file may use a given operation multiple times,
-care should be taken to make sure that it is unique.
-
-The *summary_filename* should also be unique and is used for saving the summary upon request.
-When the remodeler is applied to full datasets rather than single files,
-the summaries are saved in the `derivatives/remodel/summaries` directory under the dataset root.
-A time stamp and file extension are appended to the *summary_filename* when the
-summary is saved.
-
-In addition to the standard parameters, *summary_name* and *summary_filename* required of all summaries,
-the *summarize_sidecar_from_events* operation requires two additional lists to be supplied.
-The *skip_columns* list specifies the names of columns to skip entirely in
-generating the sidecar template.
-The *value_columns* list specifies the names of columns to treat as value columns
-when generating the sidecar template.
-
-(summarize-sidecar-from-events-example-anchor)=
-#### Summarize sidecar from events example
-
-The following example shows the JSON for including this operation in a remodeling file.
-
-````{admonition} A JSON file with a single *summarize_sidecar_from_events* summarization operation.
-:class: tip
-```json
-[{
- "operation": "summarize_sidecar_from_events",
- "description": "Generate a sidecar from the excerpted events file.",
- "parameters": {
- "summary_name": "AOMIC_generate_sidecar",
- "summary_filename": "AOMIC_generate_sidecar",
- "skip_columns": ["onset", "duration"],
- "value_columns": ["response_time", "stop_signal_delay"]
- }
-}]
-
-```
-````
-
-The results of executing this operation on the
-[**sample remodel event file**](sample-remodel-event-file-anchor)
-are shown in the following example using the text format.
-
-````{admonition} Sample *summarize_sidecar_from_events* operation results in text format.
-:class: tip
-```text
-Summary name: AOMIC_generate_sidecar
-Summary type: events_to_sidecar
-Summary filename: AOMIC_generate_sidecar
-
-Dataset: Currently no overall sidecar extraction is available
-
-Individual files:
-
-aomic_sub-0013_excerpt_events.tsv: Total events=6 Skip columns: ['onset', 'duration']
-Sidecar:
-{
- "trial_type": {
- "Description": "Description for trial_type",
- "HED": {
- "go": "(Label/trial_type, Label/go)",
- "succesful_stop": "(Label/trial_type, Label/succesful_stop)",
- "unsuccesful_stop": "(Label/trial_type, Label/unsuccesful_stop)"
- },
- "Levels": {
- "go": "Here describe column value go of column trial_type",
- "succesful_stop": "Here describe column value succesful_stop of column trial_type",
- "unsuccesful_stop": "Here describe column value unsuccesful_stop of column trial_type"
- }
- },
- "response_accuracy": {
- "Description": "Description for response_accuracy",
- "HED": {
- "correct": "(Label/response_accuracy, Label/correct)"
- },
- "Levels": {
- "correct": "Here describe column value correct of column response_accuracy"
- }
- },
- "response_hand": {
- "Description": "Description for response_hand",
- "HED": {
- "left": "(Label/response_hand, Label/left)",
- "right": "(Label/response_hand, Label/right)"
- },
- "Levels": {
- "left": "Here describe column value left of column response_hand",
- "right": "Here describe column value right of column response_hand"
- }
- },
- "sex": {
- "Description": "Description for sex",
- "HED": {
- "female": "(Label/sex, Label/female)",
- "male": "(Label/sex, Label/male)"
- },
- "Levels": {
- "female": "Here describe column value female of column sex",
- "male": "Here describe column value male of column sex"
- }
- },
- "response_time": {
- "Description": "Description for response_time",
- "HED": "(Label/response_time, Label/#)"
- },
- "stop_signal_delay": {
- "Description": "Description for stop_signal_delay",
- "HED": "(Label/stop_signal_delay, Label/#)"
- }
-}
-```
-````
-
-(remodel-implementation-anchor)=
-## Remodel implementation
-
-Operations are defined as classes that extend `BaseOp` regardless of whether
-they are transformations or summaries. However, summaries must also implement
-an additional supporting class that extends `BaseSummary` to hold the summary information.
-
-In order to be executed by the remodeling functions,
-an operation must appear in the `valid_operations` dictionary.
-
-Each operation class must have a `NAME` class variable, specifying the operation name (string) and a
-`PARAMS` class variable containing a dictionary of the operation's parameters represented as a json schema.
-The operation's constructor extends the `BaseOp` class constructor by calling:
-
-````{admonition} A remodel operation class must call the BaseOp constructor first.
-:class: tip
-```python
- super().__init__(parameters)
-```
-````
-
-A remodel operation class must implement the `BaseOp` abstract methods `do_ops` and `validate_input_data`.
-
-### The PARAMS dictionary
-
-The class-wide `PARAMS` dictionary specifies the required and optional parameters of the operation as a [**JSON schema**](https://json-schema.org/).
-We currently use draft-2020-12.
-The basic vocabulary allows specifying the type of parameters that are expected and
-whether a parameter is required or optional.
-
-It is also possible to add dependencies between parameters. More information can be found in the JSON schema
-[**documentation**](https://json-schema.org/learn/getting-started-step-by-step).
-
-On the highest level the type should always be specified as an object, as the parameters are always provided as a dictionary or json object.
-Under the properties key, the expected parameters should be listed, along with what datatype is expected for every parameter.
-The specifications can be nested, for example, the `rename_columns` operation requires a parameter `column_mapping`,
-which should be a JSON object whose keys are any valid string, and whose values are also strings.
-This is represented in the following way:
-
-```json
-{
- "type": "object",
- "properties": {
- "column_mapping": {
- "type": "object",
- "patternProperties": {
- ".*": {
- "type": "string"
- }
- },
- "minProperties": 1
- },
- "ignore_missing": {
- "type": "boolean"
- }
- },
- "required": [
- "column_mapping",
- "ignore_missing"
- ],
- "additionalProperties": false
- }
-```
-
-The `PARAMS` dictionaries for all available operations are read by the `validator` and compiled into a
-single JSON schema which represents the specification for remodeler files.
-The `properties` dictionary explicitly specifies the parameters that are allowed for this operation.
-The `required` list specifies which parameters must be included when calling operation.
-Parameters that are not required may be omitted in the operation call.
-The `"additionalProperties": false` is the way that JSON schema use to
-indicate that no other parameters are allowed in the call to the operation.
-
-A limitation to JSON schema representations is that although it can handle specific dependencies between keys in the data,
-it cannot validate the data that is provided in the JSON file against other data in the JSON file.
-For example, if the requirement is a list of elements whose length should be specified by another parameter,
-JSON schema does not provide a vocabulary for setting this dependency.
-Instead, we handle these type of dependencies in the `validate_input_data` method.
-
-(operation-class-constructor-anchor)=
-### Operation class constructor
-
-All the operation classes have constructors that start with a call to the superclass constructor `BaseOp`.
-The following example shows the constructor for the `RemoveColumnsOp` class.
-
-````{admonition} The class-wide PARAMS dictionary for the RemoveColumnsOp class.
-:class: tip
-```python
- def __init__(self, parameters):
- super().__init__(parameters)
- self.column_names = parameters['column_names']
- ignore_missing = parameters['ignore_missing']
- if ignore_missing:
- self.error_handling = 'ignore'
- else:
- self.error_handling = 'raise'
-```
-````
-
-After the call to the base class constructor, the operation constructor assigns the operation-specific
-values to class properties. Validation takes place before the operation classes are initialized.
-
-
-(the-do_op-implementation-anchor)=
-### The do_op implementation
-The remodeling script is meant to be executed by the `Dispatcher`,
-which keeps a compiled version of the remodeling script to execute on each tabular file to be remodeled.
-
-The main method that must be implemented by each operation is `do_op`, which takes
-an instance of the `Dispatcher` class as the first parameter and a Pandas [`DataFrame`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html)
-representing the tabular file as the second parameter.
-A third required parameter is a name used to identify the tabular file in error messages and summaries.
-This name is usually the filename or the filepath from the dataset root.
-An additional optional argument, a sidecar containing HED annotations,
-only need be included for HED operations.
-Note that the `Dispatcher` is responsible for holding the appropriate version of the HED schema if
-HED remodeling operations are included.
-
-The following example shows a sample implementation for `do_op`.
-
-````{admonition} The implementation of do_op for the RemoveColumnsOp class.
-:class: tip
-```python
-
- def do_op(self, dispatcher, df, name, sidecar=None):
- return df.drop(self.remove_names, axis=1, errors=self.error_handling)
-```
-````
-
-The `do_op` in this case is a wrapper for the underlying Pandas `DataFrame`
-operation for removing columns.
-
-**IMPORTANT NOTE**: The `do_op` operation always assumes that `n/a` values have been
-replaced by `numpy.NaN` values in the incoming dataframe `df`.
-The `Dispatcher` class has a static method `prep_data` that does this replacement.
-At the end of running all the remodeling operations on a data file `Dispatcher` `run_operations`
-method replaces all of the `numpy.NaN` values with `n/a`, the value expected by BIDS.
-This operation is performed by the `Dispatcher` static method `post_proc_data`.
-
-
-### The validate_input_data implementation
-
-This method exist to handle additional input data validation that cannot be specified in JSON schema.
-It is a class method which is called by the `validator`.
-If there is no additional validation to be done,
-a minimal implementation of this method should take in a dictionary with the operation parameters and return an empty list.
-In case additional validation is required, the method should directly implement validation and return a list of user-friendly
-error messages (strings) if validation fails, or an empty list if there are no errors.
-
-The following implementation of `validate_input_data` method, for the `factor_hed_tags` operation,
-checks whether the parameter `query_names` is the same length as the input for parameter `queries`,
-since the names specified in the first parameter are meant to represent the queries provided in the latter.
-The check only takes place if `query_names` exists, since naming is handled automatically otherwise.
-
-```python
-@staticmethod
-def validate_input_data(parameters):
- errors = []
- if parameters.get("query_names", False):
- if len(parameters.get("query_names")) != len(parameters.get("queries")):
- errors.append("The list in query_names, in the factor_hed_tags operation, should have the same number of items as queries.")
- return errors
-```
-
-
-(the-do_op-for summarization-anchor)=
-### The do_op for summarization
-
-The `do_op` operation for summarization operations has a slightly different form,
-as it serves primarily as a wrapper for the actual summary information as illustrated
-by the following example.
-
-(implementation-of-do-op_summarize-column-names-anchor)=
-````{admonition} The implementation of do_op for SummarizeColumnNamesOp.
-:class: tip
-```python
- def do_op(self, dispatcher, df, name, sidecar=None):
- summary = dispatcher.summary_dict.get(self.summary_name, None)
- if not summary:
- summary = ColumnNameSummary(self)
- dispatcher.summary_dict[self.summary_name] = summary
- summary.update_summary({"name": name, "column_names": list(df.columns)})
- return df
-
-```
-````
-
-A `do_op` operation for a summarization checks the `dispatcher` to see if the
-summary name is already in the dispatcher's `summary_dict`.
-If that summary is not yet in the `summary_dict`,
-the operation creates a `BaseSummary` object for its summary (e.g., `ColumnNameSummary`)
-and adds this object to the dispatcher's `summary_dict`,
-otherwise the operation fetches the `BaseSummary` object from the dispatcher's `summary_dict`.
-It then asks this `BaseSummary` object to update the summary based on the `DataFrame`
-as explained in the next section.
-
-(additional-requirements-for-summarization-anchor)=
-### Additional requirements for summarization
-
-Any summary operation must implement a supporting class that extends `BaseSummary`.
-This class is used to hold and accumulate the information specific to the summary.
-This support class must implement two methods: `update_summary` and `get_summary_details`.
-
-The `update_summary` method is called by its associated `BaseOp` operation during the `do_op`
-to update the summary information based on the current `DataFrame`.
-The `update_summary` information takes a single parameter, which is a dictionary of information
-specific to this operation.
-
-````{admonition} The update_summary method required to be implemented by all BaseSummary objects.
-:class: tip
-```python
- def update_summary(self, summary_dict)
-```
-````
-
-In the example [do_op for ColumnNamesOp](implementation-of-do-op_summarize-column-names-anchor),
-the dictionary contains keys for `name` and `column_names.
-
-The `get_summary_details` returns a dictionary with the summary-specific information
-currently in the summary.
-The `BaseSummary` provides universal methods for converting this summary to JSON or text format.
-
-
-````{admonition} The get_summary_details method required to be implemented by all BaseSummary objects.
-:class: tip
-```python
- get_summary_details(self, verbose=True)
-```
-````
-The operation associated with this instance of it associated with a given format
-implementation
-
-### Validator implementation
-
-The required input for the remodeler is specified in JSON format and must follow
-the rules laid out by the JSON schema.
-The parameters in the remodeler file must conform to the properties specified
-in the corresponding JSON schema associated with each operation.
-The errors are retrieved from the validator but are not passed on directly but instead
-modified for display as user-friendly error messages.
-
-Validation errors are organized by stages as follows.
-
-#### Stage 0: top-level structure
-
-Stage 0 refers to the top-level structure of the remodel JSON file.
-As specified by the validator's `BASE_ARRAY`,
-a JSON remodel file must be an array of operation dictionaries containing at least 1 item.
-
-#### Stage 1: operation dictionary format
-
-Stage 1 validation refers the structure of the individual operations as specified by the validator's `OPERATION_DICT`.
-Every operation dictionary should have exactly the keys: `operation`, `description`, and `parameters`.
-
-#### Stage 2: operation dictionary values
-
-Stage 2 validates the values associated with the keys in each operation dictionary.
-The `operation` and `description` keys should have a string values,
-while the `parameters` key should have a dictionary value.
-
-Stage 2 validation also verifies that the operation value is one of the valid operations as
-enumerated in the `valid_operations` dictionary.
-
-Several checks are also applied to the `parameters` dictionary.
-The properties listed as `required` in the schema must appear as keys in the `parameters` dictionary.
-
-If additional properties are not allowed, as designated by `"additionalProperties": False` in the JSON schema,
-the validator verifies that parameters not mentioned in the schema do not appear.
-Note this is currently true for all operations and recommended for new operations.
-
-If the schema for the operation has a `dependentRequired` dictionary, the validator
-verifies that the indicated keys are present if the listed parameter values are also present.
-For example, the `factor_column_op` only allows the `factor_names` parameter if the `factor_values`
-parameter is also present. In this case the dependency works only one way, such that `factor_values`
-can be provided without `factor_names`. If `factor_names` is provided alone the operation automatically generates the
-factor names based on the column names, however, without `factor_values` the names provided
-in `factor_names` do not correspond to anything, so this key cannot appear on its own.
-
-#### Later validation stages
-
-Later stages in validation concern the values given within the parameter object, which can be nested to an arbitrary level
-and are handled in a general way.
-The user is provided with the operation index, name and the 'path' of the value that is invalid.
-Note that while parameters always contains an object, the values in parameters can be of any type.
-Thus, parameter values can be objects whose values might also be expected to be objects, arrays, or arrays of objects.
-The validator has appropriate messages for many of the conditions that can be set with json schema,
-but if an new operation parameter has a condition that has not been used yet, a new error message will need to be added to the validator.
-
-
-When validation against JSON schema passes,
-the validator performs additional data-specific validation by calling `validate_input_data`
-for each operation to verify that input data satisfies the
-constraints that fall outside the scope of JSON schema.
-Also see [**The validate_input_data implementation**](#the-validate_input_data-implementation) and
-[**The PARAMS dictionary**](#the-params-dictionary) sections for additional information.
+(hed-remodeling-tools-anchor)=
+# HED remodeling tools
+
+**Remodeling** refers to the process of transforming a tabular file
+into a different form in order to disambiguate the
+information or to facilitate a particular analysis.
+The remodeling operations are specified in a JSON (`.json`) file,
+giving a record of the transformations performed.
+
+There are two types of remodeling operations: **transformation** and **summarization**.
+The **transformation** operations modify the tabular files,
+while **summarization** produces an auxiliary information file but leaves
+the tabular files unchanged.
+
+The file remodeling tools can be applied to any tab-separated value (`.tsv`) file
+but are particularly useful for restructuring files representing experimental events.
+Please read the [**HED remodeling quickstart**](./HedRemodelingQuickstart)
+tutorials for an introduction and basic use of the tools.
+
+The file remodeling tools can be applied to individual files using the
+[**HED online tools**](https://hedtools.ucsd.edu/hed) or to entire datasets
+using the [**remodel command-line interface**](remodel-command-line-interface-anchor)
+either by calling Python scripts directly from the command line
+or by embedding calls in a Jupyter notebook.
+The tools are also available as
+[**HED RESTful services**](./HedOnlineTools.md#hed-restful-services).
+The online tools are particularly useful for debugging.
+
+This user's guide contains the following topics:
+
+* [**Overview of remodeling**](overview-of-remodeling-anchor)
+* [**Installing the remodel tools**](installing-the-remodel-tools-anchor)
+* [**Remodel command-line interface**](remodel-command-line-interface-anchor)
+* [**Remodel scripts**](remodel-scripts-anchor)
+ * [**Backing up files**](backing-up-files-anchor)
+ * [**Remodeling files**](remodeling-files-anchor)
+ * [**Restoring files**](restoring-files-anchor)
+* [**Remodel with HED**](remodel-with-hed-anchor)
+* [**Remodel sample files**](remodel-sample-files-anchor)
+ * [**Sample remodel file**](sample-remodel-file-anchor)
+ * [**Sample remodel event file**](sample-remodel-event-file-anchor)
+ * [**Sample remodel sidecar file**](sample-remodel-sidecar-file-anchor)
+* [**Remodel transformations**](remodel-transformations-anchor)
+ * [**Factor column**](factor-column-anchor)
+ * [**Factor HED tags**](factor-hed-tags-anchor)
+ * [**Factor HED type**](factor-hed-type-anchor)
+ * [**Merge consecutive**](merge-consecutive-anchor)
+ * [**Remap columns**](remap-columns-anchor)
+ * [**Remove columns**](remove-columns-anchor)
+ * [**Remove rows**](remove-rows-anchor)
+ * [**Rename columns**](rename-columns-anchor)
+ * [**Reorder columns**](reorder-columns-anchor)
+ * [**Split rows**](split-rows-anchor)
+* [**Remodel summarizations**](remodel-summarizations-anchor)
+ * [**Summarize column names**](summarize-column-names-anchor)
+ * [**Summarize column values**](summarize-column-values-anchor)
+ * [**Summarize definitions**](summarize-definitions-anchor)
+ * [**Summarize hed tags**](summarize-hed-tags-anchor)
+ * [**Summarize hed type**](summarize-hed-type-anchor)
+ * [**Summarize hed validation**](summarize-hed-validation-anchor)
+ * [**Summarize sidecar from events**](summarize-sidecar-from-events-anchor)
+* [**Remodel implementation**](remodel-implementation-anchor)
+
+
+(overview-of-remodeling-anchor)=
+## Overview of remodeling
+
+Remodeling consists of restructuring and/or extracting information from tab-separated
+value files based on a specified list of operations contained in a JSON file.
+
+Internally, the remodeling operations represent the tabular file using a
+[**Pandas DataFrame**](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html).
+
+(transformation-operations-anchor)=
+### Transformation operations
+
+**Transformation** operations, shown schematically in the following
+figure, are designed to transform an incoming tabular file
+into a new DataFrame without modifying the incoming data.
+
+![Transformation operations](./_static/images/TransformationOperations.png)
+
+Transformation operations are stateless and do not save any context information or
+affect future applications of the transformation.
+
+Transformations, themselves, do not have any output and just return a new,
+transformed DataFrame.
+In other words, transformations do not operate in place on the incoming DataFrame,
+but rather, they create a new DataFrame containing the result.
+
+Typically, the calling program is responsible for reading and saving the tabular file,
+so the user can choose whether to overwrite or create a new file.
+
+See the [**remodeling tool program interface**](remodel-command-line-interface-anchor)
+section for information on how to call the operations.
+
+(summarization-operations-anchor)=
+### Summarization operations
+
+**Summarization** operations do not modify the input DataFrame but rather extract and save information in an internally stored summary dictionary as shown schematically in the following figure.
+
+![Summary operations](./_static/images/SummaryOperation.png)
+
+The dispatcher that executes remodeling operations can be interrogated at any time
+for the state information contained in the global summary dictionary and can save additional summary information at any time during execution.
+Usually summaries are dumped at the end of processing to the `derivatives/remodel/summaries`
+subdirectory under the dataset root.
+
+Summarization operations may appear anywhere in the operation list,
+and the same type of summary may appear multiple times under different names in order to track progress.
+
+The dispatcher stores information from each uniquely named summarization operation
+as a separate summary dictionary entry.
+Within its summary information, most summarization operations keep a separate
+summary for each individual file and have methods to create an overall summary
+of the information for all the files that have been processed by the summarization.
+
+Summarization results are available in JSON (`.json`) and text (`.txt`) formats.
+
+(available-operations-anchor)=
+### Available operations
+
+The following table lists the available remodeling operations with brief example use cases
+and links to further documentation. Operations not listed in the summarize section are transformations.
+
+(remodel-operation-summary-anchor)=
+````{table} Summary of the HED remodeling operations for tabular files.
+| Category | Operation | Example use case |
+| -------- | ------- | -----|
+| **clean-up** | | |
+| | [*remove_columns*](remove-columns-anchor) | Remove temporary columns created during restructuring. |
+| | [*remove_rows*](remove-rows-anchor) | Remove rows with n/a values in a specified column. |
+| | [*rename_columns*](rename-columns-anchor) | Make columns names consistent across a dataset. |
+| | [*reorder_columns*](reorder-columns-anchor) | Make column order consistent across a dataset. |
+| **factor** | | |
+| | [*factor_column*](factor-column-anchor) | Extract factor vectors from a column of condition variables. |
+| | [*factor_hed_tags*](factor-hed-tags-anchor) | Extract factor vectors from search queries of HED annotations. |
+| | [*factor_hed_type*](factor-hed-type-anchor) | Extract design matrices and/or condition variables. |
+| **restructure** | | |
+| | [*merge_consecutive*](merge-consecutive-anchor) | Replace multiple consecutive events of the same type
with one event of longer duration. |
+| | [*remap_columns*](remap-columns-anchor) | Create m columns from values in n columns (for recoding). |
+| | [*split_rows*](split-rows-anchor) | Split trial-encoded rows into multiple events. |
+| **summarize** | | |
+| | [*summarize_column_names*](summarize-column-names-anchor) | Summarize column names and order in the files. |
+| | [*summarize_column_values*](summarize-column-values-anchor) | Count the occurrences of the unique column values. |
+| | [*summarize_hed_tags*](summarize-hed-tags-anchor) | Summarize the HED tags present in the
HED annotations for the dataset. |
+| | [*summarize_hed_type*](summarize-hed-type-anchor) | Summarize the detailed usage of a particular type tag
such as *Condition-variable* or *Task*
(used to automatically extract experimental designs). |
+| | [*summarize_hed_validation*](summarize-hed-validation-anchor) | Validate the data files and report any errors. |
+| | [*summarize_sidecar_from_events*](summarize-sidecar-from-events-anchor) | Generate a sidecar template from an event file. |
+````
+
+The **clean-up** operations are used at various phases of restructuring to assure consistency
+across dataset files.
+
+The **factor** operations produce column vectors with the same number of rows as the data file
+from which they were calculated.
+They encode condition variables, design matrices, or other search criteria.
+See the [**HED conditions and design matrices**](./HedConditionsAndDesignMatrices.md)
+for more information on factoring and analysis.
+
+The **restructure** operations modify the way in which a data file represents its information.
+
+The **summarize** operations produce dataset-wide summaries of various aspects of the data files
+as well as summaries of the individual files.
+
+(installing-the-remodel-tools-anchor)=
+## Installing the remodel tools
+
+The remodeling tools are available in the GitHub
+[**hed-python**](https://github.com/hed-standard/hed-python) repository
+along with other tools for data cleaning and curation.
+Although version 0.1.0 of this repository is available on [**PyPI**](https://pypi.org/)
+as `hedtools`, the version containing the restructuring tools (Version 0.2.0)
+is still under development and has not been officially released.
+However, the code is publicly available on the `develop` branch of the
+hed-python repository and
+can be directly installed from GitHub using `pip`:
+
+```text
+pip install git+https://github.com/hed-standard/hed-python/@develop
+```
+
+The web services and online tools supporting remodeling are available
+on the [**HED online tools dev server**](https://hedtools.ucsd.edu/hed_dev).
+When version 0.2.0 of `hedtools` is officially released on PyPI, restructuring
+will become available on the released [**HED online tools**](https://hedtools.ucsd.edu/hed).
+A docker version is also under development.
+
+The following diagram shows a schematic of the remodeling process.
+
+![Event remodeling process](./_static/images/EventRemappingProcess.png)
+
+Initially, the user creates a backup of the specified tabular files (usually `events.tsv` files).
+This backup is a mirror of the data files in the dataset,
+but is located in the `derivatives/remodel/backups` directory and never modified once the backup is created.
+
+Remodeling applies a sequence of operations specified in a JSON remodel file
+to the backup versions of the data files.
+The JSON remodel file provides a record of the operations performed on the file.
+If the user detects a mistake in the transformations,
+he/she can correct the transformation file and rerun the transformations.
+
+Remodeling always runs on the original backup version of the file rather than
+the transformed version, so the transformations can always be corrected and rerun.
+It is possible to by-pass the backup, particularly if only using summarization operations,
+but this is not recommended and should be done with care.
+
+(remodel-command-line-interface-anchor)=
+## Remodel command-line interface
+
+The remodeling toolbox provides Python scripts with command-line interfaces
+to create or restore backups and to apply operations to the files in a dataset.
+The file remodeling tools may be applied to datasets that are in free form under a directory root
+or that are in [**BIDS-format**](https://bids.neuroimaging.io/).
+Some operations use [**HED (Hierarchical Event Descriptors)**](./IntroductionToHed.md) annotations.
+See the [**Remodel with HED**](remodel-with-hed-anchor) section for a discussion
+of these operations and how to use them.
+
+The remodeling command-line interface can be used from the command line,
+called from another Python program, or used in a Jupyter notebooks.
+Example Jupyter notebooks using the remodeling commands can be found
+[**here**](https://github.com/hed-standard/hed-examples/tree/main/src/jupyter_notebooks/remodeling).
+
+
+(calling-remodel-tools-anchor)=
+### Calling remodel tools
+
+The remodeling tools provide three Python programs for backup (`run_remodel_backup`),
+remodeling (`run_remodel`) and restoring (`run_remodel_restore`) event files.
+These programs can be called from the command line or from another Python program.
+
+The programs use a standard command-line argument list for specifying input as summarized in the following table.
+
+(remodeling-operation-summary-anchor)=
+````{table} Summary of command-line arguments for the remodeling programs.
+| Script name | Arguments | Purpose |
+| ----------- | -------- | ------- |
+|*run_remodel_backup* | *data_dir*
*-bd -\\-backup-dir*
*-bn -\\-backup-name*
*-e -\\-extensions*
*-f -\\-file-suffix*
*-t -\\-task-names*
*-v -\\-verbose*
*-x -\\-exclude-dirs*| Create a backup event files. |
+|*run_remodel* | *data_dir*
*model_path*
*-b -\\-bids-format*
*-bd -\\-backup-dir*
*-bn -\\-backup-name*
*-e -\\-extensions*
*-f -\\-file-suffix*
*-i -\\-individual-summaries*
*-j -\\-json-sidecar*
*-ld -\\-log-dir*
*-nb -\\-no-backup*
*-ns -\\-no-summaries*
*-nu -\\-no-update*
*-r -\\-hed-version*
*-s -\\-save-formats*
*-t -\\-task-names*
*-v -\\-verbose*
*-w -\\-work-dir*
*-x -\\-exclude-dirs* | Restructure or summarize the event files. |
+|*run_remodel_restore* | *data_dir*
*-bd -\\-backup-dir*
*-bn -\\-backup-name*
*-t -\\-task-names*
*-v -\\-verbose*
| Restore a backup of event files. |
+
+````
+All the scripts have a required argument, which is the full path of the dataset root (*data_dir*).
+The `run_remodel` program has a required parameter which is the full path of a JSON file
+containing a specification of the remodeling commands to be run.
+
+(remodel-command-line-arguments-anchor)=
+### Remodel command-line arguments
+
+This section describes the arguments that are used for the remodeling command-line interface
+with examples and more details.
+
+#### Positional arguments
+
+Positional arguments are required and must be given in the order specified.
+
+`data_dir`
+> The full path of dataset root directory.
+
+`model_path`
+> The full path of the JSON remodel file (for *run_remodel* only).
+>
+#### Named arguments
+
+Named arguments consist of a key starting with a hyphen and are possibly followed by a value.
+Named arguments can be given in any order or omitted.
+If omitted, a specified default is used.
+Argument keys and values are separated by spaces.
+
+For argument values that are lists, the key is given followed by the items in the list,
+all separated by spaces.
+
+Each command has two different forms of the key name: a short form (a single hyphen followed by a single character)
+and a longer form (two hyphens followed by a more self-explanatory name).
+Users are free to use either form.
+
+`-b`, `--bids-format`
+> If this flag present, the dataset is in BIDS format with sidecars. Tabular files and their associated sidecars are located using BIDS naming.
+
+`-bd`, `--backup-dir`
+> The path to the directory holding the backups (default: `[data_root]/derivatives/remodel/backups`).
+> Use the `-nb` option if you wish to omit the backup (in `run_remodel`).
+
+`-bn`, `--backup-name`
+> The name of the backup used for the remodeling (default: `default_back`).
+
+`-e`, `--extensions`
+> This option is followed by a list of file extension(s) of the data files to process.
+> The default is `.tsv`. Comma separated tabular files are not permitted.
+
+`-f`, `--file-suffix`
+> This option is followed by the suffix names of the files to be processed.
+> For example `events` (the default) captures files named `events.tsv` if the default extension is used.
+> The filename without the extension must end in one of the specified suffixes in order to be
+> backed up or transformed.
+
+`-i`, `--individual-summaries`
+> This option offers a choice among three options:
+> - `separate`: Individual summaries for each file in separate files in addition the overall summary.
+> - `consolidated`: Individual summaries written in the same file as the overall summary.
+> - `none`: Only an overall summary.
+
+`-j`, `--json-sidecar`
+> This option is followed by the full path of the JSON sidecar with HED annotations to be
+> applied during the processing of HED-related remodeling operations.
+
+`-ld`, `--log-dir`
+> This option is followed by the full path of a directory for writing log files.
+> A log file is written if the remodeling tools raise an exception and the program terminates.
+> Note that a log file is not written for issues gathered during operations such as `summarize_hed_valistion`
+> because reporting HED validation errors is a normal part of this operation.
+> On the other hand, errors in the JSON remodeling file do raise and exception and are reported in the log.
+
+`-nb`, `--no-backup`
+> If present, no backup is used. Rather operations are performed directly on the files.
+
+`-ns`, `--no-summaries`
+> If present, no summary files are output.
+
+`-nu`, `--no-update`
+> If present, the modified files are not output.
+
+`-r`, `--hed-versions`
+> This option is followed by one or more HED versions. Versions of the standard schema are specified
+> by their semantic versions (e.g., `8.1.0`), while library schema versions are prefixed by their
+> library name (e.g., `score_1.0.0`).
+
+> If more than one HED schema version is given, all but one of the versions must start with an
+> additional namespace designator (e.g., `sc:`). At most one version can omit the namespace designator
+> when multiple schema are being used. In annotations, tags must start with the namespace
+> designator of the corresponding schema from which they were selected (e.g. `sc:Sleep-modulator`
+> if the SCORE library was designated by `sc:score_1.0.0`).
+
+`-s`, `--save-formats`
+> This option is followed by the extensions (including .) of the formats in which
+> to save summaries (default: `.txt` `.json`).
+
+`-t`, `--task-names`
+> The name(s) of the tasks to be included (for BIDS-formatted files only).
+> When a dataset includes multiple tasks, the event files are often structured
+> differently for each task and thus require different transformation files.
+> This option allows the backups and operations to be restricted to an individual task.
+
+> If this option is omitted, all tasks are used. This means that all `events.tsv` files are
+> restored from a backup if the backup is used, the operations are performed on all `events.tsv` files, and summaries are combined over all tasks.
+
+> If a list of specific task names follows this option, only datafiles corresponding to
+> the listed tasks are processed giving separate summaries for each listed task.
+
+> If a "*" follows this option, all event files are processed and separate summaries are created for each task.
+
+> Task detection follows the BIDS convention. Tasks are detected by finding "task-x" in the file names of `events.tsv` files. Here x is the name of the task. The task name is followed by an underbar, by a period, or be at the end of the filename.
+
+`-v`, `--verbose`
+> If present, more comprehensive messages documenting transformation progress
+> are printed to standard output.
+
+`-w`, `--work-dir`
+> The path to the remodeling work root directory --both for summaries (default: `[data_root]/derivatives/remodel`).
+> Use the `-nb` option if you wish to omit the backup (in `run_remodel`).
+
+`-x`, `--exclude-dirs`
+> The directories to exclude when gathering the data files to process.
+> For BIDS datasets, these are typically `derivatives`, `stimuli`, and `sourcecode`.
+> Any subdirectory with a path component named `remodel` is automatically excluded from remodeling, as
+> these directories are reserved for storing backup, state, and result information for the remodeling process itself.
+
+(remodel-scripts-anchor)=
+## Remodel scripts
+
+This section discusses the three main remodeling scripts with command-line interfaces
+to support backup, remodeling, and restoring the tabular files used in the remodeling process.
+These scripts can be run from the command line or from another Python program using a function call.
+
+(backing-up-files-anchor)=
+### Backing up files
+
+The `run_remodel_backup` Python program creates a backup of the specified files.
+The backup is always created in the `derivatives/remodel/backups` subdirectory
+under the dataset root as shown in the following example for the
+sample dataset `eeg_ds003645s_hed_remodel`,
+which can be found in the `datasets` subdirectory of the
+[**hed-examples**](https://github.com/hed-standard/hed-examples) GitHub repository.
+
+![Remodeling backup structure](./_static/images/RemodelingBackupStructure.png)
+
+
+The backup process creates a mirror of the directory structure of the source files to be backed up
+in the directory `derivatives/remodel/backups/backup_name/backup_root` as shown in the figure above.
+The default backup name is `default_back`.
+
+In the above example, the backup has subdirectories `sub-002` and `sub-003` just
+like the main directory of the dataset.
+These subdirectories only contain backups of the files to be transformed
+(by default files with names ending in `events.tsv`).
+
+In addition to the `backup_root`, the backup directory also contains a dictionary of backup files
+in the `backup_lock.json` file. This dictionary is used internally by the remodeling tools.
+The backup should be created once and not modified by the user.
+
+The following example shows how to run the `run_remodel_backup` program from the command line
+to back up the dataset located at `/datasets/eeg_ds003645s_hed_remodel`.
+
+(remodel-backup-anchor)=
+````{admonition} Example of calling run_remodel_backup from the command line.
+:class: tip
+
+```bash
+python run_remodel_backup /datasets/eeg_ds003645s_hed_remodel -x derivatives stimuli
+
+```
+````
+
+Since the `-f` and `-e` arguments are not given, the default file suffix and extension values
+apply, so only files of the form `events.tsv` are backed up.
+The `-x` option excludes any source files from the `derivatives` and `stimuli` subdirectories.
+These choices can be overridden using additional command-line arguments.
+
+The following shows how the `run_remodel_backup` program can be called from a
+Python program or a Jupyter notebook.
+The command-line arguments are given in a list instead of on the command line.
+
+(remodel-backup-jupyter-anchor)=
+````{admonition} Example of Python code to call run_remodel_backup using a function call.
+:class: tip
+
+```python
+
+import hed.tools.remodeling.cli.run_remodel_backup as cli_backup
+
+data_root = '/datasets/eeg_ds003645s_hed_remodel'
+arg_list = [data_root, '-x', 'derivatives', 'stimuli']
+cli_backup.main(arg_list)
+
+```
+````
+
+During remodeling, each file in the source is associated with a backup file using
+its relative path from the dataset root.
+Remodeling is performed by reading the backup file, performing the operations specified in the
+JSON remodel file, and overwriting the source file as needed.
+
+Users can also create alternatively named backups by providing the `-n` argument with a backup name to
+the `run_remodel_backup` program.
+To use backup files from another named backup, call the remodeling program with
+the `-n` argument and the correct backup name.
+Named backups can provide checkpoints to allow the execution of
+transformations to start from intermediate points.
+
+**NOTE**: You should not delete backups, even if you have created multiple named backups.
+The backups provide useful state and provenance information about the data.
+
+(remodeling-files-anchor)=
+### Remodeling files
+
+Remodeling consists of applying a sequence of operations from the
+[**remodel operation summary**](remodel-operation-summary-anchor)
+to successively transform each backup file according to the instructions
+and to overwrite the actual files with the final result.
+
+If the dataset has no backups, the actual data files rather than the backups are transformed.
+You are expected to [**create the backup**](backing-up-files-anchor) (just once)
+before running the remodeling operations.
+Going without backup is not recommended unless you are only doing summarization operations.
+
+The operations are specified as a list of dictionaries in a JSON file in the
+[**remodel sample files**](remodel-sample-files-anchor) as discussed below.
+
+Before running remodeling transformations on an entire dataset,
+consider using the [**HED online tools**](https://hedtools.ucsd.edu/hed)
+to debug your remodeling operation file on a single file.
+The remodeling process always starts with the original backup files,
+so the usual development path is to incrementally add operations to the end
+of your transformation JSON file as you develop and test on a single file
+until you have the desired end result.
+
+The following example shows how to run a remodeling script from the command line.
+The example assumes that the backup has already been created for the dataset.
+
+(run-remodel-anchor)=
+````{admonition} Example of calling run_remodel from the command line.
+:class: tip
+
+```bash
+python run_remodel /datasets/eeg_ds003645s_hed_remodel /datasets/remove_extra_rmdl.json -x derivatives simuli
+
+```
+````
+
+The script has two required arguments the dataset root and the path to the JSON remodel file.
+Usually, the JSON remodel files are stored with the dataset itself in the
+`derivatives/remodel/remodeling_files` subdirectory, but common scripts can be stored in a central place elsewhere.
+
+The additional keyword option, `-x` in the example indicates that directory paths containing the component `derivatives` or the component `stimuli` should be excluded.
+Excluded directories need not have their excluded path component at the top level of the dataset.
+Subdirectory paths containing the `remodel` path component are automatically excluded.
+
+The command-line interface can also be used in a Jupyter notebook or as part of a larger Python
+program by calling the `main` function with the equivalent command-line arguments provided
+in a list with the positional arguments appearing first.
+
+The following example shows Python code to remodel a dataset using the command-line interface.
+This code can be used in a Jupyter notebook or in another Python program.
+
+````{admonition} Example Python code to call run_remodel using a function call.
+:class: tip
+
+```python
+import hed.tools.remodeling.cli.run_remodel as cli_remodel
+
+data_root = '/datasets/eeg_ds003645s_hed_remodel'
+model_path = '/datasets/remove_extra_rmdl.json'
+arg_list = [data_root, model_path, '-x', 'derivatives', 'stimuli']
+cli_remodel.main(arg_list)
+
+```
+````
+
+(restoring-files-anchor)=
+### Restoring files
+
+Since remodeling always uses the backed up version of each data file,
+there is no need to restore these files to their original state
+between remodeling runs.
+However, when finished with an analysis,
+you may want to restore the data files to their original state.
+
+The following example shows how to call `run_remodel_restore` to
+restore the data files from the default backup.
+The restore operation restores all the files in the specified backup.
+
+(run-remodel-restore-anchor)=
+````{admonition} Example of calling run_remodel_restore from the command line.
+:class: tip
+
+```bash
+python run_remodel_restore /datasets/eeg_ds003645s_hed_remodel
+
+```
+````
+
+As with the other command-line programs, `run_remodel_restore` can be also called using a function call.
+
+````{admonition} Example Python code to call *run_remodel_restore* using a function call.
+:class: tip
+
+```python
+import hed.tools.remodeling.cli.run_restore as cli_remodel
+
+data_root = '/datasets/eeg_ds003645s_hed_remodel'
+cli_remodel.main([data_root])
+
+```
+````
+(remodel-with-hed-anchor)=
+## Remodel with HED
+
+[**HED**](introduction-to-hed-anchor) (Hierarchical Event Descriptors) is a
+system for annotating data in a manner that is both human-understandable and machine-actionable.
+HED provides much more detail about the events and their meanings,
+If you are new to HED see the
+[**HED annotation quickstart**](./HedAnnotationQuickstart.md).
+For information about HED's integration into BIDS (Brain Imaging Data Structure) see
+[**BIDS annotation quickstart**](./BidsAnnotationQuickstart.md).
+
+Currently, five remodeling operations rely on HED annotations:
+- [**factor_hed_tags**](factor-hed-tags-anchor)
+- [**factor_hed_type**](factor-hed-type-anchor)
+- [**summarize_hed_tags**](summarize-hed-tags-anchor)
+- [**summarize_hed_type**](summarize-hed-type-anchor)
+- [**summarize_hed_validation**](summarize-hed-validation-anchor).
+
+HED tags provide a mechanism for advanced data analysis and for
+extracting experiment-specific information from the data files.
+However, since HED information is not always stored in the data files themselves,
+you may need to provide a HED schema and a JSON sidecar.
+
+The HED schema defines the allowed HED tag vocabulary, and the JSON sidecar
+associates HED annotations with the information in the columns of the event files.
+If you are not using any of the HED operations in your remodeling,
+you do not have to provide this information.
+
+
+(extracting-hed-information-from-bids-anchor)=
+### Extracting HED information from BIDS
+
+The simplest way to use HED with `run_remodel` is to use the `-b` option,
+which indicates that the dataset is in [**BIDS**](https://bids.neuroimaging.io/) (Brain Imaging Data Structure) format.
+
+BIDS is a standardized way of organizing neuroimaging data.
+HED and BIDS are well integrated.
+If you are new to BIDS, see the
+[**BIDS annotation quickstart**](./BidsAnnotationQuickstart.md).
+
+A HED-annotated BIDS dataset provides the HED schema version in the `dataset_description.json`
+file located directly under the BIDS dataset root.
+
+BIDS datasets must have filenames in a specific format,
+and the HED tools can locate the relevant JSON sidecars for each data file based on this information.
+
+
+(directly-specifying-hed-information-anchor)=
+### Directly specifying HED information
+
+If your data is already in BIDS format, using the `-b` option is ideal since
+the needed information can be located automatically.
+However, early in the experimental process,
+your datafiles are not likely to be organized in BIDS format,
+and this option will not be available if you want to use HED.
+
+Without the `-b` option, the remodeling tools locate the appropriate files based
+on specified filename suffixes and extensions.
+In order to use HED operations, you must explicitly specify the HED versions
+using the `-r` option.
+The `-r` option supports a list of HED versions if multiple HED schemas are used.
+For example: `-r 8.1.0 sc:score_1.0.0` specifies that vocabulary will be drawn
+from standard HED Version 8.1.0 and from
+HED SCORE library version 1.0.0.
+Annotations containing tags from SCORE should be prefixed with `sc:`.
+Note: both of the schemas can be viewed by the [**HED Schema Viewer**](https://www.hedtags.org/display_hed.html).
+
+Usually, annotators will consolidate HED annotations in a single JSON sidecar file
+located at the top-level of the dataset.
+The path of this sidecar can be passed as a command-line argument using the `-j` option.
+If more than one JSON sidecar file contains HED annotations, users will need to call the lower-level
+remodeling functions to perform these operations.
+
+The following example illustrates a command-line call that passes both a HED schema version and
+the path to the JSON file with the HED annotations.
+
+(run-remodel-with-hed-direct-anchor)=
+````{admonition} Remodeling a non-BIDS dataset using HED.
+:class: tip
+
+```bash
+python run_remodel /datasets/eeg_ds003645s_hed_remodel /datasets/summarize_conditions_rmdl.json \
+-x derivatives simuli -r 8.1.0 -j /datasets/eeg_ds003645s_hed_remodel/task-FacePerception_events.json
+
+```
+````
+
+(remodel-with-hed-direct-python-anchor)=
+````{admonition} Example Python code to use run_remodel on a non-BIDS dataset.
+:class: tip
+
+```python
+import hed.tools.remodeling.cli.run_remodel as cli_remodel
+
+data_root = '/datasets/eeg_ds003645s_hed_remodel'
+model_path = '/datasets/summarize_conditions_rmdl.json'
+json_path = '/datasets/eeg_ds003645s_hed_remodel/task-FacePerception_events.json'
+arg_list = [data_root, model_path, '-x', 'derivatives', 'stimuli', '-r' 8.1.0 '-j' json_path]
+cli_remodel.main(arg_list)
+
+```
+````
+
+(remodel-error-handling-anchor)=
+## Remodel error handling
+
+Errors can occur during several stages in during remodeling and how they are
+handled depends on the type of error and where the error occurs.
+Except for the validation summary, the underlying remodeling code raises exceptions for most errors.
+
+
+(errors-in-the-remodel-file-anchor)=
+### Errors in the remodel file
+
+Each operation requires specific parameters to execute properly.
+The underlying implementation for each operation defines these parameters using a [**json schema**](https://json-schema.org/)
+as the `PARAMS` property of the operation's class definition.
+The use of the JSON schema allows the remodeler to specify and validate requirements on most of an
+operation's parameters using standardized methods.
+
+The [**remodeler_validator**](https://github.com/hed-standard/hed-python/blob/master/hed/tools/remodeling/remodeler_validator.py)
+compiles a JSON schema for the remodeler from individual operations and validates
+the remodel file against the compiled JSON schema. The validator should always before executing any remodel operations.
+
+For example, the command line [**run_remodel**](https://raw.githubusercontent.com/hed-standard/hed-python/develop/hed/tools/remodeling/cli/run_remodel.py)
+program calls the validator before executing any operations.
+If there are errors, `run_remodel` reports the errors for all operations and exits.
+This allows users to correct errors in all operations in one pass without any data modification.
+The [**HED online tools**](https://hedtools.org/hed) are particularly useful for debugging
+the syntax and other issues in the remodeling process.
+
+(execution-time-remodel-errors-anchor)=
+### Execution-time remodel errors
+
+When an error occurs during execution, an exception is raised.
+Exceptions are raised for invalid or missing files or if a transformed file
+is unable to be rewritten due to improper file permissions.
+Each individual operation may also raise an exception if the
+data file being processed does not have the expected information,
+such as a column with a particular name.
+
+Exceptions raised during execution cause the process to be terminated and no
+further files are processed.
+
+
+(remodel-sample-files-anchor)=
+## Remodel sample files
+
+All remodeling operations are specified in a standardized JSON remodel input file.
+The following shows the contents of the JSON remodeling file `remove_extra_rmdl.json`,
+which contains a single operation with instructions to remove the `value` and `sample` columns
+from the data file if the columns exist.
+
+(sample-remodel-file-anchor)=
+### Sample remodel file
+
+````{admonition} A sample JSON remodeling file with a single remove_columns transformation operation.
+:class: tip
+
+```json
+[
+ {
+ "operation": "remove_columns",
+ "description": "Remove unwanted columns prior to analysis",
+ "parameters": {
+ "remove_names": ["value", "sample"]
+ }
+ }
+]
+
+```
+````
+
+Each operation is specified in a dictionary with three top-level keys: "operation", "description",
+and "parameters". The value of the "operation" is the name of the operation.
+The "description" value should include the reason this operation was needed,
+not just a description of the operation itself.
+Finally, the "parameters" value is a dictionary mapping parameter name to
+parameter value.
+
+The parameters for each operation are listed in
+[**Remodel transformations**](remodel-transformations-anchor) and
+[**Remodel summarizations**](remodel-summarizations-anchor) sections.
+An operation may have both required and optional parameters.
+Optional parameters may be omitted if unneeded, but all parameters are specified in
+the "parameters" section of the dictionary.
+The full specification of the remodel file is also provided as a [**JSON schema**](https://json-schema.org/).
+
+The remodeling JSON files should have names ending in `_rmdl.json` to more easily
+distinguish them from other JSON files.
+Although these files can be stored anywhere, their preferred location is
+in the `derivatives/remodel/models` subdirectory under the dataset root so
+that they can provide provenance for the dataset.
+
+(sample-remodel-event-file-anchor)=
+### Sample remodel event file
+
+Several examples illustrating the remodeling operations use the following excerpt of the stop-go task from sub-0013
+of the AOMIC-PIOP2 dataset available on [**OpenNeuro**](https://openneuro.org) as ds002790.
+The full event file is
+[**sub-0013_task-stopsignal_acq-seq_events.tsv**](./_static/data/sub-0013_task-stopsignal_acq-seq_events.tsv).
+
+
+````{admonition} Excerpt from an event file from the stop-go task of AOMIC-PIOP2 (ds002790).
+| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex |
+| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- |
+| 0.0776 | 0.5083 | go | n/a | 0.565 | |correct | right | female
+| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female |
+| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female |
+| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female |
+| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male |
+| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male |
+````
+
+(Sample-remodel-sidecar-file-anchor)=
+### Sample remodel sidecar file
+
+For remodeling operations that use HED, a JSON sidecar is usually required to provide the
+necessary HED annotations. The following JSON sidecar excerpt is used in several examples to
+illustrate some of these operations.
+The full JSON file can be found at
+[**task-stopsiqnal_acq-seq_events.json**](./_static/data/task-stopsignal_acq-seq_events.json).
+
+
+````{admonition} Excerpt of JSON sidecar with HED annotations for the stop-go task of AOMIC-PIOP2.
+:class: tip
+
+```json
+{
+ "trial_type": {
+ "HED": {
+ "succesful_stop": "Sensory-presentation, Visual-presentation, Correct-action, Image, Label/succesful_stop",
+ "unsuccesful_stop": "Sensory-presentation, Visual-presentation, Incorrect-action, Image, Label/unsuccesful_stop",
+ "go": "Sensory-presentation, Visual-presentation, Image, Label/go"
+ }
+ },
+ "stop_signal_delay": {
+ "HED": "(Auditory-presentation, Delay/# s)"
+ },
+ "sex": {
+ "HED": {
+ "male": "Def/Male-image-cond",
+ "female": "Def/Female-image-cond"
+ }
+ },
+ "hed_defs": {
+ "HED": {
+ "def_male": "(Definition/Male-image-cond, (Condition-variable/Image-sex, (Male, (Image, Face))))",
+ "def_female": "(Definition/Female-image-cond, (Condition-variable/Image-sex, (Female, (Image, Face))))"
+ }
+ }
+}
+```
+````
+Notice that the JSON file has some keys (e.g., "trial_type", "stop_signal_delay", and "sex")
+which also correspond to columns in the events file.
+The "hed_defs" key corresponds to an extra entry in the JSON file that, in this case, provides the definitions needed in the HED annotation.
+
+HED operations also require the HED schema. Most of the examples use HED standard schema version 8.1.0.
+
+(remodel-transformations-anchor)=
+## Remodel transformations
+
+(factor-column-anchor)=
+### Factor column
+
+The *factor_column* operation appends factor vectors to tabular files
+based on the values in a specified file column.
+Each factor vector contains a 1 if the corresponding row had that column value and a 0 otherwise.
+The *factor_column* is used to reformat event files for analyses such as linear regression
+based on column values.
+
+(factor-column-parameters-anchor)=
+#### Factor column parameters
+
+```{admonition} Parameters for the *factor_column* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *column_name* | str | The name of the column to be factored.|
+| *factor_values* | list | Column values to be included as factors. |
+| *factor_names* | list| (**Optional**) Column names for created factors. |
+```
+
+If *column_name* is not a column in the data file, a `ValueError` is raised.
+
+If *factor_values* is empty, factors are created for each unique value in *column_name*.
+Otherwise, only factors for the specified column values are generated.
+If a specified value is missing in a particular file, the corresponding factor column contains all zeros.
+
+If *factor_names* is empty, the newly created columns are of the
+form *column_name.factor_value*.
+Otherwise, the newly created columns have names *factor_names*.
+If *factor_names* is not empty, then *factor_values* must also be specified
+and both lists must be of the same length.
+
+(factor-column-example-anchor)=
+#### Factor column example
+
+The *factor_column* operation in the following example specifies that factor columns
+should be created for *succesful_stop* and *unsuccesful_stop* of the *trial_type* column.
+The resulting columns are called *stopped* and *stop_failed*, respectively.
+
+
+````{admonition} A sample JSON file with a single *factor_column* transformation operation.
+:class: tip
+
+```json
+[{
+ "operation": "factor_column",
+ "description": "Create factors for the succesful_stop and unsuccesful_stop values.",
+ "parameters": {
+ "column_name": "trial_type",
+ "factor_values": ["succesful_stop", "unsuccesful_stop"],
+ "factor_names": ["stopped", "stop_failed"]
+ }
+}]
+```
+````
+
+The results of executing this *factor_column* operation on the
+[**sample remodel event file**](sample-remodel-event-file-anchor) are:
+
+````{admonition} Results of the factor_column operation on the samplepip data.
+
+| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex | stopped | stop_failed |
+| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- | ---------- | ---------- |
+| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female | 0 | 0 |
+| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female | 0 | 1 |
+| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female | 0 | 0 |
+| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female | 1 | 0 |
+| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male | 0 | 1 |
+| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male | 0 | 0 |
+````
+
+(factor-hed-tags-anchor)=
+### Factor HED tags
+
+The *factor_hed_tags* operation is similar to the *factor_column* operation
+in that it produces factor vectors containing 0's and 1,
+which are appended to the returned DataFrame.
+However, rather than basing these vectors on values in a specified column,
+the factors are computed by determining whether the assembled HED annotations for each row
+satisfies a specified search query.
+
+An example search query is whether the assembled HED annotation contains a particular HED tag.
+The [**HED search guide**](./HedSearchGuide.md) tutorial discusses the HED search facility in more detail.
+
+
+(factor-hed-tags-parameters-anchor)=
+#### Factor HED tags parameters
+
+```{admonition} Parameters for the *factor_hed_tags* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *queries* | list | A list of HED query strings. |
+| *query_names* | list | (**Optional**) A list of names for the factor columns generated by the queries. |
+| *remove_types* | list | (**Optional**) Structural HED tags to be removed (usually `Condition-variable` and `Task`). |
+| *expand_context* | bool | (**Optional**: default True) Expand the context and remove
`Onset` and`Offset` tags before the query. |
+
+```
+The *query_names* list, which must be empty or the same length as *queries*,
+contains the names of the factor columns produced by the search.
+If the *query_names* list is empty, the result columns are titled "query_1",
+"query_2", etc.
+
+Most of the time the *remove_types* should be set to `["Condition-variable", "Task"]` and the effects of
+the experimental design captured using the `factor_hed_types_op`.
+If *expand_context* is set to *false*, the additional context provided by `Onset`, `Offset`, and `Duration`
+is ignored.
+
+(factor-hed-tags-example-anchor)=
+#### Factor HED tags example
+
+The *factor_hed-tags* operation in the following example produce two factor
+columns with 1's where the HED string for a row contains the `Correct-action`
+and `Incorrect-action`, respectively.
+The resulting factor columns are named *correct* and *incorrect*, respectively.
+
+````{admonition} A sample JSON file with a single *factor_hed_tags* transformation operation.
+:class: tip
+
+```json
+[{
+ "operation": "factor_hed_tags",
+ "description": "Create factors based on whether the event represented a correct or incorrect action.",,
+ "parameters": {
+ "queries": ["correct-action", "incorrect-action"],
+ "query_names": ["correct", "incorrect"],
+ "remove_types": ["Condition-variable", "Task"],
+ "expand_context": false
+ }
+}]
+```
+````
+
+The results of executing this *factor_hed-tags* operation on the
+[**sample remodel event file**](sample-remodel-event-file-anchor) using the
+[**sample remodel sidecar file**](sample-remodel-sidecar-file-anchor) for HED annotations is:
+
+
+````{admonition} Results of *factor_hed_tags*.
+
+| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex | correct | incorrect |
+| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- | ---------- | ---------- |
+| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female | 0 | 0 |
+| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female | 0 | 1 |
+| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female | 0 | 0 |
+| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female | 1 | 0 |
+| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male | 0 | 1 |
+| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male | 0 | 0 |
+````
+
+(factor-hed-type-anchor)=
+### Factor HED type
+
+The *factor_hed_type* operation produces factor columns
+based on values of the specified HED type tag.
+The most common type is the HED *Condition-variable* tag, which corresponds to
+factor vectors based on the experimental design.
+Other commonly use type tags include *Task*, *Control-variable*, and *Time-block*.
+
+We assume that the dataset has been annotated using HED tags to properly document
+information such as experimental conditions, and focus on how such an annotated dataset can be
+used with remodeling to produce factor columns corresponding to these
+type variables.
+
+For additional information on how to encode experimental designs using HED, see
+[**HED conditions and design matrices**](./HedConditionsAndDesignMatrices.md).
+
+(factor-hed-type-parameters-anchor)=
+#### Factor HED type parameters
+
+```{admonition} Parameters for *factor_hed_type* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *type_tag* | str | HED tag used to find the factors (most commonly *Condition-variable*).|
+| *type_values* | list | (**Optional**) Values to factor for the *type_tag*.
If omitted, all values of that *type_tag* are used. |
+```
+The event context (as defined by onsets, offsets and durations) is always expanded and one-hot (0's and 1's)
+encoding is used for the factors.
+
+(factor-hed-type-example-anchor)=
+#### Factor HED type example
+
+The *factor_hed_type* operation in the following example appends
+additional columns to each data file corresponding to
+each possible value of each *Condition-variable* tag.
+The columns contain 1's for rows corresponding to rows (e.g., events) for which that condition
+applies and 0's otherwise.
+
+````{admonition} A JSON file with a single *factor_hed_type* transformation operation.
+:class: tip
+
+```json
+[{
+ "operation": "factor_hed_type",
+ "description": "Factor based on the sex of the images being presented.",
+ "parameters": {
+ "type_tag": "Condition-variable"
+ }
+}]
+```
+````
+
+The results of executing this *factor_hed-tags* operation on the
+[**sample remodel event file**](sample-remodel-event-file-anchor) using the
+[**sample remodel sidecar file**](sample-remodel-sidecar-file-anchor) for HED annotations are:
+
+
+````{admonition} Results of *factor_hed_type*.
+
+| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex | Image-sex.Female-image-cond | Image-sex.Male-image-cond |
+| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- | ------- | ---------- |
+| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female | 1 | 0 |
+| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female | 1 | 0 |
+| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female | 1 | 0 |
+| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female | 1 | 0 |
+| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male | 0 | 1 |
+| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male | 0 | 1 |
+````
+
+(merge-consecutive-anchor)=
+### Merge consecutive
+
+Sometimes a single long event in experimental logs is represented by multiple repeat events.
+The *merge_consecutive* operation collapses these consecutive repeat events into one event with
+duration updated to encompass the temporal extent of the merged events.
+
+(merge-consecutive-parameters-anchor)=
+#### Merge consecutive parameters
+
+```{admonition} Parameters for the *merge_consecutive* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *column_name* | str | The name of the column which is the basis of the merge.|
+| *event_code* | str, int, float | The value in *column_name* that triggers the merge. |
+| *set_durations* | bool | If true, set durations based on merged events. |
+| *ignore_missing* | bool | If true, missing *column_name* or *match_columns* do not raise an error. |
+| *match_columns* | list | (**Optional**) Columns whose values must match to collapse events. |
+```
+
+The first of the group of rows (each representing an event) to be merged is called the anchor
+for the merge. After the merge, it is the only row in the group
+that remains in the data file. The result is identical
+to its original version, except for the value in the `duration` column.
+
+If the *set_duration* parameter is true, the new duration is calculated as though
+the event began with the onset of the first event (the anchor row) in the group and
+ended at the point where all the events in the group have ended.
+This method allows for small gaps between events and for events in which an
+intermediate event in the group ends after later events.
+If the *set_duration* parameter is false, the duration of the merged row is set to `n/a`.
+
+If the data file has other columns besides `onset`, `duration` and *column_name*,
+the values in the other columns must be considered during the merging process.
+The *match_columns* is a list of the other columns whose values must agree with those
+of the anchor row in order for a merge to occur. If *match_columns* is empty, the
+other columns in each row are not taken into account during the merge.
+
+(merge-consecutive-example-anchor)=
+#### Merge consecutive example
+
+The *merge_consecutive* operation in the following example causes consecutive
+`succesful_stop` events whose `stop_signal_delay`, `response_hand`, and `sex` columns
+have the same values to be merged into a single event.
+
+
+````{admonition} A JSON file with a single *merge_consecutive* transformation operation.
+:class: tip
+
+```json
+[{
+ "operation": "merge_consecutive",
+ "description": "Merge consecutive *succesful_stop* events that match the *match_columns.",
+ "parameters": {
+ "column_name": "trial_type",
+ "event_code": "succesful_stop",
+ "set_durations": true,
+ "ignore_missing": true,
+ "match_columns": ["stop_signal_delay", "response_hand", "sex"]
+ }
+}]
+```
+````
+
+When this operation is applied to the following input file,
+the three events with a value of `succesful_stop` in the `trial_type` column starting
+at `onset` value 13.5939 are merged into a single event.
+
+````{admonition} Input file for a *merge_consecutive* operation.
+
+| onset | duration | trial_type | stop_signal_delay | response_hand | sex |
+| ----- | -------- | ---------- | ----------------- | ------------- | --- |
+| 0.0776 | 0.5083 | go | n/a | right | female|
+| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | right | female|
+| 9.5856 | 0.5084 | go | n/a | right | female|
+| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | female|
+| 14.2 | 0.5083 | succesful_stop | 0.2 | n/a | female|
+| 15.3 | 0.7083 | succesful_stop | 0.2 | n/a | female|
+| 17.3 | 0.5083 | unsuccesful_stop | 0.25 | n/a | female|
+| 19.0 | 0.5083 | unsuccesful_stop | 0.25 | n/a | female|
+| 21.1021 | 0.5083 | unsuccesful_stop | 0.25 | left | male|
+| 22.6103 | 0.5083 | go | n/a | left | male |
+````
+
+Notice that the `succesful_stop` event at `onset` value `17.3` is not
+merged because the `stop_signal_delay` column value does not match the value in the previous event.
+The final result has `duration` computed as `2.4144` = `15.3` + `0.7083` - `13.5939`.
+
+````{admonition} The results of the *merge_consecutive* operation.
+
+| onset | duration | trial_type | stop_signal_delay | response_hand | sex |
+| ----- | -------- | ---------- | ------------------ | ------------- | --- |
+| 0.0776 | 0.5083 | go | n/a | right | female |
+| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | right | female |
+| 9.5856 | 0.5084 | go | n/a | right | female |
+| 13.5939 | 2.4144 | succesful_stop | 0.2 | n/a | female |
+| 17.3 | 2.2083 | unsuccesful_stop | 0.25 | n/a | female |
+| 21.1021 | 0.5083 | unsuccesful_stop | 0.25 | left | male |
+| 22.6103 | 0.5083 | go | n/a | left | male |
+````
+
+The events that had onsets at `17.3` and `19.0` are also merged in this example
+
+(remap-columns-anchor)=
+### Remap columns
+
+The *remap_columns* operation maps combinations of values in *m* specified columns of a data file
+into values in *n* columns using a defined mapping.
+Remapping is useful during analysis to create columns in event files that are more directly useful
+or informative for a particular analysis.
+
+Remapping is also important during the initial generation of event files from experimental logs.
+The log files generated by experimental control software often generate a code for each type of log entry.
+Remapping can be used to convert the column containing these codes into one or more columns with more informative information.
+
+
+(remap-columns-parameters-anchor)=
+#### Remap columns parameters
+
+
+```{admonition} Parameters for the *remap_columns* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *source_columns* | list | A list of *m* names of the source columns for the map.|
+| *destination_columns* | list | A list of *n* names of the destination columns for the map. |
+| *map_list* | list | A list of mappings. Each element is a list of *m* source
column values followed by *n* destination values.
Mapping source values are treated as strings. |
+| *ignore_missing* | bool | If false, source column values not in the map generate "n/a"
destination values instead of errors. |
+| *integer_sources* | list | (**Optional**) A list of source columns that are integers.
The *integer_sources* must be a subset of *source_columns*. |
+```
+A column cannot be both a source and a destination,
+and all source columns must be present in the data files.
+New columns are created for destination columns that are missing from a data file.
+
+The *remap_columns* operation only works for columns containing strings or integers,
+as it is meant for remapping categorical codes.
+You must specify which source columns contain integers so that `n/a` values
+can be handled appropriately.
+
+The *map_list* parameter specifies how each unique combination of values from the source
+columns will be mapped into the destination columns.
+If there are *m* source columns and *n* destination columns,
+then each entry in *map_list* must be a list with *m* + *n* elements.
+The first *m* elements are the key values from the source columns.
+The *map_list* should have targets for all combinations of values that appear in the *m* source columns
+unless *ignore_missing* is true.
+
+After remapping, the tabular file will contain both source and destination columns.
+If you wish to replace the source columns with the destination columns,
+use a *remove_columns* transformation after the *remap_columns*.
+
+
+(remap-columns-example-anchor)=
+#### Remap columns example
+
+The *remap_columns* operation in the following example creates a new column called *response_type*
+based on the unique values in the combination of columns *response_accuracy* and *response_hand*.
+
+````{admonition} A JSON file with a single *remap_columns* transformation operation.
+:class: tip
+
+```json
+[{
+ "operation": "remap_columns",
+ "description": "Map response_accuracy and response hand into a single column.",
+ "parameters": {
+ "source_columns": ["response_accuracy", "response_hand"],
+ "destination_columns": ["response_type"],
+ "map_list": [["correct", "left", "correct_left"],
+ ["correct", "right", "correct_right"],
+ ["incorrect", "left", "incorrect_left"],
+ ["incorrect", "right", "incorrect_left"],
+ ["n/a", "n/a", "n/a"]],
+ "ignore_missing": true
+ }
+}]
+```
+````
+In this example there are two source columns and one destination column,
+so each entry in *map_list* must be a list with three elements
+two source values and one destination value.
+Since all the values in *map_list* are strings,
+the optional *integer_sources* list is not needed.
+
+The results of executing the previous *remap_column* command on the
+[**sample remodel event file**](sample-remodel-event-file-anchor) are:
+
+````{admonition} Mapping columns *response_accuracy* and *response_hand* into a *response_type* column.
+
+| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex | response_type |
+| ----- | -------- | ---------- | ---------- | ----------------- | ------------- | ----------------- | --- | ------------------- |
+| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female | correct_right |
+| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female | correct_right |
+| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female | correct_right |
+| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female | n/a |
+| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male | correct_left |
+| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male | correct_left |
+````
+
+In this example, *remap_columns* combines the values from columns `response_accuracy` and
+`response_hand` to produce a new column called `response_type` that specifies both response hand and correctness information using a single code.
+
+(remove-columns-anchor)=
+### Remove columns
+
+Sometimes columns are added during intermediate processing steps. The *remove_columns*
+operation is useful for cleaning up unnecessary columns after these processing steps complete.
+
+(remove-columns-parameters-anchor)=
+#### Remove columns parameters
+
+```{admonition} Parameters for the *remove_columns* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *column_names* | list of str | A list of columns to remove.|
+| *ignore_missing* | boolean | If true, missing columns are ignored, otherwise raise `KeyError`. |
+```
+
+If one of the specified columns is not in the file and the *ignore_missing*
+parameter is *false*, a `KeyError` is raised for the missing column.
+
+(remove-columns-example-anchor)=
+#### Remove columns example
+
+The following example specifies that the *remove_columns* operation should remove the `stop_signal_delay`,
+`response_accuracy`, and `face` columns from the tabular data.
+
+````{admonition} A JSON file with a single *remove_columns* transformation operation.
+:class: tip
+
+```json
+[{
+ "operation": "remove_columns",
+ "description": "Remove extra columns before the next step.",
+ "parameters": {
+ "column_names": ["stop_signal_delay", "response_accuracy", "face"],
+ "ignore_missing": true
+ }
+}]
+```
+````
+
+The results of executing this operation on the
+[**sample remodel event file**](sample-remodel-event-file-anchor)
+are shown below.
+The *face* column is not in the data, but it is ignored, since *ignore_missing* is true.
+If *ignore_missing* had been false, a `KeyError` would have been raised.
+
+```{admonition} Results of executing the *remove_columns*.
+| onset | duration | trial_type | response_time | response_hand | sex |
+| ----- | -------- | ---------- | ------------- | ------------- | --- |
+| 0.0776 | 0.5083 | go | 0.565 | right | female |
+| 5.5774 | 0.5083 | unsuccesful_stop | 0.49 | right | female |
+| 9.5856 | 0.5084 | go | 0.45 | right | female |
+| 13.5939 | 0.5083 | succesful_stop | n/a | n/a | female |
+| 17.1021 | 0.5083 | unsuccesful_stop | 0.633 | left | male |
+| 21.6103 | 0.5083 | go | 0.443 | left | male |
+````
+
+(remove-rows-anchor)=
+### Remove rows
+
+The *remove_rows* operation eliminates rows in which the named column has one of the specified values.
+This operation is useful for removing event markers corresponding to particular types of events
+or, for example having `n/a` in a particular column.
+
+
+(remove-rows-parameters-anchor)=
+#### Remove rows parameters
+
+```{admonition} Parameters for *remove_rows*.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *column_name* | str | The name of the column to be tested.|
+| *remove_values* | list | A list of values to be tested for removal. |
+```
+The operation does not raise an error if a data file does not have a column named
+*column_name* or is missing a value in *remove_values*.
+
+(remove-rows-example-anchor)=
+#### Remove rows example
+
+The following *remove_rows* operation removes the rows whose *trial_type* column
+contains either `succesful_stop` or `unsuccesful_stop`.
+
+````{admonition} A JSON file with a single *remove_rows* transformation operation.
+:class: tip
+
+```json
+[{
+ "operation": "remove_rows",
+ "description": "Remove rows where trial_type is either succesful_stop or unsuccesful_stop.",
+ "parameters": {
+ "column_name": "trial_type",
+ "remove_values": ["succesful_stop", "unsuccesful_stop"]
+ }
+}]
+```
+````
+
+The results of executing the previous *remove_rows* operation on the
+[**sample remodel event file**](sample-remodel-event-file-anchor) are:
+
+````{admonition} The results of executing the previous *remove_rows* operation.
+
+| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex |
+| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- |
+| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female |
+| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female |
+| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male |
+````
+
+After removing rows with `trial_type` equal to `succesful_stop` or `unsuccesful_stop` only the
+three `go` trials remain.
+
+
+(rename-columns-anchor)=
+### Rename columns
+
+The `rename_columns` operations uses a dictionary to map old column names into new ones.
+
+(rename-columns-parameters-anchor)=
+#### Rename columns parameters
+
+```{admonition} Parameters for *rename_columns*.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *column_mapping* | dict | The keys are the old column names and the values are the new names.|
+| *ignore_missing* | bool | If false, a `KeyError` is raised if a dictionary key is not a column name. |
+
+```
+
+If *ignore_missing* is false, a `KeyError` is raised if a column specified in
+the mapping does not correspond to a column name in the data file.
+
+(rename-columns-example-anchor)=
+#### Rename columns example
+
+The following example renames the `stop_signal_delay` column to be `stop_delay` and
+the `response_hand` to be `hand_used`.
+
+````{admonition} A JSON file with a single *rename_columns* transformation operation.
+:class: tip
+
+```json
+[{
+ "operation": "rename_columns",
+ "description": "Rename columns to be more descriptive.",
+ "parameters": {
+ "column_mapping": {
+ "stop_signal_delay": "stop_delay",
+ "response_hand": "hand_used"
+ },
+ "ignore_missing": true
+ }
+}]
+
+```
+````
+
+The results of executing the previous *rename_columns* operation on the
+[**sample remodel event file**](sample-remodel-event-file-anchor) are:
+
+````{admonition} After the *rename_columns* operation is executed, the sample events file is:
+| onset | duration | trial_type | stop_delay | response_time | response_accuracy | hand_used | sex |
+| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- |
+| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female |
+| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female |
+| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female |
+| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female |
+| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male |
+| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male |
+````
+
+(reorder-columns-anchor)=
+### Reorder columns
+
+The *reorder_columns* operation reorders the indicated columns in the specified order.
+This operation is often used to place the most important columns near the beginning of the file for readability
+or to assure that all the data files in dataset have the same column order.
+Additional parameters control how non-specified columns are treated.
+
+(reorder-columns-parameters-anchor)=
+#### Reorder columns parameters
+
+```{admonition} Parameters for the *reorder_columns* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *column_order* | list | A list of columns in the order they should appear in the data.|
+| *ignore_missing* | bool | Controls handling column names in the reorder list that aren't in the data. |
+| *keep_others* | bool | Controls handling of columns not in the reorder list. |
+
+```
+
+If *ignore_missing* is true
+and items in the reorder list do not exist in the file, the missing columns are ignored.
+On the other hand, if *ignore_missing* is false,
+a column name in the reorder list that is missing from the data raises a *ValueError*.
+
+The *keep_others* parameter controls whether columns in the data that
+do not appear in the *column_order* list are dropped (*keep_others* is false) or
+put at the end in the relative order that they appear in the file (*keep_others* is true).
+
+BIDS event files are required to have `onset` and `duration` as the first and second columns, respectively.
+
+(reorder-columns-example-anchor)=
+#### Reorder columns example
+
+The *reorder_columns* operation in the following example specifies that the first four
+columns of the dataset should be: `onset`, `duration`, `response_time`, and `trial_type`.
+Since *keep_others* is false, these will be the only columns retained.
+
+````{admonition} A JSON file with a single *reorder_columns* transformation operation.
+:class: tip
+
+```json
+[{
+ "operation": "reorder_columns",
+ "description": "Reorder columns.",
+ "parameters": {
+ "column_order": ["onset", "duration", "response_time", "trial_type"],
+ "ignore_missing": true,
+ "keep_others": false
+ }
+}]
+```
+````
+
+
+The results of executing the previous *reorder_columns* transformation on the
+[**sample remodel event file**](sample-remodel-event-file-anchor) are:
+
+````{admonition} Results of *reorder_columns*.
+
+| onset | duration | response_time | trial_type |
+| ----- | -------- | ---------- | ------------- |
+| 0.0776 | 0.5083 | 0.565 | go |
+| 5.5774 | 0.5083 | 0.49 | unsuccesful_stop |
+| 9.5856 | 0.5084 | 0.45 | go |
+| 13.5939 | 0.5083 | n/a | succesful_stop |
+| 17.1021 | 0.5083 | 0.633 | unsuccesful_stop |
+| 21.6103 | 0.5083 | 0.443 | go |
+````
+
+(split-rows-anchor)=
+### Split rows
+
+The *split_rows* operation
+is often used to convert event files from trial-level encoding to event-level encoding.
+This operation is meant only for tabular files that have `onset` and `duration` columns.
+
+In **trial-level** encoding, all the events in a single trial
+(usually some variation of the cue-stimulus-response-feedback-ready sequence)
+are represented by a single row in the data file.
+Often, the onset corresponds to the presentation of the stimulus,
+and the other events are not reported or are implicitly reported.
+
+In **event-level** encoding, each row represents the temporal marker for a single event.
+In this case a trial consists of a sequence of multiple events.
+
+
+(split-rows-parameters-anchor)=
+#### Split rows parameters
+
+```{admonition} Parameters for the *split_rows* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *anchor_column* | str | The name of the column that will be used for split_rows codes.|
+| *new_events* | dict | Dictionary whose keys are the codes to be inserted as new events
in the *anchor_column* and whose values are dictionaries with
keys *onset_source*, *duration*, and *copy_columns (**Optional**)*. |
+| *remove_parent_event* | bool | If true, remove parent event. |
+
+```
+
+The *split_rows* operation requires an *anchor_column*, which could be an existing
+column or a new column to be appended to the data.
+The purpose of the *anchor_column* is to hold the codes for the new events.
+
+The *new_events* dictionary has the new events to be created.
+The keys are the new event codes to be inserted into the *anchor_column*.
+The values in *new_events* are themselves dictionaries.
+Each of these dictionaries has three keys:
+
+- *onset_source* is a list of items to be added to the *onset*
+of the event row being split to produce the `onset` column value for the new event. These items can be any combination of numerical values and column names.
+- *duration* a list of numerical values and/or column names whose values are to be added
+to compute the `duration` column value for the new event.
+- *copy_columns* a list of column names whose values should be copied into each new event.
+Unlisted columns are filled with `n/a`.
+
+
+The *split_rows* operation sorts the split rows by the `onset` column and raises a `TypeError`
+if the `onset` and `duration` are improperly defined.
+The `onset` column is converted to numeric values as part splitting process.
+
+(split-rows-example-anchor)=
+#### Split rows example
+
+The *split_rows* operation in the following example specifies that new rows should be added
+to encode the response and stop signal. The anchor column is `trial_type`.
+
+
+````{admonition} A JSON file with a single *split_rows* transformation operation.
+:class: tip
+
+```json
+[{
+ "operation": "split_rows",
+ "description": "add response events to the trials.",
+ "parameters": {
+ "anchor_column": "trial_type",
+ "new_events": {
+ "response": {
+ "onset_source": ["response_time"],
+ "duration": [0],
+ "copy_columns": ["response_accuracy", "response_hand", "sex", "trial_number"]
+ },
+ "stop_signal": {
+ "onset_source": ["stop_signal_delay"],
+ "duration": [0.5],
+ "copy_columns": ["trial_number"]
+ }
+ },
+ "remove_parent_event": false
+ }
+ }]
+```
+````
+
+The results of executing this *split_rows* operation on the
+[**sample remodel event file**](sample-remodel-event-file-anchor) are:
+
+````{admonition} Results of the previous *split_rows* operation.
+
+| onset | duration | trial_type | stop_signal_delay | response_time | response_accuracy | response_hand | sex |
+| ----- | -------- | ---------- | ----------------- | ------------- | ----------------- | ------------- | --- |
+| 0.0776 | 0.5083 | go | n/a | 0.565 | correct | right | female |
+| 0.6426 | 0 | response | n/a | n/a | correct | right | female |
+| 5.5774 | 0.5083 | unsuccesful_stop | 0.2 | 0.49 | correct | right | female |
+| 5.7774 | 0.5 | stop_signal | n/a | n/a | n/a | n/a | n/a |
+| 6.0674 | 0 | response | n/a | n/a | correct | right | female |
+| 9.5856 | 0.5084 | go | n/a | 0.45 | correct | right | female |
+| 10.0356 | 0 | response | n/a | n/a | correct | right | female |
+| 13.5939 | 0.5083 | succesful_stop | 0.2 | n/a | n/a | n/a | female |
+| 13.7939 | 0.5 | stop_signal | n/a | n/a | n/a | n/a | n/a |
+| 17.1021 | 0.5083 | unsuccesful_stop | 0.25 | 0.633 | correct | left | male |
+| 17.3521 | 0.5 | stop_signal | n/a | n/a | n/a | n/a | n/a |
+| 17.7351 | 0 | response | n/a | n/a | correct | left | male |
+| 21.6103 | 0.5083 | go | n/a | 0.443 | correct | left | male |
+| 22.0533 | 0 | response | n/a | n/a | correct | left | male |
+````
+
+In a full processing example, it might make sense to rename `trial_type` to be
+`event_type` and to delete the `response_time` and the `stop_signal_delay` columns,
+since these items have been unfolded into separate events.
+This could be accomplished in subsequent clean-up operations.
+
+(remodel-summarizations-anchor)=
+## Remodel summarizations
+
+Summarizations differ transformations in two respects: they do not modify the input data file,
+and they keep information about the results from each file that has been processed.
+Summarization operations may be used at several points in the operation list as checkpoints
+during debugging as well as for their more typical informational uses.
+
+All summary operations have two required parameters: *summary_name* and *summary_filename*.
+
+The *summary_name* is the unique key used to identify the
+particular incarnation of this summary in the dispatcher.
+Care should be taken to make sure that the *summary_name* is unique within
+a given JSON remodeling file if the same summary operation is used more than
+once within the file (e.g. for before and after summary information).
+
+The *summary_filename* should also be unique and is used for saving the summary upon request.
+When the remodeler is applied to full datasets rather than single files,
+the summaries are saved in the `derivatives/remodel/summaries` directory under the dataset root.
+A time stamp and file extension are appended to the *summary_filename* when the
+summary is saved.
+
+(summarize-column-names-anchor)=
+### Summarize column names
+
+The *summarize_column_names* tracks the unique column name patterns found in data files across
+the dataset and which files have these column names.
+This summary is useful for determining whether there are any non-conforming data files.
+
+Often event files associated with different tasks have different column names,
+and this summary can be used to verify that the files corresponding to the same task
+have the same column names.
+
+A more problematic issue is when some event files for the same task
+have reordered column names or use different column names.
+
+(summarize-columns-names-parameters-anchor)=
+#### Summarize column names parameters
+
+The *summarize_column_names* operation has no parameters and only requires the
+*summary_name* and the *summary_filename* to specify the operation.
+
+The *summarize_column_names* operation only has the two parameters required of
+all summaries.
+
+```{admonition} Parameters for the *summarize_column_names* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *summary_name* | str | A unique name used to identify this summary.|
+| *summary_filename* | str | A unique file basename to use for saving this summary. |
+| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
+```
+
+(summarize-column-names-example-anchor)=
+#### Summarize column names example
+
+The following example remodeling file produces a summary, which when saved
+will appear with file name `AOMIC_column_names_xxx.txt` or
+`AOMIC_column_names_xxx.json` where `xxx` is a timestamp.
+
+````{admonition} A JSON file with a single *summarize_column_names* summarization operation.
+:class: tip
+```json
+[{
+ "operation": "summarize_column_names",
+ "description": "Summarize column names.",
+ "parameters": {
+ "summary_name": "AOMIC_column_names",
+ "summary_filename": "AOMIC_column_names"
+ }
+}]
+```
+````
+
+When this operation is applied to the [**sample remodel event file**](sample-remodel-event-file-anchor),
+the following text summary is produced.
+
+````{admonition} Result of applying *summarize_column_names* to the sample remodel file.
+:class: tip
+
+```text
+
+Summary name: AOMIC_column_names
+Summary type: column_names
+Summary filename: AOMIC_column_names
+
+Summary details:
+
+Dataset: Number of files=1
+ Columns: ['onset', 'duration', 'trial_type', 'stop_signal_delay', 'response_time', 'response_accuracy', 'response_hand', 'sex']
+ sub-0013_task-stopsignal_acq-seq_events.tsv
+
+Individual files:
+
+sub-0013_task-stopsignal_acq-seq_events.tsv:
+ ['onset', 'duration', 'trial_type', 'stop_signal_delay', 'response_time', 'response_accuracy', 'response_hand', 'sex']
+
+```
+````
+
+Since we are only summarizing one event file, there is only one unique pattern -- corresponding
+to the columns: *onset*, *duration*, *trial_type*, *stop_signal_delay*, *response_time*, *response_accuracy*, *response_hand*, and *response_time*.
+
+When the dataset has multiple column name patterns, the summary lists unique pattern separately along
+with the names of the data files that have this pattern.
+
+The JSON version of the summary is useful for programmatic manipulation,
+while the text version shown above is more readable.
+
+
+(summarize-column-values-anchor)=
+### Summarize column values
+
+The summarize column values operation provides a summary of the number of times various
+column values appear in event files across the dataset.
+
+
+(summarize-columns-values-parameters-anchor)=
+#### Summarize column values parameters
+
+The following table lists the parameters required for using the summary.
+
+```{admonition} Parameters for the *summarize_column_values* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *summary_name* | str | A unique name used to identify this summary.|
+| *summary_filename* | str | A unique file basename to use for saving this summary. |
+| *append_timecode* | bool | (**Optional**: Default false) If True, append a time code to filename. |
+| *max_categorical* | int | (**Optional**: Default 50) If given, the text summary shows top *max_categorical* values.
Otherwise the text summary displays all categorical values.|
+| *skip_columns* | list | (**Optional**) A list of column names to omit from the summary.|
+| *value_columns* | list | (**Optional**) A list of columns to omit the listing unique values. |
+| *values_per_line* | int | (**Optional**: Default 5) If given, the text summary displays this
number of values per line (default is 5).|
+
+```
+
+In addition to the standard parameters, *summary_name* and *summary_filename* required of all summaries,
+the *summarize_column_values* operation requires two additional lists to be supplied.
+The *skip_columns* list specifies the names of columns to skip entirely in the summary.
+Typically, the `onset`, `duration`, and `sample` columns are skipped, since they have unique values for
+each row and their values have limited information.
+
+The *summarize_column_values* is mainly meant for creating summary information about columns
+containing a finite number of distinct values.
+Columns that contain numeric information will usually have distinct entries for
+each row in a tabular file and are not amenable to such summarization.
+These columns could be specified as *skip_columns*, but another option is to
+designate them as *value_columns*. The *value_columns* are reported in the summary,
+but their distinct values are not reported individually.
+
+For datasets that include multiple tasks, the event values for each task may be distinct.
+The *summarize_column_values* operation does not separate by task, but expects the
+calling programs filter the files by task as desired.
+The `run_remodel` program supports selecting files corresponding to a particular task.
+
+Two additional optional parameters are available for specifying aspects of the text summary output.
+The *max_categorical* optional parameter specifies how many unique values should be displayed
+for each column. The *values_per_line* controls how many categorical column values (with counts)
+are displayed on each line of the output. By default, 5 values are displayed.
+
+(summarize-column-values-example-anchor)=
+#### Summarize column values example
+
+The following example shows the JSON for including this operation in a remodeling file.
+
+````{admonition} A JSON file with a single *summarize_column_values* summarization operation.
+:class: tip
+```json
+[{
+ "operation": "summarize_column_values",
+ "description": "Summarize the column values in an excerpt.",
+ "parameters": {
+ "summary_name": "AOMIC_column_values",
+ "summary_filename": "AOMIC_column_values",
+ "skip_columns": ["onset", "duration"],
+ "value_columns": ["response_time", "stop_signal_delay"]
+ }
+}]
+```
+````
+
+A text format summary of the results of executing this operation on the
+[**sample remodel event file**](sample-remodel-event-file-anchor)
+is shown in the following example.
+
+````{admonition} Sample *summarize_column_values* operation results in text format.
+:class: tip
+```text
+Summary name: AOMIC_column_values
+Summary type: column_values
+Summary filename: AOMIC_column_values
+
+Overall summary:
+Dataset: Total events=6 Total files=1
+ Categorical column values[Events, Files]:
+ response_accuracy:
+ correct[5, 1] n/a[1, 1]
+ response_hand:
+ left[2, 1] n/a[1, 1] right[3, 1]
+ sex:
+ female[4, 1] male[2, 1]
+ trial_type:
+ go[3, 1] succesful_stop[1, 1] unsuccesful_stop[2, 1]
+ Value columns[Events, Files]:
+ response_time[6, 1]
+ stop_signal_delay[6, 1]
+
+Individual files:
+
+sub-0013_task-stopsignal_acq-seq_events.tsv:
+Total events=200
+ Categorical column values[Events, Files]:
+ response_accuracy:
+ correct[5, 1] n/a[1, 1]
+ response_hand:
+ left[2, 1] n/a[1, 1] right[3, 1]
+ sex:
+ female[4, 1] male[2, 1]
+ trial_type:
+ go[3, 1] succesful_stop[1, 1] unsuccesful_stop[2, 1]
+ Value columns[Events, Files]:
+ response_time[6, 1]
+ stop_signal_delay[6, 1]
+```
+````
+
+Because the [**sample remodel event file**](sample-remodel-event-file-anchor)
+only has 6 events, we expect that no value will be represented in more than 6 events.
+The column names corresponding to value columns just have the event counts in them.
+
+This command was executed with the `-i` option in `run_remodel`,
+results from the individual data files are shown after the overall summary.
+The individual results are similar to the overall summary because only one data file
+was processed.
+
+For a more extensive example see the
+[**text**](./_static/data/summaries/FacePerception_column_values_summary.txt)
+and [**JSON**](./_static/data/summaries/FacePerception_column_values_summary.json)
+format summaries of the sample dataset
+[**ds003645s_hed**](https://github.com/hed-standard/hed-examples/tree/main/datasets/eeg_ds003645s_hed)
+using the [**summarize_columns_rmdl.json**](./_static/data/summaries/summarize_columns_rmdl.json)
+remodeling file.
+
+
+(summarize-definitions-anchor)=
+### Summarize definitions
+
+The summarize definitions operation provides a summary of the `Def-expand` tags found across the dataset,
+nothing any ambiguous or erroneous ones. If working on a BIDS dataset, it will initialize with the known definitions
+from the sidecar, reporting any deviations from the known definitions as errors.
+
+(summarize-definitions-parameters-anchor)=
+#### Summarize definitions parameters
+
+**NOTE: This summary is still under development**
+The following table lists the parameters required for using the summary.
+
+```{admonition} Parameters for the *summarize_definitions* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *summary_name* | str | A unique name used to identify this summary.|
+| *summary_filename* | str | A unique file basename to use for saving this summary. |
+| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
+```
+
+The *summarize_definitions* is mainly meant for verifying consistency in unknown `Def-expand` tags.
+This comes up where you have an assembled dataset, but no longer have the definitions stored (or never created them to begin with).
+
+
+(summarize-definitions-example-anchor)=
+#### Summarize definitions example
+
+The following example shows the JSON for including this operation in a remodeling file.
+
+````{admonition} A JSON file with a single *summarize_definitions* summarization operation.
+:class: tip
+```json
+[{
+ "operation": "summarize_definitions",
+ "description": "Summarize the definitions used in this dataset.",
+ "parameters": {
+ "summary_name": "HED_column_definition_summary",
+ "summary_filename": "HED_column_definition_summary"
+ }
+}]
+```
+````
+
+A text format summary of the results of executing this operation on the
+[**sub-003_task-FacePerception_run-3_events.tsv**](_static/data/sub-003_task-FacePerception_run-3_events.tsv) file
+of the [**eeg_ds_003645s_hed_column**](https://github.com/hed-standard/hed-examples/tree/main/datasets/eeg_ds003645s_hed_column) dataset is shown in the following example.
+
+````{admonition} Sample *summarize_definitions* operation results in text format.
+:class: tip
+```text
+Summary name: HED_column_definition_summary
+Summary type: definitions
+Summary filename: HED_column_definition_summary
+
+Overall summary:
+ Known Definitions: 17 items
+ cross-only: 2 items
+ description: A white fixation cross on a black background in the center of the screen.
+ contents: (Visual-presentation,(Background-view,Black),(Foreground-view,(Center-of,Computer-screen),(Cross,White)))
+ face-image: 2 items
+ description: A happy or neutral face in frontal or three-quarters frontal pose with long hair cropped presented as an achromatic foreground image on a black background with a white fixation cross superposed.
+ contents: (Visual-presentation,(Background-view,Black),(Foreground-view,((Center-of,Computer-screen),(Cross,White)),(Grayscale,(Face,Hair,Image))))
+ circle-only: 2 items
+ description: A white circle on a black background in the center of the screen.
+ contents: (Visual-presentation,(Background-view,Black),(Foreground-view,((Center-of,Computer-screen),(Circle,White))))
+ press-left-finger: 2 items
+ description: The participant presses a key with the left index finger to indicate a face symmetry judgment.
+ contents: ((Index-finger,(Experiment-participant,Left-side-of)),(Keyboard-key,Press))
+ press-right-finger: 2 items
+ description: The participant presses a key with the right index finger to indicate a face symmetry evaluation.
+ contents: ((Index-finger,(Experiment-participant,Right-side-of)),(Keyboard-key,Press))
+ famous-face-cond: 2 items
+ description: A face that should be recognized by the participants
+ contents: (Condition-variable/Face-type,(Image,(Face,Famous)))
+ unfamiliar-face-cond: 2 items
+ description: A face that should not be recognized by the participants.
+ contents: (Condition-variable/Face-type,(Image,(Face,Unfamiliar)))
+ scrambled-face-cond: 2 items
+ description: A scrambled face image generated by taking face 2D FFT.
+ contents: (Condition-variable/Face-type,(Image,(Disordered,Face)))
+ first-show-cond: 2 items
+ description: Factor level indicating the first display of this face.
+ contents: ((Condition-variable/Repetition-type,Item-interval/0,(Face,Item-count/1)))
+ immediate-repeat-cond: 2 items
+ description: Factor level indicating this face was the same as previous one.
+ contents: ((Condition-variable/Repetition-type,Item-interval/1,(Face,Item-count/2)))
+ delayed-repeat-cond: 2 items
+ description: Factor level indicating face was seen 5 to 15 trials ago.
+ contents: (Condition-variable/Repetition-type,(Face,Item-count/2),(Item-interval,(Greater-than-or-equal-to,Item-interval/5)))
+ left-sym-cond: 2 items
+ description: Left index finger key press indicates a face with above average symmetry.
+ contents: (Condition-variable/Key-assignment,((Asymmetrical,Behavioral-evidence),(Index-finger,(Experiment-participant,Right-side-of))),((Behavioral-evidence,Symmetrical),(Index-finger,(Experiment-participant,Left-side-of))))
+ right-sym-cond: 2 items
+ description: Right index finger key press indicates a face with above average symmetry.
+ contents: (Condition-variable/Key-assignment,((Asymmetrical,Behavioral-evidence),(Index-finger,(Experiment-participant,Left-side-of))),((Behavioral-evidence,Symmetrical),(Index-finger,(Experiment-participant,Right-side-of))))
+ face-symmetry-evaluation-task: 2 items
+ description: Evaluate degree of image symmetry and respond with key press evaluation.
+ contents: (Experiment-participant,Task,(Discriminate,(Face,Symmetrical)),(Face,See),(Keyboard-key,Press))
+ blink-inhibition-task: 2 items
+ description: Do not blink while the face image is displayed.
+ contents: (Experiment-participant,Inhibit-blinks,Task)
+ fixation-task: 2 items
+ description: Fixate on the cross at the screen center.
+ contents: (Experiment-participant,Task,(Cross,Fixate))
+ initialize-recording: 2 items
+ description:
+ contents: (Recording)
+ Ambiguous Definitions: 0 items
+
+ Errors: 0 items
+```
+````
+
+Since this file didn't have any ambiguous or incorrect `Def-expand` groups, those sections are empty.
+Ambiguous definitions are those that take a placeholder, but it doesn't have enough information
+to be sure to which tag the placeholder applies.
+Erroneous ones are ones with conflicting expanded forms.
+
+Currently, summaries are not generated for individual files,
+but this is likely to change in the future.
+
+Below is a simple example showing the format when erroneous or ambiguous definitions are found.
+
+````{admonition} Sample input for *summarize_definitions* operation documenting ambiguous/erroneous definitions.
+:class: tip
+```text
+((Def-expand/Initialize-recording,(Recording)),Onset)
+((Def-expand/Initialize-recording,(Recording, Event)),Onset)
+(Def-expand/Specify-age/1,(Age/1, Item-count/1))
+```
+````
+
+````{admonition} Sample *summarize_definitions* operation error results in text format.
+:class: tip
+```text
+Summary name: HED_column_definition_summary
+Summary type: definitions
+Summary filename: HED_column_definition_summary
+
+Overall summary:
+ Known Definitions: 1 items
+ initialize-recording: 2 items
+ description:
+ contents: (Recording)
+ Ambiguous Definitions: 1 items
+ specify-age/#: (Age/#,Item-count/#)
+ Errors: 1 items
+ initialize-recording:
+ (Event,Recording)
+```
+````
+
+It is assumed the first definition encountered is the correct definition, unless the first one is ambiguous.
+Thus, it finds (`Def-expand/Initialize-recording`,(`Recording`) and considers it valid, before encountering
+(`Def-expand/Initialize-recording`,(`Recording`, `Event`)), which is now deemed an error.
+
+
+(summarize-hed-tags-anchor)=
+### Summarize HED tags
+
+The *summarize_hed_tags* operation extracts a summary of the HED tags present
+in the annotations of a dataset.
+This summary operation assumes that the structure in question is suitably
+annotated with HED (Hierarchical Event Descriptors).
+You must provide a HED schema version.
+If the data has annotations in a JSON sidecar, you must also provide its path.
+
+(summarize-hed-tags-parameters-anchor)=
+#### Summarize HED tags parameters
+
+The *summarize_hed_tags* operation has the two required parameters
+(*tags* and *expand_context*) in addition to the standard *summary_name* and *summary_filename* parameters.
+
+```{admonition} Parameters for the *summarize_hed_tags* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *summary_name* | str | A unique name used to identify this summary.|
+| *summary_filename* | str | A unique file basename to use for saving this summary. |
+| *tags* | dict | Dictionary with category title keys and tags in that category as values. |
+| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
+| *include_context* | bool | (**Optional**: Default true) If true, expand the event context to
account for onsets and offsets. |
+| *remove_types* | list | (**Optional**) A list of types such as
`Condition-variable` and `Task` to remove. |
+| *replace_defs* | bool | (**Optional**: Default true) If true, the `Def` tags are replaced with the
contents of the definition (no `Def` or `Def-expand`). |
+| *word_cloud* | dict | (**Optional**) If present, the operation produces a
word cloud image in addition to the summaries. |
+```
+
+The *tags* dictionary has keys that specify how the user wishes the tags
+to be categorized for display.
+Note that these keys are titles designating display categories, not HED tags.
+
+The *tags* dictionary values are lists of actual HED tags (or their children)
+that should be listed under the respective display categories.
+
+If the optional parameter *include_context* is true, the counts include tags contributing
+to the event context in events intermediate between onsets and offsets.
+
+If the optional parameter *replace_defs* is true, the tag counts include
+tags contributed by contents of the definitions.
+
+If *word_cloud* parameter is provided but its value is empty, the default word cloud settings are used.
+The following table lists the optional parameters used to control the appearance of the word cloud image.
+
+```{admonition} Optional keys in the word cloud dictionary value.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *background_color* | str | The matplotlib name of the background color (default "black").|
+| *contour_color* | str | The matplotlib name of the contour color if mask provided. |
+| *contour_width* | float | Width of contour if mask provided (default 3). |
+| *font_path* | str | The path of the system font to use in place of the default font. |
+| *height* | int | Height in pixels of the image (default 300).|
+| *mask_path* | str | The path of the mask image to use if *use_mask* is true
and an image other than the brain is needed. |
+| *max_font_size* | float | The maximum font size to use in the image (default 15). |
+| *min_font_size* | float | The minimum font size to use in the image (default 8).|
+| *prefer_horizontal* | float | Fraction of horizontal words in image (default 0.75). |
+| *scale_adjustment* | float | Constant to add to log10 count transformation (default 7). |
+| *use_mask* | dict | If true, a mask image is used to provide a contour around the words. |
+| *width* | int | Width in pixels of image (default 400). |
+```
+
+(summarize-hed-tags-example-anchor)=
+#### Summarize HED tags example
+
+The following remodeling command specifies that the tag counts should be grouped
+under the titles: *Sensory events*, *Agent actions*, and *Objects*.
+Any leftover tags will appear under the title "Other tags".
+
+````{admonition} A JSON file with a single *summarize_hed_tags* summarization operation.
+:class: tip
+```json
+[{
+ "operation": "summarize_hed_tags",
+ "description": "Summarize the HED tags in the dataset.",
+ "parameters": {
+ "summary_name": "summarize_hed_tags",
+ "summary_filename": "summarize_hed_tags",
+ "tags": {
+ "Sensory events": ["Sensory-event", "Sensory-presentation",
+ "Task-stimulus-role", "Experimental-stimulus"],
+ "Agent actions": ["Agent-action", "Agent", "Action", "Agent-task-role",
+ "Task-action-type", "Participant-response"],
+ "Objects": ["Item"]
+ }
+ }
+}]
+```
+````
+
+The results of executing this operation on the
+[**sample remodel event file**](sample-remodel-event-file-anchor) are shown below.
+
+````{admonition} Text summary of *summarize_hed_tags* operation on the sample remodel file.
+:class: tip
+
+```text
+Summary name: summarize_hed_tags
+Summary type: hed_tag_summary
+Summary filename: summarize_hed_tags
+
+Overall summary:
+Dataset: Total events=1200 Total1 file=6
+ Main tags[events,files]:
+ Sensory events:
+ Sensory-presentation[6,1] Visual-presentation[6,1] Auditory-presentation[3,1]
+ Agent actions:
+ Incorrect-action[2,1] Correct-action[1,1]
+ Objects:
+ Image[6,1]
+ Other tags[events,files]:
+ Label[6,1] Def[6,1] Delay[3,1]
+
+Individual files:
+
+aomic_sub-0013_excerpt_events.tsv:
+Total events=6
+ Main tags[events,files]:
+ Sensory events:
+ Sensory-presentation[6,1] Visual-presentation[6,1] Auditory-presentation[3,1]
+ Agent actions:
+ Incorrect-action[2,1] Correct-action[1,1]
+ Objects:
+ Image[6,1]
+ Other tags[events,files]:
+ Label[6,1] Def[6,1] Delay[3,1]
+
+```
+````
+
+The HED tag *Task-action-type* was specified in the "Agent actions" category,
+*Incorrect-action* and *Correct-action*, which are children of *Task-action-type*
+in the [**HED schema**](https://www.hedtags.org/display_hed.html),
+will appear with counts in the list under this category.
+
+The sample events file had 6 events, including 1 correct action and 2 incorrect actions.
+Since only one file was processed, the information for *Dataset* was
+similar to that presented under *Individual files*.
+
+For a more extensive example, see the
+[**text**](./_static/data/summaries/FacePerception_hed_tag_summary.txt)
+and [**JSON**](./_static/data/summaries/FacePerception_hed_tag_summary.json)
+format summaries of the sample dataset
+[**ds003645s_hed**](https://github.com/hed-standard/hed-examples/tree/main/datasets/eeg_ds003645s_hed)
+using the [**summarize_hed_tags_rmdl.json**](./_static/data/summaries/summarize_hed_tags_rmdl.json)
+remodeling file.
+
+(summarize-hed-type-anchor)=
+### Summarize HED type
+
+The *summarize_hed_type* operation is designed to extract experimental design matrices or other
+experimental structure.
+This summary operation assumes that the structure in question is suitably
+annotated with HED (Hierarchical Event Descriptors).
+The [**HED conditions and design matrices**](https://hed-examples.readthedocs.io/en/latest/HedConditionsAndDesignMatrices.html)
+explains how this works.
+
+(summarize-hed-type-parameters-anchor)=
+#### Summarize HED type parameters
+
+The *summarize_hed_type* operation provides detailed information about a specified tag,
+usually `Condition-variable` or `Task`.
+This summary provides useful information about experimental design.
+
+```{admonition} Parameters for the *summarize_hed_type* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *summary_name* | str | A unique name used to identify this summary.|
+| *summary_filename* | str | A unique file basename to use for saving this summary. |
+| *type_tag* | str | Tag to produce a summary for (most often *condition-variable*).|
+| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename.|
+```
+In addition to the two standard parameters (*summary_name* and *summary_filename*),
+the *type_tag* parameter is required.
+Only one tag can be given, so you must provide a separate operations in the remodel file
+for multiple type tags.
+
+(summarize-hed-type-example-anchor)=
+#### Summarize HED type example
+
+````{admonition} A JSON file with a single *summarize_hed_type* summarization operation.
+:class: tip
+```json
+[{
+ "operation": "summarize_hed_type",
+ "description": "Summarize column names.",
+ "parameters": {
+ "summary_name": "AOMIC_condition_variables",
+ "summary_filename": "AOMIC_condition_variables",
+ "type_tag": "condition-variable"
+ }
+}]
+```
+````
+
+The results of executing this operation on the
+[**sample remodel event file**](sample-remodel-event-file-anchor) are shown below.
+
+````{admonition} Text summary of *summarize_hed_types* operation on the sample remodel file.
+:class: tip
+
+```text
+Summary name: AOMIC_condition_variables
+Summary type: hed_type_summary
+Summary filename: AOMIC_condition_variables
+
+Overall summary:
+
+Dataset: Type=condition-variable Type values=1 Total events=6 Total files=1
+ image-sex: 2 levels in 6 event(s)s out of 6 total events in 1 file(s)
+ female-image-cond [4,1]: ['Female', 'Image', 'Face']
+ male-image-cond [2,1]: ['Male', 'Image', 'Face']
+
+Individual files:
+
+aomic_sub-0013_excerpt_events.tsv:
+Type=condition-variable Total events=6
+ image-sex: 2 levels in 6 events
+ female-image-cond [4 events, 1 files]:
+ Tags: ['Female', 'Image', 'Face']
+ male-image-cond [2 events, 1 files]:
+ Tags: ['Male', 'Image', 'Face']
+```
+````
+
+Because *summarize_hed_type* is a HED operation,
+a HED schema version is required and a JSON sidecar is also usually needed.
+This summary was produced by using `hed_version="8.1.0"` when creating the `dispatcher`
+and using the [**sample remodel sidecar file**](sample-remodel-sidecar-file-anchor) in the `do_op`.
+The sidecar provides the annotations that use the `condition-variable` tag in the summary.
+
+For a more extensive example, see the
+[**text**](./_static/data/summaries/FacePerception_hed_type_summary.txt)
+and [**JSON**](./_static/data/summaries/FacePerception_hed_type_summary.json)
+format summaries of the sample dataset
+[**ds003645s_hed**](https://github.com/hed-standard/hed-examples/tree/main/datasets/eeg_ds003645s_hed)
+using the [**summarize_hed_types_rmdl.json**](./_static/data/summaries/summarize_hed_types_rmdl.json)
+remodeling file.
+
+(summarize-hed-validation-anchor)=
+### Summarize HED validation
+
+The *summarize_hed_validation* operation runs the HED validator on the requested data
+and produces a summary of the errors.
+See the [**HED validation guide**](./HedValidationGuide.md) for available methods of
+running the HED validator.
+
+
+(summarize-hed-validation-parameters-anchor)=
+#### Summarize HED validation parameters
+
+In addition to the required *summary_name* and *summary_filename* parameters,
+the *summarize_hed_validation* operation has a required boolean parameter *check_for_warnings*.
+If *check_for_warnings* is false, the summary will not report warnings.
+
+```{admonition} Parameters for the *summarize_hed_validation* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *summary_name* | str | A unique name used to identify this summary.|
+| *summary_filename* | str | A unique file basename to use for saving this summary. |
+| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
+| *check_for_warnings* | bool | (**Optional**: Default false) If true, warnings are reported in addition to errors. |
+```
+The *summarize_hed_validation* is a HED operation and the calling program must provide a HED schema version
+and usually a JSON sidecar containing the HED annotations.
+
+The validation process takes place in two stages: first the JSON sidecar is validated.
+This strategy is used because a single error in the JSON sidecar can generate an error message
+for every line in the corresponding data file.
+
+If the JSON sidecar has errors (warnings don't count), the validation process is terminated
+without validation of the data file and assembled HED annotations.
+
+If the JSON sidecar does not have errors,
+the validator assembles the annotations for each line in the data files and validates
+the assembled HED annotation.
+Data file-wide consistency, such as matched onsets and offsets, is also checked.
+
+
+(summarize-hed-validation-example-anchor)=
+#### Summarize HED validation example
+
+````{admonition} A JSON file with a single *summarize_hed_validation* summarization operation.
+:class: tip
+```json
+[{
+ "operation": "summarize_hed_validation",
+ "description": "Summarize validation errors in the sample dataset.",
+ "parameters": {
+ "summary_name": "AOMIC_sample_validation",
+ "summary_filename": "AOMIC_sample_validation",
+ "check_for_warnings": true
+ }
+}]
+```
+````
+
+To demonstrate the output of the validation operation, we modified the first row of the
+[**sample remodel event file**](sample-remodel-event-file-anchor)
+so that `trial_type` column contained the value `baloney` rather than `go`.
+This modification generates a warning because the meaning of `baloney` is not defined
+in the [**sample remodel sidecar file**](sample-remodel-sidecar-file-anchor).
+The results of executing the example operation with the modified file are shown
+in the following example.
+
+
+````{admonition} Text summary of *summarize_hed_validation* operation on a modified sample data file.
+:class: tip
+
+```text
+Summary name: AOMIC_sample_validation
+Summary type: hed_validation
+Summary filename: AOMIC_sample_validation
+
+Summary details:
+
+Dataset: [1 sidecar files, 1 event files]
+ task-stopsignal_acq-seq_events.json: 0 issues
+ sub-0013_task-stopsignal_acq-seq_events.tsv: 6 issues
+
+Individual files:
+
+ sub-0013_task-stopsignal_acq-seq_events.tsv: 1 sidecar files
+ task-stopsignal_acq-seq_events.json has no issues
+ sub-0013_task-stopsignal_acq-seq_events.tsv issues:
+ HED_UNKNOWN_COLUMN: WARNING: Column named 'onset' found in file, but not specified as a tag column or identified in sidecars.
+ HED_UNKNOWN_COLUMN: WARNING: Column named 'duration' found in file, but not specified as a tag column or identified in sidecars.
+ HED_UNKNOWN_COLUMN: WARNING: Column named 'response_time' found in file, but not specified as a tag column or identified in sidecars.
+ HED_UNKNOWN_COLUMN: WARNING: Column named 'response_accuracy' found in file, but not specified as a tag column or identified in sidecars.
+ HED_UNKNOWN_COLUMN: WARNING: Column named 'response_hand' found in file, but not specified as a tag column or identified in sidecars.
+ HED_SIDECAR_KEY_MISSING[row=0,column=2]: WARNING: Category key 'baloney' does not exist in column. Valid keys are: ['succesful_stop', 'unsuccesful_stop', 'go']
+
+```
+````
+
+This summary was produced using HED schema version `hed_version="8.1.0"` when creating the `dispatcher`
+and using the [**sample remodel sidecar file**](sample-remodel-sidecar-file-anchor) in the `do_op`.
+
+
+(summarize-sidecar-from-events-anchor)=
+### Summarize sidecar from events
+
+The summarize sidecar from events operation generates a sidecar template from the event
+files in the dataset.
+
+
+(summarize-sidecar-from-events-parameters-anchor)=
+#### Summarize sidecar from events parameters
+
+The following table lists the parameters required for using the summary.
+
+```{admonition} Parameters for the *summarize_sidcar_from_events* operation.
+:class: tip
+
+| Parameter | Type | Description |
+| ------------ | ---- | ----------- |
+| *summary_name* | str | A unique name used to identify this summary.|
+| *summary_filename* | str | A unique file basename to use for saving this summary. |
+| *skip_columns* | list | A list of column names to omit from the sidecar.|
+| *value_columns* | list | A list of columns to treat as value columns in the sidecar. |
+| *append_timecode* | bool | (Optional) If True, append a time code to filename.
False is the default. |
+```
+The standard summary parameters, *summary_name* and *summary_filename* are required.
+The *summary_name* is the unique key used to identify the
+particular incarnation of this summary in the dispatcher.
+Since a particular operation file may use a given operation multiple times,
+care should be taken to make sure that it is unique.
+
+The *summary_filename* should also be unique and is used for saving the summary upon request.
+When the remodeler is applied to full datasets rather than single files,
+the summaries are saved in the `derivatives/remodel/summaries` directory under the dataset root.
+A time stamp and file extension are appended to the *summary_filename* when the
+summary is saved.
+
+In addition to the standard parameters, *summary_name* and *summary_filename* required of all summaries,
+the *summarize_sidecar_from_events* operation requires two additional lists to be supplied.
+The *skip_columns* list specifies the names of columns to skip entirely in
+generating the sidecar template.
+The *value_columns* list specifies the names of columns to treat as value columns
+when generating the sidecar template.
+
+(summarize-sidecar-from-events-example-anchor)=
+#### Summarize sidecar from events example
+
+The following example shows the JSON for including this operation in a remodeling file.
+
+````{admonition} A JSON file with a single *summarize_sidecar_from_events* summarization operation.
+:class: tip
+```json
+[{
+ "operation": "summarize_sidecar_from_events",
+ "description": "Generate a sidecar from the excerpted events file.",
+ "parameters": {
+ "summary_name": "AOMIC_generate_sidecar",
+ "summary_filename": "AOMIC_generate_sidecar",
+ "skip_columns": ["onset", "duration"],
+ "value_columns": ["response_time", "stop_signal_delay"]
+ }
+}]
+
+```
+````
+
+The results of executing this operation on the
+[**sample remodel event file**](sample-remodel-event-file-anchor)
+are shown in the following example using the text format.
+
+````{admonition} Sample *summarize_sidecar_from_events* operation results in text format.
+:class: tip
+```text
+Summary name: AOMIC_generate_sidecar
+Summary type: events_to_sidecar
+Summary filename: AOMIC_generate_sidecar
+
+Dataset: Currently no overall sidecar extraction is available
+
+Individual files:
+
+aomic_sub-0013_excerpt_events.tsv: Total events=6 Skip columns: ['onset', 'duration']
+Sidecar:
+{
+ "trial_type": {
+ "Description": "Description for trial_type",
+ "HED": {
+ "go": "(Label/trial_type, Label/go)",
+ "succesful_stop": "(Label/trial_type, Label/succesful_stop)",
+ "unsuccesful_stop": "(Label/trial_type, Label/unsuccesful_stop)"
+ },
+ "Levels": {
+ "go": "Here describe column value go of column trial_type",
+ "succesful_stop": "Here describe column value succesful_stop of column trial_type",
+ "unsuccesful_stop": "Here describe column value unsuccesful_stop of column trial_type"
+ }
+ },
+ "response_accuracy": {
+ "Description": "Description for response_accuracy",
+ "HED": {
+ "correct": "(Label/response_accuracy, Label/correct)"
+ },
+ "Levels": {
+ "correct": "Here describe column value correct of column response_accuracy"
+ }
+ },
+ "response_hand": {
+ "Description": "Description for response_hand",
+ "HED": {
+ "left": "(Label/response_hand, Label/left)",
+ "right": "(Label/response_hand, Label/right)"
+ },
+ "Levels": {
+ "left": "Here describe column value left of column response_hand",
+ "right": "Here describe column value right of column response_hand"
+ }
+ },
+ "sex": {
+ "Description": "Description for sex",
+ "HED": {
+ "female": "(Label/sex, Label/female)",
+ "male": "(Label/sex, Label/male)"
+ },
+ "Levels": {
+ "female": "Here describe column value female of column sex",
+ "male": "Here describe column value male of column sex"
+ }
+ },
+ "response_time": {
+ "Description": "Description for response_time",
+ "HED": "(Label/response_time, Label/#)"
+ },
+ "stop_signal_delay": {
+ "Description": "Description for stop_signal_delay",
+ "HED": "(Label/stop_signal_delay, Label/#)"
+ }
+}
+```
+````
+
+(remodel-implementation-anchor)=
+## Remodel implementation
+
+Operations are defined as classes that extend `BaseOp` regardless of whether
+they are transformations or summaries. However, summaries must also implement
+an additional supporting class that extends `BaseSummary` to hold the summary information.
+
+In order to be executed by the remodeling functions,
+an operation must appear in the `valid_operations` dictionary.
+
+Each operation class must have a `NAME` class variable, specifying the operation name (string) and a
+`PARAMS` class variable containing a dictionary of the operation's parameters represented as a json schema.
+The operation's constructor extends the `BaseOp` class constructor by calling:
+
+````{admonition} A remodel operation class must call the BaseOp constructor first.
+:class: tip
+```python
+ super().__init__(parameters)
+```
+````
+
+A remodel operation class must implement the `BaseOp` abstract methods `do_ops` and `validate_input_data`.
+
+### The PARAMS dictionary
+
+The class-wide `PARAMS` dictionary specifies the required and optional parameters of the operation as a [**JSON schema**](https://json-schema.org/).
+We currently use draft-2020-12.
+The basic vocabulary allows specifying the type of parameters that are expected and
+whether a parameter is required or optional.
+
+It is also possible to add dependencies between parameters. More information can be found in the JSON schema
+[**documentation**](https://json-schema.org/learn/getting-started-step-by-step).
+
+On the highest level the type should always be specified as an object, as the parameters are always provided as a dictionary or json object.
+Under the properties key, the expected parameters should be listed, along with what datatype is expected for every parameter.
+The specifications can be nested, for example, the `rename_columns` operation requires a parameter `column_mapping`,
+which should be a JSON object whose keys are any valid string, and whose values are also strings.
+This is represented in the following way:
+
+```json
+{
+ "type": "object",
+ "properties": {
+ "column_mapping": {
+ "type": "object",
+ "patternProperties": {
+ ".*": {
+ "type": "string"
+ }
+ },
+ "minProperties": 1
+ },
+ "ignore_missing": {
+ "type": "boolean"
+ }
+ },
+ "required": [
+ "column_mapping",
+ "ignore_missing"
+ ],
+ "additionalProperties": false
+ }
+```
+
+The `PARAMS` dictionaries for all available operations are read by the `validator` and compiled into a
+single JSON schema which represents the specification for remodeler files.
+The `properties` dictionary explicitly specifies the parameters that are allowed for this operation.
+The `required` list specifies which parameters must be included when calling operation.
+Parameters that are not required may be omitted in the operation call.
+The `"additionalProperties": false` is the way that JSON schema use to
+indicate that no other parameters are allowed in the call to the operation.
+
+A limitation to JSON schema representations is that although it can handle specific dependencies between keys in the data,
+it cannot validate the data that is provided in the JSON file against other data in the JSON file.
+For example, if the requirement is a list of elements whose length should be specified by another parameter,
+JSON schema does not provide a vocabulary for setting this dependency.
+Instead, we handle these type of dependencies in the `validate_input_data` method.
+
+(operation-class-constructor-anchor)=
+### Operation class constructor
+
+All the operation classes have constructors that start with a call to the superclass constructor `BaseOp`.
+The following example shows the constructor for the `RemoveColumnsOp` class.
+
+````{admonition} The class-wide PARAMS dictionary for the RemoveColumnsOp class.
+:class: tip
+```python
+ def __init__(self, parameters):
+ super().__init__(parameters)
+ self.column_names = parameters['column_names']
+ ignore_missing = parameters['ignore_missing']
+ if ignore_missing:
+ self.error_handling = 'ignore'
+ else:
+ self.error_handling = 'raise'
+```
+````
+
+After the call to the base class constructor, the operation constructor assigns the operation-specific
+values to class properties. Validation takes place before the operation classes are initialized.
+
+
+(the-do_op-implementation-anchor)=
+### The do_op implementation
+The remodeling script is meant to be executed by the `Dispatcher`,
+which keeps a compiled version of the remodeling script to execute on each tabular file to be remodeled.
+
+The main method that must be implemented by each operation is `do_op`, which takes
+an instance of the `Dispatcher` class as the first parameter and a Pandas [`DataFrame`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html)
+representing the tabular file as the second parameter.
+A third required parameter is a name used to identify the tabular file in error messages and summaries.
+This name is usually the filename or the filepath from the dataset root.
+An additional optional argument, a sidecar containing HED annotations,
+only need be included for HED operations.
+Note that the `Dispatcher` is responsible for holding the appropriate version of the HED schema if
+HED remodeling operations are included.
+
+The following example shows a sample implementation for `do_op`.
+
+````{admonition} The implementation of do_op for the RemoveColumnsOp class.
+:class: tip
+```python
+
+ def do_op(self, dispatcher, df, name, sidecar=None):
+ return df.drop(self.remove_names, axis=1, errors=self.error_handling)
+```
+````
+
+The `do_op` in this case is a wrapper for the underlying Pandas `DataFrame`
+operation for removing columns.
+
+**IMPORTANT NOTE**: The `do_op` operation always assumes that `n/a` values have been
+replaced by `numpy.NaN` values in the incoming dataframe `df`.
+The `Dispatcher` class has a static method `prep_data` that does this replacement.
+At the end of running all the remodeling operations on a data file `Dispatcher` `run_operations`
+method replaces all of the `numpy.NaN` values with `n/a`, the value expected by BIDS.
+This operation is performed by the `Dispatcher` static method `post_proc_data`.
+
+
+### The validate_input_data implementation
+
+This method exist to handle additional input data validation that cannot be specified in JSON schema.
+It is a class method which is called by the `validator`.
+If there is no additional validation to be done,
+a minimal implementation of this method should take in a dictionary with the operation parameters and return an empty list.
+In case additional validation is required, the method should directly implement validation and return a list of user-friendly
+error messages (strings) if validation fails, or an empty list if there are no errors.
+
+The following implementation of `validate_input_data` method, for the `factor_hed_tags` operation,
+checks whether the parameter `query_names` is the same length as the input for parameter `queries`,
+since the names specified in the first parameter are meant to represent the queries provided in the latter.
+The check only takes place if `query_names` exists, since naming is handled automatically otherwise.
+
+```python
+@staticmethod
+def validate_input_data(parameters):
+ errors = []
+ if parameters.get("query_names", False):
+ if len(parameters.get("query_names")) != len(parameters.get("queries")):
+ errors.append("The list in query_names, in the factor_hed_tags operation, should have the same number of items as queries.")
+ return errors
+```
+
+
+(the-do_op-for summarization-anchor)=
+### The do_op for summarization
+
+The `do_op` operation for summarization operations has a slightly different form,
+as it serves primarily as a wrapper for the actual summary information as illustrated
+by the following example.
+
+(implementation-of-do-op_summarize-column-names-anchor)=
+````{admonition} The implementation of do_op for SummarizeColumnNamesOp.
+:class: tip
+```python
+ def do_op(self, dispatcher, df, name, sidecar=None):
+ summary = dispatcher.summary_dict.get(self.summary_name, None)
+ if not summary:
+ summary = ColumnNameSummary(self)
+ dispatcher.summary_dict[self.summary_name] = summary
+ summary.update_summary({"name": name, "column_names": list(df.columns)})
+ return df
+
+```
+````
+
+A `do_op` operation for a summarization checks the `dispatcher` to see if the
+summary name is already in the dispatcher's `summary_dict`.
+If that summary is not yet in the `summary_dict`,
+the operation creates a `BaseSummary` object for its summary (e.g., `ColumnNameSummary`)
+and adds this object to the dispatcher's `summary_dict`,
+otherwise the operation fetches the `BaseSummary` object from the dispatcher's `summary_dict`.
+It then asks this `BaseSummary` object to update the summary based on the `DataFrame`
+as explained in the next section.
+
+(additional-requirements-for-summarization-anchor)=
+### Additional requirements for summarization
+
+Any summary operation must implement a supporting class that extends `BaseSummary`.
+This class is used to hold and accumulate the information specific to the summary.
+This support class must implement two methods: `update_summary` and `get_summary_details`.
+
+The `update_summary` method is called by its associated `BaseOp` operation during the `do_op`
+to update the summary information based on the current `DataFrame`.
+The `update_summary` information takes a single parameter, which is a dictionary of information
+specific to this operation.
+
+````{admonition} The update_summary method required to be implemented by all BaseSummary objects.
+:class: tip
+```python
+ def update_summary(self, summary_dict)
+```
+````
+
+In the example [do_op for ColumnNamesOp](implementation-of-do-op_summarize-column-names-anchor),
+the dictionary contains keys for `name` and `column_names.
+
+The `get_summary_details` returns a dictionary with the summary-specific information
+currently in the summary.
+The `BaseSummary` provides universal methods for converting this summary to JSON or text format.
+
+
+````{admonition} The get_summary_details method required to be implemented by all BaseSummary objects.
+:class: tip
+```python
+ get_summary_details(self, verbose=True)
+```
+````
+The operation associated with this instance of it associated with a given format
+implementation
+
+### Validator implementation
+
+The required input for the remodeler is specified in JSON format and must follow
+the rules laid out by the JSON schema.
+The parameters in the remodeler file must conform to the properties specified
+in the corresponding JSON schema associated with each operation.
+The errors are retrieved from the validator but are not passed on directly but instead
+modified for display as user-friendly error messages.
+
+Validation errors are organized by stages as follows.
+
+#### Stage 0: top-level structure
+
+Stage 0 refers to the top-level structure of the remodel JSON file.
+As specified by the validator's `BASE_ARRAY`,
+a JSON remodel file must be an array of operation dictionaries containing at least 1 item.
+
+#### Stage 1: operation dictionary format
+
+Stage 1 validation refers the structure of the individual operations as specified by the validator's `OPERATION_DICT`.
+Every operation dictionary should have exactly the keys: `operation`, `description`, and `parameters`.
+
+#### Stage 2: operation dictionary values
+
+Stage 2 validates the values associated with the keys in each operation dictionary.
+The `operation` and `description` keys should have a string values,
+while the `parameters` key should have a dictionary value.
+
+Stage 2 validation also verifies that the operation value is one of the valid operations as
+enumerated in the `valid_operations` dictionary.
+
+Several checks are also applied to the `parameters` dictionary.
+The properties listed as `required` in the schema must appear as keys in the `parameters` dictionary.
+
+If additional properties are not allowed, as designated by `"additionalProperties": False` in the JSON schema,
+the validator verifies that parameters not mentioned in the schema do not appear.
+Note this is currently true for all operations and recommended for new operations.
+
+If the schema for the operation has a `dependentRequired` dictionary, the validator
+verifies that the indicated keys are present if the listed parameter values are also present.
+For example, the `factor_column_op` only allows the `factor_names` parameter if the `factor_values`
+parameter is also present. In this case the dependency works only one way, such that `factor_values`
+can be provided without `factor_names`. If `factor_names` is provided alone the operation automatically generates the
+factor names based on the column names, however, without `factor_values` the names provided
+in `factor_names` do not correspond to anything, so this key cannot appear on its own.
+
+#### Later validation stages
+
+Later stages in validation concern the values given within the parameter object, which can be nested to an arbitrary level
+and are handled in a general way.
+The user is provided with the operation index, name and the 'path' of the value that is invalid.
+Note that while parameters always contains an object, the values in parameters can be of any type.
+Thus, parameter values can be objects whose values might also be expected to be objects, arrays, or arrays of objects.
+The validator has appropriate messages for many of the conditions that can be set with json schema,
+but if an new operation parameter has a condition that has not been used yet, a new error message will need to be added to the validator.
+
+
+When validation against JSON schema passes,
+the validator performs additional data-specific validation by calling `validate_input_data`
+for each operation to verify that input data satisfies the
+constraints that fall outside the scope of JSON schema.
+Also see [**The validate_input_data implementation**](#the-validate_input_data-implementation) and
+[**The PARAMS dictionary**](#the-params-dictionary) sections for additional information.
diff --git a/docs/source/HedSearchGuide.md b/docs/source/HedSearchGuide.md
index c4dde7d..e4df36e 100644
--- a/docs/source/HedSearchGuide.md
+++ b/docs/source/HedSearchGuide.md
@@ -241,8 +241,8 @@ based on HED annotations in a dataset-independent manner.
These queries can be used to locate data sets satisfying the specified criteria
and to find the relevant event markers in that data.
-For example, the [**factor_hed_tags**](https://www.hed-resources.org/en/latest/FileRemodelingTools.html#factor-hed-tags)
-operation of the HED [**file remodeling tools**](https://www.hed-resources.org/en/latest/FileRemodelingTools.html)
+For example, the [**factor_hed_tags**](https://www.hed-resources.org/en/latest/HedRemodelingTools.html#factor-hed-tags)
+operation of the [**HED remodeling tools**](https://www.hed-resources.org/en/latest/HedRemodelingTools.html)
creates factor vectors for selecting events satisfying general HED queries.
The [**HED-based epoching**](https://www.hed-resources.org/en/latest/HedMatlabTools.html#hed-based-epoching) tools in [**EEGLAB**](https://sccn.ucsd.edu/eeglab/index.php)
diff --git a/docs/source/HedSummaryGuide.md b/docs/source/HedSummaryGuide.md
index 51af930..27c362b 100644
--- a/docs/source/HedSummaryGuide.md
+++ b/docs/source/HedSummaryGuide.md
@@ -1,7 +1,7 @@
(hed-summary-guide-anchor)=
# HED summary guide
-The HED [**File remodeling tools**](https://www.hed-resources.org/en/latest/FileRemodelingTools.html) provide a number of event summaries
+The HED [**HED remodeling tools**](https://www.hed-resources.org/en/latest/HedRemodelingTools.html) provide a number of event summaries
and event file transformations that are very useful during curation and analysis.
The summaries described in this guide are:
@@ -10,8 +10,8 @@ The summaries described in this guide are:
* [**HED tag summary**](hed-tag-summary-anchor)
* [**Experimental design summary**](experimental-design-summary-anchor)
-As described in more detail in the [**File remodeling quickstart**](https://www.hed-resources.org/en/latest/FileRemodelingQuickstart.html) tutorial and the
-[**File remodeling tools**](https://www.hed-resources.org/en/latest/FileRemodelingTools.html)
+As described in more detail in the [**HED remodeling quickstart**](https://www.hed-resources.org/en/latest/HedRemodelingQuickstart.html) tutorial and the
+[**HED remodeling tools**](https://www.hed-resources.org/en/latest/HedRemodelingTools.html)
user manual, these tools have as input, a JSON file with a list of remodeling commands and
an event file. Summaries involving HED also require a HED schema version and possibly a
JSON sidecar containing HED annotations.
diff --git a/docs/source/HedValidationGuide.md b/docs/source/HedValidationGuide.md
index 80d524f..757d3f8 100644
--- a/docs/source/HedValidationGuide.md
+++ b/docs/source/HedValidationGuide.md
@@ -250,7 +250,7 @@ Errors, if any are printed to the command line.
#### Remodeling validation summaries
Validation is also available through HED remodeling tool interface.
-As explained in [**File remodeling quickstart**](./FileRemodelingQuickstart.md),
+As explained in [**HED remodeling quickstart**](./HedRemodelingQuickstart),
the HED remodeling tools allow users to restructure their event files and/or
summarize their contents in various ways.
Users specify a list of operations in a JSON remodeling file,
@@ -318,7 +318,7 @@ For example, the text file is:
`/root_path/derivatives/remodel/summaries/validate_initial_xxx.txt`
where xxx is the time of generation.
-For more information see [**File remodeling quickstart**](./FileRemodelingQuickstart.md)
+For more information see [**HED remodeling quickstart**](./HedRemodelingQuickstart)
for an overview of the remodeling process and
-[**File remodeling tools**](./FileRemodelingTools.md) for detailed descriptions of
+[**HED remodeling tools**](./HedRemodelingTools) for detailed descriptions of
the operations that are currently supported.
\ No newline at end of file
diff --git a/docs/source/HowCanYouUseHed.md b/docs/source/HowCanYouUseHed.md
index 81aff08..46d9595 100644
--- a/docs/source/HowCanYouUseHed.md
+++ b/docs/source/HowCanYouUseHed.md
@@ -92,7 +92,7 @@ in post-processing and assure that the conditions are correctly marked.
#### Logs to event files
Although the HED tools do not yet directly support any particular experimental presentation/control
-software packages, the HED [**File remodeling tools**](./FileRemodelingTools.md) can
+software packages, the HED [**HED remodeling tools**](./HedRemodelingTools) can
be useful in working with logged data.
Assuming that you can put the information from your experimental log into a tabular form such as:
@@ -108,8 +108,8 @@ Assuming that you can put the information from your experimental log into a tabu
````
-The [**summarize column values**](./FileRemodelingTools.md#summarize-column-values)
-operation in the HED [**file remodeling tools**](./FileRemodelingTools.md)
+The [**summarize column values**](./HedRemodelingTools#summarize-column-values)
+operation in the HED [**file remodeling tools**](./HedRemodelingTools)
compiles detailed summaries of the contents of tabular files.
Use the following remodeling file and your tabular log file as input
to the HED online [**event remodeling**](https://hedtools.ucsd.edu/hed_dev/events) tools
@@ -135,10 +135,10 @@ to quickly get an overview of its contents.
### Post-processing the event data
The information that first comes off the experimental logs is usually not directly usable for
-sharing and analysis. A number of HED [**File remodeling tools**](./FileRemodelingTools.md)
+sharing and analysis. A number of HED [**HED remodeling tools**](./HedRemodelingTools)
might be helpful for restructuring your first pass at the event files.
-The [**remap columns**](./FileRemodelingTools.md#remap-columns) transformation is
+The [**remap columns**](./HedRemodelingTools#remap-columns) transformation is
particularly useful during the initial processing of tabular log information
as exemplified by the following example
@@ -395,7 +395,7 @@ to improperly handle these situations, reducing the accuracy of analysis.
At this time, your only option is to do manual checks or write custom code to
detect these types of experiment-specific inconsistencies.
However, work is underway to include some standard types of checks in the
-HED [**File remodeling tools**](./FileRemodelingTools.md) in future releases.
+HED [**HED remodeling tools**](./HedRemodelingTools) in future releases.
You may also want to reorganize the event files using the remodeling tools.
See the [**Remap columns**](remap-columns-anchor)
@@ -431,7 +431,7 @@ work and possibly contact with the data authors for correct use and interpretati
You can get a preliminary sense about what is actually in the data by downloading a
single event file (e.g., a BIDS `_events.tsv`) and its associated JSON sidecar
(e.g., a BIDS `_events.json`) and creating HED remodeling tool summaries using the
-[**HED online tools for debugging**](./FileRemodelingQuickstart.md#online-tools-for-debugging).
+[**HED online tools for debugging**](./HedRemodelingQuickstart#online-tools-for-debugging).
Summaries of particular use for analysts include:
- The [**column value summary**](./HedSummaryGuide.md#column-value-summary) compiles a summary of
@@ -449,11 +449,11 @@ or temporal layout of the experiment.
While HED tag summary and the experimental design summaries require that the dataset have HED annotations, these summaries do not rely on the experiment-specific
event-coding used in each experiment and can be used to compare information for different datasets.
-The [**File remodeling quickstart**](./FileRemodelingQuickstart.md) tutorial
+The [**HED remodeling quickstart**](./HedRemodelingQuickstart) tutorial
gives an overview of the remodeling tools and how to use them.
-More detailed information can be found in [**File remodeling tools**](./FileRemodelingTools.md).
+More detailed information can be found in [**HED remodeling tools**](./HedRemodelingTools).
-The [**Online tools for debugging**](./FileRemodelingQuickstart.md#online-tools-for-debugging)
+The [**Online tools for debugging**](./HedRemodelingQuickstart#online-tools-for-debugging)
shows how to use remodeling tools to obtain these summaries without writing any code.
The [**HED conditions and design matrices**](HedConditionsAndDesignMatrices.md) guide explains how
@@ -474,8 +474,8 @@ additional code, while generality allows comparison of criteria across different
experiments.
The factor generation as described in the next section relies on the HED
-[**File remodeling tools**](FileRemodelingTools.md).
-See [**File remodeling tools**](FileRemodelingTools.md).
+[**HED remodeling tools**](HedRemodelingTools).
+See [**HED remodeling tools**](HedRemodelingTools).
(factor-vectors-and-selection-anchor)=
#### Factor vectors and selection
@@ -490,18 +490,18 @@ rows as the event file (each row corresponding to an event marker).
Factor vectors contain 1's for rows in which a specified criterion is satisfied
and 0's otherwise.
-- The [**factor column operation**](./FileRemodelingTools.md#factor-column)
+- The [**factor column operation**](./HedRemodelingTools#factor-column)
creates factor vectors based on the unique values in specified columns.
This factor operation does not require any HED information.