Skip to content

Commit

Permalink
Merge pull request hed-standard#356 from VisLab/develop
Browse files Browse the repository at this point in the history
Many corrections to the remodeling tools documentation
  • Loading branch information
VisLab authored Feb 2, 2024
2 parents 7b047e3 + 35b4048 commit 94dc628
Show file tree
Hide file tree
Showing 2 changed files with 51 additions and 42 deletions.
88 changes: 46 additions & 42 deletions docs/source/FileRemodelingTools.md
Original file line number Diff line number Diff line change
Expand Up @@ -711,8 +711,7 @@ from the data file if the columns exist.
"operation": "remove_columns",
"description": "Remove unwanted columns prior to analysis",
"parameters": {
"remove_names": ["value", "sample"],
"ignore_missing": true
"remove_names": ["value", "sample"]
}
}
]
Expand Down Expand Up @@ -829,7 +828,7 @@ based on column values.
| ------------ | ---- | ----------- |
| *column_name* | str | The name of the column to be factored.|
| *factor_values* | list | Column values to be included as factors. |
| *factor_names* | list| Column names for created factors. |
| *factor_names* | list| (**Optional**) Column names for created factors. |
```

If *column_name* is not a column in the data file, a `ValueError` is raised.
Expand All @@ -841,8 +840,8 @@ If a specified value is missing in a particular file, the corresponding factor c
If *factor_names* is empty, the newly created columns are of the
form *column_name.factor_value*.
Otherwise, the newly created columns have names *factor_names*.
If *factor_names* is not empty, then *factor_values* must also be specified and
both lists must be of the same length.
If *factor_names* is not empty, then *factor_values* must also be specified
and both lists must be of the same length.

(factor-column-example-anchor)=
#### Factor column example
Expand Down Expand Up @@ -906,17 +905,20 @@ The [**HED search guide**](./HedSearchGuide.md) tutorial discusses the HED searc
| Parameter | Type | Description |
| ------------ | ---- | ----------- |
| *queries* | list | A list of HED query strings. |
| *query_names* | list | A list of names for the resulting factor columns generated by the queries. |
| *remove_types* | list | Structural HED tags to be removed (usually `Condition-variable` and `Task`). |
| *expand_context* | bool | (Optional) Expand the context and remove `Onse` and`Offset` tags before the query. |
| *query_names* | list | (**Optional**) A list of names for the factor columns generated by the queries. |
| *remove_types* | list | (**Optional**) Structural HED tags to be removed (usually `Condition-variable` and `Task`). |
| *expand_context* | bool | (**Optional**: default True) Expand the context and remove <br/>`Onset` and`Offset` tags before the query. |
```
The *query_names* list, which must be empty or the same length as *queries*,
contains the names of the factor columns produced by the search.
If the *query_names* list is empty, the result columns are titled "query_1",
"query_2", etc.

The *remove_types* and *expand_context* are not yet implemented, and hence ignored in the current release.
Most of the time the *remove_types* should be set to `["Condition-variable", "Task"]` and the effects of
the experimental design captured using the `factor_hed_types_op`.
If *expand_context* is set to *false*, the additional context provided by `Onset`, `Offset`, and `Duration`
is ignored.

(factor-hed-tags-example-anchor)=
#### Factor HED tags example
Expand All @@ -936,7 +938,7 @@ The resulting factor columns are named *correct* and *incorrect*, respectively.
"parameters": {
"queries": ["correct-action", "incorrect-action"],
"query_names": ["correct", "incorrect"],
"remove_types": [],
"remove_types": ["Condition-variable", "Task"],
"expand_context": false
}
}]
Expand Down Expand Up @@ -986,8 +988,10 @@ For additional information on how to encode experimental designs using HED, see
| Parameter | Type | Description |
| ------------ | ---- | ----------- |
| *type_tag* | str | HED tag used to find the factors (most commonly *Condition-variable*).|
| *type_values* | list | Values to factor for the *type_tag*.<br>If empty, all values of that *type_tag* are used. |
| *type_values* | list | (**Optional**) Values to factor for the *type_tag*.<br>If omitted, all values of that *type_tag* are used. |
```
The event context (as defined by onsets, offsets and durations) is always expanded and one-hot (0's and 1's)
encoding is used for the factors.

(factor-hed-type-example-anchor)=
#### Factor HED type example
Expand All @@ -1006,8 +1010,7 @@ applies and 0's otherwise.
"operation": "factor_hed_type",
"description": "Factor based on the sex of the images being presented.",
"parameters": {
"type_tag": "Condition-variable",
"type_values": []
"type_tag": "Condition-variable"
}
}]
```
Expand Down Expand Up @@ -1047,9 +1050,9 @@ duration updated to encompass the temporal extent of the merged events.
| ------------ | ---- | ----------- |
| *column_name* | str | The name of the column which is the basis of the merge.|
| *event_code* | str, int, float | The value in *column_name* that triggers the merge. |
| *match_columns* | list | Columns whose values must match to collapse events. |
| *set_durations* | bool | If true, set durations based on merged events. |
| *ignore_missing* | bool | If true, missing *column_name* or *match_columns* do not raise an error. |
| *ignore_missing* | bool | If true, missing *column_name* or *match_columns* do not raise an error. |
| *match_columns* | list | (**Optional**) Columns whose values must match to collapse events. |
```

The first of the group of rows (each representing an event) to be merged is called the anchor
Expand Down Expand Up @@ -1088,9 +1091,9 @@ have the same values to be merged into a single event.
"parameters": {
"column_name": "trial_type",
"event_code": "succesful_stop",
"match_columns": ["stop_signal_delay", "response_hand", "sex"],
"set_durations": true,
"ignore_missing": true
"ignore_missing": true,
"match_columns": ["stop_signal_delay", "response_hand", "sex"]
}
}]
```
Expand Down Expand Up @@ -1161,15 +1164,15 @@ Remapping can be used to convert the column containing these codes into one or m
| *destination_columns* | list | A list of *n* names of the destination columns for the map. |
| *map_list* | list | A list of mappings. Each element is a list of *m* source <br/>column values followed by *n* destination values.<br/> Mapping source values are treated as strings. |
| *ignore_missing* | bool | If false, source column values not in the map generate "n/a"<br/> destination values instead of errors. |
| *integer_sources* | list | [**Optional**] A list of source columns that are integers.<br/> The *integer_sources* must be a subset of *source_columns*. |
| *integer_sources* | list | (**Optional**) A list of source columns that are integers.<br/> The *integer_sources* must be a subset of *source_columns*. |
```
A column cannot be both a source and a destination,
and all source columns must be present in the data files.
New columns are created for destination columns that are missing from a data file.

The *remap_columns* operation only works for columns containing strings or integers,
as it is meant for remapping categorical codes.
You must specify the which source columns contain integers so that `n/a` values
You must specify which source columns contain integers so that `n/a` values
can be handled appropriately.

The *map_list* parameter specifies how each unique combination of values from the source
Expand Down Expand Up @@ -1490,6 +1493,7 @@ The results of executing the previous *reorder_columns* transformation on the

The *split_rows* operation
is often used to convert event files from trial-level encoding to event-level encoding.
This operation is meant only for tabular files that have `onset` and `duration` columns.

In **trial-level** encoding, all the events in a single trial
(usually some variation of the cue-stimulus-response-feedback-ready sequence)
Expand All @@ -1515,7 +1519,6 @@ In this case a trial consists of a sequence of multiple events.
```


The *split_rows* operation requires an *anchor_column*, which could be an existing
column or a new column to be appended to the data.
The purpose of the *anchor_column* is to hold the codes for the new events.
Expand Down Expand Up @@ -1651,7 +1654,7 @@ all summaries.
| ------------ | ---- | ----------- |
| *summary_name* | str | A unique name used to identify this summary.|
| *summary_filename* | str | A unique file basename to use for saving this summary. |
| *append_timecode* | bool | (Optional) If True, append a time code to filename.<br/>False is the default. |
| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
```

(summarize-column-names-example-anchor)=
Expand Down Expand Up @@ -1730,11 +1733,11 @@ The following table lists the parameters required for using the summary.
| ------------ | ---- | ----------- |
| *summary_name* | str | A unique name used to identify this summary.|
| *summary_filename* | str | A unique file basename to use for saving this summary. |
| *skip_columns* | list | A list of column names to omit from the summary.|
| *value_columns* | list | A list of columns to omit the listing unique values. |
| *append_timecode* | bool | (Optional) If True, append a time code to filename.<br/>False is the default.|
| *max_categorical* | int | (Optional) If given, the text summary shows top *max_categorical* values.<br/>Otherwise the text summary displays all categorical values.|
| *values_per_line* | bool | (Optional) If given, the text summary displays this <br/>number of values per line (default is 5).|
| *append_timecode* | bool | (**Optional**: Default false) If True, append a time code to filename. |
| *max_categorical* | int | (**Optional**: Default 50) If given, the text summary shows top *max_categorical* values.<br/>Otherwise the text summary displays all categorical values.|
| *skip_columns* | list | (**Optional**) A list of column names to omit from the summary.|
| *value_columns* | list | (**Optional**) A list of columns to omit the listing unique values. |
| *values_per_line* | int | (**Optional**: Default 5) If given, the text summary displays this <br/>number of values per line (default is 5).|
```

Expand Down Expand Up @@ -1866,10 +1869,11 @@ The following table lists the parameters required for using the summary.
| ------------ | ---- | ----------- |
| *summary_name* | str | A unique name used to identify this summary.|
| *summary_filename* | str | A unique file basename to use for saving this summary. |
| *append_timecode* | bool | (Optional) If True, append a time code to filename.<br/>False is the default.|
| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
```

The *summarize_definitions* is mainly meant for verifying consistency in unknown `Def-expand` tags. This comes up where you have an assembled dataset, but no longer have the definitions stored (or never created them to begin with).
The *summarize_definitions* is mainly meant for verifying consistency in unknown `Def-expand` tags.
This comes up where you have an assembled dataset, but no longer have the definitions stored (or never created them to begin with).


(summarize-definitions-example-anchor)=
Expand Down Expand Up @@ -2029,10 +2033,10 @@ The *summarize_hed_tags* operation has the two required parameters
| *summary_name* | str | A unique name used to identify this summary.|
| *summary_filename* | str | A unique file basename to use for saving this summary. |
| *tags* | dict | Dictionary with category title keys and tags in that category as values. |
| *append_timecode* | bool | (Optional) If True, append a time code to filename.<br/>False is the default.|
| *include_context* | bool | (Optional) If true, expand the event context to <br/>account for onsets and offsets. |
| *replace_defs* | bool | (Optional) If true, the `Def` tags are replaced with the<br/>contents of the definition (no `Def` or `Def-expand`). |
| *remove_types* | list | (Optional) A list of types (such as `Condition-variable` and `Task` to remove. |
| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
| *include_context* | bool | (**Optional**: Default true) If true, expand the event context to <br/>account for onsets and offsets. |
| *replace_defs* | bool | (**Optional**: Default true) If true, the `Def` tags are replaced with the<br/>contents of the definition (no `Def` or `Def-expand`). |
| *remove_types* | list | (**Optional**) A list of types such as `Condition-variable` and `Task` to remove. |
```

The *tags* dictionary has keys that specify how the user wishes the tags
Expand Down Expand Up @@ -2159,7 +2163,7 @@ This summary provides useful information about experimental design.
| *summary_name* | str | A unique name used to identify this summary.|
| *summary_filename* | str | A unique file basename to use for saving this summary. |
| *type_tag* | str | Tag to produce a summary for (most often *condition-variable*).|
| *append_timecode* | bool | (Optional) If True, append a time code to filename.<br/>False is the default.|
| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename.|
```
In addition to the two standard parameters (*summary_name* and *summary_filename*),
the *type_tag* parameter is required.
Expand Down Expand Up @@ -2251,8 +2255,8 @@ If *check_for_warnings* is false, the summary will not report warnings.
| ------------ | ---- | ----------- |
| *summary_name* | str | A unique name used to identify this summary.|
| *summary_filename* | str | A unique file basename to use for saving this summary. |
| *append_timecode* | bool | (Optional) If True, append a time code to filename.<br/>False is the default.|
| *check_for_warnings* | bool | (Optional) If true, warnings are reported in addition to errors.<br/>False is the default.|
| *append_timecode* | bool | (**Optional**: Default false) If true, append a time code to filename. |
| *check_for_warnings* | bool | (**Optional**: Default false) If true, warnings are reported in addition to errors. |
```
The *summarize_hed_validation* is a HED operation and the calling program must provide a HED schema version
and usually a JSON sidecar containing the HED annotations.
Expand Down Expand Up @@ -2622,13 +2626,13 @@ since the names specified in the first parameter are meant to represent the quer
The check only takes place if `query_names` exists, since naming is handled automatically otherwise.

```python
@staticmethod
def validate_input_data(parameters):
errors = []
if parameters.get("query_names", False):
if len(parameters.get("query_names")) != len(parameters.get("queries")):
errors.append("The list in query_names, in the factor_hed_tags operation, should have the same number of items as queries.")
return errors
@staticmethod
def validate_input_data(parameters):
errors = []
if parameters.get("query_names", False):
if len(parameters.get("query_names")) != len(parameters.get("queries")):
errors.append("The list in query_names, in the factor_hed_tags operation, should have the same number of items as queries.")
return errors
```


Expand Down
5 changes: 5 additions & 0 deletions docs/source/HedMatlabTools.md
Original file line number Diff line number Diff line change
Expand Up @@ -595,10 +595,15 @@ Python may be installed in your user space or in system space for all users.
- You may want to add the location of the Python executable to your PATH.
(Most installers give you that option as part of the installation.)

#### Installing in a virtual environment

https://www.mathworks.com/support/search.html/answers/1750425-python-virtual-environments-with-python-interface.html?fq%5B%5D=asset_type_name:answer&page=1
(step-3-connect-python-to-matlab-anchor)=
#### Step 3: Connect Python to Matlab


C:\Users\username\AppData\Local\Programs\Python\python -m venv C:\Users\username\py38

Setting the Python version uses the MATLAB `pyenv` function with the `'Version'` argument
as illustrated by the following example.

Expand Down

0 comments on commit 94dc628

Please sign in to comment.