Skip to content

Commit c166b93

Browse files
authored
Improve static types migration docs (#6544)
Signed-off-by: Ben Sherman <[email protected]>
1 parent 53c28b8 commit c166b93

File tree

9 files changed

+307
-51
lines changed

9 files changed

+307
-51
lines changed

docs/conf.py

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -384,9 +384,8 @@ class NextflowLexer(RegexLexer):
384384
(r'/\*.*?\*/', Comment.Multiline),
385385
# keywords: go before method names to avoid lexing "throw new XYZ"
386386
# as a method signature
387-
(r'(assert|catch|else|'
388-
r'if|instanceof|new|return|throw|try|in|as)\b',
389-
Keyword),
387+
(r'(assert|catch|else|if|instanceof|new|return|throw|try|in|as)\b', Keyword),
388+
(r'(channel|log)', Name.Namespace),
390389
# method names
391390
(r'^(\s*(?:[a-zA-Z_][\w.\[\]]*\s+)+?)' # return arguments
392391
r'('
@@ -397,9 +396,8 @@ class NextflowLexer(RegexLexer):
397396
r'(\s*)(\()', # signature start
398397
bygroups(using(this), Name.Function, Whitespace, Operator)),
399398
(r'@[a-zA-Z_][\w.]*', Name.Decorator),
400-
(r'(def|enum|include|from|output|process|workflow)\b', Keyword.Declaration),
401-
(r'(boolean|byte|char|double|float|int|long|short|void)\b',
402-
Keyword.Type),
399+
(r'(def|enum|include|from|output|params|process|workflow)\b', Keyword.Declaration),
400+
(r'(boolean|byte|char|double|float|int|long|short|void)\b', Keyword.Type),
403401
(r'(true|false|null)\b', Keyword.Constant),
404402
(r'""".*?"""', String.Double),
405403
(r"'''.*?'''", String.Single),

docs/migrations/25-10.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -138,8 +138,14 @@ def x_opt: String? = null
138138

139139
In the type system, queue channels are represented as `Channel`, while value channels are represented as `Value`. To make the terminology clearer and more concise, queue channels are now called *dataflow channels* (or simply *channels*), and value channels are now called *dataflow values*. See {ref}`dataflow-page` for more information.
140140

141-
:::{note}
142-
Nextflow supports Groovy-style type annotations using the `<type> <name>` syntax, but this approach is deprecated in {ref}`strict syntax <strict-syntax-page>`. While Groovy-style annotations remain valid for functions and local variables, the language server and `nextflow lint` automatically convert them to Nextflow-style annotations during code formatting.
141+
Groovy-style type annotations (e.g., `String x`) are still supported for functions and local variables. However, the language server and `nextflow lint` will automatically convert them to Nextflow-style annotations during code formatting.
142+
143+
:::{warning}
144+
Since Nextflow-style type annotations are new in Nextflow 25.10, formatting code with Groovy-style type annotations will make it incompatible with previous versions of Nextflow. If you need your code to remain compatible with versions prior to 25.10, run the formatter with Nextflow 25.04:
145+
146+
```bash
147+
NXF_VER=25.04.8 nextflow lint -format .
148+
```
143149
:::
144150

145151
See {ref}`migrating-static-types` for details.

docs/process.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -678,6 +678,8 @@ In the above example, the `tuple` input consists of the value `x` and the file `
678678

679679
A `tuple` definition may contain any of the following qualifiers, as previously described: `val`, `env`, `path` and `stdin`. Files specified with the `path` qualifier are treated exactly the same as standalone `path` inputs.
680680

681+
(process-input-each)=
682+
681683
### Input repeaters (`each`)
682684

683685
The `each` qualifier allows you to repeat the execution of a process for each item in a collection, each time a new value is received. For example:
@@ -735,7 +737,7 @@ When multiple repeaters are defined, the process is executed for each *combinati
735737
:::
736738

737739
:::{note}
738-
Input repeaters do not support tuples. Use the {ref}`operator-combine` or {ref}`operator-cross` operator to combine the repeated input with the other inputs to produce all of the desired input combinations.
740+
Input repeaters do not support tuples. Use the {ref}`operator-combine` operator to combine the repeated input with the other inputs to produce all of the desired input combinations.
739741
:::
740742

741743
(process-multiple-inputs)=

docs/reference/cli.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -942,6 +942,14 @@ Lint and format all files in the current directory (and subdirectories) and use
942942
$ nextflow lint -format -spaces 2 .
943943
```
944944

945+
:::{note}
946+
Formatting code with the `lint` command in Nextflow 25.10 or later may make your code incompatible with previous versions of Nextflow. If you need your code to remain compatible with versions prior to 25.10, run the formatter with Nextflow 25.04:
947+
948+
```bash
949+
NXF_VER=25.04.8 nextflow lint -format .
950+
```
951+
:::
952+
945953
(cli-list)=
946954

947955
### `list`

docs/reference/operator.md

Lines changed: 57 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,6 @@
66

77
## branch
88

9-
:::{versionadded} 19.08.0-edge
10-
:::
11-
129
*Returns: multiple channels*
1310

1411
The `branch` operator forwards each item from a source channel to one of multiple output channels, based on a selection criteria.
@@ -63,6 +60,10 @@ The `branchCriteria()` method can be used to create a branch criteria as a varia
6360

6461
## buffer
6562

63+
:::{warning}
64+
This operator depends on the ordering of values in the source channel. It can lead to {ref}`non-deterministic behavior <cache-nondeterministic-inputs>` if used improperly.
65+
:::
66+
6667
*Returns: channel*
6768

6869
The `buffer` operator collects items from a source channel into subsets and emits each subset separately.
@@ -133,6 +134,10 @@ See also: [collate](#collate)
133134

134135
## collate
135136

137+
:::{warning}
138+
This operator depends on the ordering of values in the source channel. It can lead to {ref}`non-deterministic behavior <cache-nondeterministic-inputs>` if used improperly.
139+
:::
140+
136141
*Returns: channel*
137142

138143
The `collate` operator collects items from a source channel into groups of *N* items.
@@ -360,7 +365,9 @@ For example:
360365
:language: console
361366
```
362367

363-
See also: [mix](#mix)
368+
:::{tip}
369+
As a best practice, use [`mix`](#mix) instead of `concat`. The `mix` operator does not wait for each source channel to emit all values before processing the next one.
370+
:::
364371

365372
(operator-count)=
366373

@@ -473,6 +480,10 @@ See also: [combine](#combine)
473480

474481
## distinct
475482

483+
:::{warning}
484+
This operator depends on the ordering of values in the source channel. It can lead to {ref}`non-deterministic behavior <cache-nondeterministic-inputs>` if used improperly.
485+
:::
486+
476487
*Returns: channel*
477488

478489
The `distinct` operator forwards a source channel with *consecutively* repeated items removed, such that each emitted item is different from the preceding one:
@@ -565,6 +576,10 @@ The following example filters a channel using a boolean predicate, which is a {r
565576

566577
## first
567578

579+
:::{warning}
580+
This operator depends on the ordering of values in the source channel. It can lead to {ref}`non-deterministic behavior <cache-nondeterministic-inputs>` if used improperly.
581+
:::
582+
568583
*Returns: dataflow value*
569584

570585
The `first` operator emits the first item in a source channel, or the first item that matches a condition. The condition can be a {ref}`regular expression<script-regexp>`, a type qualifier (i.e. Java class), or a boolean predicate. For example:
@@ -619,7 +634,9 @@ The `flatten` operator flattens each item from a source channel that is a list o
619634

620635
As shown in the above example, deeply nested collections are also flattened.
621636

622-
See also: [flatMap](#flatmap)
637+
:::{tip}
638+
As a best practice, use [`flatMap`](#flatmap) instead of `flatten`. The `flatMap` operator only flattens one level and has a well-defined return type.
639+
:::
623640

624641
(operator-grouptuple)=
625642

@@ -629,7 +646,7 @@ See also: [flatMap](#flatmap)
629646

630647
The `groupTuple` operator collects lists (i.e. *tuples*) from a source channel into groups based on a grouping key. A new tuple is emitted for each distinct key.
631648

632-
To be more precise, the operator transforms a sequence of tuples like *(K, V, W, ..)* into a sequence of tuples like *(K, list(V), list(W), ..)*.
649+
To be more precise, the operator transforms a sequence of tuples like *(K, V1, V2, ..)* into a sequence of tuples like *(K, list(V1), list(V2), ..)*.
633650

634651
For example:
635652

@@ -673,13 +690,10 @@ Available options:
673690
: The required number of items for each group. When a group reaches the required size, it is emitted.
674691

675692
`sort`
676-
: Defines the sorting criteria for the grouped items. Can be one of the following values:
677-
678-
- `false`: No sorting is applied (default).
679-
- `true`: Order the grouped items by the item's natural ordering i.e. numerical for number, lexicographic for string, etc. See the [Java documentation](http://docs.oracle.com/javase/tutorial/collections/interfaces/order.html) for more information.
680-
- `'hash'`: Order the grouped items by the hash number associated to each entry.
681-
- `'deep'`: Similar to the previous, but the hash number is created on actual entries content e.g. when the item is a file, the hash is created on the actual file content.
682-
- A custom sorting criteria used to order the nested list elements of each tuple. It can be a {ref}`Closure <script-closure>` or a [Comparator](http://docs.oracle.com/javase/7/docs/api/java/util/Comparator.html) object.
693+
: Defines the sorting criteria for the grouped items.
694+
: :::{warning}
695+
The `sort` option is discouraged because it can lead to inconsistent sorting when there are multiple groups. Perform sorting separately (e.g., in a subsequent `map` operation) to ensure correct results.
696+
:::
683697

684698
(operator-ifempty)=
685699

@@ -717,7 +731,7 @@ See also: {ref}`channel-empty` channel factory
717731

718732
The `join` operator emits the inner product of two source channels using a matching key.
719733

720-
To be more precise, the operator transforms a sequence of tuples like *(K, V1, V2, ..)* and *(K, W1, W1, ..)* into a sequence of tuples like *(K, V1, V2, .., W1, W2, ..)*.
734+
To be more precise, the operator transforms a sequence of tuples like *(K, V1, V2, ..)* and *(K, W1, W2, ..)* into a sequence of tuples like *(K, V1, V2, .., W1, W2, ..)*.
721735

722736
For example:
723737

@@ -765,6 +779,10 @@ See also: [combine](#combine), [cross](#cross)
765779

766780
## last
767781

782+
:::{warning}
783+
This operator depends on the ordering of values in the source channel. It can lead to {ref}`non-deterministic behavior <cache-nondeterministic-inputs>` if used improperly.
784+
:::
785+
768786
*Returns: dataflow value*
769787

770788
The `last` operator emits the last item from a source channel:
@@ -837,6 +855,12 @@ The following examples show how to find the longest string in a channel:
837855

838856
## merge
839857

858+
:::{warning}
859+
This operator depends on the ordering of values in the source channel. It can lead to {ref}`non-deterministic behavior <cache-nondeterministic-inputs>` if used improperly.
860+
861+
Use [combine](#combine) or [join](#join) instead to combine multiple channels in a deterministic way, such as a matching key.
862+
:::
863+
840864
*Returns: channel*
841865

842866
The `merge` operator joins the items from two or more channels into a new channel:
@@ -865,12 +889,6 @@ The `merge` operator may return a channel or value depending on the inputs:
865889

866890
- If the first argument is a dataflow value, the `merge` operator returns a dataflow value, merging the first value from each input, regardless of whether there are channel inputs with additional values.
867891

868-
:::{danger}
869-
In general, the use of the `merge` operator is discouraged. Processes and channel operators are not guaranteed to emit items in the order that they were received, as they are executed concurrently. Therefore, if you try to merge output channels from different processes, the resulting channel may be different on each run, which will cause resumed runs to {ref}`not work properly <cache-nondeterministic-inputs>`.
870-
871-
You should always use a matching key (e.g. sample ID) to merge multiple channels, so that they are combined in a deterministic way. For this purpose, you can use the [join](#join) operator.
872-
:::
873-
874892
(operator-min)=
875893

876894
## min
@@ -940,9 +958,6 @@ See also: [concat](#concat)
940958

941959
## multiMap
942960

943-
:::{versionadded} 19.11.0-edge
944-
:::
945-
946961
*Returns: multiple channels*
947962

948963
The `multiMap` operator applies a set of mapping functions to a source channel, producing a separate output channel for each mapping function.
@@ -985,6 +1000,10 @@ If you use `multiMap` to split a tuple or map into multiple channels, it is reco
9851000

9861001
## randomSample
9871002

1003+
:::{warning}
1004+
This operator depends on the ordering of values in the source channel. It can lead to {ref}`non-deterministic behavior <cache-nondeterministic-inputs>` if used improperly.
1005+
:::
1006+
9881007
*Returns: channel*
9891008

9901009
The `randomSample` operator emits a randomly-selected subset of items from a source channel:
@@ -1049,6 +1068,10 @@ Using `set` is semantically equivalent to assigning a variable:
10491068
my_channel = channel.of(10, 20, 30)
10501069
```
10511070

1071+
:::{tip}
1072+
As a best practice, use a standard assignment (`=`) instead of `set`. Standard assignments enable more effective {ref}`type checking <preparing-static-types>`.
1073+
:::
1074+
10521075
See also: [tap](#tap)
10531076

10541077
(operator-splitcsv)=
@@ -1102,9 +1125,6 @@ Available options:
11021125
`decompress`
11031126
: When `true`, decompress the content using the GZIP format before processing it (default: `false`). Files with the `.gz` extension are decompressed automatically.
11041127

1105-
`elem`
1106-
: The index of the element to split when the source items are lists or tuples (default: first file object or first element).
1107-
11081128
`header`
11091129
: When `true`, the first line is used as the columns names (default: `false`). Can also be a list of columns names.
11101130

@@ -1459,6 +1479,10 @@ An optional {ref}`closure <script-closure>` can be used to transform each item b
14591479

14601480
## take
14611481

1482+
:::{warning}
1483+
This operator depends on the ordering of values in the source channel. It can lead to {ref}`non-deterministic behavior <cache-nondeterministic-inputs>` if used improperly.
1484+
:::
1485+
14621486
*Returns: channel*
14631487

14641488
The `take` operator takes the first *N* items from a source channel:
@@ -1491,7 +1515,9 @@ The `tap` operator assigns a source channel to a variable, and emits the source
14911515
:language: console
14921516
```
14931517

1494-
See also: [set](#set)
1518+
:::{tip}
1519+
As a best practice, use a standard assignment (`=`) instead of `tap`. Standard assignments enable more effective {ref}`type checking <preparing-static-types>`.
1520+
:::
14951521

14961522
## toInteger
14971523

@@ -1588,7 +1614,7 @@ See also: [collect](#collect)
15881614

15891615
The `transpose` operator "transposes" each tuple from a source channel by flattening any nested list in each tuple, emitting each nested item separately.
15901616

1591-
To be more precise, the operator transforms a sequence of tuples like *(K, list(V), list(W), ..)* into a sequence of tuples like *(K, V, W, ..)*.
1617+
To be more precise, the operator transforms a sequence of tuples like *(K, list(V1), list(V2), ..)* into a sequence of tuples like *(K, V1, V2, ..)*.
15921618

15931619
For example:
15941620

@@ -1628,7 +1654,9 @@ Available options:
16281654
`remainder`
16291655
: When `true`, incomplete tuples are emitted with `null` values for missing elements, otherwise they are discarded (default: `false`).
16301656

1631-
See also: [groupTuple](#grouptuple)
1657+
:::{tip}
1658+
As a best practice, use [`flatMap`](#flatmap) instead of `transpose`, since `flatMap` has a well-defined return type.
1659+
:::
16321660

16331661
(operator-unique)=
16341662

0 commit comments

Comments
 (0)