doctest allow int/line-number prefix #9430

S0AndS0 · 2025-02-12T23:18:44Z

⚠️ nested case does not spark joy and RegExp might be a better choice for this feature request!

This patch aims to formally begin the discussion here about feature parity between Erlang OTP doctest, and the library of the same name maintained by williamthome

Specifically this patch targets > vs 1> syntax for declaring the start of a testable code line or set of lines, ex;

Erlang OTP doctest

-doc """
This test is how we do with `erlang/otp` doctests

```
> true.
true
> totallyOkay.
totallyOkay
```
"""
woot() -> ok.

williamthome doctest

-doc """
This test is how we do with `williamthome/doctest` (prior to version `0.10.0`)

```erlang
1> false.
false
2> thisIsFine.
thisIsFine
```
"""
muchJoy() -> ok.

CLAassistant · 2025-02-12T23:18:50Z

All committers have signed the CLA.

github-actions · 2025-02-12T23:19:35Z

CT Test Results

2 files 97 suites 1h 8m 53s ⏱️
2 195 tests 2 146 ✅ 47 💤 2 ❌
2 562 runs 2 511 ✅ 49 💤 2 ❌

For more details on these failures, see this check.

Results for commit 9ed5b46.

♻️ This comment has been updated with latest results.

To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass.

See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally.

Artifacts

// Erlang/OTP Github Action Bot

S0AndS0 · 2025-02-12T23:49:32Z

Here are some results of manual testing, because I'm not yet smort enough to run only doctests from the CLI;

Mixing of > and INT> seem fine, so I've not broken anything... yet

> parse_tests([<<"> SomeCode.">>, <<"419> OtherCode.">>], []).
[{test,[<<"SomeCode.">>,10,<<"OtherCode.">>],"_"}]

Multi-line code seems to work fine too

> parse_tests([<<"68> ReallyLongLine,">>, <<" ConinuedOnAnotherLine.">>], []).
[{test,[<<"ReallyLongLine,">>,10,
        <<"ConinuedOnAnotherLine.">>],
       "_"}]
> parse_tests([<<"> ReallyLongLine,">>, <<" ConinuedOnAnotherLine.">>], []).
[{test,[<<"ReallyLongLine,">>,10,
        <<"ConinuedOnAnotherLine.">>],
       "_"}]

Lines that contain > but not prefixed by an integer are ignored as usual

> parse_tests([<<"Something > OtherThang.">>], []).
[{test,[],[<<"Something > OtherThang.">>]}]

🐛 Lines prefixed by any number of blank space are considered code
```
> parse_tests([<<" wat.">>], []).
[{test,[<<"wat.">>],"_"}]
```
Note; double-checking against pre-PR code this is true there too 🤷

I still be learning the ways of Erlang, and least-bad practices, so any tips towards reducing future costs to maintenance are very much appreciated!

CC: @garazdawi I figured it'd be wise to nudge things specific to OTP parity from williamthome/doctest -- Issue 47 to here ;-)

garazdawi · 2025-02-13T07:34:15Z

One of the things I like a lot about doctests is that it adds uniformity to the examples. So I think that I would like it to be very oppionionated about how you write examples. So I would like for the examples to either use > or 1>, not a mix. The 1> adds one extra unnecessary number in front, but it makes the examples closer to what you normally see in an Erlang shell.

Which one do you prefer and why?

garazdawi · 2025-02-13T07:36:00Z

Here are some results of manual testing, because I'm not yet smort enough to run only doctests from the CLI;

1> shell_docs:test(ModuleToTest, []).

S0AndS0 · 2025-02-13T16:38:00Z

Which one do you prefer and why?

For compatibility with preexisting doctest tooling and tests I prefer number prefixed code lines, also it should hopefully mean fewer examples need edited. Plus having an incrementing number makes it obvious to readers, and copy/pasters, of examples that there likely is an intended order. Ex;

1> State = "World".
2> Greeter = fun(State) -> "Hello " ++ State end.
3> Result = Greeter(State).
"Hello World"

At the same time I prefer >, without a line number prefix, because it could be used in doc-comments/tests where asserts are independent of one another, ex;

> true.
true
> false.
false

True having both is a more overhead to maintain and parse, and I'm but one opinionated noob ;-)

williamthome · 2025-02-13T20:09:40Z

I think just the > is fine and less tedious. Elixir uses the iex> prefix and optionally allows the line numbers. See examples.

Maybe it would be nice to use erl> as the prefix to distinguish between Erlang and Elixir examples when they are on the same documentation, but I'm for the simplicity of the > prefix.

@garazdawi Sorry for the off-topic question below:

Is it possible for the shell to capture pasted content (Ctrl + v)? If so, it would be great if the shell could recognize the first character, and if it is a >, paste all the lines starting with the > in sequence. For instance:

```erlang
> A = 1.
1
> B = 1.
> A + B.
2
```

The shell would then print:

A = 1.
Enter
B = 1.
Enter
A + B.
Enter

It would be nice to have this feature to experiment with copied examples from the documentation.

michalmuskala · 2025-02-14T00:28:28Z

For the second point, the way it works with ex_doc, is that when you copy the example, the iex> parts are not actually copied. I'm sure something similar could work for Erlang examples

S0AndS0 · 2025-02-14T00:33:49Z

Ooof, so seems to be lots of variants that could be supported, but could don't mean should... 🤔 still I'm game to be nerd-sniped for a bit ;-)

TLDR RE = "^(\\w*(?:\$\\d+\$)?)(>)" and re:split seem to cover all conditions! We either get a list of length 4 with last element being the code, or list of length 1 which should be a different branch's responsibility.

Here's a synopsis of what I could find of variants;

Elixer ExUnit v1.12.3
- iex> starter with ...> code continuation
- iex(1)> starter with ...(1)> code continuation
Erlang williamthome/doctest:v0.9.3
- 1> starter with .. code continuation
Erlang williamthome/doctest:v0.10.2
- 1> starter with .. code continuation
- > starter with .. code continuation
Erlang OTP 29138f5dd1:lib/stdlib/src/shell_docs_test.erl
- > starter with code continuation

... Double-checking dates it seems williamthome/doctest is about nine-months old, so perhaps my thinking about backwards compatibility and smooth transition should be adjusted, because the code continuation differences would need tackled too.

Simplicity is nice, and usually I'm all for it, but I like the idea of broader support for documentation tests too 🤷

williamthome · 2025-02-14T11:30:32Z

Just checked and the iex> prefix in Elixir is because of the Elixir shell (v1.17):

iex(1)> a = ~s"
...(1)> foo
...(1)> "
"\nfoo\n"
iex(2)> b =
...(2)> 1 +
...(2)> 1
2
iex(3)> c = :foo
:foo

The line numbers can be omitted in Elixir doctest:

iex> a = ~s"
...> foo
...> "
"\nfoo\n"
iex> b =
...> 1 +
...> 1
2
iex> c = :foo
:foo

In Erlang shell (OTP-28) it is:

1> A = ~"
   foo
   ".
<<"\nfoo\n">>
2> B =
   1 +
   1.
2
3> C = foo.
foo

Omitting line numbers in Erlang doctest:

> A = ~"
  foo
  ".
<<"\nfoo\n">>
> B =
  1 +
  1.
2
> C = foo.
foo

Erlang should not always adhere to what's implemented in Elixir, but maybe the optional line count support is a good idea. I do not have a strong opinion of the "why", only because of the shell ;)

S0AndS0 · 2025-02-15T23:07:05Z

I do not have a strong opinion of the "why", only because of the shell ;)

Having noodled it about in my brain, and skimmed bits of doc-comments within various Erlang OTP sources, I believe the strongest argument I can think of for supporting line-numbers in doctests is reducing Git Diff churn.

Here be but two examples I could find that should just work were we to support both ways;

erlang/otp/lib/stdlib/src/re.erl

1> re:run("aaaaaaaaaaaaaz","(a+)*z").
{match,[{0,14},{0,13}]}
2> re:run("aaaaaaaaaaaaaz","(a+)*z",[{match_limit_recursion,5}]).
nomatch
3> re:run("aaaaaaaaaaaaaz","(a+)*z",[{match_limit_recursion,5},report_errors]).
{error,match_limit_recursion}

erlang/otp/lib/stdlib/src/binary.erl

> binary:bin_to_list(<<"erlang">>, {1,3}).
"rla"
%% or [114,108,97] in list notation.

And though glancing elsewhere I found a few instances where either would require some touch-work, ex;

erlang/otp/lib/stdlib/src/argparse.erl

1> argparse:parse([], #{arguments => [#{name => myarg, short => $m}]}).

{ok,#{}, ...
2> argparse:parse([], #{arguments => [#{name => myarg, short => $m, default => "def"}]}).

{ok,#{myarg => "def"}, ...

... I've got a hunch it'll be less touch-work than picking but one style.

garazdawi · 2025-02-21T13:54:37Z

I do not have a strong opinion of the "why", only because of the shell ;)

Lets do what the shell does, so that we get a similar look and feel as the shell. That is the regexp to match a test start is [0-9]+> . It will cause a bit of churn to change all the modules already using the new doctests, but better now than later. Will you do that change @S0AndS0 ?

For the second point, the way it works with ex_doc, is that when you copy the example, the iex> parts are not actually copied. I'm sure something similar could work for Erlang examples.

@michalmuskala it already works like this :)

S0AndS0 · 2025-02-21T15:51:12Z

@garazdawi
Lets do what the shell does... That is the regexp to match a test start is [0-9]+> ... Will you do that change @S0AndS0?

Sure thing! I'll try to get something prettier (more maintainable than my current questionable code suggestions) sorted and submitted for public comment soon™

S0AndS0 · 2025-02-22T17:23:44Z

TLDR update I'm blocked by something upstream of shell_docs_test:run_test/2 filtering-out doctest blocks that include line numbers, and though RegExp mostly works as intended something within run_tests(Test, Bindings) or downstream of it does not like the format of {test, ...} tuples edits have returned.

A hint, or two, would be much appreciated!

Unabridged update

I think I won the battle with RegExp, and by doing so also flattened that pesky case statement out of existence!

Like the most recent commit message says, local/isolated testing of shell_docs_test:parse_tests/2 shows changes to be behaving themselves. However, I'm still a bit touched by "skill issues" for getting inputs injected into the changes correctly. Near as I can tell, after adding some io:format/2 sanity checks, the shell_docs_test:test/2 and/or shell_docs_test:run_test/2 functions are now needing attention 🤔

If we get doctest input similar to;

-module(wat).

-export([woot/0]).

-doc """
```erlang
> wat:woot().
false
```
""".
woot() -> woot.

... Then I can observe the run_test(Code, InitialBindings) function gets the doctest lines, but requested RegExp changes means > wat:woot() is filtered out. And adding a line number shows shell_docs_test:run_test/2 never gets any doctest lines, ex;

 -doc """
 ```erlang
-> wat:woot().
+1> wat:woot().
 false
 ```
 """.
 woot() -> woot.

Altering the RegExp to allow for optional line-numbers, (?'line_number'[0-9]+) → (?'line_number'(?:[0-9]+)?), and not using line-numbers in doctests, ex...

--define(RE_CAPTURE, "(?'indent'^\s*)(((?'line_number'[0-9]+)(?'prefix'>\s|\s))|(?'prefix'%))(?'content'[^%]*)(?'comment'(?:.*%.*)?)").
+-define(RE_CAPTURE, "(?'indent'^\s*)(((?'line_number'(?:[0-9]+)?)(?'prefix'>\s|\s))|(?'prefix'%))(?'content'[^%]*)(?'comment'(?:.*%.*)?)").
 -define(RE_OPTIONS, [ {capture, [indent, line_number, prefix, content, comment] ,binary}, dupnames ]).
 parse_tests([], []) ->

... does show things are mostly parsed as they use-to. I.E. [{test, ...}] filters back up to shell_docs_test:run_test/2 but then I'm not observing expected failure of woot =:= false comparison.

Edit 🤦 test({pre,[],[{code,Attrs,[<<">",_/binary>> = Code]}]}, Bindings) is what be pre-filtering

garazdawi · 2025-02-25T09:19:10Z

Let me know when you are ready for a review!

The examples in lists, binary, maps, zstd and shell_docs_test also need to be updated to follow the new style.

S0AndS0 · 2025-02-26T04:25:33Z

Thanks for being patient with me!

I think I've got most of the {match, ...} things sorted, and will be happy to update doctests that were previously functional too.

For single line test/match blocks it seems to be mostly doing what it should... mostly... but for reasons I've yet to figure-out it ain't at all happy with mutli-line stuff and I either may need a bit of help or more time to resolve my skill issues x-)

S0AndS0 · 2025-03-01T18:44:49Z

I think (hope) changes are ready for review!

Turns out I was over complicating things with the RegExp and stuff... no big surprise there (-x

Updated doctest comments to include line numbers and adjusted indentation too for

lib/stdlib/src/binary.erl
lib/stdlib/src/lists.erl
lib/stdlib/src/maps.erl
[¿?] lib/stdlib/src/shell_docs_test.erl please check 49765b5d1e commit message
lib/stdlib/src/zstd.erl

Please do let me know if I've broken anything, that wasn't already misbehaving, and I'll do my best to make corrections!

garazdawi · 2025-03-01T18:55:06Z

We did some changes in the lists module, would you mind rebasing on latest master?

I’ll have a closer look on Monday.

⚠️ nested `case` does not spark joy and RegExp might be a better choice for this feature request! This patch aims to formally begin the discussion here about feature parity between Erlang OTP doctest, and the library of the same name maintained by [williamthome](https://github.com/williamthome/doctest/) Specifically this patch targets `>` vs `1>` syntax for declaring the start of a testable code line or set of lines, ex; **Erlang OTP doctest** ````erlang -doc """ This test is how we do with `erlang/otp` doctests ``` > true. true > totallyOkay. totallyOkay ``` """ woot() -> ok. ```` **williamthome doctest** ````erlang -doc """ This test is how we do with `williamthome/doctest` ```erlang 1> false. false 2> thisIsFine. thisIsFine ``` """ muchJoy() -> ok. ````

⚠️ Local testing of changes, in isolation, shows good behavior! But additional edits, and testing, will be necessary for me to have confidence in changes. Change summary between `c23347df2c` and here: - define RegExp capture and options - remove `parse_match/2` completely - add `parse_tests/2` RegExp pattern matching within function prams - remove `parse_tests/2` nested `case` in favor of recursion

🤦 Now I get what `Match` means in `parse_tests/2` and why for the calls to `parse_match/2` too! This _should_ now collect code lines, like it once did, and expected match line(s)... Though it seems the last line and/or empty lines now need ignored or skipped 🤔

Attempting to match in-line comments was a bad idea from my past self, as was trying to dedent 🤦 ⚠️ Still need to sort-out multi-line `Cmd` and `Match`

Still gotta sort-out that issue of multi-line `Code`/`Match` 🤔

⚠️ If we need to support `.. ` command continuation in the future that'll require touching the `RE_CAPTURE` pattern and adding a few more variants for `parse_tests/2` too

Spotting the following error/stack-trace popping for; ````erlang -doc """ ``` 1> Prebound. hello ``` """. ```` ... Here's some `io:format` output and resulting error; ``` run_test(Code, InitialBindings) | Code -> <<"1> Prebound.\nhello\n">> | InitialBindings -> [] | ReLines -> [ {match, [<<"1> ">>, <<"> ">>, <<"Prebound.">>]}, {match, [<<>>, <<>>, <<"hello">>]}, {match, [<<>>, <<>>, <<>>]} ] | Tests -> [{test, [<<"Prebound.">>], [<<"hello">>]}] [{match,1,{atom,1,hello},{block,1,[{var,1,'Prebound'}]}}] ** exception error: {unbound_var,'Prebound'} in function erl_eval:exprs/2 (erl_eval.erl, line 217) in call from shell_docs_test:run_tests/2 (shell_docs_test.erl, line 202) in call from lists:foldl/3 (lists.erl, line 2146) in call from shell_docs_test:test/2 (shell_docs_test.erl, line 153) in call from shell_docs_test:test/2 (shell_docs_test.erl, line 153) in call from shell_docs_test:parse_and_run/3 (shell_docs_test.erl, line 137) in call from shell_docs_test:'-parse_and_run/3-lc$^0/1-0-'/3 (shell_docs_test.erl, line 133) in call from shell_docs_test:'-module/2-lc$^0/1-0-'/3 (shell_docs_test.erl, line 122) ```

Tip, use the following command to filter out this commit's churn; ```bash git diff --ignore-space-change ```

Usually doing something like; ```bash git pull --rebase -X theirs origin master ``` ... is fine, but this time Git lost the plot in a few places. Fortunately it was only one file, so doing a selective show via; ```bash git show origin/master:lib/stdlib/src/lists.erl > lib/stdlib/src/lists.erl ``` ... And then manually going through to re-add doctest line numbers, and adjust indentation, wasn't too much of a chore.

For _reasons_ Git felt it important to (re-)introduce trailing spaces

S0AndS0 · 2025-03-01T22:23:48Z

I don't mind at all, and ya can consider the rebasing done :-)

Looking forward to Monday's review I will, and 'ntil then I'll be wishing y'all a wonderful weekend!

IngelaAndin added the team:VM Assigned to OTP team VM label Feb 17, 2025

jhogberg assigned garazdawi Feb 17, 2025

S0AndS0 added 10 commits March 1, 2025 13:49

Let parse_tests/2 do the parsing

f76bcc1

Ignore empty match at end of test block

09b33ab

Simplify RegExp!

e3da3fb

Attempting to match in-line comments was a bad idea from my past self, as was trying to dedent 🤦 ⚠️ Still need to sort-out multi-line `Cmd` and `Match`

Mimic established behavior and naming more closely

1c8da6f

Still gotta sort-out that issue of multi-line `Code`/`Match` 🤔

🎉 WooHoo multi-line Code/Match supported!

e9758b2

⚠️ If we need to support `.. ` command continuation in the future that'll require touching the `RE_CAPTURE` pattern and adding a few more variants for `parse_tests/2` too

⚠️ add line numbers to doctest comments

9898d20

S0AndS0 added 3 commits March 1, 2025 13:49

💫 Adjust doctest comment indentation

5958c8d

Tip, use the following command to filter out this commit's churn; ```bash git diff --ignore-space-change ```

⚠️ Fix bad rebase 2 of 2

9ed5b46

For _reasons_ Git felt it important to (re-)introduce trailing spaces

S0AndS0 force-pushed the add-doctest-int-prefix branch from 2d96680 to 9ed5b46 Compare March 1, 2025 22:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doctest allow int/line-number prefix #9430

doctest allow int/line-number prefix #9430

S0AndS0 commented Feb 12, 2025 •

edited

Loading

CLAassistant commented Feb 12, 2025 •

edited

Loading

github-actions bot commented Feb 12, 2025 •

edited

Loading

S0AndS0 commented Feb 12, 2025 •

edited

Loading

garazdawi commented Feb 13, 2025

garazdawi commented Feb 13, 2025

S0AndS0 commented Feb 13, 2025

williamthome commented Feb 13, 2025 •

edited

Loading

michalmuskala commented Feb 14, 2025

S0AndS0 commented Feb 14, 2025

williamthome commented Feb 14, 2025

S0AndS0 commented Feb 15, 2025

garazdawi commented Feb 21, 2025

S0AndS0 commented Feb 21, 2025 •

edited

Loading

S0AndS0 commented Feb 22, 2025 •

edited

Loading

garazdawi commented Feb 25, 2025

S0AndS0 commented Feb 26, 2025

S0AndS0 commented Mar 1, 2025

garazdawi commented Mar 1, 2025

S0AndS0 commented Mar 1, 2025

doctest allow int/line-number prefix #9430

Are you sure you want to change the base?

doctest allow int/line-number prefix #9430

Conversation

S0AndS0 commented Feb 12, 2025 • edited Loading

CLAassistant commented Feb 12, 2025 • edited Loading

github-actions bot commented Feb 12, 2025 • edited Loading

CT Test Results

Artifacts

S0AndS0 commented Feb 12, 2025 • edited Loading

garazdawi commented Feb 13, 2025

garazdawi commented Feb 13, 2025

S0AndS0 commented Feb 13, 2025

williamthome commented Feb 13, 2025 • edited Loading

michalmuskala commented Feb 14, 2025

S0AndS0 commented Feb 14, 2025

williamthome commented Feb 14, 2025

S0AndS0 commented Feb 15, 2025

garazdawi commented Feb 21, 2025

S0AndS0 commented Feb 21, 2025 • edited Loading

S0AndS0 commented Feb 22, 2025 • edited Loading

garazdawi commented Feb 25, 2025

S0AndS0 commented Feb 26, 2025

S0AndS0 commented Mar 1, 2025

garazdawi commented Mar 1, 2025

S0AndS0 commented Mar 1, 2025

S0AndS0 commented Feb 12, 2025 •

edited

Loading

CLAassistant commented Feb 12, 2025 •

edited

Loading

github-actions bot commented Feb 12, 2025 •

edited

Loading

S0AndS0 commented Feb 12, 2025 •

edited

Loading

williamthome commented Feb 13, 2025 •

edited

Loading

S0AndS0 commented Feb 21, 2025 •

edited

Loading

S0AndS0 commented Feb 22, 2025 •

edited

Loading