Add Line numbers #348

omri123 · 2023-11-14T17:53:24Z

I this pair of patches, I added line numbers both to files and to code in the prompt.
When creating SEARCH/REPLACE blocks aider write code with line numbers as well, and I chanegd the parser to handle it. When parsing I ignore the numbers and use existing SEARCH/REPLACE.

When benchmarking, it gives a small positive performance difference, and additional small improvement in the cases of "Malformed response".

This change allow me to tell aider things like this.

You can use the following commands to get information from files outside the chat:
\\GetDefinition of <symboll> used in <file>,<line>

I am writing a vscode extension that will be able to perform this kind of commands, soon I will make it public and open source.

paul-gauthier · 2023-11-14T18:20:45Z

Thanks for suggesting these changes. Have you run the full benchmark suite against this prompting? Can you share the detailed results?

I have extensively explored this concept in the past, and found line numbers were not helpful to overall code editing success. I seemed to find they actually worsened performance.

But perhaps you're seeing something different in the benchmarks?

omri123 · 2023-11-14T21:13:50Z

I run the full suit only with gpt-4-turbo.
Results are detailed below.

With line numbers:

Malformed chats: 6
Pass 1: 52.59%
Pass 2: 62.96%

Community:

Malformed chats: 12
Pass 1: 50.04%
Pass 2: 61.5%

Results are not too reproducible - when I run the community version a second time I got 48% and 58%, but I chose to use the ones consistent with your measurments.

I performed some ablation study. Some of my performance boost comes from this trick:
#346

I changed the example SEARCH/REPLACE block,

Without this change we can see ~1% performance drop when introducing line numbers.
With this change, without line numbers - Pass 1: 50.04%, Pass 2: 64.4%
So the difference don't come from the line-numbers alone.

Note about user experiance:
When using line numbers, the user can't copy code from the chat and paste it without removing the numbers first. Relevant for cases of malformed responses

omri123 · 2023-11-14T21:19:26Z

I didn't keep the files from the ablation study benchmarking, when I used benchmarking for the first time I wasn't organized enough and I tested uncommited changes. I will re-run it to be sure.

I will also take a look on previous implementation to have better context of what I am doing. Do you have old benchmarking of line numbers documented somewhere?

omri123 · 2023-11-14T21:51:29Z

The implementation is different

The new implementation use pipe | and not space to seperate numbers from code
In the new implementation, code that appears in the system promps is also numbered. The assistant is expected to produce numbered code as well.

gwpl · 2023-11-28T10:32:55Z

(would be great if could be turned on/off not only via ENVIRONMENT variable (and eventually flag) , but also during REPL (e.g. /config nonu or /config nu (yes, you see my bias toward Vim short codes...)

omri123 · 2023-11-28T21:29:41Z

An update about this feature: I want to write a new benchmark, relevant to some problems I saw with this feature, to re-write some logic (searching for lines to replace) and to separate it as a third coder. I hope to find the time for it in the next few days.

gwpl · 2023-11-30T11:20:56Z

To my understanding how LLMs work (and my experience from [inference on graph algorithms] ), providing all line numbers , I can see how could become unnecessary obstructing noise to signal ration of piles of transformers , specifically distracting certain attention heads ...

Maybe encoding line numbers in another way for specific languages could be more efficient (e.g. add as comment at ends of some important lines like , end of first line starting new function? ( // line 1243 ). (Of course this would require extra logic to track which of those comments were added and should be removed afterwards, however at same time easy to pattern match and keep in a list of table).

paul-gauthier · 2024-01-02T17:35:48Z

Thanks for taking the time to experiment with new edit formats and share your results. I've also experimented with line numbers for code editing quite a bit. All the benchmarking I've done convinces me that it's worse to ask GPT to use line numbers.

I'm going to close this PR as I'm not likely to add line number support at this time.

numbering, minimal prompt change

8115c0a

omri123 force-pushed the line-numbers branch from 706451d to 296b615 Compare November 14, 2023 17:57

Change editblock SEARCH/REPLACE prompt

33d092c

omri123 force-pushed the line-numbers branch from 296b615 to 33d092c Compare November 14, 2023 17:58

omri123 mentioned this pull request Nov 19, 2023

Feature request: vscode extension #68

Open

paul-gauthier closed this Jan 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Line numbers #348

Add Line numbers #348

omri123 commented Nov 14, 2023 •

edited

Loading

paul-gauthier commented Nov 14, 2023

omri123 commented Nov 14, 2023

omri123 commented Nov 14, 2023

omri123 commented Nov 14, 2023

gwpl commented Nov 28, 2023

omri123 commented Nov 28, 2023

gwpl commented Nov 30, 2023 •

edited

Loading

paul-gauthier commented Jan 2, 2024

Add Line numbers #348

Add Line numbers #348

Conversation

omri123 commented Nov 14, 2023 • edited Loading

paul-gauthier commented Nov 14, 2023

omri123 commented Nov 14, 2023

omri123 commented Nov 14, 2023

omri123 commented Nov 14, 2023

gwpl commented Nov 28, 2023

omri123 commented Nov 28, 2023

gwpl commented Nov 30, 2023 • edited Loading

paul-gauthier commented Jan 2, 2024

omri123 commented Nov 14, 2023 •

edited

Loading

gwpl commented Nov 30, 2023 •

edited

Loading