Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Line numbers #348

Closed
wants to merge 2 commits into from
Closed

Add Line numbers #348

wants to merge 2 commits into from

Conversation

omri123
Copy link
Contributor

@omri123 omri123 commented Nov 14, 2023

I this pair of patches, I added line numbers both to files and to code in the prompt.
When creating SEARCH/REPLACE blocks aider write code with line numbers as well, and I chanegd the parser to handle it. When parsing I ignore the numbers and use existing SEARCH/REPLACE.

When benchmarking, it gives a small positive performance difference, and additional small improvement in the cases of "Malformed response".

This change allow me to tell aider things like this.

You can use the following commands to get information from files outside the chat:
\\GetDefinition of <symboll> used in <file>,<line>

I am writing a vscode extension that will be able to perform this kind of commands, soon I will make it public and open source.

@paul-gauthier
Copy link
Collaborator

Thanks for suggesting these changes. Have you run the full benchmark suite against this prompting? Can you share the detailed results?

I have extensively explored this concept in the past, and found line numbers were not helpful to overall code editing success. I seemed to find they actually worsened performance.

But perhaps you're seeing something different in the benchmarks?

@omri123
Copy link
Contributor Author

omri123 commented Nov 14, 2023

I run the full suit only with gpt-4-turbo.
Results are detailed below.

With line numbers:

  • Malformed chats: 6
  • Pass 1: 52.59%
  • Pass 2: 62.96%

Community:

  • Malformed chats: 12
  • Pass 1: 50.04%
  • Pass 2: 61.5%

Results are not too reproducible - when I run the community version a second time I got 48% and 58%, but I chose to use the ones consistent with your measurments.

I performed some ablation study. Some of my performance boost comes from this trick:
#346

I changed the example SEARCH/REPLACE block,

  • Without this change we can see ~1% performance drop when introducing line numbers.
  • With this change, without line numbers - Pass 1: 50.04%, Pass 2: 64.4%
    So the difference don't come from the line-numbers alone.

Note about user experiance:
When using line numbers, the user can't copy code from the chat and paste it without removing the numbers first. Relevant for cases of malformed responses

@omri123
Copy link
Contributor Author

omri123 commented Nov 14, 2023

I didn't keep the files from the ablation study benchmarking, when I used benchmarking for the first time I wasn't organized enough and I tested uncommited changes. I will re-run it to be sure.

I will also take a look on previous implementation to have better context of what I am doing. Do you have old benchmarking of line numbers documented somewhere?

@omri123
Copy link
Contributor Author

omri123 commented Nov 14, 2023

The implementation is different

  1. The new implementation use pipe | and not space to seperate numbers from code
  2. In the new implementation, code that appears in the system promps is also numbered. The assistant is expected to produce numbered code as well.

@gwpl
Copy link

gwpl commented Nov 28, 2023

(would be great if could be turned on/off not only via ENVIRONMENT variable (and eventually flag) , but also during REPL (e.g. /config nonu or /config nu (yes, you see my bias toward Vim short codes...)

@omri123
Copy link
Contributor Author

omri123 commented Nov 28, 2023

An update about this feature: I want to write a new benchmark, relevant to some problems I saw with this feature, to re-write some logic (searching for lines to replace) and to separate it as a third coder. I hope to find the time for it in the next few days.

@gwpl
Copy link

gwpl commented Nov 30, 2023

To my understanding how LLMs work (and my experience from [inference on graph algorithms] ), providing all line numbers , I can see how could become unnecessary obstructing noise to signal ration of piles of transformers , specifically distracting certain attention heads ...

Maybe encoding line numbers in another way for specific languages could be more efficient (e.g. add as comment at ends of some important lines like , end of first line starting new function? ( // line 1243 ). (Of course this would require extra logic to track which of those comments were added and should be removed afterwards, however at same time easy to pattern match and keep in a list of table).

@paul-gauthier
Copy link
Collaborator

Thanks for taking the time to experiment with new edit formats and share your results. I've also experimented with line numbers for code editing quite a bit. All the benchmarking I've done convinces me that it's worse to ask GPT to use line numbers.

I'm going to close this PR as I'm not likely to add line number support at this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants