Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a code editor with syntax highlighting #291

Open
johannphilippe opened this issue Jan 13, 2022 · 10 comments
Open

Add a code editor with syntax highlighting #291

johannphilippe opened this issue Jan 13, 2022 · 10 comments
Assignees

Comments

@johannphilippe
Copy link
Contributor

johannphilippe commented Jan 13, 2022

I would like to contribute to development of a code editor element with syntax highlighting.
It could probably be derived from basic_text_box, though it would require a dynamic syntax analyzer allowing user configuration for support of - almost - any language (perheaps brainf*ck is not a major target :D ), at least common syntax.
I quickly modified lexertk here, so it doesn't dismisses spaces, tabs and comments. I could also "highlight" an ugly custom static text box (no edit). This tiny library lexertk can still bring a few ideas on how we could do that !

Discord discussion below about text editor and related.

johannphilippe — 10/01/2022
I don't know if I should create an issue for that (perheaps as a reminder), but I want to add a few key shortcuts controls to basic_text_box (the one that is used in text_edit example) : 
- Tabs (several spaces)
- Ctrl + backspace (delete a full world if possible)
- Edit : also Ctrl+left or Ctrl+right to move caret from one word if possible
Though, Tabs would mess up with how it is currently used to pass the focus to the next element (probably due to composite_base I guess)

Also, that leads to another question : when we will work on syntax highlighting, we surely will make a pure syntax highlighter  based on lexer/parser and stuff. But are we going a make another text element that stands on it, or are we going to add it to the current text classes ?

Because of course, tab as a "space * 3" is mostly used in code editor.  So there are a few questions here. (modifié)
[20:05]
For example, in Flutter, the basic  TextField uses tab like elements basic_input_box uses it -> to pass focus. But in a multiline text editor like in the elements example, you probably want it to add a few spaces. (modifié)
[20:06]
Not related to that question, but also for basic_input_box make the placeholder protected rather than private could be interesting, I think it's important that a child could modify it at runtime. (modifié)

redtide — 10/01/2022
the only thing comes in my mind is that tabs should be configurable, I guess the highlight should be some external mechanism to add to some advanced textedit class?

johannphilippe — 10/01/2022
I agree for highlight.
[20:22]
For tabs, I think basic_text_box and basic_input_box are quite different widgets, where the second is mostly used for single line entries (so tab changing focus is relevant) and the first is mostly a plain text editor (so, adding spacing tabs could be relevant in this case)

redtide — 10/01/2022
right, so I think the former should be configurable
[20:25]
I guess a syntax highlighting could be used on some styled_text_box

johannphilippe — 10/01/2022
Or that could simply be done like that : 
* basic_text_box is a text editor, with spacing tabs. Also has choice between plain text editor, or highlighted text editor (so code editor can inheritit it).
* basic_input_box and input_box maker keep their "change focus" tab behavior (override the basic_text_box behavior)
* code_editor is based on basic_text_box with highlight parameter true
[20:28]
I think my ideas benefits are to keep almost intact the current implementation. 
Though, I understand your idea too, the text_editor_controller could be great too. Though I didn't see this pattern a lot in elements.
[20:29]
(I use controllers in my projects, as a way to use custom parameters on some custom elements)

redtide — 10/01/2022
I would let basic_text_box for generic editing, then inherit it to a styled_text_box with advanced features like syntax highlighting, eventually more advanced editing can inherit from the latter
[20:31]
(basically no SH on the basic at all, only tab/spaces configuration)

johannphilippe — 10/01/2022
SH ? (modifié)

redtide — 10/01/2022
syntax highlighting

@redtide
syntax highlighting

johannphilippe — 10/01/2022
Well the tab spacing is not hard to implement. And I'm pretty sure basic_input_box overrides it to let parent handle it.

redtide — 10/01/2022
maybe you can get some reference from EditorConfig, they are not much settings
[20:47]
(I mean for indent size and indent type: tab or space) (modifié)
[20:48]
I have no experience on that side, might be to ask @djowel first

johannphilippe — 10/01/2022
One last thing : if text elements are mainly based on std::u32string and  std::u32string_view, like it seems to be, it makes it hard to work with regex (I don't think there is  u32 regex in std). Could we template it to let used decide on it ? Of course, I can always write a function to convert, but with a lot of text elements (I will use a lot of basic input in my soft),  I think it will cost a little bit to CPU.

@redtide
maybe you can get some reference from EditorConfig, they are not much settings

johannphilippe — 10/01/2022
you are also right because indent could be 2, 3, 4 (...) spaces
11 janvier 2022
@djowel
Copy link
Member

djowel commented Jan 14, 2022

I quickly modified lexertk here, so it doesn't dismisses spaces, tabs and comments. I could also "highlight" an ugly custom static text box (no edit). This tiny library lexertk can still bring a few ideas on how we could do that !

I dissuade you from using a lexer. 1. It's not the right tool. 2. It is an unnecessary dependency, and 3. It is easy to implement without it.

The right tool is actually a recursive descent parser because parsing syntax is recursive WRT elements such as braces, parentheses and even comments in some languages (e.g. pascal). It is easy to hand-code an RDP for this purpose. I can offer guidance or write code if needed.

@djowel
Copy link
Member

djowel commented Jan 14, 2022

  • Edit : also Ctrl+left or Ctrl+right to move caret from one word if possible

Option+left/right already moves caret by words. Shift+Option+left/right extends the selection by words.

@djowel
Copy link
Member

djowel commented Jan 14, 2022

Because of course, tab as a "space * 3" is mostly used in code editor.

The tab number of spaces needs to be configurable.

@djowel
Copy link
Member

djowel commented Jan 14, 2022

For example, in Flutter, the basic TextField uses tab like elements basic_input_box uses it -> to pass focus. But in a multiline text editor like in the elements example, you probably want it to add a few spaces. (modifié)

Same with element input text boxes. Code editors should behave differently.

@djowel
Copy link
Member

djowel commented Jan 14, 2022

I would let basic_text_box for generic editing, then inherit it to a styled_text_box with advanced features like syntax highlighting, eventually more advanced editing can inherit from the latter

A (styled) rich-text editor and a code-editor are two separate things that have very specific requirements. While it may seem intuitive to have a code-editor inherit from a rich-text editor, it is best to simply have these as totally separate classes. There's not much code-reuse advantage.

@djowel
Copy link
Member

djowel commented Jan 14, 2022

One last thing : if text elements are mainly based on std::u32string and  std::u32string_view, like it seems to be, it makes it hard to work with regex (I don't think there is u32 regex in std). Could we template it to let used decide on it ? Of course, I can always write a function to convert, but with a lot of text elements (I will use a lot of basic input in my soft), I think it will cost a little bit to CPU.

There is: std::wregex and std::wsmatch

How to use Unicode range in C++ regex

@djowel djowel self-assigned this Jan 14, 2022
@johannphilippe
Copy link
Contributor Author

The right tool is actually a recursive descent parser

Doesn't it require tokenizing first ? I will take a look at RDP !

@djowel
Copy link
Member

djowel commented Jan 14, 2022

The right tool is actually a recursive descent parser

Doesn't it require tokenizing first ? I will take a look at RDP !

No. Not really. Boost.Spirit for example does not require a lexer.

@djowel
Copy link
Member

djowel commented Jan 14, 2022

I wrote a syntax highlighter a long time ago, using spirit, for syntax highlighting c++. It might be possible to 1. Extract the parser proper and rewrite it using pure c++ (no dependencies) 2. make it generic to allow other languages.

Here's the github page: BoostBook

It's still being used by Boost documentation. Many Boost docs use it.

Here's the actual syntax highlighting code:
https://github.com/boostorg/quickbook/blob/develop/src/syntax_highlight.cpp

The advantage there is that the grammar is formalized and can be lifted.

(edit: reading the code again, I realize there's also a python parser)

@johannphilippe
Copy link
Contributor Author

I wrote a syntax highlighter a long time ago, using spirit, for syntax highlighting c++. It might be possible to 1. Extract the parser proper and rewrite it using pure c++ (no dependencies) 2. make it generic to allow other languages.

After reading the code, I think I need a bit of time to understand it 😄
I found that today, that seems to explain the syntax used to describe grammars : https://en.wikipedia.org/wiki/Spirit_Parser_Framework

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants