Skip to content
This repository was archived by the owner on Nov 5, 2022. It is now read-only.

textcursor type to replace textselection #123

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dskinner
Copy link
Contributor

@dskinner dskinner commented Jun 8, 2015

Cursor is fundamentally different as it's defined as only containing an index and mark. Caret location is always at index. A text selection is the cursor indices sorted least to greatest.

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project, in which case you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please let us know the company's name.

@dskinner
Copy link
Contributor Author

dskinner commented Jun 8, 2015

changes to files beyond textcursor.go were kept to a minimum to assist in identify any issues with the transition.

@dskinner
Copy link
Contributor Author

dskinner commented Jun 8, 2015

I signed it!

@googlebot
Copy link

CLAs look good, thanks!

// TextCursor represents a text position and selection. A selection is defined as
// the pair of cursor indices sorted least to greatest. When transforming a
// cursor with no selection, mark should equal index.
type TextCursor struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand things correctly, index and mark are equivalent to the old start and end fields, except the carat is always displayed at mark, and mark can be before index. If this is true, index and mark are still essentially a half-closed interval. What's not clear is which endpoint is open and which is closed (I guess the smaller of the two is always the closed endpoint). It would be good to have this documented.

I'm not sold on the field names index and mark. Given that they're interval endpoints, how about From and To? I'm not completely opposed to mark, but index just doesn't seem to work nicely as a counterpart for me. I'm open to other suggestions here. Maybe Carat and Base? I'll stick with From and To for the rest of my comments.

I'm not sure it's worth bringing in the concept of selection into the TextCursor documentation - it's rather confusing as it guarantees order (unlike TextCursor), and is already well explained below.

My stab at documentation (based on my understanding of the code) is as follows (I'm a visual guy):

// TextCursor represents an interval of runes, where From and To can be 
// considered as the indices to the gaps between runes. 
// To can be less than, equal or greater to From.
// When the TextCursor is used as a TextBox selection, the carat should always 
// be displayed at To.
// The TextCursor represents a single point in a string When the To and From fields are equal.
//
// For example, given the string "Hello world":
//
//      ┌   ┬   ┬   ┬   ┬   ┬   ┬   ┬   ┬   ┬   ┬   ┐
//        H   e   l   l   o       w   o   r   l   d  
//      └   ┴   ┴   ┴   ┴   ┴   ┴   ┴   ┴   ┴   ┴   ┘
//      ₀   ₁   ₂   ₃   ₄   ₅   ₆   ₇   ₈   ₉   ₁₀  ₁₁
//
//
// TextCursor{ From: 0, To: 5 } represents:
//
//      ┌   ┬   ┬   ┬   ┬   ╥   ┬   ┬   ┬   ┬   ┬   ┐
//      │ H   e   l   l   o ║     w   o   r   l   d  
//      └   ┴   ┴   ┴   ┴   ╨   ┴   ┴   ┴   ┴   ┴   ┘
//      ├───────────────────╢ 
//      ₀                   ₅
//
//
// TextCursor{ From: 11, To: 6 } represents:
//
//      ┌   ┬   ┬   ┬   ┬   ┬   ╥   ┬   ┬   ┬   ┬   ┐
//        H   e   l   l   o     ║ w   o   r   l   d  
//      └   ┴   ┴   ┴   ┴   ┴   ╨   ┴   ┴   ┴   ┴   ┘
//                              ╟───────────────────┤
//                              ₆                   ₁₁
//
//
// TextCursor{ From: 5, To: 5 } represents:
//
//      ┌   ┬   ┬   ┬   ┬   ╥   ┬   ┬   ┬   ┬   ┬   ┐
//        H   e   l   l   o ║     w   o   r   l   d  
//      └   ┴   ┴   ┴   ┴   ╨   ┴   ┴   ┴   ┴   ┴   ┘
//                          ║
//                          ₅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor.Index is the location of the caret. Given that the words cursor and caret refer to the same idea, the rationale for the field name Index is that a single cursor/caret can only exist at a single position.

I am somewhat opposed to the names From and To as it conflates the idea of a cursor/caret and a selection. This is also why I omitted the Start and End methods. A selection is retrieved via TextCursor.Selection which is currently simple enough to be a pair of ordered indices, but still distinct from the cursor itself. Would TextCaret.Index be more obvious?

Regarding Cursor.Mark, I'm not totally convinced on that name either. Mark is meant to represent storing a cursor position but is still volatile. During transforms, mark is never meant to be operated on independently. All transforms should either adjust only the cursor index, or apply the same change to cursor index and cursor mark. I've been considering how to enforce this contract through what methods are available on TextCursor as well.

If you look at the TextCursorSlice.Transform, this uniformly transforms index and mark, no special cases. This is essentially what TextCursor.Offset does as well, allowing the cursor, and any selection implied, to move around.

If you also look at TransformCarets (which I forgot to rename to TransformSelection, will follow up with that change), it only operates on the cursor index.

So those transform methods operate within the bounds of the contract I describe, but the TextCursor itself does not yet.

What I've considered next to assure TextCursor makes this clear is to provide the following constructors instead in a follow up commit (since it will touch more code elsewhere). NewTextCursor(index int) Cursor and NewTextCursorSelection(start, end int).

So the distinction I'm trying to make here is that a TextCursor is not a half-closed interval up until the point you call TextCursor.Selection(). It's just happen-stance they are stored side by side in the same struct due to a selection being trivially represented by an additional index.

I think this is an important distinction, particularly when you look at all the method on textboxcontroller working with selections and carets and all the ints going around everywhere. A new caret creates a hidden selection that could be retrieved with a selection method and other weirdness.

I'm also against the redundancy of Cursor.Caret, as I hope I clarified what a TextCursor is above but perhaps renaming to TextCaret would make things totally obvious.

I'm not sure it's worth bringing in the concept of selection into the TextCursor documentation

I agree. That's more or less there due to maintaining a similar constructor to the old code.

I love the ascii art :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about, instead of trying to treat this thing as two indices, we have:

type TextCursor struct {
  Index uint // Carat index.
  Length int // If non-zero cursor is a selection.
}

Where Length can be zero, positive or negative.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the construction of thr non-selection case simple:

gxui.TextCursor{ Index: i }

t = t.Offset(i) becomes t.Index += i

Things that don't want to know about selections don't look at the Length field.

My concerns with half-closed intervals go away as it's much simpler to understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just mulling over removing mark (a second index) completely and this is a great idea

@ben-clayton
Copy link
Contributor

Hi @dskinner. I have a bunch of comments mostly regarding semantics.Otherwise, it looks good.
Thank you for taking the time to do this!

Cheers,
Ben

@dskinner
Copy link
Contributor Author

@ben-clayton I've pushed the updates previously discussed a few days ago but got sidelined before I got to detail some thoughts on this pull request.

Currently, I've left Index as an int instead of the suggested uint in your original comment. This was due to the Range function which would allow overflowing the uint and make an error less obvious. For example, TextCursor{Index: 0, Length: -5} is not valid, and being able to instantiate the struct directly skips any checks that might occur during construction.

I've also added a constructor to help with working with methods such as WordAt and other funcs that return a range, though the name NewTextCursor doesn't feel quite right. The motivation for this is simply that typing TextCursor{Index: end, Length: start-end} is unnecessarily error prone. Somewhere along the way, I had myself accidentally typed TextCursor{e, e-s} and it took a good minute to find. This bothers me.

In retrospect, I'm actually more in favor of how you originally viewed the code as TextCursor{From, To}. This simplifies working with intervals and at worse, makes a single caret index verbose.

At the same time, this brings into question the interval/ package. This package would play an important role in multiple cursors/selections, syntax highlighting, indexing, etc. I looked through gxui and saw this package was only used for managing cursors/selections, and the code_syntax_layer and subsequent paints.

The package contains some oddities though such as the unused U64. It's also curious why the original TextSelection didn't simply make use of IntData, storing caret location at IntData.data and so on (via struct embedding or w/e).

There is another interval package from biogo: https://github.com/biogo/store/tree/master/interval

I briefly looked at it, though the binary-tree/pointers approach in my experience is typically slower than a slice of struct literals. This may be important for syntax highlighting large files but it may be too soon to say. In general, inserts were five times slower than gxui/interval. I didn't measure GC pause for having all those pointers, not sure how relevant that will be in 1.5 even.

Regardless, I'm not totally dismissive of a 3rd party package here considering the experimental vendor support coming in 1.5, but at the least I think it's a good model for cleaning up gxui/interval, removing a lot of the redundancy found in the text cursor code, and consider other cases the interval package will play a role in (i.e. indexing, bookmarking).

So right now I feel this pull request is premature.

@ben-clayton
Copy link
Contributor

Hi @dskinner - sorry for the insanely long delay in getting back to you on this.

Currently, I've left Index as an int instead of the suggested uint in your original comment. This was due to the Range function which would allow overflowing the uint and make an error less obvious. For example, TextCursor{Index: 0, Length: -5} is not valid, and being able to instantiate the struct directly skips any checks that might occur during construction.

Makes sense.

I've also added a constructor to help with working with methods such as WordAt and other funcs that return a range, though the name NewTextCursor doesn't feel quite right. The motivation for this is simply that typing TextCursor{Index: end, Length: start-end} is unnecessarily error prone. Somewhere along the way, I had myself accidentally typed TextCursor{e, e-s} and it took a good minute to find. This bothers me.

The new constructor seems fine to me.

In retrospect, I'm actually more in favor of how you originally viewed the code as TextCursor{From, To}. This simplifies working with intervals and at worse, makes a single caret index verbose.

shrug What you have in your PR seems like an improvement with ToT, if just for the additional documentation.

At the same time, this brings into question the interval/ package. This package would play an important role in multiple cursors/selections, syntax highlighting, indexing, etc. I looked through gxui and saw this package was only used for managing cursors/selections, and the code_syntax_layer and subsequent paints.

The package contains some oddities though such as the unused U64. It's also curious why the original TextSelection didn't simply make use of IntData, storing caret location at IntData.data and so on (via struct embedding or w/e).

The interval package has gone through several iterations. A colleague and I spent a week optimizing it as it was used in some particularly hot code in my full-time project, Android GPU profiling tools.
If you think the version of the interval package is better, I can port this back again.

There is another interval package from biogo: https://github.com/biogo/store/tree/master/interval

I briefly looked at it, though the binary-tree/pointers approach in my experience is typically slower than a slice of struct literals. This may be important for syntax highlighting large files but it may be too soon to say. In general, inserts were five times slower than gxui/interval. I didn't measure GC pause for having all those pointers, not sure how relevant that will be in 1.5 even.

Like you guessed, the interval code can be stressed with lots of syntax layers. I'd prefer to keep the code fast, and in GXUI. If there's ways we can make the interval package easier to use, I'm happy for changes.

So right now I feel this pull request is premature.

I'm happy to accept it, but it's your call.

Thanks for all your time on this so far Daniel.

@dskinner
Copy link
Contributor Author

Hey,

No worries about the long delay. My concerns still stand about this being premature. While agree the current pull request is an improvement over what's currently in master, it still doesn't address all concerns and that's just the facts. If we merge this, it's only due to be replaced by something more complete later on.

My main concern as expressed on gitter is still with a selection that displays carets on each end. With this current pull-request as is, that requires some hackery, but I don't think this should be an edge case at all. On mobile in fact, this is quite normal. On desktop, it should definitely be an option.

I want to follow up with a pull request that actually addresses this issue but I'll also want time to benchmark differences between implementations so as to find the most suitable way to do this while still managing to do things like syntax highlighting in real time without much delay.

In regards to the interval package, I gave a cursory glance over the package from android-sdk that you linked and I think it'd be worthwhile to backport this to gxui, particularly since you mentioned spending time speeding it up and the fact that it's much more well documented.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants