Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionals issue with non-English (Arabic) #333

Open
EyadKhaled opened this issue Dec 19, 2019 · 3 comments
Open

Optionals issue with non-English (Arabic) #333

EyadKhaled opened this issue Dec 19, 2019 · 3 comments
Labels

Comments

@EyadKhaled
Copy link

EyadKhaled commented Dec 19, 2019

English works fine with that:

+ [*] welcome [*]

But when I type Arabic letters it doesn't work..

+ [*] مرحبا [*]

I didn't forget {utf8: true} by the way.

I usually use optionals in my triggers so it's a serious problem for me :v , also I tried many times to fix that issue.. So I think it's a unicode problem.

@kirsle
Copy link
Member

kirsle commented Jan 4, 2020

Hello,

This has been a problem affecting the JavaScript version for a while (putting [*] on either side of Unicode text), so a while back I added a work-around command for this specific use case:

? مرحبا
- Reply

Put a "?" symbol in place of the "+" and remove the [*] optional wildcards from either end. It makes it a "keyword trigger" which tries various different formats of wildcard to make a +Trigger that simply checks that the keyword is in the message "somewhere".

More information about the bug can be found in #147 and other issues with the unicode tag. The root cause is that the [optional] syntax in RiveScript gets turned into a regexp that uses the word-boundary metacharacter "\b" and in JavaScript the "\b" doesn't consider Unicode characters to be words.

@kirsle kirsle added the unicode label Jan 4, 2020
@EyadKhaled
Copy link
Author

Sorry for long time without reply..
I was doing a trick to solve this problem with alternatives:

+ (اهلا|*اهلا) test (وسهلا|وسهلا*)

But it wasn't the best choice for sure. So "?" will be very useful and clean code instead of this trick.
Thanks.

Oh i was wondering..Does this problem happen in other language implementations?

@kirsle
Copy link
Member

kirsle commented Jan 28, 2020

I tested some of the other versions (Go, Perl and Python). It seems the Go version has a similar bug, but it works in the Perl and Python versions.

I tested this on the Python version and it worked fine:

+ [*] مرحبا [*]
- It matched!
      .   .       
     .:...::      RiveScript Interpreter (Python)
    .::   ::.     Library Version: v1.14.9
 ..:;;. ' .;;:..  
    .  '''  .     Type '/quit' to quit.
     :;,:,;:      Type '/help' for more options.
     :     :      

Using the RiveScript bot found in: tmp
Type a message to the bot and press Return to send it.

You> مرحبا
Bot> It matched!
You> hello مرحبا world
Bot> It matched!

And Perl:

RiveScript Interpreter - Interactive Mode
-----------------------------------------
RiveScript Version: v2.0.3
        Reply Root: ../rivescript-python/tmp

You are now chatting with the RiveScript bot. Type a message and press Return to send it.
When finished, type '/quit' to exit the program. Type '/help' for other options.

You> مرحبا
Bot> It matched!
You> helo مرحبا world
Bot> It matched!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants