Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] add option to convert bbcode to markdown #23

Closed
wants to merge 5 commits into from

Conversation

Dahie
Copy link

@Dahie Dahie commented May 15, 2020

This is still work in progress, but shows a possible direction discussed in #22

The tests are failing at the moment. I have a bit of a hard time with Treetop, it's documentation is … sparse with a steep learning curve.


This is an image: ![Image](http://example.com/foo.png)

> This *is* a quotation.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current treetop node structure does not allow to have these quotes per line. I need some help here.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. Do you still need help with this? If so, could you elaborate on what you're trying to do and what problems you ran into?

Copy link
Owner

@jarrett jarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is still marked as WIP. I take it it's not ready for me to review and merge into master yet. Please let me know if I'm mistaken.

@Dahie
Copy link
Author

Dahie commented May 30, 2020

Yes, still open. So the goal is to translate

[quote]this is

a 
multiline quote
[/quote]

into

> this is
>
> a
> multine quote
>

My thinking went in to directions to achieve this. Either by replacing the newlines within the block node to \n> but that messes with included paragraphs and leaves empty new lines so far.
Thinking two was having this dealt properly in Parslet and introducing QuoteLineItemsor something that get prepended with the >

@jarrett
Copy link
Owner

jarrett commented May 30, 2020

Ah, thanks for pointing that out. So tell me if I'm understanding this correctly:

In BBCode, the string apples\noranges is supposed to parse to the same HTML as apples oranges. In other words, the single newline character is treated as a space. But when converting to Markdown, you actually do want to preserve the single newline character in the output.

Did I get that right? If so, then I think your second option is the way to go. I.e. we'll have to extend the .treetop file so it emits different nodes for apples\noranges and apples oranges. Then, we'll add logic to consume these types of nodes in the HTML and Markdown rendering methods.

@jarrett
Copy link
Owner

jarrett commented May 30, 2020

BTW, I added comments to node_extensions.rb in the master branch just now. Hopefully this makes the interplay between the .treetop file and node_extensions.rb easier to understand.

@Dahie
Copy link
Author

Dahie commented May 31, 2020

Ah, thanks for pointing that out. So tell me if I'm understanding this correctly:

In BBCode, the string apples\noranges is supposed to parse to the same HTML as apples oranges. In other words, the single newline character is treated as a space. But when converting to Markdown, you actually do want to preserve the single newline character in the output.

Did I get that right? If so, then I think your second option is the way to go. I.e. we'll have to extend the .treetop file so it emits different nodes for apples\noranges and apples oranges. Then, we'll add logic to consume these types of nodes in the HTML and Markdown rendering methods.

Good point, markdown's handling of when to paragraph and when to line breaks got me a few times already. It's documented here: https://www.markdownguide.org/basic-syntax/#paragraphs-1

[quote]this becomes    
a 
single line quote
[/quote]

would actually become

> this becomes a single line quote

While

[quote] this becomes
  
a  
multiline quote
[/quote]

should become

> this becomes
>
> a
> multiline quote

(Note the double-spaces behind the "a")

@Dahie
Copy link
Author

Dahie commented May 31, 2020

Btw, what is the minimum supported ruby version of the gem? I'd tend to go for 2.5 given everything older is EOL

@Dahie Dahie force-pushed the add-support-for-markdown branch from 4020410 to 2deb40a Compare May 31, 2020 08:36

def expected_markdown
<<-EOS
**This *is* awesome**
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails at the moment and creates the line **This [i]is[/i] awesome**
Something in the recursive detection seems to be broken.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think this should be doable if we extend the .treetop file.

I agree with supporting nothing earlier than Ruby 2.5. I wrote this gem a long time ago and didn't get around to updating its dependencies. I'm going to try to pull from your repo, see if there's anything else needing updates, and then push to master with all dependencies updated.

@jarrett
Copy link
Owner

jarrett commented May 31, 2020

What's the rationale for the commit removing sanitization (Dahie@d19b450)?

@jarrett
Copy link
Owner

jarrett commented May 31, 2020

By the way, when I updated Treetop to the latest version (1.6.10), I started getting the following error upon compiling the .treetop file:

NoMethodError: undefined method `expected' for #<Treetop::Runtime::SyntaxNode:0x0000557d8de861c8>
    /usr/local/bundle/gems/treetop-1.6.9/lib/treetop/compiler/node_classes/parenthesized_expression.rb:9:in `expected'

After a quick look at parenthesized_expression.rb in the Treetop source, I tend to think this is a Treetop bug. Reverting Treetop to 1.5.3 fixed the problem. (I also tried all later versions, and all raise the same error.) So I'll keep Treetop pinned to 1.5.3 for now. Let me know if you have any further insight into this.

Beyond that, I updated all dependencies and pushed to master.

@Dahie
Copy link
Author

Dahie commented Jun 1, 2020

I ran into the same issue trying to update treetop and decided to stick with the old version. Good enough.

@Dahie
Copy link
Author

Dahie commented Jun 1, 2020

What's the rationale for the commit removing sanitization (Dahie@d19b450)?

You can ignore the other branches in my repo. They were some experiments, but not for merging.

@jarrett
Copy link
Owner

jarrett commented Jun 1, 2020 via email

@jarrett
Copy link
Owner

jarrett commented Jun 19, 2020

This is taking me longer than I expected, but I have a partly working implementation on the master branch. I pulled from your repo, did some refactoring, and made an attempt at solving the blockquote problem. But there are still some bugs with paragraphs and line breaks. I'll keep working on those.

@jarrett
Copy link
Owner

jarrett commented Jun 20, 2020

I think I've worked out the paragraph and line-break bugs. Let me know what you think. We may be close to releasing a new version with Markdown support.

@Dahie
Copy link
Author

Dahie commented Jun 20, 2020

Thank you looks very good! Only, the readme does not mention the markdown output feature and how to use it.

Feel free to close my PR when you release the new version :)

@jarrett
Copy link
Owner

jarrett commented Jun 20, 2020

The new version (1.1.0) is released. Thanks again for your help!

@jarrett jarrett closed this Jun 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants