Extending to manhwa #2

levavft · 2023-04-24T10:22:32Z

Hi~ I was about to embark on creating something similar for manhwa's and then I found this very nice project.
backend-wise it should be really easy to extend this, for example using pytesseract.
I would love to create anything you need for the backend, I've created a local version of manga-ocr that uses pytesseract and its simple enough, I just don't really know how to embed it into your project as it is a bit more involved.

using pytesseract it should be possible to extend to other languages as well, easily. of course, its not as good as manga-ocr's ocr (I couldn't found good databases that I could use to copy their approach) but setting pytesseract as a default when there isn't anything better should be great :)

so, basically, tell me how to contribute and I'll have a pull request ready in no time ;p

rDarge · 2023-05-25T16:20:50Z

I've been thinking about trying to add additional backend options too - like adding an option to translate with DeepL instead of ChatGPT. If you're not sure how to contribute, but you have a stable fork of manga-ocr that uses pytesseract, can you upload it to a public repo? I'd be happy to take a look and provide some suggestions/support.

levavft · 2023-05-26T07:23:33Z

Alright, I'll make my version a bit more stable and clean, and upload it ^^ should take a few days at most.
I think it might be a good idea to also add a google cloud ocr option for those who have a google cloud key, I'll see what I can do ;p

rDarge · 2023-06-01T20:44:31Z

@levavft Just wanted to check in on this - Have you made some headway in your pytesseract fork?

K-RT-Dev · 2023-06-02T14:15:58Z

Thank you very much for your enthusiasm in contributing to the project :)

Unfortunately, I have had very little time to work on this side project. But I can tell you that in the next version, I will add:

Support for DeepL (using an API Key) and Google (without an API Key) as translators
Options to perform the same translation in multiple translation engines simultaneously
Option for GPT to take translations generated by different engines and combine them into an improved one

@rDarge The improvement to incorporate DeepL is almost ready. If you haven't started development yet, don't waste time on that.

@levavft A while ago, I had another project similar to this one (which I closed) that used pytesseract. I ended up abandoning it because creating an installable version with a moderately small size was impossible, very difficult to achieve. It would be very interesting and a great contribution if you manage to generate a Python installable that has pytesseract as a dependency.

rDarge · 2023-06-02T20:02:10Z

@K-RT-Dev Great! I'll create some additional issues for the other changes I've been working on so I can make sure we've got alignment before I put up a PR

levavft · 2023-06-03T09:12:12Z

Hey @K-RT-Dev and @rDarge sorry for the delayed response. I've been testing what I have against less clean text, and its awful. (My personal use case is pretty clean text). Specifically - Korean manhwa text often has bubbly letters, which pytesseract simply can't read. So to be honest I'm feeling like spending more time on tesseract might be a waste of time. Instead, it might be good to use google ocr, especially since you essentially get to use it for free if you're not planning on making money from it. I have no idea if it does better on such text (I haven't tested it at all) but at the very least it should be much easier to use and install.

The things you're currently working on sound great! I can't wait to see them in action.
I'll list some ideas that I had while playing with my tesseract version, and if something catches your eye I might spend some time on it (though, like you I seem to have somewhat reduced capacity for side projects ><)

Adding an option to view a page and create a list of bounding boxes to it, similar to this:
https://github.com/manisandro/gImageReader

assuming you like the previous idea, you can use chatgpt on the text as a whole. which should allow it to be much more natural (especially if you specifically ask chatgpt to make it sound natural)

using a spell checker to rate the quality of different ocr results (from different engines / with different pre-processing steps) and choosing the best one.

Anyhow, keep us updates and I'll update if I'll have something worth sharing ^^

Kromtar · 2023-06-03T14:49:54Z

@levavft Could you share the set of images you're using to test text extraction? I have some models I could try to see their performance.

Adding an option to view a page and create a list of bounding boxes to it, similar to this: https://github.com/manisandro/gImageReader

We are aligned in our ideas. Precisely, the second mode of operation I am planning to integrate into the system consists of this.
My idea is as follows:

From a page, be able to manually select the order of the texts.
Have an option to write free text to describe what is happening in the accompanying images.
Send the corresponding text sequence to GPT and add the personalized context if it is present.

I have conducted manual tests using this method, and the results are incredible. When GPT has the complete text from one or more narrators, it can infer dialogue exchanges much better. Additionally, if it has context describing what is happening (for example, "People are talking while they see a landscape"), it helps in identifying pronouns and verb tenses more accurately.

levavft · 2023-06-04T04:58:48Z

@Kromtar Sure! Here are the ones I had the most trouble with:
https://github.com/levavft/manhwa-ocr-test-files

Good to see people are on the same page, this could become a very nice tool for translators or just those who want to read manhwa ^^

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extending to manhwa #2

Extending to manhwa #2

levavft commented Apr 24, 2023

rDarge commented May 25, 2023

levavft commented May 26, 2023 •

edited

rDarge commented Jun 1, 2023

K-RT-Dev commented Jun 2, 2023

rDarge commented Jun 2, 2023

levavft commented Jun 3, 2023

Kromtar commented Jun 3, 2023

levavft commented Jun 4, 2023

Extending to manhwa #2

Extending to manhwa #2

Comments

levavft commented Apr 24, 2023

rDarge commented May 25, 2023

levavft commented May 26, 2023 • edited

rDarge commented Jun 1, 2023

K-RT-Dev commented Jun 2, 2023

rDarge commented Jun 2, 2023

levavft commented Jun 3, 2023

Kromtar commented Jun 3, 2023

levavft commented Jun 4, 2023

levavft commented May 26, 2023 •

edited