Skip to content

Latest commit

 

History

History
288 lines (203 loc) · 11.9 KB

README_en.md

File metadata and controls

288 lines (203 loc) · 11.9 KB

中文 English 日本語

Umi-OCR

Umi-OCR

Umi-OCR LICENSE forks stars forks 翻译状态

Usage Download Releases Changelog Issue a problem


Free, Open-source, Batch Offline OCR Software
Compatible with Windows7 x64 and above

  • Free: All the code of this project is open-source and completely free.
  • Convenient: Unzip and use, run offline, no need for network.
  • Efficient: Comes with a highly efficient offline OCR engine. As long as the computer performance is sufficient, it can be faster than online OCR services.
  • Flexible: Supports customizable interface, and supports multiple calling methods such as command-line and HTTP API.

1-标题-1.png

1-标题-2.png

Usage:

Using the Source Code:

Developers should read Building the Project before proceeding.

Download Releases:

•  Scoop Installer (Click to expand)

Scoop is a command-line installer for Windows that facilitates the management of multiple applications. You can install Scoop first and then use the following commands to install Umi-OCR:

  • Add the extras bucket:
scoop bucket add extras
  • (Optional 1) Install Umi-OCR (comes with the Rapid-OCR engine, better compatibility):
scoop install extras/umi-ocr
  • (Optional 2) Install Umi-OCR (comes with the Paddle-OCR engine, slightly faster):
scoop install extras/umi-ocr-paddle
  • Do not install both, as the shortcuts may be overwritten. However, you can import additional plugins to switch between different OCR engines at any time.

Getting Started

The software release package is available in .7z compressed format or as a self-extracting .7z.exe package. The self-extracting package can be used to extract files on a computer without compression software installed.

This software does not require installation. After extraction, simply click on Umi-OCR.exe to start the program.

If you encounter any problems, please submit an Issue and I will do my best to assist you.

Interface Language

Umi-OCR supports multiple languages for its interface. When you open the software for the first time, it will automatically switch to the language based on your computer's system settings.

If you need to manually switch languages, please refer to the following figure, 全局设置语言/Language .

1-标题-1.png

Help us translate / 帮助我们翻译!

You can participate in translation work online on Weblate:

https://hosted.weblate.org/engage/umi-ocr/

Tabbed Interface

Umi-OCR v2 is composed of a series of flexible and easy-to-use tabbed interfaces. You can open the required tabbed interface according to your preferences.

The top left corner of the tab bar can be used to switch window always on top. The top right corner can be used to lock the tabbed interface to prevent accidental closure during daily use.

Screenshot OCR

2-截图-1.png

Screenshot OCR: After opening this page, you can use a keyboard shortcut to capture a screenshot and recognize the text in the image.

  • The left-side image preview panel allows you to select and copy text with your mouse.
  • The right-side recognition record panel allows you to edit text and select and copy multiple records.
  • It also supports copying images from elsewhere and pasting them into Umi-OCR for recognition.

Paragraph Merge

2-截图-2.png

About OCR Text Post-Processing - Paragraph Merge: This feature can organize the layout and order of OCR results to make the text more suitable for reading and use. The preset schemes are:

  • Single line: Merge text on the same line, suitable for most scenarios.
  • Multiple lines - natural paragraphs: Intelligently recognize and merge text belonging to the same paragraph, suitable for most scenarios, as shown in the figure above.
  • Multiple lines - code block: Try to restore the original indentation and spacing of the text. Suitable for recognizing code snippets or scenes that require retaining spaces.
  • Vertical layout: Suitable for vertical layout. Needs to be used in conjunction with a model library that also supports vertical layout recognition.

Batch OCR

3-批量-1.png

Batch OCR: This page supports batch importing local images for recognition.

  • The recognized content can be saved in various formats such as txt/jsonl/md/csv(Excel).
  • Supports text post-processing technology, which can recognize text belonging to the same natural paragraph and merge it. It also supports multiple processing schemes such as code blocks and vertical text.
  • There is no limit on the number of images that can be imported for processing at one time, and the software can automatically shut down or sleep after completing the task.

Ignore Regions

3-批量-2.png

About OCR Text Post-Processing - Ignore Regions: This is a special function in batch OCR that is used to exclude unwanted text in images.

  • The ignore region editor can be accessed in the right column of the batch recognition page settings.
  • As shown in the example above, there are multiple watermarks/LOGOs at the top and bottom right corner of the image. If these images are recognized in batches, the watermarks will interfere with the recognition results.
  • Hold down the right mouse button to draw multiple rectangular boxes. The text inside these areas will be ignored during the task.
  • Please try to draw the rectangular boxes larger, completely wrapping all possible positions of the watermark.

Batch Documents OCR


QR Code

4-二维码-1.png

Scan Code:

  • You can capture screenshots, paste, or drag local images to read QR codes and barcodes.
  • Supports multiple codes in one image.
  • Supports 19 protocols, as follows:

Aztec,Codabar,Code128,Code39,Code93,DataBar,DataBarExpanded,DataMatrix,EAN13,EAN8,ITF,LinearCodes,MatrixCodes,MaxiCode,MicroQRCode,PDF417,QRCode,UPCA,UPCE,

4-二维码-2.png

Generate Code:

  • Enter text to generate a QR code image.
  • Supports 19 protocols and parameters such as error correction level.

Global Settings

5-全局设置-1.png

Global Settings: Here you can adjust the global parameters of the software. Common features include:

  • One-click to add shortcuts or set auto-startup.
  • Change the interface language. Umi supports traditional Chinese, English, Japanese, and other languages.
  • Switch interface themes. Umi has multiple light/dark themes.
  • Adjust the font size and font of the interface text.
  • Switch OCR plugins.
  • Renderer: The software interface defaults to support GPU-accelerated rendering. If you encounter screen flickering or UI misalignment on your machine, please adjust Interface and AppearanceRenderer, try switching to different rendering schemes, or turn off hardware acceleration.

API Usage:

About Project Structure

Repositories:

Build the Project

Step 0: (Optional) Fork this project

Step 1: Download the code

Choose one of the following:

  • Pull your forked repository to your local machine
  • Download the zip source code package of this repository
  • Clone this repository

Next Steps:

Please go to the following repositories to complete the development/runtime environment deployment for the corresponding platform.

This project also has a very simple one-click packaging script, which can be found in the following repositories.

  • Windows
  • Cross-platform support is under development.