From 8cb932ea606cfb1e468a951edf3befc93f9dbc01 Mon Sep 17 00:00:00 2001 From: Kearney <59185302+BackMountainDevil@users.noreply.github.com> Date: Thu, 16 Apr 2020 15:38:06 +0800 Subject: [PATCH 1/6] =?UTF-8?q?=E5=A2=9E=E5=8A=A0=E5=91=BD=E4=BB=A4?= =?UTF-8?q?=E8=A1=8C=E5=AE=89=E8=A3=85Numpy=E3=80=81whl=E3=80=81OpenCV?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- doc/settingup.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/doc/settingup.md b/doc/settingup.md index d9e4e1c5..5437249c 100644 --- a/doc/settingup.md +++ b/doc/settingup.md @@ -7,7 +7,7 @@ > 系统环境:windows 10 + python 3.6 + OpenCV 3.4.1 -### 一、安装python ### +### 一、安装python和pip ### python的安装之前在[python自学笔记](https://github.com/vipstone/python)的项目中描述了,在这不做重复说明,有需要的朋友,点击查看:[python环境安装](https://github.com/vipstone/python/blob/master/%E5%BC%80%E5%8F%91%E7%8E%AF%E5%A2%83%E6%90%AD%E5%BB%BA.md) @@ -15,12 +15,14 @@ python的安装之前在[python自学笔记](https://github.com/vipstone/python) 根据上文提示,现在我们已经正确安装了python和pip(安装和管理python包的工具),在正式安装OpenCV之前,首先我们要安装numpy模块。 numpy:是一个定义了数值数组和矩阵类型和它们的基本运算的语言扩展,OpenCV引用了numpy模块,所以安装OpenCV之前必须安装numpy。 +`pip install numpy` 本文安装python模块使用的是.whl文件安装的。 **whl文件是什么?** whl是一个python的压缩包,其中包含了py文件以及经过编译的pyd文件。 +`pip install wheel` **whl安装命令** > pip3 install 存放路径\xxx.whl @@ -50,7 +52,9 @@ Successfully installed numpy-1.14.2+mkl ### 三、安装OpenCV ### 同样安装OpenCV模块和numpy方式类似。 - +命令行安装: +`pip install opencv-python` +手动安装: 第1步:首先去网站下载OpenCV对应的.whl版本压缩包,网址:https://www.lfd.uci.edu/~gohlke/pythonlibs/#opencv 本人下载的版本是:opencv_python‑3.4.1‑cp36‑cp36m‑win_amd64.whl 64位系统对应python3.6的,下载到d盘根目录。 百度云链接:https://pan.baidu.com/s/10RefansrC4_0zsNehjyKTg @@ -79,7 +83,7 @@ import cv2 print(cv2.__version__) -# 输出:3.4.1 +# 输出:4.2.0 ``` 上面我们简单的打印了OpenCV的版本号,如果能正常输出不报错,说明我们已经把OpenCV的python环境搭建ok了。 @@ -87,7 +91,7 @@ print(cv2.__version__) ``` python import cv2 -filepath = "img/meinv.png" +filepath = "img/meinv.png" #这里替换为你电脑照片的路径,注意不要包含中文,斜杠和反斜杠务必转换 img = cv2.imread(filepath) cv2.namedWindow('Image') cv2.imshow('Image', img) From acb0fb5fb6054608602707a12e89e54ca59c1d7e Mon Sep 17 00:00:00 2001 From: Kearney <59185302+BackMountainDevil@users.noreply.github.com> Date: Thu, 16 Apr 2020 16:04:06 +0800 Subject: [PATCH 2/6] Update tesseractOCR.md --- doc/tesseractOCR.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/tesseractOCR.md b/doc/tesseractOCR.md index 4ff3a283..0179a657 100644 --- a/doc/tesseractOCR.md +++ b/doc/tesseractOCR.md @@ -22,8 +22,8 @@ Tesseract的OCR引擎最先由HP实验室于1985年开始研发,至1995年时 使用命令,查看版本号和支持语言: >cd C:\Users\Administrator\AppData\Local\Tesseract-OCR ->tesseract -v ->tesseract --list-langs  #查看Tesseract-OCR支持语言 +>tesseract -v #查看Tesseract-OCR的版本 +>tesseract --list-langs  #查看Tesseract-OCR支持语言,下文中的语言只能从这里的结果选取 三、配置tesseract运行文件 @@ -31,7 +31,7 @@ C:\Python36\Lib\site-packages\pytesseract\pytesseract.py 找到文件: >tesseract_cmd = 'tesseract' -修改为: +修改为(根据刚才的安装目录修改): >tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract.exe' 四、代码识别 From 01cd35c9aa4465b1345644421d24f53f0bbba7b1 Mon Sep 17 00:00:00 2001 From: Kearney <59185302+BackMountainDevil@users.noreply.github.com> Date: Thu, 16 Apr 2020 16:20:41 +0800 Subject: [PATCH 3/6] =?UTF-8?q?=E5=A2=9E=E5=8A=A0=E5=AE=89=E8=A3=85?= =?UTF-8?q?=E5=8C=85=E4=B8=AD=E6=96=87=E8=AF=AD=E8=A8=80=E5=AE=89=E8=A3=85?= =?UTF-8?q?=E5=A4=B1=E8=B4=A5=E7=9A=84=E8=A7=A3=E5=86=B3=E5=8A=9E=E6=B3=95?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- doc/tesseractOCR.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/doc/tesseractOCR.md b/doc/tesseractOCR.md index 0179a657..7b27cc28 100644 --- a/doc/tesseractOCR.md +++ b/doc/tesseractOCR.md @@ -48,3 +48,11 @@ print(text) ``` 作为非常优秀的Ocr识别库,tesseract当然可以训练自己的数据模型,从而达到为我所用的目的,后续文章会介绍如何训练自己的文字识别库。 + +2020.4.16 +采用Windows安装包选择中文简体+中文繁体语言最后安装失败,仅成功安装eng和osd语言。 +解决办法: +https://github.com/tesseract-ocr/tessdoc/blob/master/Data-Files.md +1.在上面的网址下载语言压缩包 +2.将压缩包解压到对应目录,如C:\Program Files\Tesseract-OCR\tessdata +参考:https://github.com/UB-Mannheim/tesseract/wiki/Install-additional-language-and-script-models From 9fbff8666d0bbe9455392ea3c4ad31fac76afdf0 Mon Sep 17 00:00:00 2001 From: Kearney <59185302+BackMountainDevil@users.noreply.github.com> Date: Thu, 16 Apr 2020 16:36:53 +0800 Subject: [PATCH 4/6] Update tesseractOCR.md --- doc/tesseractOCR.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/tesseractOCR.md b/doc/tesseractOCR.md index 7b27cc28..02f27090 100644 --- a/doc/tesseractOCR.md +++ b/doc/tesseractOCR.md @@ -22,8 +22,8 @@ Tesseract的OCR引擎最先由HP实验室于1985年开始研发,至1995年时 使用命令,查看版本号和支持语言: >cd C:\Users\Administrator\AppData\Local\Tesseract-OCR ->tesseract -v #查看Tesseract-OCR的版本 ->tesseract --list-langs  #查看Tesseract-OCR支持语言,下文中的语言只能从这里的结果选取 +>tesseract -v +>tesseract --list-langs  #查看Tesseract-OCR的版本和支持语言,下文中的语言只能从这里的结果选取 三、配置tesseract运行文件 From 7c5a11cb963eba2b5295d9b8ddad61aeaced99fe Mon Sep 17 00:00:00 2001 From: Kearney <59185302+BackMountainDevil@users.noreply.github.com> Date: Thu, 16 Apr 2020 17:06:16 +0800 Subject: [PATCH 5/6] =?UTF-8?q?=E5=A2=9E=E5=8A=A0=E4=B8=AD=E6=96=87?= =?UTF-8?q?=E8=AF=AD=E8=A8=80=E5=8C=85=E5=AE=89=E8=A3=85=E5=A4=B1=E8=B4=A5?= =?UTF-8?q?=E7=9A=84=E8=A7=A3=E5=86=B3=E5=8A=9E=E6=B3=95?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- doc/tesseractOCR.md | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/doc/tesseractOCR.md b/doc/tesseractOCR.md index 02f27090..a35cf287 100644 --- a/doc/tesseractOCR.md +++ b/doc/tesseractOCR.md @@ -51,8 +51,4 @@ print(text) 2020.4.16 采用Windows安装包选择中文简体+中文繁体语言最后安装失败,仅成功安装eng和osd语言。 -解决办法: -https://github.com/tesseract-ocr/tessdoc/blob/master/Data-Files.md -1.在上面的网址下载语言压缩包 -2.将压缩包解压到对应目录,如C:\Program Files\Tesseract-OCR\tessdata -参考:https://github.com/UB-Mannheim/tesseract/wiki/Install-additional-language-and-script-models +解决办法参考:https://blog.csdn.net/weixin_43031092/article/details/105561486 From 3681e9ae3207f732a1b197a0b053b1e7ac679cce Mon Sep 17 00:00:00 2001 From: Kearney <59185302+BackMountainDevil@users.noreply.github.com> Date: Thu, 16 Apr 2020 20:16:14 +0800 Subject: [PATCH 6/6] =?UTF-8?q?=E5=B0=86=E6=9C=80=E6=96=B0=E7=89=88?= =?UTF-8?q?=E4=B8=AD=E6=96=87=E8=AF=AD=E8=A8=80=E5=8C=85=E6=89=93=E5=8C=85?= =?UTF-8?q?=E6=B7=BB=E5=8A=A0=E5=88=B0=E6=96=87=E6=A1=A3?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- doc/tesseractOCR.md | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/tesseractOCR.md b/doc/tesseractOCR.md index a35cf287..194587c1 100644 --- a/doc/tesseractOCR.md +++ b/doc/tesseractOCR.md @@ -52,3 +52,4 @@ print(text) 2020.4.16 采用Windows安装包选择中文简体+中文繁体语言最后安装失败,仅成功安装eng和osd语言。 解决办法参考:https://blog.csdn.net/weixin_43031092/article/details/105561486 +最新打包的中文语言包:https://pan.baidu.com/s/11vlNct2oxO_ATfsBhyGv8Q 提取码:fi33