-
-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize the 'unclip' logic in DBPostProcess #181
Open
HiDolen
wants to merge
6
commits into
RapidAI:main
Choose a base branch
from
HiDolen:main
base: main
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
如果只考虑unlip的输入的box只为矩形框的话,您这样改应该没问题的。 |
目前
可以确定,输入到 unclip 函数的 box 已经是只有 4 个坐标的矩形框。没有多边形的可能。 |
这个我下班再详细看看哈!你好认真呀
…---- Replied Message ----
| From | ***@***.***> |
| Date | 05/20/2024 10:06 |
| To | RapidAI/RapidOCR ***@***.***> |
| Cc | SWHL ***@***.***>,
Comment ***@***.***> |
| Subject | Re: [RapidAI/RapidOCR] Optimize the 'unclip' logic in DBPostProcess (PR #181) |
目前 self.boxes_from_bitmap 的处理流程如下:
使用 cv2.findContours() 从二值图获得边界
通过 self.get_mini_boxes() 使用 cv2.minAreaRect() 从边界获得框中心、长宽和角度,其结果送入 cv2.boxPoints() 转换为 4 个点坐标,然后按第一个点在左上角且顺时针排序
使用 self.box_score_fast() 计算所得框是否能较好覆盖 pred,丢弃分数低于 elf.box_thresh 的框
使用 self.unclip()扩展框边界
再次使用 self.get_mini_boxes() 获得扩展后的框
结合原始图片大小,进行框的坐标映射,顺便将值截断到图片大小范围内
可以确定,输入到 unclip 函数的 box 已经是只有 4 个坐标的矩形框。没有多边形的可能。
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
刚刚发现了修改后代码的致命错误:我没有测试文字倾斜的情况。 虽然针对这个问题修改了一下代码: def unclip(box):
area = cv2.contourArea(box)
perimeter = cv2.arcLength(box, True)
distance = area * self.unclip_ratio / perimeter
# signs = np.array([[-1, -1], [1, -1], [1, 1], [-1, 1]])
# expanded = box + distance * signs
unit_vectors = []
for i in range(4):
vector = box[(i + 1) % 4] - box[i]
unit_vector = vector / np.linalg.norm(vector)
unit_vectors.append(unit_vector)
new_box = np.zeros_like(box)
for i in range(4):
new_box[i] = box[i] + unit_vectors[i - 1] * distance
new_box[i] = new_box[i] - unit_vectors[i] * distance
expanded = new_box
return expanded.astype(np.float32) 此时能正确识别出倾斜文字,但发现准确度不如修改之前的版本。 修改前, 再次修改后, “番剧” 二字没有识别成功。 我先关闭本 pr。 |
严谨如你 |
写了个两方案可视化可视化。 import cv2
import numpy as np
from einops import rearrange
import pyclipper
from shapely.geometry import Polygon
import plotly.graph_objects as go
unclip_ratio = 1.6
def unclip_origin(box):
poly = Polygon(box)
distance = poly.area * unclip_ratio / poly.length
offset = pyclipper.PyclipperOffset()
offset.AddPath(box, pyclipper.JT_ROUND, pyclipper.ET_CLOSEDPOLYGON)
expanded = np.array(offset.Execute(distance))
##########################
bounding_box = cv2.minAreaRect(expanded)
points = sorted(list(cv2.boxPoints(bounding_box)), key=lambda x: x[0])
index_1, index_2, index_3, index_4 = 0, 1, 2, 3
if points[1][1] > points[0][1]:
index_1 = 0
index_4 = 1
else:
index_1 = 1
index_4 = 0
if points[3][1] > points[2][1]:
index_2 = 2
index_3 = 3
else:
index_2 = 3
index_3 = 2
box = [points[index_1], points[index_2], points[index_3], points[index_4]]
expanded = np.array(box)
##########################
return expanded
def unclip_2(box):
area = cv2.contourArea(box)
perimeter = cv2.arcLength(box, True)
distance = area * unclip_ratio / perimeter
unit_vectors = []
for i in range(4):
vector = box[(i + 1) % 4] - box[i]
unit_vector = vector / np.linalg.norm(vector)
unit_vectors.append(unit_vector)
new_box = np.zeros_like(box)
for i in range(4):
new_box[i] = box[i] + unit_vectors[i - 1] * distance
new_box[i] = new_box[i] - unit_vectors[i] * distance
expanded = new_box
return expanded.astype(np.float32)
def create_2d_trace(box, name, color):
box_closed = np.concatenate([box, box[0:1]], axis=0)
trace = go.Scatter(x=box_closed[:, 0], y=box_closed[:, 1], mode='lines', name=name, line=dict(color=color))
return trace
def test_unclip_functions(box):
# 计算unclip后的box
unclipped_box_1 = unclip_origin(box)
unclipped_box_2 = unclip_2(box)
# 创建一个新的figure
fig = go.Figure()
# 绘制原始的box
fig.add_trace(create_2d_trace(box, 'Original', 'blue'))
fig.add_trace(create_2d_trace(unclipped_box_1, 'before', 'red'))
fig.add_trace(create_2d_trace(unclipped_box_2, 'after', 'green'))
fig.update_layout(
autosize=False,
xaxis=dict(
scaleanchor='y',
scaleratio=1,
)
)
# 显示图
fig.show()
box = np.array(
[[834.6764, 613.2059], [871.58813, 646.35297], [864.7058, 646.82355], [827.79407, 624.67645]]
).astype(np.float32)
test_unclip_functions(box) 结果如图: before 是现在的代码,after 是本 pr 的代码。我认为修改后的 unclip 结果才符合直觉。 但这样的修改确实会带来某些情况下识别率的差异。看 maintainers 大家的想法如何。 |
优秀,待我有空仔细研究一下。感谢 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
当前问题
在 detect_process.py 的
DBPostProcess
类中,self.boxes_from_bitmap()
方法用到了self.unclip()
,以扩展所得 box 的边界。问题是,
self.unclip()
做了很多多余的操作。本来只需要简单将四个顶点向外移动,当前的代码却选择借助多边形进行扩展,然后再次使用
self.get_mini_boxes()
从多边形变回四个顶点。很绕。解决方法
self.get_mini_boxes()
里有将 box 顶点按照第一个为左上顶点且顺时针排序的规则排序的逻辑,所以传入到self.unclip()
的box
顶点顺序是已知的。如此,可将self.unclip()
改写为如下形式:调用时不再需要
self.get_mini_boxes()
:我自己测试了几张图片,这样的改动没有发生问题。两种代码的结果可以说是等价的。
pr 将四处所有用到 unclip 的地方进行了修改。
测试
修改前,
修改后,