Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FAST #35476

Open
wants to merge 93 commits into
base: main
Choose a base branch
from
Open

Add FAST #35476

wants to merge 93 commits into from

Conversation

jadechoghari
Copy link
Contributor

@jadechoghari jadechoghari commented Jan 1, 2025

What does this PR do?

This PR adds FAST: Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation.

It should be merged after the first pr for its backbone, textnet, is merged: #34979

Colab to replicate the author's logits: https://colab.research.google.com/drive/1bdkNiRI2bl7rBcgGYXe2UeobX78TUGYY?usp=sharing

What's left:

  • Fix make quality failing due to a doc issue
  • Complete full model documentation

@jadechoghari
Copy link
Contributor Author

everything matching!

@jadechoghari jadechoghari marked this pull request as ready for review February 14, 2025 12:48
@qubvel qubvel self-requested a review February 17, 2025 12:01
Copy link
Member

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thanks for adding the model 🚀.

I did an initial review, please see the comments below. I didn't highlight everything, however please apply the recommendations throughout the code. Looking forward for getting this merged!

Comment on lines +19 to +30
## Overview

Fast model proposes an accurate and efficient scene text detection framework, termed FAST (i.e., faster
arbitrarily-shaped text detector).

FAST has two new designs. (1) We design a minimalist kernel representation (only has 1-channel output) to model text
with arbitrary shape, as well as a GPU-parallel post-processing to efficiently assemble text lines with a negligible
time overhead. (2) We search the network architecture tailored for text detection, leading to more powerful features
than most networks that are searched for image classification.

## FastConfig

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need code snippet and some picture here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will leave this till the end yeah!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants