Skip to content
View NormXU's full-sized avatar
🎯
最後まで、絶対に諦めじゃだめ
🎯
最後まで、絶対に諦めじゃだめ
Block or Report

Block or report NormXU

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
NormXU/README.md

Hi there 👋🏻,

🔭 I am currently focused on developing cutting-edge multi-modality models capable of generating and understanding text and images natively.

My specific areas of interest include:

  • Denoising

  • Auto-regressive Generation
  • Diffusion

PS: I recognize that all diffusion and next-token generation tasks are inherently denoising tasks. (2024/05)


  • Document Understanding & Layout Analysis
  • Optical Character Recognition
  • Object Detection

PS: Looks like all these tasks can be regarded as next-token generation tasks. (2023/12)


In addition to my current work, I have prior experience in Robotics Perception from my Master's studies. I hope my work can be helpful to you.

Feel free to reach out if you have any questions or if there's anything I can assist with!

Pinned

  1. ERNIE-Layout-Pytorch ERNIE-Layout-Pytorch Public

    An unofficial Pytorch implementation of ERNIE-Layout which is originally released through PaddleNLP.

    Python 93 10

  2. Layout2Graph Layout2Graph Public

    An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"

    Python 70 10

  3. nougat-latex-ocr nougat-latex-ocr Public

    Codebase for fine-tuning / evaluating nougat-based image2latex generation models

    Python 94 11