Skip to content
View zj5559's full-sized avatar
  • Dalian University of Technology

Highlights

  • Pro

Block or report zj5559

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Deployment of the tracking model PromptVT.

Python 8 1 Updated Mar 25, 2024

DaD's a pretty good keypoint detector, probably the best.

Python 63 2 Updated Mar 21, 2025
Jupyter Notebook 377 47 Updated Dec 5, 2023

[ICLR 2025, Oral] EmbodiedSAM: Online Segment Any 3D Thing in Real Time

Python 469 21 Updated Apr 6, 2025
Python 207 24 Updated Mar 17, 2025

[AAAI-25 Oral] Official Implementation of "FLAME: Learning to Navigate with Multimodal LLM in Urban Environments"

Python 44 3 Updated Feb 21, 2025

Official code and checkpoint release for mobile robot foundation models: GNM, ViNT, and NoMaD.

Python 797 106 Updated Sep 15, 2024

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,225 165 Updated Mar 28, 2025
Python 18 Updated Jun 24, 2024

LoRAT_pytracking: reproduction of [ECCV2024] LoRAT

Python 40 2 Updated Dec 9, 2024

EVE Series: Encoder-Free Vision-Language Models from BAAI

Python 320 8 Updated Mar 1, 2025

Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"

Python 6,696 435 Updated Mar 18, 2025

The official implementation for the CVPR 2023 paper Joint Visual Grounding and Tracking with Natural Language Specification.

Python 66 3 Updated Jun 3, 2023

Florence-2

Jupyter Notebook 63 12 Updated Feb 13, 2025

[TPAMI 2024] Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding

Python 24 Updated Sep 11, 2024

Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)

Python 1,935 218 Updated May 20, 2024

A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites

3,594 285 Updated Mar 25, 2025

[Embodied-AI-Survey-2024] Paper list and projects for Embodied AI

1,367 91 Updated Mar 17, 2025

[Lumina Embodied AI Community] 具身智能技术指南 Embodied-AI-Guide

4,392 272 Updated Apr 11, 2025

A curated list of visual reinforcement learning resources

243 11 Updated Feb 12, 2025

An open source implementation of CLIP.

Python 11,554 1,089 Updated Apr 5, 2025
Jupyter Notebook 976 158 Updated Mar 3, 2025

Artificial Intelligence Research for Science (AIRS)

Python 609 69 Updated Mar 20, 2025

List the AI for Science papers accepted by top conferences

Jupyter Notebook 109 13 Updated Sep 14, 2024

SeqTrackv2: Unified Sequence-to-Sequence Learning for Single- and Multi-Modal Visual Object Tracking

Python 68 6 Updated Mar 26, 2024

The official python toolkit for running experiments and evaluate performance on VideoCube benchmark @TPAMI2023

Python 30 6 Updated Apr 1, 2024

[CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloading the trained model checkpoints, and example notebooks / gra…

Python 219 10 Updated Sep 30, 2024

[CVPRW’24 Best Paper Honorable Mention Award] DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM

7 Updated Oct 7, 2024

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

Python 816 43 Updated Feb 27, 2025

[ICCV'23] CiteTracker: Correlating Image and Text for Visual Tracking

Python 40 2 Updated Jun 20, 2024
Next
Showing results