diff --git a/README.md b/README.md
index fb48d4d..8cb958b 100644
--- a/README.md
+++ b/README.md
@@ -1,6 +1,178 @@
 # 每日从arXiv中获取最新YOLO相关论文
 
 
+## ASMA: An Adaptive Safety Margin Algorithm for Vision\-Language Drone Navigation via Scene\-Aware Control Barrier Functions
+
+**发布日期**：2024-09-16
+
+**作者**：Sourav Sanyal
+
+**摘要**：In the rapidly evolving field of vision\-language navigation \(VLN\), ensuring
+robust safety mechanisms remains an open challenge. Control barrier functions
+\(CBFs\) are efficient tools which guarantee safety by solving an optimal control
+problem. In this work, we consider the case of a teleoperated drone in a VLN
+setting, and add safety features by formulating a novel scene\-aware CBF using
+ego\-centric observations obtained through an RGB\-D sensor. As a baseline, we
+implement a vision\-language understanding module which uses the contrastive
+language image pretraining \(CLIP\) model to query about a user\-specified \(in
+natural language\) landmark. Using the YOLO \(You Only Look Once\) object
+detector, the CLIP model is queried for verifying the cropped landmark,
+triggering downstream navigation. To improve navigation safety of the baseline,
+we propose ASMA \-\- an Adaptive Safety Margin Algorithm \-\- that crops the
+drone's depth map for tracking moving object\(s\) to perform scene\-aware CBF
+evaluation on\-the\-fly. By identifying potential risky observations from the
+scene, ASMA enables real\-time adaptation to unpredictable environmental
+conditions, ensuring optimal safety bounds on a VLN\-powered drone actions.
+Using the robot operating system \(ROS\) middleware on a parrot bebop2 quadrotor
+in the gazebo environment, ASMA offers 59.4% \- 61.8% increase in success rates
+with insignificant 5.4% \- 8.2% increases in trajectory lengths compared to the
+baseline CBF\-less VLN while recovering from unsafe situations.
+
+
+**代码链接**：摘要中未找到代码链接。
+
+**论文链接**：[阅读更多](http://arxiv.org/abs/2409.10283v1)
+
+---
+
+
+## Self\-Updating Vehicle Monitoring Framework Employing Distributed Acoustic Sensing towards Real\-World Settings
+
+**发布日期**：2024-09-16
+
+**作者**：Xi Wang
+
+**摘要**：The recent emergence of Distributed Acoustic Sensing \(DAS\) technology has
+facilitated the effective capture of traffic\-induced seismic data. The
+traffic\-induced seismic wave is a prominent contributor to urban vibrations and
+contain crucial information to advance urban exploration and governance.
+However, identifying vehicular movements within massive noisy data poses a
+significant challenge. In this study, we introduce a real\-time semi\-supervised
+vehicle monitoring framework tailored to urban settings. It requires only a
+small fraction of manual labels for initial training and exploits unlabeled
+data for model improvement. Additionally, the framework can autonomously adapt
+to newly collected unlabeled data. Before DAS data undergo object detection as
+two\-dimensional images to preserve spatial information, we leveraged
+comprehensive one\-dimensional signal preprocessing to mitigate noise.
+Furthermore, we propose a novel prior loss that incorporates the shapes of
+vehicular traces to track a single vehicle with varying speeds. To evaluate our
+model, we conducted experiments with seismic data from the Stanford 2 DAS
+Array. The results showed that our model outperformed the baseline model
+Efficient Teacher and its supervised counterpart, YOLO \(You Only Look Once\), in
+both accuracy and robustness. With only 35 labeled images, our model surpassed
+YOLO's mAP 0.5:0.95 criterion by 18% and showed a 7% increase over Efficient
+Teacher. We conducted comparative experiments with multiple update strategies
+for self\-updating and identified an optimal approach. This approach surpasses
+the performance of non\-overfitting training conducted with all data in a single
+pass.
+
+
+**代码链接**：摘要中未找到代码链接。
+
+**论文链接**：[阅读更多](http://arxiv.org/abs/2409.10259v1)
+
+---
+
+
+## Tracking Virtual Meetings in the Wild: Re\-identification in Multi\-Participant Virtual Meetings
+
+**发布日期**：2024-09-15
+
+**作者**：Oriel Perl
+
+**摘要**：In recent years, workplaces and educational institutes have widely adopted
+virtual meeting platforms. This has led to a growing interest in analyzing and
+extracting insights from these meetings, which requires effective detection and
+tracking of unique individuals. In practice, there is no standardization in
+video meetings recording layout, and how they are captured across the different
+platforms and services. This, in turn, creates a challenge in acquiring this
+data stream and analyzing it in a uniform fashion. Our approach provides a
+solution to the most general form of video recording, usually consisting of a
+grid of participants \(\\cref\{fig:videomeeting\}\) from a single video source with
+no metadata on participant locations, while using the least amount of
+constraints and assumptions as to how the data was acquired. Conventional
+approaches often use YOLO models coupled with tracking algorithms, assuming
+linear motion trajectories akin to that observed in CCTV footage. However, such
+assumptions fall short in virtual meetings, where participant video feed window
+can abruptly change location across the grid. In an organic video meeting
+setting, participants frequently join and leave, leading to sudden, non\-linear
+movements on the video grid. This disrupts optical flow\-based tracking methods
+that depend on linear motion. Consequently, standard object detection and
+tracking methods might mistakenly assign multiple participants to the same
+tracker. In this paper, we introduce a novel approach to track and re\-identify
+participants in remote video meetings, by utilizing the spatio\-temporal priors
+arising from the data in our domain. This, in turn, increases tracking
+capabilities compared to the use of general object tracking. Our approach
+reduces the error rate by 95% on average compared to YOLO\-based tracking
+methods as a baseline.
+
+
+**代码链接**：摘要中未找到代码链接。
+
+**论文链接**：[阅读更多](http://arxiv.org/abs/2409.09841v1)
+
+---
+
+
+## Stutter\-Solver: End\-to\-end Multi\-lingual Dysfluency Detection
+
+**发布日期**：2024-09-15
+
+**作者**：Xuanru Zhou
+
+**摘要**：Current de\-facto dysfluency modeling methods utilize template matching
+algorithms which are not generalizable to out\-of\-domain real\-world dysfluencies
+across languages, and are not scalable with increasing amounts of training
+data. To handle these problems, we propose Stutter\-Solver: an end\-to\-end
+framework that detects dysfluency with accurate type and time transcription,
+inspired by the YOLO object detection algorithm. Stutter\-Solver can handle
+co\-dysfluencies and is a natural multi\-lingual dysfluency detector. To leverage
+scalability and boost performance, we also introduce three novel dysfluency
+corpora: VCTK\-Pro, VCTK\-Art, and AISHELL3\-Pro, simulating natural spoken
+dysfluencies including repetition, block, missing, replacement, and
+prolongation through articulatory\-encodec and TTS\-based methods. Our approach
+achieves state\-of\-the\-art performance on all available dysfluency corpora. Code
+and datasets are open\-sourced at https://github.com/eureka235/Stutter\-Solver
+
+
+**代码链接**：https://github.com/eureka235/Stutter-Solver
+
+**论文链接**：[阅读更多](http://arxiv.org/abs/2409.09621v1)
+
+---
+
+
+## Self\-Prompting Polyp Segmentation in Colonoscopy using Hybrid Yolo\-SAM 2 Model
+
+**发布日期**：2024-09-14
+
+**作者**：Mobina Mansoori
+
+**摘要**：Early diagnosis and treatment of polyps during colonoscopy are essential for
+reducing the incidence and mortality of Colorectal Cancer \(CRC\). However, the
+variability in polyp characteristics and the presence of artifacts in
+colonoscopy images and videos pose significant challenges for accurate and
+efficient polyp detection and segmentation. This paper presents a novel
+approach to polyp segmentation by integrating the Segment Anything Model \(SAM
+2\) with the YOLOv8 model. Our method leverages YOLOv8's bounding box
+predictions to autonomously generate input prompts for SAM 2, thereby reducing
+the need for manual annotations. We conducted exhaustive tests on five
+benchmark colonoscopy image datasets and two colonoscopy video datasets,
+demonstrating that our method exceeds state\-of\-the\-art models in both image and
+video segmentation tasks. Notably, our approach achieves high segmentation
+accuracy using only bounding box annotations, significantly reducing annotation
+time and effort. This advancement holds promise for enhancing the efficiency
+and scalability of polyp detection in clinical settings
+https://github.com/sajjad\-sh33/YOLO\_SAM2.
+
+
+**代码链接**：https://github.com/sajjad-sh33/YOLO_SAM2.
+
+**论文链接**：[阅读更多](http://arxiv.org/abs/2409.09484v1)
+
+---
+
+
 ## Breaking reCAPTCHAv2
 
 **发布日期**：2024-09-13
@@ -63,17 +235,17 @@ creation of guitar tabs from video recordings.
 
 **摘要**：Open\-vocabulary detection \(OVD\) aims to detect objects beyond a predefined
 set of categories. As a pioneering model incorporating the YOLO series into
-OVD, YOLO\-World is well\-suited for scenarios prioritizing speed and
-efficiency.However, its performance is hindered by its neck feature fusion
-mechanism, which causes the quadratic complexity and the limited guided
-receptive fields.To address these limitations, we present Mamba\-YOLO\-World, a
-novel YOLO\-based OVD model employing the proposed MambaFusion Path Aggregation
-Network \(MambaFusion\-PAN\) as its neck architecture. Specifically, we introduce
-an innovative State Space Model\-based feature fusion mechanism consisting of a
+OVD, YOLO\-World is well\-suited for scenarios prioritizing speed and efficiency.
+However, its performance is hindered by its neck feature fusion mechanism,
+which causes the quadratic complexity and the limited guided receptive fields.
+To address these limitations, we present Mamba\-YOLO\-World, a novel YOLO\-based
+OVD model employing the proposed MambaFusion Path Aggregation Network
+\(MambaFusion\-PAN\) as its neck architecture. Specifically, we introduce an
+innovative State Space Model\-based feature fusion mechanism consisting of a
 Parallel\-Guided Selective Scan algorithm and a Serial\-Guided Selective Scan
 algorithm with linear complexity and globally guided receptive fields. It
 leverages multi\-modal input sequences and mamba hidden states to guide the
-selective scanning process.Experiments demonstrate that our model outperforms
+selective scanning process. Experiments demonstrate that our model outperforms
 the original YOLO\-World on the COCO and LVIS benchmarks in both zero\-shot and
 fine\-tuning settings while maintaining comparable parameters and FLOPs.
 Additionally, it surpasses existing state\-of\-the\-art OVD methods with fewer
@@ -82,7 +254,7 @@ parameters and FLOPs.
 
 **代码链接**：摘要中未找到代码链接。
 
-**论文链接**：[阅读更多](http://arxiv.org/abs/2409.08513v1)
+**论文链接**：[阅读更多](http://arxiv.org/abs/2409.08513v2)
 
 ---
 
@@ -150,189 +322,3 @@ advanced sensor fusion for improved navigation and collision avoidance.
 
 ---
 
-
-## A Semantic Segmentation Approach on Sweet Orange Leaf Diseases Detection Utilizing YOLO
-
-**发布日期**：2024-09-10
-
-**作者**：Sabit Ahamed Preanto
-
-**摘要**：This research introduces an advanced method for diagnosing diseases in sweet
-orange leaves by utilising advanced artificial intelligence models like YOLOv8
-. Due to their significance as a vital agricultural product, sweet oranges
-encounter significant threats from a variety of diseases that harmfully affect
-both their yield and quality. Conventional methods for disease detection
-primarily depend on manual inspection which is ineffective and frequently leads
-to errors, resulting in delayed treatment and increased financial losses. In
-response to this challenge, the research utilized YOLOv8 , harnessing their
-proficiencies in detecting objects and analyzing images. YOLOv8 is recognized
-for its rapid and precise performance, while VIT is acknowledged for its
-detailed feature extraction abilities. Impressively, during both the training
-and validation stages, YOLOv8 exhibited a perfect accuracy of 80.4%, while VIT
-achieved an accuracy of 99.12%, showcasing their potential to transform disease
-detection in agriculture. The study comprehensively examined the practical
-challenges related to the implementation of AI technologies in agriculture,
-encompassing the computational demands and user accessibility, and offering
-viable solutions for broader usage. Moreover, it underscores the environmental
-considerations, particularly the potential for reduced pesticide usage, thereby
-promoting sustainable farming and environmental conservation. These findings
-provide encouraging insights into the application of AI in agriculture,
-suggesting a transition towards more effective, sustainable, and
-technologically advanced farming methods. This research not only highlights the
-efficacy of YOLOv8 within a specific agricultural domain but also lays the
-foundation for further studies that encompass a broader application in crop
-management and sustainable agricultural practices.
-
-
-**代码链接**：摘要中未找到代码链接。
-
-**论文链接**：[阅读更多](http://arxiv.org/abs/2409.06671v1)
-
----
-
-
-## An Attribute\-Enriched Dataset and Auto\-Annotated Pipeline for Open Detection
-
-**发布日期**：2024-09-10
-
-**作者**：Pengfei Qi
-
-**摘要**：Detecting objects of interest through language often presents challenges,
-particularly with objects that are uncommon or complex to describe, due to
-perceptual discrepancies between automated models and human annotators. These
-challenges highlight the need for comprehensive datasets that go beyond
-standard object labels by incorporating detailed attribute descriptions. To
-address this need, we introduce the Objects365\-Attr dataset, an extension of
-the existing Objects365 dataset, distinguished by its attribute annotations.
-This dataset reduces inconsistencies in object detection by integrating a broad
-spectrum of attributes, including color, material, state, texture and tone. It
-contains an extensive collection of 5.6M object\-level attribute descriptions,
-meticulously annotated across 1.4M bounding boxes. Additionally, to validate
-the dataset's effectiveness, we conduct a rigorous evaluation of YOLO\-World at
-different scales, measuring their detection performance and demonstrating the
-dataset's contribution to advancing object detection.
-
-
-**代码链接**：摘要中未找到代码链接。
-
-**论文链接**：[阅读更多](http://arxiv.org/abs/2409.06300v1)
-
----
-
-
-## ALSS\-YOLO: An Adaptive Lightweight Channel Split and Shuffling Network for TIR Wildlife Detection in UAV Imagery
-
-**发布日期**：2024-09-10
-
-**作者**：Ang He
-
-**摘要**：Unmanned aerial vehicles \(UAVs\) equipped with thermal infrared \(TIR\) cameras
-play a crucial role in combating nocturnal wildlife poaching. However, TIR
-images often face challenges such as jitter, and wildlife overlap,
-necessitating UAVs to possess the capability to identify blurred and
-overlapping small targets. Current traditional lightweight networks deployed on
-UAVs struggle to extract features from blurry small targets. To address this
-issue, we developed ALSS\-YOLO, an efficient and lightweight detector optimized
-for TIR aerial images. Firstly, we propose a novel Adaptive Lightweight Channel
-Split and Shuffling \(ALSS\) module. This module employs an adaptive channel
-split strategy to optimize feature extraction and integrates a channel
-shuffling mechanism to enhance information exchange between channels. This
-improves the extraction of blurry features, crucial for handling jitter\-induced
-blur and overlapping targets. Secondly, we developed a Lightweight Coordinate
-Attention \(LCA\) module that employs adaptive pooling and grouped convolution to
-integrate feature information across dimensions. This module ensures
-lightweight operation while maintaining high detection precision and robustness
-against jitter and target overlap. Additionally, we developed a single\-channel
-focus module to aggregate the width and height information of each channel into
-four\-dimensional channel fusion, which improves the feature representation
-efficiency of infrared images. Finally, we modify the localization loss
-function to emphasize the loss value associated with small objects to improve
-localization accuracy. Extensive experiments on the BIRDSAI and ISOD TIR UAV
-wildlife datasets show that ALSS\-YOLO achieves state\-of\-the\-art performance,
-Our code is openly available at
-https://github.com/helloworlder8/computer\_vision.
-
-
-**代码链接**：https://github.com/helloworlder8/computer_vision.
-
-**论文链接**：[阅读更多](http://arxiv.org/abs/2409.06259v2)
-
----
-
-
-## BFA\-YOLO: Balanced multiscale object detection network for multi\-view building facade attachments detection
-
-**发布日期**：2024-09-06
-
-**作者**：Yangguang Chen
-
-**摘要**：Detection of building facade attachments such as doors, windows, balconies,
-air conditioner units, billboards, and glass curtain walls plays a pivotal role
-in numerous applications. Building facade attachments detection aids in
-vbuilding information modeling \(BIM\) construction and meeting Level of Detail 3
-\(LOD3\) standards. Yet, it faces challenges like uneven object distribution,
-small object detection difficulty, and background interference. To counter
-these, we propose BFA\-YOLO, a model for detecting facade attachments in
-multi\-view images. BFA\-YOLO incorporates three novel innovations: the Feature
-Balanced Spindle Module \(FBSM\) for addressing uneven distribution, the Target
-Dynamic Alignment Task Detection Head \(TDATH\) aimed at improving small object
-detection, and the Position Memory Enhanced Self\-Attention Mechanism \(PMESA\) to
-combat background interference, with each component specifically designed to
-solve its corresponding challenge. Detection efficacy of deep network models
-deeply depends on the dataset's characteristics. Existing open source datasets
-related to building facades are limited by their single perspective, small
-image pool, and incomplete category coverage. We propose a novel method for
-building facade attachments detection dataset construction and construct the
-BFA\-3D dataset for facade attachments detection. The BFA\-3D dataset features
-multi\-view, accurate labels, diverse categories, and detailed classification.
-BFA\-YOLO surpasses YOLOv8 by 1.8% and 2.9% in mAP@0.5 on the multi\-view BFA\-3D
-and street\-view Facade\-WHU datasets, respectively. These results underscore
-BFA\-YOLO's superior performance in detecting facade attachments.
-
-
-**代码链接**：摘要中未找到代码链接。
-
-**论文链接**：[阅读更多](http://arxiv.org/abs/2409.04025v1)
-
----
-
-
-## YOLO\-CL cluster detection in the Rubin/LSST DC2 simulation
-
-**发布日期**：2024-09-05
-
-**作者**：Kirill Grishin
-
-**摘要**：LSST will provide galaxy cluster catalogs up to z$\\sim$1 that can be used to
-constrain cosmological models once their selection function is well\-understood.
-We have applied the deep convolutional network YOLO for CLuster detection
-\(YOLO\-CL\) to LSST simulations from the Dark Energy Science Collaboration Data
-Challenge 2 \(DC2\), and characterized the LSST YOLO\-CL cluster selection
-function. We have trained and validated the network on images from a hybrid
-sample of \(1\) clusters observed in the Sloan Digital Sky Survey and detected
-with the red\-sequence Matched\-filter Probabilistic Percolation, and \(2\)
-simulated DC2 dark matter haloes with masses $M\_\{200c\} > 10^\{14\} M\_\{\\odot\}$. We
-quantify the completeness and purity of the YOLO\-CL cluster catalog with
-respect to DC2 haloes with $M\_\{200c\} > 10^\{14\} M\_\{\\odot\}$. The YOLO\-CL cluster
-catalog is 100% and 94% complete for halo mass $M\_\{200c\} > 10^\{14.6\} M\_\{\\odot\}$
-at $0.2<z<0.8$, and $M\_\{200c\} > 10^\{14\} M\_\{\\odot\}$ and redshift $z \\lesssim 1$,
-respectively, with only 6% false positive detections. All the false positive
-detections are dark matter haloes with $ 10^\{13.4\} M\_\{\\odot\} \\lesssim M\_\{200c\}
-\\lesssim 10^\{14\} M\_\{\\odot\}$. The YOLO\-CL selection function is almost flat with
-respect to the halo mass at $0.2 \\lesssim z \\lesssim 0.9$. The overall
-performance of YOLO\-CL is comparable or better than other cluster detection
-methods used for current and future optical and infrared surveys. YOLO\-CL shows
-better completeness for low mass clusters when compared to current detections
-in surveys using the Sunyaev Zel'dovich effect, and detects clusters at higher
-redshifts than X\-ray\-based catalogs. The strong advantage of YOLO\-CL over
-traditional galaxy cluster detection techniques is that it works directly on
-images and does not require photometric and photometric redshift catalogs, nor
-does it need to mask stellar sources and artifacts.
-
-
-**代码链接**：摘要中未找到代码链接。
-
-**论文链接**：[阅读更多](http://arxiv.org/abs/2409.03333v1)
-
----
-