Update LiSSA-RATLR in documentation

dfuchss · web-flow · commit c920b567542a · 2025-02-17T15:55:36.000+01:00
diff --git a/docs/Home.md b/docs/Home.md
@@ -31,7 +31,7 @@ To get to know the project, please read the following pages:
   * [TLR](https://github.com/ArDoCo/TLR): Traceability Link Recovery (TLR) Modules
   * [StanfordCoreNLP-Provider-Service](https://github.com/ArDoCo/StanfordCoreNLP-Provider-Service): RESTful web service for text preprocessing
   * [InconsistencyDetection](https://github.com/ArDoCo/InconsistencyDetection): Inconsistency Detection (ID) Modules
-  * [LiSSA](https://github.com/ArDoCo/LiSSA): Linking Sketches and Software Architecture Modules
+  * [LiSSA-RATLR](https://github.com/ArDoCo/LiSSA-RATLR): LiSSA - A Framework for Generic Traceability Link Recovery
 * Testing and Evaluation
   * [IntegrationTests](https://github.com/ArDoCo/IntegrationTests): Integration Tests
   * [Benchmark](https://github.com/ArDoCo/Benchmark): Benchmarks
diff --git a/docs/LiSSA.md b/docs/LiSSA.md
@@ -1,27 +1,112 @@
-# Linking Sketches and Software Architecture (LiSSA)
-
-The LiSSA approach aims to connect sketches and informal diagrams (such as class diagrams, component diagrams, ...) with
-formal models like component models.
-
-The following diagram shows the pipeline that is planned for the LiSSA approach.
-
-```mermaid
-stateDiagram-v2
-    DiagramDetection
-    TextPreprocessing
-    ArchitectureModel
-    TextExtraction
-    EntityRecognition
-    RecommendationGeneration
-    ConnectionGeneration
-    InconsistencyDetection
-
-    DiagramDetection --> RecommendationGeneration
-    TextPreprocessing --> TextExtraction
-    ArchitectureModel --> RecommendationGeneration
-    TextExtraction --> EntityRecognition
-    DiagramDetection --> EntityRecognition
-    EntityRecognition --> RecommendationGeneration
-    RecommendationGeneration --> ConnectionGeneration
-    ConnectionGeneration --> InconsistencyDetection
-```
+# LiSSA: A Framework for Generic Traceability Link Recovery
+
+Welcome to the LiSSA project!
+This framework leverages Large Language Models (LLMs) enhanced through Retrieval-Augmented Generation (RAG) to establish traceability links across various software artifacts.
+
+## Overview
+
+In software development and maintenance, numerous artifacts such as requirements, code, and architecture documentation are produced.
+Understanding the relationships between these artifacts is crucial for tasks like impact analysis, consistency checking, and maintenance.
+LiSSA aims to provide a generic solution for Traceability Link Recovery (TLR) by utilizing LLMs in combination with RAG techniques.
+
+The concept and evaluation of LiSSA are detailed in our paper:
+
+> Fuchß, D., Hey, T., Keim, J., Liu, H., Ewald, N., Thirolf, T., & Koziolek, A. (2025). LiSSA: Toward Generic Traceability Link Recovery through Retrieval-Augmented Generation. In Proceedings of the IEEE/ACM 47th International Conference on Software Engineering, Ottawa, Canada.
+
+You can access the paper [here](https://ardoco.de/c/icse25).
+
+## Features
+
+- **Generic Applicability**: LiSSA is designed to recover traceability links across various types of software artifacts, including:
+  - [Requirements to code](https://ardoco.de/c/icse25)
+  - [Documentation to code](https://ardoco.de/c/icse25)
+  - [Architecture documentation to architecture models](https://ardoco.de/c/icse25)
+
+- **Retrieval-Augmented Generation**: By combining LLMs with RAG, LiSSA enhances the accuracy and relevance of the recovered traceability links.
+
+## Getting Started
+
+To get started with LiSSA, follow these steps:
+
+1. **Clone the Repository**:
+   ```bash
+   git clone https://github.com/ArDoCo/LiSSA-RATLR
+   cd LiSSA-RATLR
+   ```
+
+2. **Install Dependencies**:
+   Ensure you have Java JDK 21 or later installed. Then, build the project using Maven:
+   ```bash
+   mvn clean package
+   ```
+
+3. **Run LiSSA**:
+   Execute the main application:
+   ```bash
+   java -jar target/ratlr-*-jar-with-dependencies.jar eval -c config.json
+   ```
+
+### Configuration
+
+1. Create a configuration you want to use for evaluation / execution. E.g., you can find configurations [here](https://github.com/ArDoCo/ReplicationPackage-ICSE25_LiSSA-Toward-Generic-Traceability-Link-Recovery-through-RAG/tree/main/LiSSA-RATLR-V2/lissa/configs/req2code-significance). You can also provide a directory containing multiple configurations.
+2. Configure your OpenAI API key and organization in a `.env` file. You can use the provided template file as a template `env-template`.
+3. LiSSA caches requests in order to be reproducible. The cache is located in the cache folder that can be specified in the configuration.
+4. Run `java -jar target/ratlr-*-jar-with-dependencies.jar eval -c configs/....` to run the evaluation. You can provide a JSON or a directory containing JSON configurations.
+5. The results will be printed to the console and saved to a file in the current directory. The name is also printed to the console.
+
+### Results of Evaluation / Execution
+The results will be stored as markdown files.
+A result file can look like below.
+It contains the configuration and the results of the evaluation.
+Additionally, the LiSSA generate CSV files that contain the traceability links as pairs of identifiers.
+
+<details>
+<summary>Example Result</summary>
+
+```json
+## Configuration
+{
+  "cache_dir" : "./cache-r2c/dronology-dd--102959883",
+  "gold_standard_configuration" : {
+    "hasHeader" : false,
+    "path" : "./datasets/req2code/dronology-dd/answer.csv"
+  },
+  "... other configuration parameters ..."
+}
+
+## Stats
+* # TraceLinks (GS): 740
+* # Source Artifacts: 211
+* # Target Artifacts: 423
+## Results
+* True Positives: 283
+* False Positives: 1286
+* False Negatives: 457
+* Precision: 0.18036966220522627
+* Recall: 0.3824324324324324
+* F1: 0.24512776093546992
+```
+
+</details>
+
+## Evaluation
+
+LiSSA has been empirically evaluated on three different TLR tasks:
+
+- Requirements to code
+- Documentation to code
+- Architecture documentation to architecture models
+- Requirements to requirements
+
+The results indicate that the RAG-based approach can significantly outperform state-of-the-art methods in code-related tasks.
+However, further research is needed to enhance its performance for broader applicability.
+
+## Acknowledgments
+
+LiSSA is developed by researchers from the Modelling for Continuous Software Engineering (MCSE) group of KASTEL - Institute of Information Security and Dependability at the Karlsruhe Institute of Technology (KIT).
+
+For more information about the project and related research, visit our [website](https://ardoco.de/).
+
+---
+
+*Note: This README provides a brief overview of the LiSSA project. For comprehensive details, please refer to the [repository](https://github.com/ArDoCo/LiSSA-RATLR)*
diff --git a/docs/_Sidebar.md b/docs/_Sidebar.md
@@ -12,4 +12,4 @@
     1. [SAM-Code](traceability-link-recovery#sam-code)
     1. [SAD-SAM-Code](traceability-link-recovery#sad-sam-code)
 7. [Inconsistency Detection (ID)](Inconsistency-Detection)
-8. [LiSSA](lissa)
+8. [LiSSA-RATLR](lissa)