Skip to content

LDA-based Hierarchical Clustering for Story Point Estimation

Notifications You must be signed in to change notification settings

SOLAR-group/LHC-SE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LDA-based Hierarchical Clustering for Story point Estimation (LHC-SE)

Using clustering of textual features to estimate effort in agile projects.

This repository is supplementary to the paper entitled "Investigating the Effectiveness of Clustering for Story Point Estimation", authored by Vali Tawosi, Afnan Al-Subaihin, and Federica Sarro, which is accepted for presentation at the 29th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2022). It contains the R source code and datasets used in the study.

How to train the LDA model:

Rscript generate_lda_model.R [PATH TO DATASET]

Since this step may take long, we provide the best lda_model achieveved with the Tawosi Dataset. The model can be found at ../results/lda_2265.rda in this repository.

How to run the estimation (from terminal):

Extract the dataset fles and run:

Rscript lda_project_specific.R [PATH TO DATASET] [PATH TO LDA MODEL] [Cluster Building Strategy:'MAE', 'MdAE', or 'sil'] [Algorithm Variant: 'LHC-SE', 'LHC-TC-SE', or 'LHC-TC-TFIDF-SE']

About

LDA-based Hierarchical Clustering for Story Point Estimation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages