A Python script to extract mapping statistics from STAR alignment log files and export them to a CSV file.
- Extracts number of input reads, uniquely mapped reads number, and uniquely mapped reads percentage.
- Parses STAR log files and consolidates results into a CSV file.
-
Clone the repository:
git clone https://github.com/tkinley/STAR-Log-Parser.git cd STAR-Log-Parser -
Place your STAR log files in the
log_filesdirectory. -
Run the script:
python extract_mapping_info.py log_files
-
The results will be saved in
mapping_summary.csv.
Filename,Number of input reads,Uniquely mapped reads number,Uniquely mapped reads %
ERR1942975_Log.final.out,40332686,25507486,63.24%
ERR1942976_Log.final.out,40230927,25813120,64.16%This will create a CSV file named mapping_summary.csv in the same directory, containing the extracted information sorted by filenames in ascending order. If you need to sort in descending order, you can modify the sorted function call to include the reverse=True parameter:
import pandas as pd
data_sorted = sorted(data, key=lambda x: x["Filename"], reverse=True)