Skip to content

This is repository of amplicon sequencing data and bioinformatics pipeline codes for the nectar microbiome of Monkey Flowers.

Notifications You must be signed in to change notification settings

YingtongAamandaWu/MonkeyFlower_ampliconSeq_DNA_data_codes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a repository of data, R scripts, and codes for Linux-based commands for the bioinformatics analyses of amplicon sequencing data from fungi and bacteria microbiomes within the nectar samples of Sticky Monkeyflower (Diplacus aurantiacus).

The repository is comprised of 4 folders:

01_Data folder includes two CSV files:

  1. "sampling_sheet_regional_survey_2015_final_corrected.csv": metadata for the DNA samples, documenting the site ID, plant ID, flower ID of each sample, as well as the corresponding concentration of fungi and bacteria unit forming colonies (CFUs) in each sample.
  2. "2015_survey_siteinfo_location_envi.csv": documents the environmental data and coordinates for each flower.
  3. "Wu_Metagenome.environmental.1.0_B_N_SUB13559541.xlsx": documents the biosample information for each flower for sample names started from B to N
  4. "Wu_Metagenome.environmental.1.0_O_S_SUB13567828.xlsx": documents the biosample information for each flower for sample names started from O to S
  5. "SRA_metadata_site_started_with_B_N.xlsx": documents the sequencing information for each fastq file, as related to the biosample info in "Wu_Metagenome.environmental.1.0_B_N_SUB13559541.xlsx".
  6. "SRA_metadata_site_started_with_O_S.xlsx": documents the sequencing information for each fastq file, as related to the biosample info in "Wu_Metagenome.environmental.1.0_O_S_SUB13567828.xlsx".

02_Rscripts includes R script used in the bioinformatics analyses:

  1. "make_Map_20230413.Rmd": R code that makes the Figure 1 map, showing the distributions and locations of samples.
  2. "Bioinformatics_ITS1_DADA2_CONSTAXtaxa_20230202.Rmd": R code that implements the Dada2 pipeline on fungi ITS1 sequences.
  3. "Bioinformatics_16S_DADA2_SILVAtaxa_20230207.Rmd": R code that implements the Dada2 pipeline on bacteria 16S sequences.
  4. "make_phyloseq_objects_&_run_CLAM_test_20220203.Rmd": R code that generates phyloseq objects for ITS1 and 16S sequences, respectively; the code also uses the data from plantings -- densities of bacteria and fungi colony-forming units (CFUs) to categorize whether the nectar samples are (1) bacteria-dominated flowers, (2) fungi-dominated flowers, (3) co-dominated flowers, and (4) flowers with too few microbes to be classified into any of the three other groups.
  5. "diversity_analyses_fungi_threshold=clam_20230413.Rmd": R code that analyzes the alpha and beta diversity of fungi sequences (ITS1); data analyses include pairwise two-sample permutation tests for alpha diversity, permutational multivariate ANOVA for species composition, differential abundance analyses, etc.
  6. "diversity_analyses_bacteria_threshold=clam_20230413.Rmd": R code that analyzes the alpha and beta diversity of bacteria sequences (16S); data analyses similar to those included in "diversity_analyses_fungi_threshold=clam_20230413.Rmd".
  7. "ASVlevel_Co_occurence_network_NetCoMi_pearson_sparcc_r0.1_clam_20220429.rmd": R code that conducts co-occurrence network analyses for fungi and bacteria sequences using Pearson correlation network and SparCC (Sparse Correlations for Compositional data) network method.
  8. "ASVlevel_co_occurence_network_NetCoMi_spieceasi_r0.1_clam_20220504.rmd": R code that conducts co-occurrence network analyses for fungi and bacteria sequences using SPIEC-EASI (Sparse InversE Covariance estimation for Ecological Association and Statistical Inference) network method.

03_Output includes key output files from the bioinformatics pipeline:

  1. "ITS1.unpooled.ASVs.fa": fasta file that documents the representative sequence for each fungi ITS1 ASV (amplicon sequence variant);
  2. "16S.unpooled.ASVs.fa": fasta file that documents the representative sequence for each bacteria 16S ASV (amplicon sequence variant);
  3. "Appendix1_ITS1_fungi_ASV.count.ordered.csv": csv file that documents the total counts of reads, relative abundance within all samples, and species taxonomy of each fungi ITS1 ASV.
  4. "Appendix2_16S_bacteria_ASV.count.ordered.csv": csv file that documents the total counts of reads, relative abundance within all samples, and species taxonomy of each bacteria 16S ASV.

04_Docs includes a file that documents other bioinformatics steps not conducted in R:

  1. "Wu_Monkeyflower_bioinformatics_steps.docx": a docx file that records the bioinformatics steps from demultiplexing to species assignment, based in Linux environment.

About

This is repository of amplicon sequencing data and bioinformatics pipeline codes for the nectar microbiome of Monkey Flowers.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published