The MultiAmplicon class is a container that stores at least primer pairs, read files and progressively processed data in an 'amplicon x samples' format. The slots in this object are incrementally filled with by running wrappers functions (mostly around functions from the dada2 package). The object is treated (subsetted etc.) like a (pseudo) matrix, colums are samples, rows are different amplicons.

Subsetting for MultiAmplicon objects should conveniently subset all (potentially) filled slots

MultiAmplicon(
  PrimerPairsSet = PrimerPairsSet(),
  PairedReadFileSet = PairedReadFileSet(),
  .Data = matrix(ncol = 0, nrow = 0),
  stratifiedFiles = list(),
  sampleData = new("sample_data", data.frame(row.names = names(PairedReadFileSet),
    readsF = PairedReadFileSet@readsF, readsR = PairedReadFileSet@readsF)),
  derep = list(),
  dada = list(),
  mergers = list(),
  sequenceTable = list(),
  sequenceTableNoChime = list(),
  taxonTable = list()
)

getPrimerPairsSet(MA)

getPairedReadFileSet(MA)

getRawCounts(MA)

getStratifiedFilesF(MA, simplify = TRUE)

getStratifiedFilesR(MA, simplify = TRUE)

getDerepF(MA, simplify = TRUE)

getDerepR(MA, simplify = TRUE)

getDadaF(MA, simplify = TRUE)

getDadaR(MA, simplify = TRUE)

getMergers(MA, simplify = TRUE)

getSequenceTable(MA, simplify = TRUE)

getSequenceTableNoChime(MA, simplify = TRUE)

getTaxonTable(MA, simplify = TRUE)

getSequencesFromTable(MA)

# S4 method for MultiAmplicon
getSequencesFromTable(MA)

# S4 method for MultiAmplicon
show(object)

# S4 method for MultiAmplicon,index,index,ANY
[(x, i, j, ..., drop = FALSE)

# S4 method for MultiAmplicon,index,missing,ANY
[(x, i, j, ..., drop = FALSE)

# S4 method for MultiAmplicon,missing,index,ANY
[(x, i, j, ..., drop = FALSE)

Arguments

PrimerPairsSet

a set of primer pairs specifiying your amplicons see PrimerPairsSet-class

PairedReadFileSet

a set of paired end sequencing data files PairedReadFileSet-class

.Data

Users should not supply this parameter, the slot is created by sortAmplicons.

stratifiedFiles

Users should not supply this parameter, the slot is created by sortAmplicons.

sampleData

Users should not supply this parameter. It's filled with a sample_data object from phyloseq. The slot is created from sample names (same as colnames(MA)) and more data can be added by addSampleData.

derep

Users should not supply this parameter, the slot is created by derepMulti

dada

Users should not supply this parameter, the slot is created by dadaMulti

mergers

Users should not supply this parameter, the slot is created by mergeMulti

sequenceTable

Users should not supply this parameter, the slot is created by makeSequenceTableMulti

sequenceTableNoChime

Users should not supply this parameter, the slot is created by removeChimeraMulti

taxonTable

Users should not supply this parameter, the slot is created by blastTaxAnnot. It's filled with a list of taxonomyTable objects from phyloseq.

MA

MultiAmplicon-class object

simplify

Should a list of objects be simplified to only one object if it has length one?

object

A MultiAmplicon-class object.

x

MultiAmplicon-class object

i

numeric, logical or names vector for subsetting rows (== amplicons)

j

numeric, logical or names vector for subsetting columns (== read files, corresponding usually to samples)

...

not used

drop

should not be used

Functions

  • MultiAmplicon: Constructor for MultiAmplicon-class

Slots

PrimerPairsSet

The primer pairs used in your experiment to specify amplicons stored in a PrimerPairsSet-class object.

PairedReadFileSet

The (quality filtered) fastq files (one file pair for each sample) that store your sequencing data.

.Data

A numeric matrix of sequencing read counts per amplicon and sample. Created by the function sortAmplicons in the MultiAmplicon pipeline.

sampleData

A sample_data object from phyloseq. The slot is created from sample names (names of the PrimerPairsSet, which have tto be the same as colnames(MA)). More data can be added by addSampleData.

stratifiedFiles

temporary files as a result of stratifying into amplicons and samples using the MultiAmplicon pipeline function sortAmplicons. Forward (sometimes called R1) and reverse (sometimes called R2) files are stored as a (amplicons x samples) matrix of PairedReadFileSet-class objects.

derep

A list of PairedDerep-class objects containing pairs of derep-class objects created by dada2’s derepFastq function or withing the MultiAmplicon pipeline by derepMulti.

dada

A list of PairedDada-class object containing pairs of dada-class objects created by dada2’s dada function. Within the MultiAmplicon pipeline this slot is filled by dadaMulti.

mergers

A list of objects containing merged pairs of forward and reverse reads as created by by dada2’s mergePairs function. Within the MultiAmplicon pipeline this slot is filled by mergeMulti.

sequenceTable

A list of matrix objects created by dada2’s makeSequenceTable. Samples (in rows) and amplified sequence variants (ASVs) in columns. Within the MultiAmplicon pipeline this slot is filled by makeSequenceTableMulti.

sequenceTableNoChime

A list of matrix objects created by dada2’s removeBimeraDenovo. Samples (in rows) and ASVs screened for PCR chimeras in columns. Within the MultiAmplicon pipeline this slot is filled by removeChimeraMulti.

taxonTable

A list of matrix objects created by a function for taxonomical annotation (for example blastTaxAnnot. ASVs are in rows and taxnomical ranks are in columns. MultiAmplicon(PrimerPairsSet, PairedReadFileSet)

See also

Author

Emanuel Heitlinger

Examples

primerF <- c("AGAGTTTGATCCTGGCTCAG", "ACTCCTACGGGAGGCAGC", "GAATTGACGGAAGGGCACC", "YGGTGRTGCATGGCCGYT") primerR <- c("CTGCWGCCNCCCGTAGG", "GACTACHVGGGTATCTAATCC", "AAGGGCATCACAGACCTGTTAT", "TCCTTCTGCAGGTTCACCTAC") PPS <- PrimerPairsSet(primerF, primerR) fastq.dir <- system.file("extdata", "fastq", package = "MultiAmplicon") fastq.files <- list.files(fastq.dir, full.names=TRUE) Ffastq.file <- fastq.files[grepl("F_filt", fastq.files)] Rfastq.file <- fastq.files[grepl("R_filt", fastq.files)] PRF <- PairedReadFileSet(Ffastq.file, Rfastq.file) MA <- MultiAmplicon(PPS, PRF) ## sort into amplicons MA1 <- sortAmplicons(MA, filedir=tempfile(pattern = "dir"))
#> creating directory /tmp/RtmpTPTbQy/dirfd1a1657d6ac
#> finished sorting 0 sequencing reads for S00_F_filt.fastq.gz in #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S00_F_filt.fastq.gz and #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S00_R_filt.fastq.gz #> into 0 amplicons and written into /tmp/RtmpTPTbQy/dirfd1a1657d6ac
#> finished sorting 0 sequencing reads for S01_F_filt.fastq.gz in #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S01_F_filt.fastq.gz and #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S01_R_filt.fastq.gz #> into 0 amplicons and written into /tmp/RtmpTPTbQy/dirfd1a1657d6ac
#> finished sorting 9348 sequencing reads for S02_F_filt.fastq.gz in #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S02_F_filt.fastq.gz and #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S02_R_filt.fastq.gz #> into 4 amplicons and written into /tmp/RtmpTPTbQy/dirfd1a1657d6ac
#> finished sorting 5 sequencing reads for S03_F_filt.fastq.gz in #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S03_F_filt.fastq.gz and #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S03_R_filt.fastq.gz #> into 1 amplicons and written into /tmp/RtmpTPTbQy/dirfd1a1657d6ac
#> finished sorting 14 sequencing reads for S04_F_filt.fastq.gz in #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S04_F_filt.fastq.gz and #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S04_R_filt.fastq.gz #> into 3 amplicons and written into /tmp/RtmpTPTbQy/dirfd1a1657d6ac
#> finished sorting 51777 sequencing reads for S05_F_filt.fastq.gz in #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S05_F_filt.fastq.gz and #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S05_R_filt.fastq.gz #> into 3 amplicons and written into /tmp/RtmpTPTbQy/dirfd1a1657d6ac
#> finished sorting 51777 sequencing reads for S05D_F_filt.fastq.gz in #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S05D_F_filt.fastq.gz and #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S05D_R_filt.fastq.gz #> into 3 amplicons and written into /tmp/RtmpTPTbQy/dirfd1a1657d6ac
#> finished sorting 17358 sequencing reads for S06_F_filt.fastq.gz in #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S06_F_filt.fastq.gz and #> /localstorage/ele/git_projects/MultiAmplicon/inst/extdata/fastq/S06_R_filt.fastq.gz #> into 4 amplicons and written into /tmp/RtmpTPTbQy/dirfd1a1657d6ac
## Only after sorting the MultiAmplicon object is really poplated ## with sensible data, now matrix-like access to different ## amplicons (primer pairs) and different sequencing read files ## (usually samples) is implemented ## the number of amplicons (primer pairs) nrow(MA)
#> [1] 0
## the number of samples (sequencing read file pairs) ncol(MA)
#> [1] 0
## dereplication is currently not supported ## MA2 <- derepMulti(MA1) ### use dada directly after sorting MA3 <- dadaMulti(MA1, selfConsist = TRUE)
#> #> #> amplicon AGAGTTTGATCCTGGCTCAG.CTGCWGCCNCCCGTAGG: dada estimation of sequence variants from 6 of 8 possible sample files
#> calling dada with selfConsist=TRUE parameters
#> Initializing error rates to maximum possible estimate. #> selfConsist step 1 ...... #> selfConsist step 2 #> selfConsist step 3 #> selfConsist step 4 #> selfConsist step 5 #> selfConsist step 6 #> Convergence after 6 rounds. #> Initializing error rates to maximum possible estimate. #> selfConsist step 1 ...... #> selfConsist step 2 #> selfConsist step 3 #> selfConsist step 4 #> selfConsist step 5 #> Convergence after 5 rounds.
#> #> #> amplicon ACTCCTACGGGAGGCAGC.GACTACHVGGGTATCTAATCC: dada estimation of sequence variants from 5 of 8 possible sample files
#> calling dada with selfConsist=TRUE parameters
#> Initializing error rates to maximum possible estimate. #> selfConsist step 1 ..... #> selfConsist step 2 #> selfConsist step 3 #> selfConsist step 4 #> selfConsist step 5 #> selfConsist step 6 #> Convergence after 6 rounds. #> Initializing error rates to maximum possible estimate. #> selfConsist step 1 ..... #> selfConsist step 2 #> selfConsist step 3 #> selfConsist step 4 #> selfConsist step 5 #> Convergence after 5 rounds.
#> #> #> amplicon GAATTGACGGAAGGGCACC.AAGGGCATCACAGACCTGTTAT: dada estimation of sequence variants from 2 of 8 possible sample files
#> calling dada with selfConsist=TRUE parameters
#> Initializing error rates to maximum possible estimate. #> selfConsist step 1 .. #> selfConsist step 2 #> selfConsist step 3 #> selfConsist step 4 #> selfConsist step 5 #> Convergence after 5 rounds. #> Initializing error rates to maximum possible estimate. #> selfConsist step 1 .. #> selfConsist step 2 #> selfConsist step 3 #> selfConsist step 4 #> selfConsist step 5 #> Convergence after 5 rounds.
#> #> #> amplicon YGGTGRTGCATGGCCGYT.TCCTTCTGCAGGTTCACCTAC: dada estimation of sequence variants from 5 of 8 possible sample files
#> calling dada with selfConsist=TRUE parameters
#> Initializing error rates to maximum possible estimate. #> selfConsist step 1 ..... #> selfConsist step 2 #> selfConsist step 3 #> selfConsist step 4 #> selfConsist step 5 #> Convergence after 5 rounds. #> Initializing error rates to maximum possible estimate. #> selfConsist step 1 ..... #> selfConsist step 2 #> selfConsist step 3 #> selfConsist step 4 #> Convergence after 4 rounds.
MA4 <- mergeMulti(MA3, justConcatenate=TRUE)
#> #> merging sequences from 6 samples for amplicon AGAGTTTGATCCTGGCTCAG.CTGCWGCCNCCCGTAGG
#> calling mergePairs with justConcatenate=TRUE parameters
#> #> merging sequences from 5 samples for amplicon ACTCCTACGGGAGGCAGC.GACTACHVGGGTATCTAATCC
#> calling mergePairs with justConcatenate=TRUE parameters
#> #> merging sequences from 2 samples for amplicon GAATTGACGGAAGGGCACC.AAGGGCATCACAGACCTGTTAT
#> calling mergePairs with justConcatenate=TRUE parameters
#> #> merging sequences from 5 samples for amplicon YGGTGRTGCATGGCCGYT.TCCTTCTGCAGGTTCACCTAC
#> calling mergePairs with justConcatenate=TRUE parameters
#> calling makeSequenceTable with default parameters
#> calling makeSequenceTable with default parameters
#> calling makeSequenceTable with default parameters
#> calling makeSequenceTable with default parameters
MA6 <- removeChimeraMulti(MA5, mc.cores=1)
#> calling removeBimeraDenovo with default parameters
#> calling removeBimeraDenovo with default parameters
#> calling removeBimeraDenovo with default parameters
#> calling removeBimeraDenovo with default parameters