Sort different amplicons into a fully stratified samples x amplicons structure based on primer matches.
sortAmplicons( MA, filedir = "stratified_files", n = 1e+06, countOnly = FALSE, rmPrimer = TRUE, ... ) # S4 method for MultiAmplicon sortAmplicons( MA, filedir = "stratified_files", n = 1e+06, countOnly = FALSE, rmPrimer = TRUE, ... )
MA |
|
---|---|
filedir | path to an existing or newly to be created folder on your computer. If existing it has to be empty. |
n | parameter passed to the yield functions of package ShortRead. This controls the memory consumption during streaming. Lower values result in lower memory requirements but might result longer processing time due to more repeated I/O operations reading the sequence files. |
countOnly | logical argument if set TRUE only a matrix of read counts is returned |
rmPrimer | logical, indicating whether primer sequences should be removed during sorting |
... | additional parameter so be passed to Biostrings::isMatchingStartingAt. Be careful when using multiple starting positions or allowing error. This could lead to read pairs being assigned to multiple amplicons. |
MultiAmplicon: By default (countOnly=FALSE) a
MultiAmplicon-class
object is returned with the
stratifiedFiles slot populated. Stratified file names are
constructed using a unique string created by
tempfile
and stored in the given filedir
(by default R's tempdir
). If the countOnly
is set only a numeric matrix of read counts is returned.
This function uses isMatchingStartingAt
to match primer sequences at the first position of forward and
reverse sequences. These primer sequences can be removed. The
remaining sequences of interest are written to files to allow
processing via standard metabarcoding pipelines.
Emanuel Heitlinger
primerF <- c("AGAGTTTGATCCTGGCTCAG", "ACTCCTACGGGAGGCAGC", "GAATTGACGGAAGGGCACC", "YGGTGRTGCATGGCCGYT") primerR <- c("CTGCWGCCNCCCGTAGG", "GACTACHVGGGTATCTAATCC", "AAGGGCATCACAGACCTGTTAT", "TCCTTCTGCAGGTTCACCTAC") PPS <- PrimerPairsSet(primerF, primerR) fastq.dir <- system.file("extdata", "fastq", package = "MultiAmplicon") fastq.files <- list.files(fastq.dir, full.names=TRUE) Ffastq.file <- fastq.files[grepl("F_filt", fastq.files)] Rfastq.file <- fastq.files[grepl("R_filt", fastq.files)] PRF <- PairedReadFileSet(Ffastq.file, Rfastq.file) MA <- MultiAmplicon(PPS, PRF)#> Error in sample_names(object@sampleData): could not find function "sample_names"## sort into amplicons MA1 <- sortAmplicons(MA)#> Error in h(simpleError(msg, call)): error in evaluating the argument 'MA' in selecting a method for function 'sortAmplicons': object 'MA' not found