Annotate amplified sequence variants (ASVs) with taxa labels derived from a BLAST search.

blastTaxAnnot(
  MA,
  db = "nt/nt",
  num_threads = getOption("mc.cores", 1L),
  negative_gilist = system.file("extdata", "uncultured.gi", package = "MultiAmplicon"),
  infasta = paste0(tempfile(), ".fasta"),
  outblast = paste0(tempfile(), ".blt"),
  taxonSQL,
  ...
)

Arguments

MA

A MultiAmplicon for which taxa shoudl be annotated. This should contain a sequenceTableNoChime slot, meaning that the MultiAmplicon pipeline needs to be followed to that point.

db

The blast database. Either a full path or the path relative to the you data base directory set in an environmental variable.

num_threads

The number of threads used for the blast search.

negative_gilist

A file containing NCBI GI numbers to exclude from blast searches.

infasta

A fasta file generated for the blast searche, a temporary file in the respective R temporary folder by default.

outblast

A blast tabular output file generated by the blast searche, a temporary file in the respective R temporary folder by default.

taxonSQL

An SQL file generated by the package taxonomizr.

...

String (of other options) passed to blastn (see blastn -help in the terminal)

Value

A MultiAmplicon object with the taxonTable slot filled

Details

Based on a BLAST search taxonomic labels are assigned to ASVs. Currently supported taxonomy levels are c("superkingdom", "phylum", "class", "order", "family", "genus", "species"). For each taxonomic level unique best taxa (based on bitscores) are reported. If no unique best taxon exists at a particular level NA is returned for this. The function combines evidence for taxnomic annotations in case of seperate HSPs for (concatenated) non-merged sequences.

Author

Emanuel Heitlinger