MISO
fastmiso
  • MISO (Mixture of Isoforms) software documentation
  • What is MISO?
  • How MISO works
  • Mailing list
  • Installation
  • Overview
  • Ways of running MISO and associated file formats
  • Using MISO in parallel on multiple cores
  • Using MISO on a cluster
  • Alternative event annotations
  • Running MISO
  • Summarizing MISO output
  • Detecting differentially expressed isoforms
  • Compressing raw MISO output
  • Example MISO pipeline
  • Interpreting and filtering MISO output
  • Visualizing and plotting MISO output
  • Updates
  • Frequently Asked Questions (FAQ)
  • Advanced uses of MISO
  • Acknowledgements
  • Authors
  • Installing Python-only MISO version
  • References
MISO
  • Docs »
  • Glossary of terms related to MISO
  • Edit on GitHub

Contents

  • Glossary of terms related to MISO
    • Annotations and GFF files
    • Read alignments and BAM files
    • Inference terms

Glossary of terms related to MISO¶

Terms used in the MISO manual.

Annotations and GFF files¶

  • Exon-centric: An annotation of alternative events in the genome that is based on inclusion/exclusion of a particular exon in a transcript. For example, an exon-centric annotation of an alternatively skipped exon would contain two isoforms, one containing the skipped exon and its two flanking exon, and another isoform containing only the two flanking exons. This “exon-centric” annotation does not incorporate other exons in the gene, and so Ψ values obtained from this annotation correspond only to the inclusion of the alternative exon relative to its two annotated flanking exons, without considering any other parts of the gene’s isoforms.
  • Isoform-centric: Unlike exon-centric annotations, in isoform-centric annotations each whole isoform of a gene is annotated and used as input to MISO. Ψ values obtained this way are vectors, each entry corresponding to the percent inclusion of a whole isoform in the annotated gene.

Read alignments and BAM files¶

  • Paired-end versus single-end: In paired-end sequencing, both ends of a cluster on a flow cell are sequenced. Each mate is guaranteed to have originated from the same molecule. In single-end sequencing, only one end of a molecule is sequenced. MISO supports both paired-end and single-end data. All paired-end data can be run in MISO as single-end by simply omitting the --paired-end parameter. In that case, MISO will treat each mate of a pair independently.
  • Properly paired read pair: This term applies only to paired-end data, and refers to read pairs where both mates are mapped in a way that makes sense given the strandedness of the RNA-Seq protocol and the alignments of the individual mates. When MISO maps read pairs to event annotations in paired-end mode, it only considers properly paired reads. If the mates maps to distinct chromosomes, then the read pair will not be considered properly paired. Similarly, if one mate maps in opposite orientation to what is expected given the strandedness setting, it will not be considered properly paired. Finally, if one mate maps to within the boundaries of an annotated event but the other does not, the read pair will not be considered (though if such cases are common, one can use MISO in single-end mode.) MISO will generally look for the BAM flag that encodes whether a read pair is properly paired or not. Otherwise, it pairs mates together from a BAM file using their read IDs.
  • Overhang: Overhang applied to splice junction reads, refers to the minimum number of bases covered by the read on any of the exons involved in the junction. For example, if a junction read of length 30 is aligned to the border of two exons with 10 bases covered on one exon and 20 bases covered on the other exon, the overhang is defined to be 10 (the smallest of the two numbers.) For single-end reads, requiring a considerable overhang like 4 or more helps filter alignments that appear as junction reads but are simply artifacts of sequencing errors and/or alignment errors. Overhang is not defined for paired-end reads.

Inference terms¶

There are a number of technical parameter settings related to Markov chain Monte Carlo inference (MCMC), which the MISO engine is based on. In virtually all cases, users never have to mind or alter these settings, but they are explained here for completeness. These are configurable from the MISO settings file.

  • Number of (MCMC) chains: The number of independent MCMC chains used by MISO when performing inferences.

The default number is 6 which is considered a conservative setting for the problem. High chain numbers like 6 prevent MISO from getting stuck in suboptimal Ψ values.

  • Lag: The number of MCMC samples to skip over when computing the posterior distribution over Ψ. The default is 10. High settings of this parameter can, in some cases, prevent autocorrelations between MCMC samples.
  • Burn-in: The number of initial MCMC samples to exclude when computing the posterior distribution over Ψ. Large settings of burn-in can prevent generation of posterior distributions over Ψ that are closely correlated to the initial random setting of Ψ used by the sampler.

© Copyright 2010, Yarden Katz, Gábor Csárdi, Eric T. Wang, Edoardo M. Airoldi, Christopher B. Burge. Revision 1ca049fa.

Built with Sphinx using a theme provided by Read the Docs.
Read the Docs v: fastmiso
Versions
latest
fastmiso
dev
Downloads
pdf
htmlzip
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.