Crate bio [] [src]

Rust-bio, a bioinformatics library for Rust.

This library provides implementations of many algorithms and data structures that are useful for bioinformatics. All provided implementations are rigorously tested via continuous integration. For installation instructions and a general overview, visit https://rust-bio.github.io.

Currently, rust-bio provides

Example

use bio::alphabets;
use bio::data_structures::suffix_array::suffix_array;
use bio::data_structures::bwt::{bwt, less, Occ};
use bio::data_structures::fmindex::{FMIndex, FMIndexable};

let text = b"ACGGATGCTGGATCGGATCGCGCTAGCTA$";
let pattern = b"ACCG";

// Create an FM-Index for a given text.
let alphabet = alphabets::dna::iupac_alphabet();
let sa = suffix_array(text);
let bwt = bwt(text, &sa);
let less = less(&bwt, &alphabet);
let occ = Occ::new(&bwt, 3, &alphabet);
let fmindex = FMIndex::new(&bwt, &less, &occ);

let interval = fmindex.backward_search(pattern.iter());
let positions = interval.occ(&sa);

Multithreaded Example

use bio::alphabets;
use bio::data_structures::suffix_array::suffix_array;
use bio::data_structures::bwt::{bwt, less, Occ};
use bio::data_structures::fmindex::{FMIndex, FMIndexable};
use std::sync::Arc;
use std::thread;

let text = b"ACGGATGCTGGATCGGATCGCGCTAGCTA$";
let patterns = vec![b"ACCG", b"TGCT"];

// Create an FM-Index for a given text.
let alphabet = alphabets::dna::iupac_alphabet();
let sa = suffix_array(text);
let bwt = Arc::new(bwt(text, &sa));
let less = Arc::new(less(bwt.as_ref(), &alphabet));
let occ = Arc::new(Occ::new(bwt.as_ref(), 3, &alphabet));
let fmindex = Arc::new(FMIndex::new(bwt, less, occ));

// Spawn threads to perform backward searches for each interval
let interval_calculators = patterns.into_iter().map(|pattern| {
    let fmindex = fmindex.clone();
    thread::spawn(move ||
        fmindex.backward_search(pattern.iter())
    )
}).collect::<Vec<_>>();

// Loop through the results, extracting the positions array for each pattern
for interval_calculator in interval_calculators {
    let positions = interval_calculator.join().unwrap().occ(&sa);
}

Documentation and further examples for each module can be found in the module descriptions below.

Modules

alignment

Various alignment and distance computing algorithms.

alphabets

Implementation of alphabets and useful utilities.

data_structures

Various useful data structures.

io

Readers and writers for common bioinformatics file formats.

pattern_matching

This module contains various useful pattern matching algorithms. The implementations are based on the lecture notes "Algorithmen auf Sequenzen", Kopczynski, Marschall, Martin and Rahmann, 2008 - 2015.

scores
seq_analysis

Sequence analysis algorithms.

stats

Mathematical and statistical tools.

utils

Common utilities.