FASTAReader
description
Reads FASTA format and fetches sequences. (Node.js)
installation
$ npm install fastareader
If you haven't installed Node.js yet,
first install nvm and follow the instruction on that page.
preparation
Create FASTA Information JSON file
FASTAReader first scans through the given FASTA file.
It costs nearly one minites.
To skip this process, FASTAReader generates JSON of the scanned information.
You can save the JSON like
$ fastareader foobar.fasta > foobar.fasta.json
After generating JSON, the file is automatically read if the prefix equals to the original FASTA file and suffix equals .json.
command-line
$ fastareader <fasta file> <rname> <pos> <length>
Then, sequence data comes to stdout.
AATGATCTATAGTCCATTAATTCAGTTACT
args
name | description | example |
---|---|---|
fasta file | a fasta file to get sequences | hg19.fa |
rname | a reference name to fetch. Must be in the fasta file. | chr12 |
pos | start position of the sequence to fetch (1-based coordinate). | 51417222 |
length | length of the sequence to fetch. | 300 |
options
name | description | example |
---|---|---|
--compl, -c | Gets complmentary strand of the sequence | -c |
--json, -j | a FASTA Information JSON file. When the name is [fasta file].json, the file is automatically read. | --json hg19.fa.json |
JavaScript API Documentation
- FASTAReader.create(fastafile, jsonfile)
- reader.fetch(id, start, length, inverse)
- reader.fetchByFormat(format)
- reader.getEndPos(rname)
- reader.hasN(rname, start, length)
FASTAReader.create(fastafile, jsonfile)
Creates an instance of FASTAReader.
- fastafile is a fasta file to get sequence from.
- jsonfile is optional, a FASTA Information JSON file.
Returns an instance of FASTAReader.
reader.fetch(rname, start, length, inverse)
- rname is the reference name.
- start is the start position of the sequence to fetch.
- length is the length of the sequence to fetch.
- if inverse is true, complementary strand is fetched.
Here is an example.
var reader = require('fastareader').create('/path/to/fasta.fasta');
var rname = 'chr11';
var start = 36181240;
var length = 420;
var rev = true;
var seq = reader.fetch(rname, start, length, rev);
reader.fetchByFormat(format)
format is compatible with dna library
an example of the format
chr2:34100214-34101989,-
Note that this format is 0-based coordinate.
reader.getEndPos(rname)
Gets the last position of rname.
It is the same as the length of the reference.
reader.hasN(rname, start, length)
Returns true if the region contains N, otherwise returns false. The region is specified by rname, start and length. These are the same meaning as reader.fetch().
NOTICE
FASTAReader uses 1-based coordinate system.
[1-based coordinate system]
A coordinate system where the first base of a sequence is one. In this coordinate system, a region is specified by a closed interval. For example, the region between 3rd and 7th bases inclusive is [3, 7]. The SAM, GFF and Wiggle formats are using the 1-based coordinate system.