R_BbiParser: an R interface to BigWig and BigBed

The BigWig and BigBed file types are commonly used to describe features of genomic regions such as chromatin modification or transcription factor binding sites determined by chromatin immunoprecipitation (CHIP). The BigWig and BigBed formats are binary, indexed and compressed forms of the text based Wig and Bed formats and were designed to allow querying and partial file transfers between pairs of webservers. This makes the data difficult to directly read into an R-session using the usual functions. Tools exist to convert the binary files to the text based formats that can be easily analysed in an R-session, but this is wasteful if analysing large amounts of data. To remedy this I wrote a set of tools that allow data from the Big- file formats to be accessed directly from R using the indexing already present in the file formats. This makes it easy, for example to obtain histone modifications at all transcriptional start sites or to obtain information for a given region across a large number of different files.

Bioconductor also has an R-interface to the BigWig format in the rtracklayer package. The rtracklayer package also provides a broad range of accessors for other file formats used to describe genomic features, and is more generally useful. BbiParser is much more specialised and written primarily in order to be easily optimised for speed. It is also implemented using the Rcpp modules interface, and this makes it very easy to extend the functionality on the C++ side.