11 Working with Genomics cooridates

Standards xkcd 927
Standards xkcd 927

11.1 Data File Formats

Genomics coordinates is critical in many genomics/bioinformatics analysis. Unfortunately, we don’t have coordinated effort to adopt an universal file format. As genomics technology progresses, so is the number of data file format. For instance, human genome project invented the Browser Extensible Data (BED) format in which was later expanded to a wide variety of BED-like formats. 1000 Genome project came up with Variant Call Format (VCF) to document genomic variant while other used Generic Feature Format GFF. Fortunately, in R, we have rtracklayer and GenomicRanges to help with that

11.2 GenomicRanges

Genomes Data import into R

library(rtracklayer)

Under Construction - more to come …

11.2.1 import data