The UCSC Genome Browser is capable of displaying both
the BAM and CRAM file formats. While
BAM files contain all sequence data within a file, CRAM
files are smaller by taking advantage of an additional
external "reference sequence" file. This file
is needed to both compress and decompress the read information.
Since CRAM files are more dense than BAM files, many groups
are switching to the CRAM format to save disk space. For
CRAM tracks to load there is an expectation that the md5sum of the
reference sequence used to create the CRAM will be in the CRAM header.
A file with a matching md5sum is also expected to be
accessible from the EBI CRAM Reference Registry.
Please also note that just as a BAM
file requires an associated BAM.bai index file, a CRAM file will
require an associated CRAM.crai index file in the same location to load.
Example One
Here is an example CRAM track that displays around the gene SOD1
on hg19 that can be cut and pasted ast text into the
Custom Tracks
page:
track type=bam db=hg19 name=exampleCRAM bigDataUrl=http://genome.ucsc.edu/goldenPath/help/examples/cramExample.cram
Clicking this following link will also load the above track. The information
following hgct_customText is equivalent to pasting the text in to
the Custom Tracks page:
Since the loading of CRAM data requires the specific reference sequence
used to create the CRAM file, it is very important that the exact same
reference sequence is used for compression and decompression. When a
CRAM file is first loaded on a given chromosome, a check for the preexistence
in a special browser "cramCache" directory of the specified reference
md5sum will take place. If the reference sequence information specific
for that CRAM for the currently viewed chromosome region does not exist, a message
will display about the file not being found along with a note about downloading
the reference from the EBI CRAM Reference Registry if it is available.
A refresh of the page once the download is
complete will display the CRAM data as if it were a BAM file.
Example Two
Clicking the following image will load a CRAM file from
the 1000 Genomes Project.
This CRAM display takes advantage of using the new "density graph"
feature where the bam.cram reads are displayed as a bar graph by
checking the box next to "Display data as a density graph"
on the Custom Track Settings page.
Example Three
The CRAM format is also supported in track hubs. Below is an example
trackDb.txt stanza that would display a CRAM files from the 1000 Genomes
Project. To learn more about using Track Hubs see the User Guide and associated Quick Start Guides to building
hubs. Note that type bam is used to display CRAM files in hubs, just
as type bam is used in custom CRAM tracks.
track cram61
type bam
shortLabel HG00361
longLabel This CRAM file is from the 1000 Genomes Project HG00361
visibility pack
bigDataUrl ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG00361/exome_alignment/HG00361.mapped.ILLUMINA.bwa.FIN.exome.20120522.bam.cram
References
The below are a collection of helpful CRAM references:
Sharing Your Data with Others
If you would like to share your CRAM data track with a colleague, learn
how to create a URL by looking at Example 11 on
this page.
Activating CRAM support for the Genome Browser
To find documentation on how to set up CRAM support on a mirror of
the UCSC Genome Browser please see this following
README.cram file.
|