Schema for gnomAD v3 - Genome Aggregation Database (gnomAD) Genome Variants v3

Home
Genomes
Genome Browser
Tools
Mirrors
- Third Party Mirrors
- Mirroring Instructions
Downloads
My Data
Help
About Us
- News
- Publications
- Blog
- Cite Us
- Credits
- Release Log
- Staff
- Contact Us
- Conditions of Use
- Jobs
- Licenses

field

description

chrom

An identifier from the reference genome

pos

The reference position, with the 1st base having position 1

id

Semi-colon separated list of unique identifiers where available

ref

Reference base(s)

alt

Comma separated list of alternate non-reference alleles called on at least one of the samples

qual

Phred-scaled quality score for the assertion made in ALT. i.e. give -10log_10 prob(call in ALT is wrong)

filter

PASS if this position has passed all filters. Otherwise, a semicolon-separated list of codes for filters that fail

info

Additional information encoded as a semicolon-separated series of short keys with optional comma-separated values

format

If genotype columns are specified in header, a semicolon-separated list of of short keys starting with GT

genotypes

If genotype columns are specified in header, a tab-separated set of genotype column values; each value is a colon-separated list of values corresponding to keys in the format column

chrom

pos

ref

alt

qual

filter

info

chr1

10031

77.00

AC0;AS_VQSR

AC=0;AN=53780;AF=0.00000e+00;lcr;variant_type=snv;n_alt_alleles=1;ReadPosRankSum=-1.38000e+00;MQRankSum=-5.72000e-01;RAW_MQ=6.39 ...

chr1

10037

180.00

AS_VQSR

AC=2;AN=72762;AF=2.74869e-05;lcr;variant_type=snv;n_alt_alleles=1;ReadPosRankSum=-4.80000e-01;MQRankSum=1.37100e+00;RAW_MQ=2.025 ...

chr1

10043

97.00

AS_VQSR

AC=1;AN=81114;AF=1.23283e-05;lcr;variant_type=snv;n_alt_alleles=1;ReadPosRankSum=-8.96000e-01;MQRankSum=1.23100e+00;RAW_MQ=8.174 ...

chr1

10055

75.00

AS_VQSR

AC=1;AN=89638;AF=1.11560e-05;lcr;variant_type=snv;n_alt_alleles=1;ReadPosRankSum=-1.10600e+00;MQRankSum=7.15000e-01;RAW_MQ=1.141 ...

chr1

10057

264.00

AS_VQSR

AC=3;AN=107374;AF=2.79397e-05;lcr;variant_type=snv;n_alt_alleles=1;ReadPosRankSum=-6.84000e-01;MQRankSum=7.88000e-01;RAW_MQ=2.27 ...

chr1

10061

72.00

AC0

AC=0;AN=103816;AF=0.00000e+00;lcr;variant_type=snv;n_alt_alleles=1;ReadPosRankSum=0.00000e+00;MQRankSum=1.05000e-01;RAW_MQ=1.410 ...

chr1

10061

TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC

1142.00

AC0;AS_VQSR

AC=0;AN=103816;AF=0.00000e+00;lcr;variant_type=indel;n_alt_alleles=1;ReadPosRankSum=-1.02600e+00;MQRankSum=7.36000e-01;RAW_MQ=8. ...

chr1

10064

CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAAA

71.00

AC0

AC=0;AN=140930;AF=0.00000e+00;lcr;variant_type=indel;n_alt_alleles=1;ReadPosRankSum=-7.27000e-01;MQRankSum=7.27000e-01;RAW_MQ=1. ...

chr1

10067

rs1489251879

TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCC

952.00

PASS

AC=2;AN=114200;AF=1.75131e-05;lcr;variant_type=indel;n_alt_alleles=1;ReadPosRankSum=0.00000e+00;MQRankSum=6.76000e-01;RAW_MQ=9.6 ...

chr1

10108

89.00

AS_VQSR

AC=1;AN=9128;AF=1.09553e-04;lcr;variant_type=indel;n_alt_alleles=1;ReadPosRankSum=2.10000e+00;MQRankSum=-1.55200e+00;RAW_MQ=7.87 ...

Description

The gnomAD v3 track shows variants from 71,702 whole genomes (and no exomes), all mapped to the GRCh38/hg38 reference sequence. Most of the genomes from v2 are included in v3. For more detailed information on gnomAD v3, see the related blog post.

The gnomAD v2 tracks show variants from 125,748 exomes and 15,708 whole genomes, all mapped to the GRCh37/hg19 reference sequence and lifted to the GRCh38/hg38 assembly. The data originate from 141,456 unrelated individuals sequenced as part of various population-genetic and disease-specific studies collected by the Genome Aggregation Database (gnomAD), release 2.1.1. Raw data from all studies have been reprocessed through a unified pipeline and jointly variant-called to increase consistency across projects. For more information on the processing pipeline and population annotations, see the following blog post and the 2.1.1 README.

gnomAD v2 data are based on the GRCh37/hg19 assembly. These tracks display the GRCh38/hg38 lift-over provided by gnomAD on their downloads site.

For questions on the gnomAD data, also see the gnomAD FAQ.

Display Conventions

: In mode, a vertical line is drawn at the position of each variant.

In mode, "ref" and "alt" alleles are displayed to the left of a vertical line with colored portions corresponding to allele counts. Hovering the mouse pointer over a variant pops up a display of alleles and counts.

Data Access

The raw data can be explored interactively with the Table Browser, or the Data Integrator. For automated analysis, the data may be queried from our REST API, and the genome annotations are stored in files that can be downloaded from our download server, subject to the conditions set forth by the gnomAD consortium (see below). Coverage values for the genome are in bigWig files in the coverage/ subdirectory. Variant VCFs can be found in the vcf/ subdirectory.

The data can also be found directly from the gnomAD downloads page. Please refer to our mailing list archives for questions, or our Data Access FAQ for more information.

Credits

Thanks to the Genome Aggregation Database Consortium for making these data available. The data are released under the ODC Open Database License (OBdL) as described here.

References

Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alfoldi J, Wang Q, Collins RL, Laricchia KM, Ganna A, Birnbaum DP et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. doi: https://doi.org/10.1101/531210.

Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, O'Donnell-Luria AH, Ware JS, Hill AJ, Cummings BB et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016 Aug 17;536(7616):285-91. PMID: 27535533; PMC: PMC5018207