The net file format is used to describe the axtNet data that underlie the
net alignment annotations in the Genome Browser. For a detailed description of the
methods used to generate these data, refer to the Genome Browser description pages
that accompany the
downloadable net alignment tracks.
At the beginning of each target species chromosome,
a “net” line appears with the format:
net chromName chromSize
Example:
net chr2L 23011544
Where chromName is the target species chromosome name
and chromSize is the size of that chromosome, followed by
the rest of the fill and gap lines. When a new target chrom starts,
there will be a new net line again.
File indentation: Line indentation level represents the
parent/child relationship between records and is a necessary part of the net
file format. Child records are indented one space from the parent, as seen in the
example net file below.
net chr2L 23011544
fill 6004 3278 chrXR_group3a - 1396397 2164 id 25606 score 23114 ali 782 qDup 576 type top tN 0 qN 0 tR 36 qR 0 tTrf 0 qTrf 0
gap 6065 2 chrXR_group3a - 1398498 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 6096 1485 chrXR_group3a - 1397572 897 tN 0 qN 0 tR 36 qR 0 tTrf 0 qTrf 0
fill 6096 513 chrU - 5570675 533 id 48675 score 4435 ali 465 qDup 533 type nonSyn tN 0 qN 0 tR 0 qR 13 tTrf 0 qTrf 0
gap 6116 8 chrU - 5571188 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 6156 5 chrU - 5571156 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 6184 3 chrU - 5571133 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 6212 18 chrU - 5571106 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 6244 9 chrU - 5571092 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 6340 2 chrU - 5570996 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 6515 3 chrU - 5570771 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 7623 1 chrXR_group3a - 1397530 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 7664 1007 chrXR_group3a - 1397008 482 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
fill 7664 382 chrXL_group1e - 8262003 506 id 25608 score 10609 ali 364 qDup 506 type nonSyn tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 7784 4 chrXL_group1e - 8262361 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 7792 3 chrXL_group1e - 8262357 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 7921 2 chrXL_group1e - 8262126 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 7949 9 chrXL_group1e - 8262092 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 8693 1 chrXR_group3a - 1396985 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
fill 9833 1251 chrU - 5562980 1239 id 48675 score 10720 ali 1124 qDup 1094 type top tN 0 qN 0 tR 22 qR 88 tTrf 0 qTrf 0
gap 9966 7 chrU - 5564075 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 10015 3 chrU - 5564030 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 10088 2 chrU - 5563957 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
gap 10101 8 chrU - 5563946 0 tN 0 qN 0 tR 0 qR 0 tTrf 0 qTrf 0
Field definitions
The net file consists of 7 fixed fields and a set
of optional name/value pair fields. In the descriptions
below, target refers to the reference
species and query refers to the aligning
species.
Fixed fields
-
Class -- Either fill or
gap. Fill refers to a portion of a chain.
-
Start in chromosome -- (target species)
-
Size -- target species)
-
Chromsome name -- (query species)
-
Relative orientation -- between target
and query species.
-
Start in chromsome -- (query species)
-
Size -- (query species)
Optional fields (Name/value pairs)
-
id -- ID of associated chain (gapped
alignment), if any.
-
score -- Score of associated chain.
-
ali -- Number of bases in alignments in chain.
-
qFar -- For fill that is on the same
chromosome as parent, how far fill is from position
predicted by parent. This helps determine if a
rearrangement is local or if a duplication is tandem.
-
qOver -- Number of bases overlapping
with parent gap on query side. Generally, this will
be near zero, except for inverts.
-
qDup -- Number of bases in query
region that are used twice or more in net. This helps
distinguish between a rearrangement and a duplication.
-
type -- One of the following values:
-
top -- Chain is top-level, not a gap filler.
-
syn -- Chain is on same chromosome and in
same direction as parent.
-
inv -- Chain is on same chromosome on
opposite direction from parent.
-
nonSyn -- Chain is on a different chromosome
from parent.
-
tN -- Number of unsequenced bases
(Ns) on target side.
-
qN -- Number of unsequenced bases on query side.
-
tR -- Number of bases in RepeatMasker
masked repeats on target.
-
qR -- Number of bases in RepeatMasker
masked repeats on query.
-
tNewR -- Bases in lineage-specific repeats on target.
-
qNewR -- Bases in lineage-specific repeats on query.
-
tOldR -- Bases in repeats predating split on target.
-
qOldR -- Bases in repeats predating split on query.
-
tTrf -- Bases in trf (Tandem Repeat Finder) repeats
on target.
-
qTrf -- Bases in trf repeats on query.
|