10. ID 字段

本节将演示如何根据 ID 字段格式化数据

每个数据点都可以为格式化参数设置值,如 color, stroke_thickness, fill_color 等。


ID 字段可以为数据点指定注释,并通过注释计算每个数据点的格式,它不会直接影响数据的显示,而是在 rule 块中使用字符串来标记它。

10.1 格式化条带


chr21 30003462 30003712 AluSx SINE Alu
chr21 30003734 30003925 L1MD LINE L1
chr21 30004082 30004207 L1ME4a LINE L1
chr21 30004229 30004286 AT_rich Low_complexity Low_complexity
chr21 30004378 30004615 L1ME4a LINE L1
chr21 30004781 30004872 AT_rich Low_complexity Low_complexity
chr21 30004942 30005099 CT-rich Low_complexity Low_complexity
chr21 30005358 30005634 MER7A DNA MER2_type
chr21 30006113 30006265 L1ME4a LINE L1

每个元素都有多个分类(如 AluSx, SINE, Alu)。要使用这些分类,需要解析数据并将 ID 参数与每个数据点关联

use strict;
$,=" ";
while(<>) {
    my $id;
    if(/L(\d+)/) {
        $id = "LINE$1";
    } elsif (/S(\d+)/ || /SINE(\d?)/) {
        #my $num = $1 || 0;
        $id = "SINE" . ($1||0);
    } elsif (/low/i || /simple/i) {
        $id = "SIMPLE";
    } elsif (/LTR/) {
        $id = "LTR";
    } else {
        $id = "OTHER";
    my @tok = split;
    print @tok[0..2],"id=$id";


hs21 30003462 30003712 id=SINE0
hs21 30003734 30003925 id=LINE1
hs21 30004082 30004207 id=LINE1
hs21 30004229 30004286 id=SIMPLE
hs21 30004378 30004615 id=LINE1
hs21 30004781 30004872 id=SIMPLE
hs21 30004942 30005099 id=SIMPLE
hs21 30005358 30005634 id=OTHER
hs21 30006113 30006265 id=LINE1

定义 tile 块展示数据

type = tile
file = repeats.withid.txt
r0   = 0.8r
r1   = 0.98r
orientation = in
layers      = 50
thickness   = 20p
padding     = 6p
margin      = 0.001u
color       = black
# for very small tiles a stroke is useful because
# it ensures that tiles associated with very small
# spans will be visible
stroke_thickness = 2p
stroke_color = black



所有的条带将具有相同的格式,要对特定 ID 值设置格式

# test with regular expression
condition = var(id) =~ /LINE/


# test string equality
condition = var(id) eq "LINE"

例如,下面的规则将不同的颜色应用于 ID 为 LINE、SINE、SIMPLE 和 OTHER 的元素。

对于 LINE 元素,设置为绿色填充色,但是 LINE1 和 LINE2 元素具有附加的 stroke_color 条件




condition    = var(id) =~ /LINE/
color        = green
flow         = continue

condition    = var(id) =~ /LINE[12]/
stroke_color = red

condition    = var(id) =~ /SINE/
color        = blue
stroke_color = blue

condition    = var(id) =~ /SIMPLE/
color        = dgrey

condition    = var(id) =~ /OTHER/
color        = lgrey



10.2 格式化 link



linkid5 hs21 30772491 30777591 id=87-69
linkid5 hs21 30602230 30607330 id=87-69
linkid6 hs21 30257977 30263077 id=60-22
linkid6 hs21 30367808 30372908 id=60-22
linkid7 hs21 30079003 30084103 id=54-90
linkid7 hs21 30771970 30777070 id=54-90

首先,根据 ID 字段的第一个值,设置 link 的厚度

# make sure that the id field matches the required number-number format
condition  = var(id) =~ /(\d+)-(\d+)/
# extract the two number in 'id' to @match and use remap() function to map
# the first number in the range 1..100 to thickess in the range 1..10.
thickness  = eval( my @match = var(id) =~ /(\d+)-(\d+)/; remap($match[0],1,100,1,10) )
# so that other rules can trigger too
flow = continue


# make sure that the id field matches the required number-number format
condition  = var(id) =~ /(\d+)-(\d+)/
# extract the two number in 'id' to @match and use remap_int() function to map
# the first number in the range 1..100 to integer thickess in the range 1..10.
thickness  = eval( my @match = "var(id)" =~ /(\d+)-(\d+)/; remap_int($match[0],1,100,1,10) )

# use the first number as the z-value (i.e. thick links will be drawn on top)
z          = eval( my @match = "var(id)" =~ /(\d+)-(\d+)/; $match_int[0] )

# use the second number for the color and transparency
color      = eval( my @match = "var(id)" =~ /(\d+)-(\d+)/;  
                   sprintf("spectral-9-div-%d_a%d", remap_int($match[1],1,100,1,9),  
                                                    remap_int($match[1],1,100,5,1 ) ) )

请注意,ID 字段(文本)用引号包裹的。这是用于计算的 Perl 表达式

# correct - "87-69" treated as string
my @match = "87-69" =~ /(\d+)-(\d+)/

# incorrect - 87-69 treated as literal
my @match = 87-69 =~ /(\d+)-(\d+)/

可以在 命令行中添加 -debug_group rules 输出解析 rule 时的信息

10.3 格式化文本

根据 ID 字段设置文本格式与上面的做法相同,例如文本如下

hs21 30829740 30829740 tb id=64
hs21 30405360 30405360 oe id=31
hs21 30112849 30112849 ps id=74
hs21 30721834 30721834 dg id=25
hs21 30325022 30325022 sj id=22

然后根据 ID 字段设置文本的颜色和大小

condition  = 1
color      = eval(sprintf("set2-4-qual-%d",remap_int(var(id),1,100,1,4)))
label_size = eval(sprintf("%dp",remap_int(var(id),1,100,12,48)))

在这里,我们使用 condition = 1,不检查 ID 字段的格式是否正确,因为参数值是数值,不需要用引号引起来。


11. 热图 link

您可以按关联的值为 link 着色,以创建热图效果

11.1 为 link 设置值

在这里,无法用上述相同的方式将值与 link 相关联,但是可以破坏其中一个参数来执行此操作。例如,使用 value 参数

hs12 117427133 132349534 hs2 94056542 114056542 value=2
hs22 33232924 49691432 hs4 88399610 108399610 value=5

11.2 根据值为 link 着色

现在每个 link 都有值了,可以用它来设置颜色了。

可以用 val(value) 获取值


# always trigger this rule
condition  = 1
# use the link's value to sample from a list of colors
color      = eval((qw(red orange green blue purple))[ var(value) ])
# continue parsing other rules
flow       = continue

# always trigger this rule
condition  = 1
# add _a3 to the color of the ribbon, giving it 50% transparency (3/6)
color      = eval(sprintf("%s_a3",var(color)))


第一条规则将值映射到颜色,需要编写一行 perl代码来对颜色列表采样。如果列表中的元素没有包含空格,可以使用 qw() 运算符将单词转换为列表

qw(red orange green blue purple)

perl 采样语法

( ...list here...)[i]


(qw(red orange green blue purple))[i]

在这里, i 是 value 参数的值,即 val(value)


12. 颠倒的 link

本节将介绍如何格式化颠倒的 link

如果链接的两端方向相互颠倒,则该链接被视为颠倒。例如,给定一个由两端 chrA:start1-end1 和 chrB:start2-end2 定义的链接,如果

start1 < end1 && start2 > end2


start1 > end1 && start2 < end2

12.1 link 几何

要设置 link 不扭曲,可以设置 flat 参数

flat = yes

但是,如果你在数据文件中设置了 twist 参数

hs1 100 200 hs2 100 200
# this link's ribbon will be twisted, even if flat=yes is set
hs3 100 200 hs4 100 200 twist=1

那么 flat = yes 将不会有任何作用,该扭曲还是会扭曲

hs1 100 200 hs2 100 200
# when a link end has inverted*=1, its start/end coordinates
# are reversed. For the start of the link use inverted1 and
# for the end inverted2.
hs3 100 200 hs4 100 200 inverted=1

可以将 inverted 参数添加到数据文件中,link 的一个端点将交换其起点和终点坐标

link 的默认绘制方式是 start1 -> end1 -> end2 -> start2,如果染色体方向发生改变,也会影响 link 的扭曲

12.2 测试颠倒

要测试 link 是否扭转,可以使用 val(rev1) 和 val(rev2),如果连接的开始和结束分别反转,则每个字符串的值都为 1


condition  = var(rev2)
color      = orange


condition  = var(rev1) && var(rev2)
color      = red

condition  = var(rev1)
color      = green

condition  = var(rev2)
color      = orange

12.3 定义颠倒的 link


# this is a normal link
chr1 100 200 chr2 100 200

# this is an inverted link - its first end is inverted
chr1 200 100 chr2 100 200

# this is an inverted link - its first end is inverted using the 'inverted' flag
chr1 100 200 chr2 100 200 inverted1=1

# this is an inverted link - its second end is inverted
chr1 100 200 chr2 200 100

# this is an inverted link - its second end is inverted using the 'inverted' flag
chr1 100 200 chr2 100 200 inverted2=1

