属于GWAS下游比较时髦的一个分析,就是用来结合某个region的GWAS和eQTL数据来找最causal的eQTL。【不局限】

a novel statistical methodology to assess whether two association signals are consistent with a shared causal variant【eQTL只是一种应用而已】

核心指标:shared causal effect (PP4)

外行花好久才弄明白的:

  1. colocalisation关注的是signal,并不是单个SNP
  2. 如下图,通常GWAS里我们选择P-value最显著的那个SNP,但不代表它就是causal,也不影响我们使用它
  3. 把这个leading SNP的周围的signal取出来,通常是个peak,可以确定这个signal到底与哪个gene有causal effect
  4. eQTL是用来找causal gene的,fine mapping才是根据functional的数据来找causal SNP的!!!【确定你是想找上游还是下游】
  5. 有genotype和phenotype就可以确定risk和protective allele,然后根据eQTL的方向来确定方向,最后就可以确定drug的方向,一条龙。

 

本文大纲

案例

方法

实战

 

案例

We further used a Bayesian method25 to test for colocalisation between SLE GWAS and eQTL signals (see the Methods section).【整体的目的和方法】

The results showed a high posterior probability for a shared causal effect (PP4=96.1%) between SLE association and CSNK2A2 expression in LCLs (see online supplementary figure S1), suggesting that the signals for SLE GWAS and eQTL were likely driven by the same causal variant.【核心指标PP4,注意这里的专业描述】

Compared with CSNK2A2 expression in LCLs, we observed a lower posterior probability for a shared causal effect (PP4=79.4%) between SLE association and CCDC113 expression (see online supplementary figure S1).【两个causal,如何二选一】

The expression level of CCDC113 was also significantly lower than that of CSNK2A2 (paired t-test p<2.2E-16, figure 1E) in LCLs.

Taken together, these results suggest that the putative causal variants may regulate expression of CSNK2A2 through affecting enhancer activities in B lymphocytes. 【结论】

 

一图胜千言,看下图。

在这个0.3Mb的区域内,SLE GWAS有很多signal,同时对其中的某些基因,eQTL也有很多signal,我们想知道哪个才是真正的具有causal effect的variant。

Figure S1 SLE association and eQTL association plots at the CSNK2A2 locus.
The x-axis shows the physical position on the chromosome (Mb).

(A) -log10(P) association p-value for SLE. (B) -log10(P) association p-value for CSNK2A2 expression in LCLs. (C) -log10(P) association p-value for CCDC113 expression in LCLs.

 

参考文献来源:Identification of ST3AGL4, MFHAS1, CSNK2A2 and CD226 as loci associated with systemic lupus erythematosus (SLE) and evaluation of SLE genetics in drug repositioning


 

方法

a novel statistical methodology to assess whether two association signals are consistent with a shared causal variant

An application is the integration of disease scans with expression quantitative trait locus (eQTL) studies, but any pair of GWAS datasets can be integrated in this framework.

下面这段introduction非常凝练:

Genome-wide association studies (GWAS) have found a large number of genetic regions (“loci”) affecting clinical end-points and phenotypes, many outside coding intervals. 【起点到终点,大部分是非编码】

One approach to understanding the biological basis of these associations has been to explore whether GWAS signals from intermediate cellular phenotypes, in particular gene expression, are located in the same loci (“colocalise”) and are potentially mediating the disease signals. 【基因表达属于中介表型】

However, it is not clear how to assess whether the same variants are responsible for the two GWAS signals or whether it is distinct causal variants close to each other. 【核心问题】

In this context, a natural question to ask is whether two independent association signals at the same locus, typically generated by two GWAS studies, are consistent with a shared causal variant. If the answer is positive, we refer to this situation as colocalised traits, and the probability that both traits share a causal mechanism is greatly increased.

The same questions can also be considered between pairs of eQTLs [9], [10], or pairs of diseases【colocalization的应用非常广泛】

 

A key feature of our approach is that it only requires single SNP p-values and their minor allele frequencies (MAFs), or estimated allelic effect and standard error, combined with closed form analytical results that enable quick comparisons, even at the genome-wide scale. 【本方法的优势】

The result of this procedure is five posterior probabilities (PP0, PP1, PP2, PP3 and PP4). A large posterior probability for hypothesis 3, PP3, indicates support for two independent causal SNPs associated with each trait. In contrast, if PP4 is large, the data support a single variant affecting both traits.

 

 


实战

https://chr1swallace.github.io/coloc/articles/a01_intro.html

 

待续~

 

 


 

 

 

参考:

 

版权声明:本文为leezx原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://www.cnblogs.com/leezx/p/14446462.html