======Novelty checking database======

The database can be accessed at: https://drive.google.com/drive/folders/1eagc2z3RdYgIgyudc5u6ru_lN5MScaC_?usp=sharing


**Which studies are included in the database?**	

We aim to include:
  *  All major GWASs for a given phenotype  
  * All published cond/conjFDR papers for a given phenotype
  *   Other relevant post-GWAS analyses improving GWAS discovery (e.g. MTAG, STAR etc.)

**Steps to complete prior to checking for novelty**

1. Perform updated literature search from date of last search (provided at top of each phenotype tab) to present day using the search terms included in meta-data

2. Double check list of studies currently included for any missing studies you are aware of

**Inputting data for new phenotypes or new/missed studies**

1. There is one tab per phenotype - please create a new tab when inputting a new phenotype

2. Update the meta-data at the top of each page:
  * Date of last update
  * Description of studies included (e.g. complete literature search; only cond/conjFDR studies; incomplete cond/conjFDR studies)
  * List of studies included: author, year, journal, DOI) + abbreviated study name (e.g. PGC_bip_2016) 

3. Ensure the following columns are complete for all loci:
  * chromosome
  * min bp
  * max bp
  * lead SNP
  * study (abbreviated study name defined in meta data at the top of each tab)
  * method (condFDR, conjFDR, GWAS, MTAG, STAR, Other)
  
**N.B.** When inputting data, it is not necessary to check for overlapping loci at this point. We therefore expect there to be overlapping and duplicate loci within each list.

** Strategy for reporting genome-wide significant loci/variants with different definitions**
  *   If loci are defined with min and max base pairs → include reported chromosomal position
  *   If lead SNPs with BP positions are reported only → generate min/maxBP  using BP position +/- 1million BPs
  *   If previous genome build used → use lift-over function using one of the following tools:
http://www.ensembl.org/Homo_sapiens/Tools/AssemblyConverter (web-based)
https://genome.ucsc.edu/cgi-bin/hgLiftOver

  * If only rsIDs of lead SNPs are provided → use the following resource to generate BP position + generate min/maxBP (BP position +/- 1million BPs)   

https://www.ncbi.nlm.nih.gov/genome/tools/cyto_convert/

  * If scenario is encountered that is not accounted for above → please discuss with group. 

**Checking for novelty**

The database is combined with the GWAS Catalog to enable a two-step strategy for novelty checking. This is described in more detail here [[biostat:pleiofdr_analysis|PleioFDR Analysis]]

Once you are confident the list of loci is complete and you have identified an appropriate list of SNPs from the GWAScatalog, you can use the following script which will check novelty against both resources 

https://github.com/precimed/pleiofdr/tree/master/fuma

**Maintaining the database**

Whenever a study from the group has been accepted that reports novel loci (e.g. cond/conjFDR study), please can first author/highest author from the group update table with these results

**Questions/problems with database**

If you require help or would like to report a problem please contact either:
  * Guy (guy.f.hindley@gmail.com) - database related
  * Zillur (zillurbmb51@gmail.com) - script related

