====== Data cleaning ======


**QC**

  *   Reduce the number of clinical variables to fit your needs
  *   Remove missing-values (888, 999)
  *   Get familiar with the phenotypes you include, Check for outliers
  *   Be aware of possible sample-effects. There are used different diagnostic instruments across samples.
  * Remove duplicates/excluded IDs. Overview avilable here: 
//{{:duplicate_ids.xls|}} 
//{{:deleted_ids.xls|}}
  *   Diagnosis QC and clinical phenotypes only verified for the TOP-sample (U01-KSU7599 + K50001-K50999)
 Make sure you use "Subj_ID" as unique identifier. ID-list available here:  
[[https://www.med.uio.no/norment/english/internal/protocols/database/index.html]]