Course Meeting Times

Sessions: 1 session / week, 2 hours / session

Course Overview

This seminar course will investigate the current state of computational challenges in personal genomics. We will work together to study seminal papers in the current literature, apply existing methods to complete genomes or genotypes, and explore open problems and existing research directions. The ultimate goal is understanding what can be learned computationally from a personal genome and more generally how sequence differences between individuals lead to phenotypic differences in gene expression, disease predisposition, or response to treatment.

Each week, the class will explore a different aspect of computational genomics. Students will read the two to three papers in advance of class, submit their impressions the night before, and participate in the discussion during class. Students will sign up to lead the discussion for one of topics of your choice. For each of the topics, the presenter will have a mentor with whom they will work in advance of their presentation to help prepare, understand the topic and materials, and run through the accompanying lab in advance. During the presentation, in the first 15–20 minutes the presenter will give a broader overview of the topic, then the class will spend spend 60 minutes discussing the papers in detail. The last 10 to 15 minutes of class will be reserved for students to read through the mini-lab and familiarize themselves with the datasets.


This graduate level special topics course requires that students receive permission from the instructor to enroll.

Paper Reading and Participation

Each week the class will read one to two articles.

The Reading phase for each paper will include, only paragraph-level comments and questions will be posted on the course site, not general comments for the paper.

The Discussion phase will include general comments on each paper will be accepted. Please use the document template (PDF) for submitting your comments to the papers. The students leading the discussion will then have the weekend to incorporate comments from the class into their presentation for the following week.

Weekly Mini Labs

After each paper discussion, the course staff will hold a short demo on how to apply the learned methods to existing datasets. The staff will help students get set up during class, and programming assignments will help students continue these labs after class.

Term Project

Each student will complete a final project that is planned out across the whole term. It should focus in depth on one or more of the topics discussed in the class. Students may either work alone or with one partner. Teams will be expected to undertake more ambitious projects. The most rewarding project topics are usually the most challenging (and possibly riskiest!). This might involve defining a biological problem, identifying relevant datasets, designing and implementing new algorithms, applying the methods, and interpreting the results. Alternatively, students can select a less risky project such as comparing several computational biology algorithms for solving the same problem and implementing, applying, and rigorously evaluating the results. Students might also analyze a relevant conference or journal article, including criticism, corrections, and / or improvements. An element of the final project grading will depend on the challenge and originality of your project, so spend enough time to choose carefully. Project proposals will be due in Session 7, and the final projects are due the night before Session 12, with in-class presentations during Session 12. The goal of the proposal is to get students thinking about their project and to make sure it is appropriate both in its subject matter and its complexity. The goal of the final project is to get students to dive more in depth into one of the topics of the course towards independent research in personal genomics.


There are no required textbooks for the course. There are some resources available that may help as an independent outside source for any points that need clarification:

  • Buy at Amazon Foulkes, Andrea S. Applied Statistical Genetics with R: For Population-based Association Studies. Springer, 2009. ISBN: 9780387895536. [Preview with Google Books] (available as an E-Book)
  • Buy at Amazon Hartl, Daniel, and Andrew G. Clark. Principles of Population Genetics. Sinauer Associates, Incorporation, 2006. ISBN: 9780878933082. (3 copies in the MIT library)


SES # Topics
1 Human Variation: Challenges in human genomics, Variation, haplotypes, LD, Common and rare variant association
2 Trait Association: Common and rare variant association, Mendelian vs. Complex, Basics of an association study, Phenotypes
3 Personal Genomics and Ethics: Research studies: Consent, privacy, returning results, Medical applications: Testing, public health issues, obligations to relatives, privacy, Consumer options: Carrier testing, newborn testing, genetic testing, Gene editing: Benefits, ethical issues, where to draw the line
4 Interpreting Coding (Rare) Variants: Rare variant detection from exome sequencing in family studies, Aggregate analyses (e.g. ExAC), Classifying function (PolyPhen), Overlapping constraint
5 Interpreting Non-Coding (Common) Variants: Enriched cell types, Causal variants: Epigenomics, Comparative Genomics, Target genes: HiC, genetic links, activity links, Upstream regulators: Regulatory motifs, TF binding
6 Bayesian Fine-Mapping and Multi-dimensional GWAS: Fine-mapping (Posterior probabilities). Bayesian models for computational fine-mapping, Multi-dimensional GWAS (Multi-Phenotype, Multi-ancestry, Multi-variant)
7 Intermediate Phenotypes and QTLs: eQTLs, meQTLs, and other molecular traits, Allelic activity, Causality and mediation analysis, Mendelian randomization analysis
8 Heritability and Phenotype Prediction: Missing heritability, heritability estimation, heritability partitioning, Polygenic risk scores, Predicting intermediate phenotypes
9 Human Ancestry and Population Genetics: Coalescent theory. Recent Selection, Sweeps, Selection Pressures, Population Stratification. Admixture Mapping, Ancestral genome sequencing and analysis
10 Cancer Genomics and Single-Cell Genomics: Genetic alterations: Somatic mutations, mutational signatures, rearrangements, Epigenetic alterations: Methylation, miRNAs, reprogramming, origin, Convergence at the regulatory region, gene, pathway, cellular level. Guiding convergence with physical, genetic, and activity networks
11 Experimental Manipulations: Genome editing and CRISPR, Experiment multiplexing, Screening and selection
12 Final presentations


Grades in this course will be based on the following:

Participation 20%
Mini labs 20%
Presentation of assigned paper 30%
Term project 30%