ChIP-seq
ChIP-seq
Technology
Park, Peter J. “ChIP–seq: Advantages and Challenges of a Maturing Technology.” Nature Reviews Genetics 10, no. 10 (October 2009) - ChIP-seq review, basics of technology, alignment, peak calling, downstream analysis.
Bailey, Timothy, Pawel Krajewski, Istvan Ladunga, Celine Lefebvre, Qunhua Li, Tao Liu, Pedro Madrigal, Cenny Taslim, and Jie Zhang. “Practical Guidelines for the Comprehensive Analysis of ChIP-Seq Data.” Edited by Fran Lewitter. PLoS Computational Biology 9, no. 11 (November 14, 2013) - ChIP-seq computational workflow - sequencing depth, alignment, QC, peak calling, reproducibility (IDR), narrow/broad peaks, differential binding analysis, annotation, normalization.
Furey, Terrence S. “ChIP–seq and beyond: New and Improved Methodologies to Detect and Characterize Protein–DNA Interactions.” Nature Reviews Genetics, (October 23, 2012) - ChIP-seq technologies, narrow and broad peaks, DNAse- and FAIRE-seq, cromatin conformation capture, Gentle technology and terms introduction.
Buenrostro, Jason D, Paul G Giresi, Lisa C Zaba, Howard Y Chang, and William J Greenleaf. “Transposition of Native Chromatin for Fast and Sensitive Epigenomic Profiling of Open Chromatin, DNA-Binding Proteins and Nucleosome Position.” Nature Methods 10, no. 12 (December 2013) - ATAC-seq technology, corespondence to DNAse-seq.
Peak calling
Zhang, Yong, Tao Liu, Clifford A. Meyer, Jérôme Eeckhoute, David S. Johnson, Bradley E. Bernstein, Chad Nusbaum, et al. “Model-Based Analysis of ChIP-Seq (MACS).” Genome Biology, (2008) - MACS paper
Kharchenko, Peter V, Michael Y Tolstorukov, and Peter J Park. “Design and Analysis of ChIP-Seq Experiments for DNA-Binding Proteins.” Nature Biotechnology, (December 2008) - SPP - R package for analysis of ChIP-seq and other functional sequencing data. ChIP-seq technology, picture of strand-specific tag distribution. Strand cross-correlation as a method to decide whether tags should be included. Three types of anomalous tags. hms-dbmi/spp. Supplements/ChIP-seq/
Rozowsky, Joel, Ghia Euskirchen, Raymond K Auerbach, Zhengdong D Zhang, Theodore Gibson, Robert Bjornson, Nicholas Carriero, Michael Snyder, and Mark B Gerstein. “PeakSeq Enables Systematic Scoring of ChIP-Seq Experiments Relative to Controls.” Nature Biotechnology, (January 2009) - PeakSeq paper. gersteinlab/PeakSeq
Zhang, Xuekui, Gordon Robertson, Martin Krzywinski, Kaida Ning, Arnaud Droit, Steven Jones, and Raphael Gottardo. “PICS: Probabilistic Inference for ChIP-Seq.” Biometrics, (March 2011) - PICS paper. https://bioconductor.org/packages/release/bioc/html/PICS.html
Zang, Chongzhi, Dustin E. Schones, Chen Zeng, Kairong Cui, Keji Zhao, and Weiqun Peng. “A Clustering Approach for Identification of Enriched Domains from Histone Modification ChIP-Seq Data.” Bioinformatics, (August 1, 2009) - SICER paper. Cut genome into non-overlapping windows and compute a score for each window based on a Poisson model. Identify “islands” vs “non-islands” by thresholding the scores and clustering windows with significant scores. For each island, compute the probability of observing the island with a given score. Constructing score distribution is involved. Excellent statistical description.
Motif detection
D’haeseleer, Patrik. “What Are DNA Sequence Motifs?” Nature Biotechnology, (April 2006)
D’haeseleer, Patrik. “How Does DNA Sequence Motif Discovery Work?” Nature Biotechnology, (August 2006)
Lawrence, C. E., S. F. Altschul, M. S. Boguski, J. S. Liu, A. F. Neuwald, and J. C. Wootton. “Detecting Subtle Sequence Signals: A Gibbs Sampling Strategy for Multiple Alignment.” Science, (October 8, 1993) - Gibbs sampling for alignment of multiple sequences. Statistical definitions.
Bailey, T. L., and C. Elkan. “Fitting a Mixture Model by Expectation Maximization to Discover Motifs in Biopolymers.” ISMB Proceedings (1994) - MM MEME algorithm to find multiple motifs of width W in a set of sequences. Background and motif models of position frequencies of letters. EM algorithm to learn motifs that maximize likelihood of the data. After the first model is found, the procedure is repeated to find other motifs.
ChIP-seq statistics
Pei Fen Kuan et al., “A Statistical Framework for the Analysis of ChIP-Seq Data,” Journal of the American Statistical Association, (September 2011) - MOSAiCS R package (MOdel-based one and two Sample Analysis and Inference for ChIP-Seq) - regression framework that explicitly models mappability and GC biases for one- and two-sample. Evaluated on non-crosslinked, non-IP DNA. Negative Binomial is better fit for tag counts.
Li, Qunhua, James B. Brown, Haiyan Huang, and Peter J. Bickel. “Measuring Reproducibility of High-Throughput Experiments.” The Annals of Applied Statistics, (September 2011) - IDR - irreproducible discovery rate theoretical paper.
ChIP-seq resources
ATAC-seq resources
Wei, Zheng, Wei Zhang, Huan Fang, Yanda Li, and Xiaowo Wang. “EsATAC: An Easy-to-Use Systematic Pipeline for ATAC-Seq Data Analysis.” Bioinformatics, March 7, 2018 - esATAC R package for full ATAC-seq data processing and analysis.
Other-seq resources
Liu, Yongjing, Liangyu Fu, Kerstin Kaufmann, Dijun Chen, and Ming Chen. “[A Practical Guide for DNase-Seq Data Analysis: From Data Management to Common Applications](https://doi.org/10.1093/bib/bby0570.” Briefings in Bioinformatics, July 12, 2018 - DNAse-seq analysis guide. Tools for QC, peak calling, analysis, footprint detection, motif analysis, visualization, all-in-one tools (Table 2)
Skene, Peter J, and Steven Henikoff. “An Efficient Targeted Nuclease Strategy for High-Resolution Mapping of DNA Binding Sites.” Genes and Chromosomes, eLife, Jan 12, 2017 - CUT&RUN technology, chromatin profiling strategy, antibody-targeted controlled cleavage by micrococcal nuclease. Cost-efficient, low input requirements, easier.