ChIP-Part implements a partitioning algorithm for large-scale Chip-Seq data sets. It has been specifically designed to find very large signal-enriched regions, occurring for instance in histone modification maps, especially those that spread over large regions (e.g. H3K36me3). The input of ChIP-Part is a set of tag positions produced by a ChIP-Seq experiment mapped to a reference genome. A working format is a simplified GFF format, called SGA, which is sorted by sequence name and position. In addition to SGA, ChIP-Part supports other input data formats such as BED, GFF, BAM, and FPS. Compressed input data in gzip or zip format is also accepted. ChIP-Part returns a list of signal-enriched regions defined by start and end positions. The default output format of the tool is a two-line-oriented SGA-formatted file, in which each edge of the signal-enriched region is represented by a SGA line: *+* for start and *-* for end respectively. In addition to SGA format, single-line-oriented GFF, BED, and FPS output formats are also provided. For supported genome assemblies, a direct link to the UCSC genome browser is further provided for rapid comparison with genome annotations. As a further option, sequences around signal-enriched region bounds can be extracted to a file in FASTA format. Sequence extraction is carried out using the FPS-formatted output.
Parent program: chip-seq
ChIP-Seq combines chromatin immunoprecipitation with massively parallel DNA sequencing to identify the set of cis-acting targets of DNA-associated proteins or factors on a genome scale. It can be used to precisely map global binding sites for any protein of interest. Previously, ChIP-on-Chip was the most common technique used to identify trascription factor DNA interactions. Chip-Seq is also used to study epigenetic events such as histone modifications and DNA methylation.