### Dataset S1. Analyzed Data

Analyzed data contains a single value for each probe position across the genome. All 3 nucleosomal and 3 control CEL files were imported into Affymetrix Tiling Analysis Software (TAS) as a 2-sample analysis and analyzed with the following parameters.

- Chip normalized (quantile normalization)
- Control and treatment normalized together
- Bandwidth = 20
- Signal scale = log2

The text output is separated into sections for each chromosome represented on the array and each row for a given section is tab-deliminted containing the chromosome coordinate in the first column and the log2 signal ratio in the second.

#### Download analyzed whole-genome data:

### Dataset S2. Clustering Analysis Data

Probe signal intensity ratios for ~800 bp surrounding each of the ~5000 verified transcription segments was extracted from the analyzed data file. Each line of data represents the nucleosome structure surrounding a transcription start site, these were clustered using k-means clustering with a Euclidean distance metric.
The K=4 clustering was used in the manuscript for Figure 4.

#### Download clustering data:

### Dataset S3. Raw Microarray Data

All raw data is available as Affymetrix cel files, listing array coordinates and intensity values. Each pair of control and nucleosomal samples were prepped and hybridized together and from the same culture, the only difference being that the nucleosomal chromatin was digested with micrococcal nuclease prior to purification.

#### Download raw data:

### Dataset S4. Transcription factor binding site occupancy

Source data used in Figure 5 for two-sample t-test comparisons of nucleosome occupancy between functional binding sites and non-functional binding sites for 103 distinct transcription factors. This file contains 6 columns in this format:

- Label - a gene name for a transcription factor
- EqualVar - whether or not nucleosome occupancy in functional sites has equal variance compared to that of non-functional sites
- MeanDiff - the difference in means between the two distributions
- T-stat - the T statistic
- p-val - two-tailed p-value
- NullHyp - whether or not the Null Hypothesis that the two distributions are drawn from the same population is rejected, based on p < 0.05

#### Download transcription factor occupancy data: