•
Biológicas / Saúde
Prévia do material em texto
<p>Review Article</p><p>The Evolution of Transcriptional Regulation in Eukaryotes</p><p>Gregory A. Wray, Matthew W. Hahn, Ehab Abouheif, James P. Balhoff, Margaret Pizer,</p><p>Matthew V. Rockman, and Laura A. Romano</p><p>Department of Biology, Duke University</p><p>Gene expression is central to the genotype-phenotype relationship in all organisms, and it is an important component of</p><p>the genetic basis for evolutionary change in diverse aspects of phenotype. However, the evolution of transcriptional</p><p>regulation remains understudied and poorly understood. Here we review the evolutionary dynamics of promoter, or cis-</p><p>regulatory, sequences and the evolutionary mechanisms that shape them. Existing evidence indicates that populations</p><p>harbor extensive genetic variation in promoter sequences, that a substantial fraction of this variation has consequences for</p><p>both biochemical and organismal phenotype, and that some of this functional variation is sorted by selection. As with</p><p>protein-coding sequences, rates and patterns of promoter sequence evolution differ considerably among loci and among</p><p>clades for reasons that are not well understood. Studying the evolution of transcriptional regulation poses empirical and</p><p>conceptual challenges beyond those typically encountered in analyses of coding sequence evolution: promoter or-</p><p>ganization is much less regular than that of coding sequences, and sequences required for the transcription of each locus</p><p>reside at multiple other loci in the genome. Because of the strong context-dependence of transcriptional regulation,</p><p>sequence inspection alone provides limited information about promoter function. Understanding the functional con-</p><p>sequences of sequence differences among promoters generally requires biochemical and in vivo functional assays.</p><p>Despite these challenges, important insights have already been gained into the evolution of transcriptional regulation, and</p><p>the pace of discovery is accelerating.</p><p>1 Introduction</p><p>A gene embedded in random DNA is inert. In the</p><p>absence of sequence motifs and proteins capable of di-</p><p>recting transcription, the protein it encodes will remain in-</p><p>visible to selection. Every gene with a phenotypic impact is</p><p>flanked by regulatory sequences that, in conjunction with</p><p>the expression and activity of proteins encoded elsewhere,</p><p>regulate when expression occurs, at what level, under what</p><p>environmental conditions, and in which cells or tissues.</p><p>Transcriptional regulatory sequences are as important for</p><p>gene function as the coding sequences that determine the</p><p>linear array of amino acids in a protein.</p><p>Transcriptional regulation is also a crucial contribu-</p><p>tor to evolutionary change in the genotype-phenotype</p><p>relationship. Understanding the dynamic link between</p><p>genotype and phenotype remains a central challenge in</p><p>evolutionary biology (Wright 1982; Raff 1996; Wilkins</p><p>2002). Enormous advances have been made during the past</p><p>few decades in understanding the dynamics of alleles</p><p>within populations, the role of genes during development,</p><p>and the evolution of phenotype. Although these studies</p><p>have progressed along nearly independent paths (for his-</p><p>torical perspectives, see Raff [1996] and Wilkins [2002]),</p><p>they have recently begun to intersect in fruitful and exciting</p><p>ways in studies of gene expression. This work is making</p><p>substantial contributions to the understanding of how the</p><p>genotype-phenotype relationship evolves.</p><p>The goal of this review is to bring transcriptional</p><p>regulation into the mainstream of molecular evolution.</p><p>We are concerned here with promoters (cis-regulatory se-</p><p>quences that influence transcription) and transcription</p><p>factors (proteins that interact with these sequences).</p><p>Throughout, we emphasize three general points. First,</p><p>changes in transcriptional regulation comprise a quantita-</p><p>tively and qualitatively significant component of the</p><p>genetic basis for evolutionary change. Second, understand-</p><p>ing how transcriptional regulation evolves requires a clear</p><p>grasp of how the relevant macromolecules interact and</p><p>function in living cells. And third, studying the evolution of</p><p>transcriptional regulation poses unique and significant</p><p>challenges to both empirical and analytical approaches.</p><p>These challenges are balanced, however, by extraordinary</p><p>opportunities to extend and deepen our understanding of</p><p>the genetic basis for phenotypic evolution.</p><p>2 Why Promoter Evolution Matters</p><p>Several recent reviews have argued that changes in</p><p>transcriptional regulation constitute a major component of</p><p>the genetic basis for phenotypic evolution (Doebley and</p><p>Lukens 1998; Carroll 2000; Stern 2000; Tautz 2000;</p><p>Theissen et al. 2000; Purugganan 2000; Wray and Lowe</p><p>2000; Carroll, Grenier, and Weatherbee 2001; Davidson</p><p>2001; Wilkins 2002). Although the authors reached similar</p><p>conclusions, they provided limited evidence to support the</p><p>claim that mutations affecting transcriptional regulation</p><p>have important evolutionary consequences. In this section</p><p>we therefore review the theoretical arguments and em-</p><p>pirical evidence that transcriptional regulation plays a per-</p><p>vasive and important role in evolution.</p><p>2.1 Theoretical Arguments: Why Promoters Ought to</p><p>Contribute to Phenotypic Evolution</p><p>Before direct evidence was available, a few far-sighted</p><p>biologists argued on the basis of first principles that changes</p><p>in gene expression should constitute an important part of the</p><p>genetic basis for phenotypic change (Jacob and Monod</p><p>Key words: binding site, enhancer, evolution of development,</p><p>genotype-phenotype relationship, promoter, transcription factor.</p><p>E-mail: gwray@duke.edu.</p><p>1377</p><p>Mol. Biol. Evol. 20(9):1377–1419. 2003</p><p>DOI: 10.1093/molbev/msg140</p><p>Molecular Biology and Evolution, Vol. 20, No. 9,</p><p>� Society for Molecular Biology and Evolution 2003; all rights reserved.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>1961; Wallace 1963; Zuckerkandl 1963; Britten and</p><p>Davidson 1969, 1971; King and Wilson 1975; Wilson</p><p>1975; Jacob 1977; Raff and Kaufman 1983). Their</p><p>arguments were based in part on the realization that the</p><p>phenotypic impact of a gene is a function of two distinct</p><p>components: the biochemical activity of the protein it</p><p>encodes and the specific conditions under which that</p><p>protein is expressed and is therefore able to exert its activity.</p><p>During subsequent decades, the field of molecular evolu-</p><p>tion focused on the evolutionary implications of the first</p><p>component of function, while developmental biologists</p><p>were more concerned with the functional implications of the</p><p>second. The revival of ‘‘evo-devo’’ has focused attention on</p><p>a more integrative view that encompasses both protein</p><p>function and gene expression (Raff 1996; Wilkins 2002).</p><p>Four additional considerations suggest that transcrip-</p><p>tional regulation ought to be evolutionarily important. (1)</p><p>Significant phenotypes. Many authors have commented</p><p>on the direct relationship between when or where a gene</p><p>is expressed and the functionally significant phenotypes</p><p>that might result from changing these parameters (Raff</p><p>and Kaufman 1983; Gerhart and Kirschner 1997; Carroll</p><p>2000; Davidson 2001; Wilkins 2002). For instance, ear-</p><p>lier expression of a hormone might result in accelerated</p><p>growth, whereas ectopic expression of a transcription</p><p>factor might result in a duplicated structure. Importantly,</p><p>these phenotypic transformations can be independent of</p><p>changes in protein sequences. Changes in precisely how</p><p>transcription is regulated can also have significant</p><p>phenotypic consequences (Paigen 1989). For instance,</p><p>synthesizing a digestive enzyme in response to feeding or</p><p>resource availability might prove advantageous compared</p><p>with continuous production (Jacob and Monod 1961).</p><p>Such changes may form the basis of polyphenism and</p><p>phenotypic plasticity (Schlichting and Pigliucci 1998;</p><p>Gilbert 2001). (2) Coordinated pleiotropy. Because the</p><p>proteins that regulate transcription interact with batteries</p><p>of functionally related genes, a mutation affecting the</p><p>function or expression of a transcription factor can</p><p>potentially produce</p><p>engrailed: Jaynes and O’Farrell 1991). (2) A</p><p>transcription factor may interact with another transcrip-</p><p>tion factor before or as it binds to DNA, in a variety of</p><p>functional contexts. Several transcription factors bind</p><p>DNA only as homodimers or heterodimers (e.g., many</p><p>nuclear receptor family members: Benecke, Gaudon, and</p><p>Gronemeyer 2001); others can bind DNA only when they</p><p>are not bound to a cofactor (e.g., myoD and Id: Benezra</p><p>et al. 1990); and still others can bind DNA alone, but</p><p>their specificity and/or association kinetics change when</p><p>Table 3</p><p>Size of Selected Transcription Factor Families in Five Eukaryotes</p><p>Transcriptiona</p><p>Factor Family</p><p>Saccharomyces</p><p>cerevisiae</p><p>Caenorhabditis</p><p>elegans</p><p>Drosophila</p><p>melanogaster</p><p>Homo</p><p>sapiens</p><p>Arabidopsis</p><p>thaliana</p><p>Homeodomain 9 109 148 267 118</p><p>Nuclear receptor 1 183 25 59 4</p><p>Zn-finger 121 437 357 706 1,049</p><p>Runt-domain 0 2 4 3 0</p><p>Basic HLH 7 41 84 131 106</p><p>Paired box 0 23 28 38 2</p><p>Myb 15 17 18 32 243</p><p>a Tallies of number of genes in each family from Venter et al. (2001) and Lander et al. (2001).</p><p>Evolution of Transcriptional Regulation 1389</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>complexed with a cofactor (e.g., many homeodomain</p><p>proteins: Lufkin 2001). Other transcription factors bind as</p><p>heterodimers with a variety of partners, with distinct con-</p><p>sequences for transcription (e.g., homeodomain proteins:</p><p>Pinsonneault et al. 1997; max and partners myc/mad:</p><p>Grandori et al. 2000). (3) A transcription factor bound</p><p>to DNA may physically inhibit binding of a different</p><p>transcription factor to a nearby site. For steric hindrance</p><p>to work, the two binding sites must be near each other</p><p>(usually on the same face of the DNA strand), and the</p><p>affinity of the blocking protein for its binding site or its</p><p>concentration must exceed that of the blocked protein.</p><p>Because steric hindrance involves nonspecific protein-</p><p>protein interactions, in principle any transcription factor</p><p>can operate in this way. (4) A transcription factor bound to</p><p>DNA may alter chromatin structure. Some transcription</p><p>factors maintain local chromatin in a decondensed state</p><p>(e.g., trithorax: Mahmoudi and Verrijzer 2001), and others</p><p>condense it (e.g., groucho: Chen and Courey 2000;</p><p>polycomb: Jacobs and van Lohuizen 1999). These</p><p>transcription factors recruit multiprotein complexes such</p><p>as the SWI/SNF complex (Varga-Weisz 2001), enzymes</p><p>that acetylate, deacetylate, methylate, or demethylate</p><p>histones (Vogelauer et al. 2000; Richards and Elgin</p><p>2002), or enzymes that methylate or demethylate DNA</p><p>(Jones and Takai 2001). Chromatin remodeling is highly</p><p>dynamic and is apparently regulated on spatial scales as</p><p>small as promoters, or even regions within a promoter</p><p>(Kadosh and Struhl 1998; Wolffe 2001). Although</p><p>chromatin condensation probably overrides most protein-</p><p>DNA interactions by physically blocking access to binding</p><p>sites, some transcription factors can associate with DNA in</p><p>partially condensed chromatin (Narlikar, Fan, and King-</p><p>ston 2002). (5) A transcription factor bound to DNA may</p><p>stabilize the bending or looping of DNA. Some proteins</p><p>facilitate local bending of DNA, allowing other bound</p><p>proteins that are near each other but not in contact to</p><p>interact (e.g., Sox: Scaffidi and Bianchi 2001; Tcf/Lef-1:</p><p>Love et al. 1995; Sp1: Sjottem, Andersen, and Johansen</p><p>1997). Other proteins stabilize DNA loops by forming</p><p>homodimers, facilitating interactions among transcription</p><p>factors bound at distant sites (e.g., GCF1: Zeller et al.</p><p>1995; RIP60: Houchens et al. 2000). Some of these so-</p><p>called architectural proteins may be necessary, rather than</p><p>sufficient, to activate or repress transcription; others,</p><p>however, play a direct role in modulating the frequency</p><p>of transcriptional initiation (e.g., Tcf/Lef-1: Fry and</p><p>Farnham 1999). (6) Transcription cofactors that do not</p><p>bind to DNA can mediate interactions between DNA-</p><p>binding proteins and the transcriptional machinery or</p><p>chromatin-remodeling enzymes. Some proteins influence</p><p>transcription by mediating specific interactions between</p><p>transcription factors and effector proteins, primarily</p><p>cofactors of the polymerase II complex or chromatin-</p><p>remodeling complexes. Many such transcription cofactors</p><p>have been identified. Some seem to interact with a re-</p><p>stricted set of transcription factors (e.g., OCA-B: Gstaiger</p><p>et al. 1995; TIF-1: Glass, Rose, and Rosenfeld 1997), but</p><p>others interact with a variety of phylogenetically unrelated</p><p>transcription factors (e.g., CBP/p300 interacts with CREB,</p><p>myoD, Myb, Jun, Fos, nuclear receptor, and AP-1 family</p><p>members: Shikama, Lyon, and La Thangue 1997, Wolffe</p><p>2001). Promoter sequences contain no direct evidence of</p><p>cofactor interactions.</p><p>3.4.6 Many Transcription Factors Act Primarily as</p><p>Activators or Repressors of Transcription</p><p>The presence of particular protein-protein interaction</p><p>domains dictates to a large extent what effect a given</p><p>transcription factor will have once it is bound to DNA (see</p><p>section 3.4.5). A variety of transcriptional activation</p><p>domains have been identified that mediate direct in-</p><p>teraction with TBP or indirect interaction, by means of</p><p>a TAF (Triezenberg 1995). Some transcription factors</p><p>contain more than one activation domain (e.g., GAL4: Gill</p><p>and Ptashne 1987; CREB: White 2001). Likewise, various</p><p>repressor domains are known, although their mechanisms</p><p>of operation are less well understood (Hanna-Rose and</p><p>Hansen 1996; Latchman 1998). ‘‘Domain-swapping’’</p><p>experiments demonstrate that these domains alone are</p><p>sufficient to turn a transcription factor from an activator</p><p>into a repressor and vice-versa.</p><p>3.4.7 The Effect of Some Transcription Factors Is Context</p><p>Dependent</p><p>The activity of many transcription factors depends on</p><p>post-translational covalent modifications, most commonly</p><p>phosphorylation (e.g., Oct-1: Segil, Roberts, and Heitz</p><p>1991), acetylation (e.g., p53: Gu and Roeder 1997), and</p><p>glycosylation (Sp1: Jackson and Tjian 1988). These</p><p>modifications often provide an important point of control</p><p>over transcription, and phosphorylation in particular is</p><p>often dynamically regulated (Roberts, Segil, and Heintz</p><p>1991). The effect of a transcription factor may be strongly</p><p>context dependent, even once it is bound to DNA and</p><p>despite the presence of an activation or repression domain</p><p>(Biggin and McGinnis 1997; Yamamoto et al. 1998; Fry</p><p>Table 4</p><p>Overlapping Binding Site Specificities</p><p>Binding Site</p><p>Transcription Factors</p><p>that Binda Referenceb</p><p>Paralogsc</p><p>ATTA Engrailed, even-skipped,</p><p>fushi-tarazu</p><p>1</p><p>GGATTA Orthodenticle, goosecoid 2</p><p>CACGTG Myc, Mad 3</p><p>Unrelated Proteinsd</p><p>CCATATTTGG SRE, YY1e 4</p><p>TCAATGT IRE-ABP, C/EBPe 5</p><p>GGGGCGTGGGCTG Sp1, Egre 4</p><p>a Even though these proteins all bind specifically to the sequences shown,</p><p>binding kinetics may differ. Proteins are probably recognizing a subset of</p><p>nucleotides in the longer target sites, based on their known specificities for other</p><p>sequences.</p><p>b (1) Biggins and McGinnis (1997); (2) Angerer et al. (2001); (3) James and</p><p>Eisenman (2002); (4) Fry and Farnham (1999); (5) Buggs et al. (1998).</p><p>c Proteins belonging to the same family; not necessarily the closest paralogs.</p><p>d Proteins with no discernible genealogical relationship.</p><p>e Overline is binding site of first protein listed, underline of second protein.</p><p>1390 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>and Farnham 1999; see section 3.5.3). Some transcription</p><p>factors require other bound proteins to function; others</p><p>interact synergistically, producing a much stronger effect</p><p>on transcription in combination than alone (Sauer, Hansen,</p><p>and Tjian 1995; Thanos and Maniatis 1995). More</p><p>dramatically, some transcription factors function as either</p><p>activators or repressors in different contexts. This can</p><p>happen in at least four ways: (1) activation and repression</p><p>domains may be present in the same protein (e.g., Dorsal:</p><p>Flores-Saaib, Jia, and Courey 2001); (2) a protein may</p><p>interact with different partners which contain distinct</p><p>interaction domains (e.g., runt: Wheeler et al. 2000; many</p><p>homeodomain proteins: Knoepfler and Kamps 1995 and</p><p>Pinsonneault et al. 1997; many nuclear receptor proteins:</p><p>Benecke, Gaudon, and Gronemeyer 2001); (3) any DNA-</p><p>binding protein can act as a repressor if it masks the</p><p>binding site of a transcriptional activator, an effect that</p><p>does not require a specialized repressor domain; and (4)</p><p>because transcription factors can influence the expression</p><p>of other transcription factors, a transcriptional activator</p><p>can repress other genes through the intermediate step of</p><p>activating a repressor, or vice versa (e.g., RPD: Bernstein</p><p>et al. 2000).</p><p>3.4.8 Transcription Factors Are Not Intrinsically Limited</p><p>to Specific Developmental or Regulatory Roles</p><p>The aspects of organismal phenotype affected by</p><p>a particular transcription factor are determined by the</p><p>downstream, or target, genes it regulates (and the genes</p><p>affected by their activity, and so forth). Unlike metabolic</p><p>enzymes and structural proteins, whose biochemical</p><p>activities determine phenotype, there is nothing about</p><p>a transcription factor that intrinsically links it to a particular</p><p>aspect of phenotype. Some transcription factors have</p><p>names that imply dedicated functions (e.g., Eyeless in</p><p>eye development, APETALA1 in floral patterning), but</p><p>these proteins have additional regulatory roles unrelated to</p><p>their eponymous ones. Many transcription factors have</p><p>extended evolutionary associations with particular de-</p><p>velopmental processes, most famously Hox proteins with</p><p>anteroposterior patterning in animals (Gerhart and Kirsch-</p><p>ner 1997) and MADS-box proteins with floral patterning in</p><p>plants (Lawton-Rauh et al. 2000). In principle, however,</p><p>any transcription factor could bind to the promoter of any</p><p>gene and regulate its expression, so long as an appropriate</p><p>binding site is present; conversely, a gene’s transcription</p><p>could, in principle, be regulated by any transcription factor.</p><p>This lack of an obligate connection between a specific</p><p>transcription factor and a specific aspect of phenotype is</p><p>evident on both developmental and evolutionary time</p><p>scales: during development most transcription factors are</p><p>expressed in several temporally and spatially distinct</p><p>phases, where they may regulate the expression of different</p><p>downstream genes that influence completely unrelated</p><p>aspects of phenotype (Duboule and Wilkins 1998;</p><p>Davidson 2001), and during the course of evolution the</p><p>downstream genes regulated by a transcription factor can</p><p>change dramatically (Keys et al. 1999; Davidson 2001;</p><p>Wilkins 2002).</p><p>3.5 Promoter Function</p><p>The transcriptional output of a promoter is not</p><p>a simple function of which binding sites are present. The</p><p>relative position, orientation, and nucleotide sequences</p><p>of these binding sites, as well as the expression profiles</p><p>of their cognate transcription factors and cofactors, all</p><p>interact to produce the total transcription profile of a gene.</p><p>These interactions are complex, nonlinear, and often</p><p>strongly context dependent. At least for the near term,</p><p>we lack the ability to predict transcription profiles from</p><p>sequence inspection.</p><p>3.5.1 Transcription Is ‘‘Off ’’ by Default</p><p>Native chromatin is impervious to the RNA poly-</p><p>merase II complex, and even a decondensed basal</p><p>promoter cannot efficiently direct transcription in the</p><p>absence of specific transcription factors (Carey and Smale</p><p>2000; Courey 2001). Because transcription is ‘‘off’’ by</p><p>default, all promoters contain binding sites for activators of</p><p>transcription but only some contain binding sites for</p><p>negative regulators (Arnone and Davidson 1997; Davidson</p><p>2001). Although not ubiquitous, repression is common and</p><p>can be important for modulating the level of transcription,</p><p>for restricting expression from inappropriate regions, and</p><p>for adjusting gene expression in response to extracellular</p><p>signals (Gray and Levine 1996; Shore and Sharrocks</p><p>2001). Activating transcription requires decondensing the</p><p>chromatin surrounding the basal promoter and around</p><p>some transcription factor binding sites, followed by DNA</p><p>binding by specific transcription factors capable of</p><p>recruiting the RNA polymerase II complex onto the basal</p><p>promoter. In practice, activating transcription at a single</p><p>locus requires dozens of specific interactions among</p><p>macromolecules (Thanos and Maniatis 1995; Reinberg</p><p>et al. 1998; Wolffe 2001) (fig. 1B).</p><p>3.5.2 Binding Site Position and Orientation Are</p><p>Functionally Tied More Closely to Nearby Binding</p><p>Sites than to the Basal Promoter</p><p>Some protein-protein interactions depend on precise</p><p>spacing and relative orientation of binding sites (e.g.,</p><p>Hanes et al. 1994). In particular, steric hindrance and some</p><p>cases of cooperative binding require binding sites to be in</p><p>specific positions relative to each other. These interactions</p><p>involve binding sites that typically lie no farther apart than</p><p>the size of the proteins that they bind (in practice, up to</p><p>a few tens of base pairs apart). Some interactions are</p><p>precisely phased to lie on the same side of nucleosomes</p><p>(;40-bp multiples) or completely decondensed DNA</p><p>(;10-bp multiples) (Lewin 2000; White 2001). In</p><p>contrast, many protein-protein interactions take place</p><p>through DNA looping and are relatively insensitive to</p><p>position and orientation. Binding sites that lie more than</p><p>a few tens of base pairs away from the basal promoter must</p><p>interact with it via DNA bending or looping, and they</p><p>often tolerate changes in position relative to the transcrip-</p><p>tion unit (indeed, this was part of the original operational</p><p>definition of an ‘‘enhancer’’: Serfling, Jasin, and Schaffner</p><p>1985; Atchison 1988). Binding sites that interact with</p><p>Evolution of Transcriptional Regulation 1391</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>chromatin remodeling complexes may also have only</p><p>moderate functional constraints on position and orienta-</p><p>tion. This combination of position sensitivity for local</p><p>interactions and position insensitivity for interactions with</p><p>effector complexes may underlie the evolutionary origin</p><p>and maintenance of promoter modules (see section 3.5.4).</p><p>3.5.3 The Regulatory Role of a Binding Site Is Often</p><p>Context Dependent</p><p>Some binding sites have discrete functions within</p><p>promoters, in the sense that when they are occupied by</p><p>a protein they consistently have the same effect on</p><p>transcription. In many instances, however, the consequen-</p><p>ces of transcription factor binding are strongly context</p><p>dependent (fig. 4) (Biggin and McGinnis 1997; Fry and</p><p>Farnham 1999; Lemon and Tjian 2000; Courey 2001). (1)</p><p>The presence or absence of cofactors is often important.</p><p>Many transcription factors interact with transcriptional or</p><p>chromatin-remodeling complexes through cofactors (see</p><p>section 3.4.5). For instance, the transcriptional activator</p><p>CREB requires the cofactor CBP to recruit the RNA</p><p>polymerase II complex (Shikama, Lyon, and La Thangue</p><p>1997), whereas the repressor protein groucho requires one</p><p>of several cofactors to initiate chromatin condensation</p><p>(Chen and Courey 2000). (2) Some binding sites bind</p><p>different transcription factors under different circum-</p><p>stances. The consensus binding sequences of many</p><p>transcription factors overlap (table 4). The effects of</p><p>different proteins that interact with the same binding site</p><p>can differ depending on the protein-protein interaction</p><p>domains they contain. For instance, the locus that encodes</p><p>the transcription factor CREM in mice produces six distinct</p><p>isoforms, all of which can recognize some of the same</p><p>binding site sequences; some isoforms contain protein-</p><p>protein interaction domains that activate transcription,</p><p>while others lack these domains and block the activator</p><p>isoforms from binding (Foulkes and Sassone-Corsi 1992).</p><p>(3) Some binding sites are positioned sufficiently near each</p><p>other that only one protein can bind at a time. Changes in</p><p>the relative concentrations of transcription factors that</p><p>interact with adjacent binding sites can have a significant</p><p>impact on transcription (see section 3.4.5).</p><p>(4) Some</p><p>transcription factors interact synergistically during or</p><p>after binding to DNA. Several cases are known where</p><p>different transcriptional activators individually have little</p><p>or no effect on transcription, but in combination they</p><p>produce a strong effect. For instance, activation of human</p><p>interferon-beta (IFN-b) transcription requires that several</p><p>proteins be present, none of which can activate transcrip-</p><p>tion alone (Thanos and Maniatis 1995).</p><p>3.5.4 Binding Sites Are Sometimes Organized into</p><p>Functional Modules</p><p>Clusters of nearby transcription factor binding sites</p><p>sometimes operate as functionally coherent modules</p><p>(Dynan 1989; Kirchhamer, Yuh, and Davidson 1996;</p><p>Arnone and Davidson 1997). A module is operationally</p><p>defined as a cluster of binding sites that produces a discrete</p><p>aspect of the total transcription profile. A single module</p><p>typically contains about 6 to 15 binding sites and binds 4</p><p>to 8 different transcription factors (Arnone and Davidson</p><p>1997; Davidson 2001). Although many promoters contain</p><p>two or more clearly distinct modules (fig. 2D and E, 2H,</p><p>2J and K, 2P), others apparently lack modular organization</p><p>(fig. 2A–C and 2I). Modules are often, but not necessarily,</p><p>physically separated around a locus (compare fig. 2H, 2J</p><p>and 2K with fig. 2D). A single module may carry out one</p><p>or a combination of the following: (1) initiate transcrip-</p><p>tion, often in a highly specific manner such as within</p><p>a single cell type or region of an embryo; (2) boost</p><p>transcription rate without being able to initiate it; (3)</p><p>mediate signals from outside the cell, by binding</p><p>a transcription factor that either contains a receptor for</p><p>a hormone or that is post-translationally modified by</p><p>a signal transduction system; (4) repress transcription</p><p>under specific conditions or in specific regions or cell</p><p>types; (5) restrict the effect of another module to a single</p><p>basal promoter through an ‘‘insulator’’ function (see</p><p>section 3.3.6); (6) selectively ‘‘tether’’ other modules, by</p><p>bringing them into proximity with a single basal promoter</p><p>(see section 3.3.6); or (7) integrate the status of other</p><p>modules by influencing transcription differently, depend-</p><p>ing on what proteins are bound elsewhere (Yuh, Bolouri,</p><p>and Davidson 1998). The most common term for</p><p>a promoter module in the literature is an ‘‘enhancer.’’</p><p>Enhancers were originally defined operationally as seg-</p><p>ments of DNA capable of elevating transcription in</p><p>a position-independent and orientation-independent man-</p><p>ner (Serfling, Jasin, and Schaffner 1985; Atchison 1988).</p><p>The term has since been applied much more broadly, to</p><p>any region of DNA that produces a specific aspect of</p><p>a transcription profile, sometimes even including regions</p><p>that repress transcription. Further ambiguities stem from</p><p>the fact that it is not always possible to assign a single</p><p>function to a region of a promoter (see section 3.5.3). The</p><p>terms enhancer, booster, activator, insulator, repressor,</p><p>locus control region, upstream activating sequence, and</p><p>upstream repressing sequence, all refer to various kinds of</p><p>modules. Although these terms are descriptive of function,</p><p>they may be misleading if later studies demonstrate</p><p>multiple functions or context dependence of function.</p><p>For these reasons, we use the more general term module</p><p>(Dynan 1989; Arnone and Davidson 1997).</p><p>3.5.5 Activator and Repressor Modules Are Often</p><p>Additive in Effect</p><p>Experiments often reveal that deleting a single module</p><p>eliminates a specific aspect of the expression profile</p><p>without disrupting the remainder (e.g., DiLeone, Russell,</p><p>and Kingsley 1998; Yuh, Bolouri, and Davidson 1998;</p><p>Kammandel et al. 1999; Sackerson, Fujioka, and Goto</p><p>1999). Conversely, predictable artificial expression profiles</p><p>can be built by experimentally combining modules from</p><p>different promoters (e.g., Kirchhamer, Bogarad, and</p><p>Davidson 1996). These experimental results are the primary</p><p>basis for the claim that the modularity of promoters</p><p>contributes to their ‘‘evolvability’’ (Stern 2000; Wilkins</p><p>2002). In contrast, experimentally deleting insulator,</p><p>tethering, or integrator modules is epistatic rather than</p><p>additive (Ohtsuki, Levine, and Cai 1998; Yuh, Bolouri,</p><p>1392 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>and Davidson 1998; Calhoun, Stathopoulos, and Levine</p><p>2002).</p><p>3.5.6 Collaboration Among Modules Produces the Total</p><p>Transcription Profile</p><p>Two aspects of promoter function are reminiscent of</p><p>analog logic circuits (Yuh, Stathopoulos, and Levine 1998;</p><p>Davidson 2001; Yuh, Bolouri, and Davidson 2001). (1)</p><p>Individual modules can function as Boolean (off/on) or</p><p>scalar (quantitative) elements whose interactions have</p><p>predictable, additive effects on transcription. Multiple</p><p>modules are sometimes required to produce a single phase</p><p>of expression. For instance, the seven stripes of even-</p><p>skipped transcription in the early embryo of Drosophila are</p><p>controlled by six modules (Sackerson, Fujioka, and Goto</p><p>1999; fig. 2H). Conversely, a single module may be</p><p>involved in several different phases of expression. For</p><p>example, module A in the Endo16 promoter of the sea</p><p>urchin Strongylocentrotus purpuratus (fig. 2D) activates</p><p>transcription in the embryo, synergistically elevates</p><p>transcription in the larva, and is required for the function</p><p>of repressor modules (Yuh, Stathopoulos, and Levine</p><p>1998; Yuh, Bolouri, and Davidson 2001). (2) Promoters</p><p>integrate multiple, diverse inputs and produce a single,</p><p>scalar output: the rate of transcriptional initiation. A</p><p>familiar analogy is a neuron, which receives input from</p><p>many sources but whose output is simply how often it</p><p>fires. In many promoters, signal integration happens at the</p><p>basal promoter, through specific interactions between</p><p>bound transcription factors and components of the RNA</p><p>polymerase II enzyme complex (Latchman 1998; Lee and</p><p>Young 2000). In some promoters, however, a distinct</p><p>module may integrate signals from other modules. For</p><p>instance, module A of the Endo16 promoter relays the</p><p>status of the other five modules to the basal promoter (fig.</p><p>2D; Yuh, Stathopoulos, and Levine 1998; Yuh, Bolouri,</p><p>and Davidson 2001).</p><p>3.6 Gene Networks</p><p>All genes are components of immensely complex</p><p>networks of interacting loci. The binding of a transcription</p><p>factor to a promoter is one of several physical determinants</p><p>of gene network architecture. Understanding the organi-</p><p>zation of gene networks (Lee et al. 2002; Milo et al. 2002)</p><p>will be necessary for understanding how they evolve (von</p><p>Dassow et al. 2000; Davidson 2001; Wagner 2001).</p><p>3.6.1 Most Transcription Factors Have Numerous</p><p>Downstream Target Genes</p><p>A simple calculation demonstrates that most eukary-</p><p>otic transcription factors must bind to the promoters of</p><p>many downstream genes. Eukaryotic genomes contain on</p><p>the order of 0.5–5 3 104 genes, only a small fraction of</p><p>which encode transcription factors (table 3 provides</p><p>a partial list for several species). Because the expression</p><p>of all genes requires that transcription factors bind to their</p><p>promoters, and because most promoters contain binding</p><p>sites for at least five different transcription factors (and</p><p>often many more), transcription factors must on average</p><p>interact with the promoters of tens to hundreds of genes.</p><p>These rough calculations agree with studies that have used</p><p>experimental approaches to estimate the number of direct</p><p>downstream targets of a specific transcription factor. In</p><p>Drosophila melanogaster, Ubx isoform Ia alone regulates</p><p>an estimated 85–170 direct downstream targets (Mastick et</p><p>al. 1995), whereas eve and ftz together appear to regulate</p><p>the majority of genes in the Drosophila genome (Liang</p><p>and Biggin 1998). The number and identify of direct</p><p>downstream targets has been assayed by in vivo binding for</p><p>many transcription factors in Saccharomyces cerevisiae</p><p>(Iyer et al. 2001; Lieb et al. 2001, 2002), and for some</p><p>transcription factors these analyses have been carried out</p><p>on cells grown under more than one environmental con-</p><p>dition (Ren et al. 2000). Even using conservative</p><p>criteria</p><p>for recognizing interactions, these analyses indicate that</p><p>most transcription factors directly regulate a few percent of</p><p>the genes in the Saccharomyces genome. Genetic networks</p><p>are therefore highly connected, with each node that is</p><p>represented by a transcription factor linked to many other</p><p>nodes. This high degree of connectivity may be responsible</p><p>in large part for the classical genetic phenomena of</p><p>epistasis, polygeny, and pleiotropy (Gibson 1996).</p><p>3.6.2 Transcription Is Often Modulated by Feedback</p><p>Loops</p><p>The expression profile of a gene is a system property,</p><p>in that it is sensitive to changes in the expression and</p><p>activity of gene products encoded by many other loci.</p><p>Thus, even if a mutation in a promoter region alters</p><p>transcription, the network of functionally interacting genes</p><p>and gene products may modulate this effect (von Dassow</p><p>et al. 2000). For instance, a mutation that doubles</p><p>transcription rate may not result in twice as much protein</p><p>being produced if there is feedback from the cytoplasm to</p><p>the nucleus that is sensitive to protein level or to the</p><p>functional consequences of protein activity (such as</p><p>accumulation of a particular metabolite). Feedback loops</p><p>are probably rather common components of gene networks</p><p>(Lee et al. 2002; Milo et al. 2002) and may mask some</p><p>functionally significant mutations in promoters.</p><p>3.6.3 A Significant Fraction of the Genome Is Involved in</p><p>Transcriptional Regulation</p><p>Several large-scale interspecific sequence compari-</p><p>sons have estimated that the number of conserved</p><p>intergenic nucleotides is similar to the number of</p><p>conserved coding nucleotides (Shabalina and Kondrashov</p><p>1999; Onyango et al. 2000; Bergman and Kreitman 2001;</p><p>Frazer et al. 2001; Shabalina et al. 2001). This striking</p><p>result suggests that the number of functional noncoding</p><p>nucleotides is approximately equal to the number of</p><p>protein-coding nucleotides, and that approximately half of</p><p>all functionally or phenotypically penetrant molecular</p><p>evolution involves noncoding sequences. The hundreds of</p><p>transcription units encoding general and specific transcrip-</p><p>tional factors, chromatin remodeling complexes, and</p><p>transcription cofactors add to the sequences involved in</p><p>transcriptional regulation. A substantial fraction of a eu-</p><p>Evolution of Transcriptional Regulation 1393</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>karyotic genome is devoted to extracting information from</p><p>itself.</p><p>4 The Rich Phenomenology of Promoter Evolution</p><p>The literature relevant to promoter evolution is</p><p>diffuse, and little of it was written by or for evolutionary</p><p>biologists. Nonetheless, the available information provides</p><p>a foundation on which to build some initial general-</p><p>izations. Parallels with, and differences from, the evolu-</p><p>tion of other regions of eukaryotic genomes are evident. In</p><p>this section, we document many ways in which promoters</p><p>and mechanisms of transcriptional regulation have in-</p><p>teresting, informative, and varied evolutionary histories.</p><p>Figure 5 provides an overview of how this section is</p><p>organized.</p><p>4.1 Promoter Sequences Evolve at Different Rates</p><p>For reasons that remain poorly understood, coding</p><p>sequences evolve at markedly uneven rates among</p><p>lineages and within genomes (Gillespie 1991; Li 1997).</p><p>Rate variation appears to be a prominent feature of</p><p>promoter sequence evolution as well. (1) Long-term</p><p>conservation. Similar clusters of transcription factor</p><p>binding sites are sometimes present in the promoters of</p><p>orthologous genes of species that diverged up to 107�108</p><p>years ago (e.g., Aparicio et al. 1995; Frasch, Chen, and</p><p>Lufkin 1995; Gerard, Zakany, and Duboule 1997; Beckers</p><p>and Duboule 1998; Margarit et al. 1998; Papenbrock</p><p>et al. 1998; Shashikant et al. 1998; Thompson et al. 1998;</p><p>Plaza, Saule, and Dozier 1999; Xu et al. 1999; Tümpel</p><p>et al. 2002). Although sequence similarities in promoters</p><p>are routinely interpreted as conserved features, another</p><p>possibility is independent origins of binding sites for the</p><p>same transcription factor (Cavener 1992; Stone and Wray</p><p>2001). Both long-term conservation and parallel origins of</p><p>binding sites for the same transcription factor suggest</p><p>constraints on promoter function and imply that stabiliz-</p><p>ing selection is operating on the gene expression profile.</p><p>(2) Rapid divergence. Promoter sequences can also</p><p>diverge extensively among even relatively closely related</p><p>species, and they may include gains and losses of multiple</p><p>binding sites and changes in the position of regulatory</p><p>sequences relative to the transcription start site (Wu and</p><p>Brennan 1993; Takahashi et al. 1999; Wolff et al. 1999;</p><p>Liu, Wu, and He 2000; Romano and Wray 2003). A</p><p>comparison of 20 well-characterized regulatory regions in</p><p>mammals revealed that approximately one-third of</p><p>binding sites in humans are probably not functional in</p><p>rodents (Dermitzakis and Clark 2002). Promoter sequence</p><p>differences may or may not alter transcription (see section</p><p>4.3) or organismal phenotype (see section 4.5), depending</p><p>on genetic background and environmental conditions.</p><p>4.2 Functional Changes in Promoters Arise from a</p><p>Variety of Mutations</p><p>Mutations affecting transcription (fig. 6) fall into</p><p>several distinct classes. (1) Small-scale, local mutations</p><p>can modify, eliminate, and generate binding sites and alter</p><p>their spacing. Promoter function can be directly altered</p><p>by the most abundant kinds of mutations: single base</p><p>substitutions, small indels, and changes in repeat number</p><p>(e.g., Gonzalez et al. 1995; Shashikant et al. 1998; Segal et</p><p>al. 1999; Takahashi et al. 2001; Rockman and Wray 2002;</p><p>Streelman and Kocher 2002). Point mutations can</p><p>modulate or eliminate transcription factor binding, gener-</p><p>ate binding sites de novo, or result in binding by a different</p><p>transcription factor (‘‘transcription factor switching’’:</p><p>Rockman and Wray 2002). Insertions and deletions can</p><p>change spacing between binding sites, as well as eliminate</p><p>binding sites or generate new ones (Ludwig and Kreitman</p><p>1995; Belting, Shashikant, and Ruddle 1998). Changes in</p><p>microsatellite structure can affect spacing between binding</p><p>sites and alter the number of binding sites, sometimes with</p><p>functional consequences (Trefilov et al. 2000; Rockman</p><p>and Wray 2002; Streelman and Kocher 2002). (2) New</p><p>regulatory sequences can be inserted into promoters</p><p>through transposition. This phenomenon has been re-</p><p>viewed extensively (Britten 1997; Kidwell and Lisch</p><p>1997; Brosius 1999). For instance: B2 SINEs in Mus</p><p>musculus contain sequences capable of acting as basal</p><p>promoters (Ferrigno et al. 2001) and some Alu elements</p><p>in humans contain binding sites for nuclear hormone</p><p>receptors and exert an influence on transcription (Babich</p><p>et al. 1999). (3) Retroposition may assemble new pro-</p><p>moters. Retroposition can create novel genes that are</p><p>subsequently expressed (e.g., jingwei and sphinx: Long,</p><p>Wang, and Zhang 1999; Wang et al. 2002). This process</p><p>occurs at appreciable frequencies within the genus</p><p>Drosophila (Bétran et al. 2002). The molecular mechanisms</p><p>underlying retroposition preclude transfer of the basal</p><p>promoter and virtually all cis-regulatory sequences (the</p><p>exception being those within exons). Because no gene can</p><p>function without transcriptional regulatory sequences, it</p><p>seems likely that novel genes that arise through retro-</p><p>position either fortuitously insert near existing cis-regula-</p><p>tory sequences and come under their regulation without</p><p>disrupting existing regulatory fuctions or persist long</p><p>enough that novel cis-regulatory sequences arise through</p><p>transposition, recombination, or small-scale local muta-</p><p>tions. Remarkably, novel genes that arise through retro-</p><p>position are often expressed in tissue-specific patterns</p><p>FIG. 5.—Varieties of evolutionary change in transcriptional regula-</p><p>tion. The diversity of evolutionary patterns and mechanisms in</p><p>transcriptional regulation can be organized by their genomic location</p><p>(cis or trans) and functional consequence (silent, biochemical, expression,</p><p>organismal, fitness). The</p><p>numbers in this diagram refer to sections of the</p><p>text that discuss that category of evolutionary change.</p><p>1394 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>similar to those of a parent locus (Bétran et al. 2002). (4)</p><p>Gene duplications may fragment or recombine promoter</p><p>sequences. Although analyses of gene duplication typically</p><p>focus on coding sequences, the associated promoters are</p><p>clearly also important for gene function. If the breakpoints</p><p>do not include cis-regulatory sequences, then the duplicated</p><p>copy is likely to be transcriptionally inert in its new location</p><p>and become a pseudogene even before it accumulates stop</p><p>codons or frameshifts. If only part of the promoter is</p><p>duplicated, the transcription profile of the new copy may</p><p>differ from the original (e.g., nNOS: Korneev and O’Shea</p><p>2002). In principle, a duplication could also fortuitously</p><p>combine sequences from two different promoters to create</p><p>a hybrid cis-regulatory region with a novel transcription</p><p>profile. Gene duplications that persist are frequently</p><p>followed by divergence in expression (Li and Noll 1994;</p><p>Gu et al. 2002; Stauber, Prell, and Schmidt-Ott 2002) and</p><p>may be followed by loss of complementary promoter</p><p>modules (Ferris and Whitt 1979; Force et al. 1999). (5)</p><p>Gene conversion can spread regulatory elements within</p><p>a gene family. Examples from humans include growth</p><p>hormone (Giordano et al. 1997), beta and gamma globins</p><p>(Chiu et al. 1997; Patrinos et al. 1998), and major</p><p>histocompatibility complex (MHC) genes (Cereb and Yang</p><p>1994). Gene conversion is an ongoing process in RNA</p><p>polymerase I–transcribed genes (which encode the 40S pre-</p><p>rRNA that is processed to form 18, 5.8, and 28S rRNA)</p><p>including associated transcriptional regulatory sequences,</p><p>but not among the more heterogeneous RNA polymerase</p><p>III-transcribed genes (White 2001). (6) Sequences that have</p><p>no prior function in regulating gene expression can become</p><p>fortuitous promoters. In one case, gene duplications</p><p>resulted in a former exon functioning in transcriptional</p><p>regulation (Sdic: Nurminsky et al. 1998). A second example</p><p>involves functional transcription factor binding sites within</p><p>an exon (nonA: Sandrelli et al. 2001). These ‘‘hopeful</p><p>monster’’ promoters demonstrate that rare events can</p><p>assemble functional cis-regulatory sequences from seem-</p><p>ingly unpromising material.</p><p>4.3 Changes in Promoter Sequence Differ Widely in Their</p><p>Effects on Transcription</p><p>Relatively little information exists about the functional</p><p>consequences of naturally occurring differences in pro-</p><p>moter sequences. A few studies have directly examined the</p><p>biochemical impact of sequence differences on protein</p><p>binding (e.g., Ruez, Payre, and Vincent 1998; Singh,</p><p>Barbour, and Berge 1998; Wolff et al. 1999; Shaw et al.</p><p>2002). Most of what we know about functional conse-</p><p>quences, however, comes from cases in which the resulting</p><p>transcription profile has been examined (e.g., Ross, Fong,</p><p>and Cavener 1994; Odgers, Healy, and Oakeshott 1995;</p><p>Tournamille et al. 1995; Belting, Shashikant, and Ruddle</p><p>1998; Indovina et al. 1998; Ludwig, Patel, and Kreitman</p><p>1998; Wang et al. 1999; Romano and Wray 2003). In some</p><p>of these cases, specific promoter sequence differences are</p><p>correlated with phenotypic consequences; in most, howev-</p><p>er, the presence of multiple sequence differences makes it</p><p>difficult to infer the precise basis for evolutionary changes</p><p>in transcription. Divergence in promoter sequence and</p><p>transcription profile are often poorly correlated: very similar</p><p>promoters can produce substantially different transcription</p><p>profiles (e.g., parvoviruses: Storgaard et al. 1993; TNFA in</p><p>primates: Haudek et al. 1998; MMSP in diptera: Christo-</p><p>phides et al. 2000), whereas highly divergent promoter</p><p>sequences can produce very similar transcription profiles</p><p>(e.g., runt in Drosophila: Wolff et al. 1999; brachyury</p><p>in ascidians: Takahashi et al. 1999; yp in Drosophila: Piano</p><p>et al. 1999; Endo16 in sea urchins: Romano and Wray</p><p>2003).</p><p>The latter situation is not unusual (for additional</p><p>examples, see section 4.7). Indeed, many changes in</p><p>promoter sequence do not alter transcription within the</p><p>limits of experimental assays. Sequence changes might be</p><p>functionally silent for several reasons. (1) Substitutions</p><p>and indels between transcription factor binding sites may</p><p>not affect DNA-protein interactions. It is probably gen-</p><p>erally true that nucleotides within binding sites are more</p><p>functionally constrained than those that lie between</p><p>binding sites. However, it is difficult to rule out the</p><p>possibility that a supposed nonbinding site nucleotide</p><p>might in fact be part of an unrecognized binding site (see</p><p>section 5.2). Furthermore, small indels that do not directly</p><p>involve binding sites may disrupt protein-protein inter-</p><p>actions by placing proteins on opposite sides of the DNA</p><p>FIG. 6.—Mutations affecting promoter structure and function. (A)</p><p>Modifications in the sequence of a transcription factor binding site. The</p><p>simplest such changes involve single nucleotide substitutions, insertions,</p><p>or deletions. Repeat length changes can also affect binding sites. (B)</p><p>Modifications in the spacing of binding sites. These changes can arise in</p><p>several ways: from insertions and deletions of other genomic segments or</p><p>mobile elements (as shown), from the accumulation of small indels, and</p><p>from the expansion or contraction of repeats. (C) Modifications in the</p><p>presence or absence of functional binding sites. Binding sites can arise or</p><p>be lost through local point mutations, insertions, or deletions. (D) Large-</p><p>scale changes in promoter organization. Several kinds of reorganization</p><p>are possible, just two of which are shown. Mobile element insertions can</p><p>‘‘import’’ functional binding sites into a promoter (see section 4.2). Small</p><p>chromosomal rearrangements can alter the orientation or location of</p><p>clusters of binding sites. The functional consequence of each kind of</p><p>change can range from no effect on gene expression to altered expression</p><p>to loss of expression (see section 4.3).</p><p>Evolution of Transcriptional Regulation 1395</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>helix or by changing the spacing of binding sites (see</p><p>section 3.5.2). (2) Changes in spacing between distant</p><p>binding sites will be neutral in many cases. Interactions</p><p>among proteins associated with binding sites more than</p><p>;50 bp apart are probably mediated by DNA bending or</p><p>looping, which may to a large degree be insensitive to</p><p>differences in spacing. (3) Some within-consensus nucle-</p><p>otide substitutions in binding sites may be functionally</p><p>neutral. Certain changes in binding site sequence can</p><p>preserve a particular DNA-protein interaction (see section</p><p>3.3.3). Not all such changes will be neutral, however, as</p><p>binding kinetics may differ, in turn altering transcription</p><p>(see section 3.3.4). Very little is known about the evolution</p><p>of binding site consensuses, so sequence comparisons</p><p>alone may be a poor guide as to which nucleotide sub-</p><p>stitutions within binding sites are likely to be function-</p><p>ally neutral. (4) Eliminating an entire binding site may be</p><p>functionally neutral. Many promoters contain multiple</p><p>copies of the same binding site, raising the possibility of</p><p>functional redundancy. Cases of probable binding site</p><p>turnover (Ludwig and Kreitman 1995; Hancock et al. 1999;</p><p>Piano et al. 1999; Liu, Wu, and He 2000; Dermitzakis</p><p>and Clark 2002; Scemama et al. 2002) may have been</p><p>possible because of functional redundancy. Nevertheless,</p><p>multiply represented binding sites within the same pro-</p><p>moter are not always functionally redundant (see section</p><p>5.5).</p><p>4.4 Gene Expression Profiles Evolve Frequently and in</p><p>Diverse Ways</p><p>The literature of comparative gene expression has</p><p>emphasized similarities and generally interpreted them as</p><p>conserved features (DeRobertis and Sasai 1996; Holland</p><p>and Holland 1999; Carroll, Grenier, and Weatherbee</p><p>2001). When comparing distantly related taxa, however,</p><p>similarities in gene expression are often outweighed by</p><p>apparently nonhomologous features (Wray and Lowe</p><p>2000; Davidson 2001; Wilkins 2002). The abundance of</p><p>population-level variation in promoter function (see</p><p>section 2.3) means that expression differences could</p><p>evolve quite rapidly under some conditions, and indeed</p><p>substantial differences in gene expression can exist even</p><p>between recently duplicated genes (Gu et al. 2002) or</p><p>closely related species (Parks et al. 1988; Ross, Fong, and</p><p>Cavener 1994; Swalla and Jeffery 1996; Grbic, Nagy, and</p><p>Strand 1998; Kissinger and Raff 1998; Brunetti et al. 2001;</p><p>Ferkowicz and Raff 2001). Several functional classes of</p><p>evolutionary change in gene expression are evident. (1)</p><p>Changes in timing of gene expression. Temporal changes</p><p>have been documented from many taxa (e.g., Dickinson</p><p>1988; Wray and McClay 1989; Swalla and Jefferey 1996;</p><p>Kim, Kerr, and Min 2000; Skaer, Pistillo, and Simpson</p><p>2002). Heterochronies are a common pattern of anatomical</p><p>evolution (McKinney and McNamara 1991), and must, at</p><p>some level, involve heritable changes in the timing of gene</p><p>expression. (2) Changes in spatial extent of gene</p><p>expression. Many studies have found interspecific differ-</p><p>ences in the spatial extent of regulatory gene expression</p><p>(e.g., Schiff et al. 1992; Abzhanov and Kaufman 2000;</p><p>Brunetti et al. 2001; Scemama et al. 2002). Such changes</p><p>are of particular interest when they affect regulatory genes,</p><p>because of the relatively direct consequences for body</p><p>proportions, organ size and number, and a great many</p><p>other anatomical features (see section 2.2). (3) Changes in</p><p>level of gene expression. Evolutionary differences in</p><p>transcription rate have also been documented (Regier</p><p>and Vlahos 1988; Crawford, Segal, and Barnett 1999;</p><p>Wang et al. 1999). Such comparisons have been simplified</p><p>with the advent of microarray technologies (e.g., Jin et al.</p><p>2001; Schadt et al. 2003). Because of this approach, we</p><p>know more about differences in transcript abundance than</p><p>any other kind of evolutionary change in gene expression.</p><p>(4) Changes in responsiveness of gene expression to</p><p>external cues. Evolutionary changes in transcriptional</p><p>responses to physiological status, environmental condi-</p><p>tions, and pheromones have also been documented</p><p>(Brakefield et al. 1996; Cooper 1999; Abouheif and Wray</p><p>2002). Such changes are a necessary component in the</p><p>evolution of polyphenism and phenotypic plasticity and</p><p>are therefore of considerable ecological interest. (5) Sex-</p><p>specific expression. Evolutionary changes in differential</p><p>gene expression among sexes have been documented</p><p>(Schiff et al. 1992; Saccone et al. 1998; Christophides et</p><p>al. 2000; Kopp, Duncan, and Carroll 2000). Microarray</p><p>surveys suggest that populations can harbor variation in</p><p>which genes are expressed in a sex-specific manner (Jin et</p><p>al. 2001). (6) Gains and losses of particular phases of</p><p>gene expression. In multicellular organisms, many genes</p><p>are expressed in a succession of spatially and temporally</p><p>distinct phases during the life cycle (for examples see</p><p>Gerhart and Kirschner [1997]; Carroll, Grenier, and</p><p>Weatherbee [2001]; and Davidson [2001]). A gene whose</p><p>expression requires a particular transcription factor during</p><p>a specific phase of expression may be ‘‘abandoned’’ by</p><p>that regulator if it is no longer expressed in the appropriate</p><p>region. Examples include several independent losses of</p><p>patterning roles for homeodomain transcription factors in</p><p>arthropods (Dawes et al. 1994; Falciani et al. 1996; Grbic,</p><p>Nagy, and Strand 1998; Mouchel-Vielh et al. 2002).</p><p>Conversely, a new regulatory linkage may be established</p><p>if a promoter acquires a binding site for a different</p><p>transcription factor, a process known as recruitment or co-</p><p>option (Duboule and Wilkins 1998; Wilkins 2002). Many</p><p>likely cases have been identified (Lowe and Wray 1997;</p><p>Saccone et al. 1998; Keys et al. 1999; Brunetti et al. 2001;</p><p>reviewed in Wilkins 2002). Evolutionary gains and losses</p><p>of particular phases of gene expression may be facilitated</p><p>by the modular organization of promoters (see section 2.1).</p><p>4.5 Changes in Gene Expression Differ Widely in Their</p><p>Effects on Organismal Phenotype</p><p>Promoter function has both a biochemical phenotype,</p><p>the gene expression profile, as well as an organismal</p><p>phenotype, involving features such as anatomy, physiol-</p><p>ogy, life history, and behavior. These biochemical and</p><p>organismal effects are evolutionarily dissociable to some</p><p>extent, because some changes in gene expression appear to</p><p>have no consequence for organismal phenotype. Such</p><p>changes in gene expression are analogous to conservative</p><p>amino acid replacements in a protein (table 5), many of</p><p>1396 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>which are likewise thought to have no impact on</p><p>organismal phenotype (Kimura 1983; Gillespie 1991).</p><p>Several cases are known where the timing or spatial extent</p><p>of gene expression differs among species without any</p><p>obvious phenotypic consequence (e.g., gld in Drosophila:</p><p>Schiff et al. 1992; Ross, Fong, and Cavener 1994; Cy gene</p><p>family in sea urchins: Fang and Brandhorst 1996;</p><p>Kissinger and Raff 1998; msp130 in sea urchins: Wray</p><p>and Bely 1994).</p><p>Although it may be difficult to demonstrate beyond</p><p>any doubt that a particular difference in transcription is</p><p>phenotypically silent, the opposite case is easier to es-</p><p>tablish. Differences in gene expression have been linked to</p><p>diverse aspects of organismal phenotype, including: (1)</p><p>anatomy (Burke et al. 1995; Averof and Patel 1997; Stern</p><p>1998; Wang et al. 1999; Lettice et al. 2002); (2) physiology</p><p>(Abraham and Doane 1978; Matsuo and Yamazaki 1984;</p><p>Dudareva et al. 1996; Sinha and Kellogg 1996; Stockhaus</p><p>et al. 1997; Segal, Barnett, and Crawford 1999; Lerman</p><p>et al. 2003); (3) behavior (Trefilov et al. 2000; Caspi et al.</p><p>2002; Enard et al. 2002b; Fang, Takahashi, and Wu 2002;</p><p>Hariri et al. 2002; Saito et al. 2002); (4) disease</p><p>susceptibility (Tournamille et al. 1995; Shin et al. 2000;</p><p>Bamshad et al. 2002; Meyer et al. 2002); (5) polyphenism</p><p>(Brakefield et al. 1996; Abouheif and Wray 2002); and</p><p>(6) life history (Allendorf, Knudsen, and Phelps 1982;</p><p>Allendorf, Knudsen, and Leary 1983; Anisimov et al.</p><p>2001; Streelman and Kocher 2002).</p><p>4.6 Mutations in trans Can Alter Transcription in</p><p>Several Ways</p><p>The genetic basis for an observed difference in the</p><p>expression of a particular gene in some cases does not</p><p>reside in cis, but rather within one of the loci encoding</p><p>transcription factors that interact with it. Three classes of</p><p>mutations can underlie these trans effects. (1) Mutations</p><p>affecting the expression profile of an upstream transcrip-</p><p>tion factor. Numerous experiments demonstrate that this</p><p>trans effect is pervasive: manipulating the expression of</p><p>a transcription factor typically alters the expression of its</p><p>downstream targets (Gilbert 2000; Alberts et al. 2002).</p><p>Although many evolutionary differences in the expression</p><p>profiles of transcription factors are known (see section</p><p>4.4), few studies have investigated the effect of these</p><p>changes on the transcription of downstream targets.</p><p>Indirect evidence of an evolutionary role comes from</p><p>phenotypic correlates of interspecific differences in</p><p>transcription factor expression (e.g., Burke et al. 1995;</p><p>Averof and Patel 1997; Stern 1998; Beldade, Brakefield,</p><p>and Long 2002) and from expression assays that test</p><p>regulatory sequences of one species in another (e.g.,</p><p>Manzanares et al. 2000). (2) Mutations affecting the DNA-</p><p>binding domain of an upstream transcription factor.</p><p>Amino acid substitutions in DNA binding domains of</p><p>transcription factors can affect the expression of down-</p><p>stream genes (e.g., Conlon et al. 2001; D’Elia et al. 2002)</p><p>and produce phenotypic consequences (Brickman et al.</p><p>2001). Such changes are apparently relatively rare, as the</p><p>amino acid sequences of DNA binding domains are</p><p>usually highly conserved (Duboule 1994; Latchman</p><p>1998). Nonetheless, variants are sometimes found within</p><p>populations (e.g., Brickman</p><p>et al. 2001). Interspecies gene-</p><p>swapping experiments support this view: in a surprising</p><p>number of cases, a vertebrate gene encoding a transcription</p><p>factor can restore a somewhat wild-type phenotype to a fly</p><p>that is homozygous for a null allele of the orthologous</p><p>gene (e.g., Lutz et al. 1996; Gerard, Zakany, and Duboule</p><p>1997). Few gene swaps rescue phenotype perfectly and</p><p>some fail almost completely, however, which may be</p><p>due in part to changes in DNA binding specificity. (3)</p><p>Mutations affecting the presence or sequence of a protein-</p><p>protein interaction domain in an upstream transcription</p><p>factor. Again, experiments provide evidence of this third</p><p>class of trans effects on transcription (Hope and Struhl</p><p>1986; Dawson, Morris, and Latchman 1996). Functional</p><p>changes in protein-protein interaction domains have</p><p>evolved in Hox transcription factors within the Arthropoda</p><p>(Galant and Carroll 2002; Ronshaugen, McGinnis, and</p><p>McGinnis 2002) and in serum response transcription</p><p>factors between arthropods and chordates (Avila et al.</p><p>2002), while an evolutionary difference in a phosphoryla-</p><p>tion site has evolved in the FOXP2 transcription factor</p><p>along the lineage separating humans from the other great</p><p>apes (Enard et al. 2002b). Sequence comparisons suggest</p><p>that amino acid substitutions in protein-protein interaction</p><p>domains can evolve rapidly under positive selection</p><p>(Sutton and Wilkinson 1997; Barrier, Robichaux, and</p><p>Purugganan 2001). All three classes of trans effects</p><p>mentioned above are likely to be highly pleiotropic</p><p>because of the large number of downstream target genes</p><p>that would be affected (see section 3.6.1).</p><p>4.7 Many Modes of Selection Operate on Promoter</p><p>Sequences</p><p>The classic modes of natural selection that operate on</p><p>coding sequences and morphology also operate on</p><p>promoter sequences (also see section 2.4). (1) Negative</p><p>(purifying) selection. Many deleterious promoter alleles</p><p>have been identified in humans, involving a wide range of</p><p>genes and phenotypic consequences (summarized in</p><p>Cooper 1999). Cases of long-term conservation of binding</p><p>sites (see section 4.1) suggest persistent negative selection.</p><p>(2) Positive selection. Some promoter alleles appear to be</p><p>under directional selection (FY: Hamblin and DiRienzo</p><p>2000; P450: Daborn et al. 2002; hsp70: Lerman et al.</p><p>2003). (3) Overdominant selection. Likely cases include</p><p>Table 5</p><p>Functional Categories of Nucleotide Substitution</p><p>Probable Impact Coding Sequencea Promoter Sequence</p><p>Neutral Synonymous nt</p><p>substitution</p><p>Nonbinding site</p><p>substitution</p><p>Low Conservative AA</p><p>replacement</p><p>Within consensus</p><p>substitution</p><p>Medium to high Nonconservative AA</p><p>replacement</p><p>Nonconsensus</p><p>substitution</p><p>Loss-of-function Frameshift or stop</p><p>codon</p><p>Deletion of basal</p><p>promoter or key</p><p>activator binding site</p><p>a AA¼ amino acid.</p><p>Evolution of Transcriptional Regulation 1397</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>some histocompatibility loci in humans and mice (Guar-</p><p>diola et al. 1996; Cowell et al. 1998). Reasons for</p><p>overdominant selection on coding sequences of these loci</p><p>are reasonably well understood, and transcription profiles</p><p>should be under selection for variation in the cell type in</p><p>which they are expressed (Guardiola et al. 1996). Other</p><p>possible cases include some b-thalassemias in humans</p><p>(Kazazian 1990), anthocyanin pigment synthesis in maize</p><p>(r locus: Li et al. 2001), and dispersal behavior in rhesus</p><p>macaques (serotonin transporter: Trefilov et al. 2000). (4)</p><p>Balancing selection. Environmental heterogeneity within</p><p>the range of a single species can result in local adaptation</p><p>and balancing selection. Cases are known from a teleost</p><p>(LDH: Crawford, Segal, and Barnett 1999; Segal, Barnett,</p><p>and Crawford 1999) and from humans (CCR5: Bamshad et</p><p>el. 2002). (5) Stabilizing selection. When binding sites</p><p>within a promoter differ but the resulting expression</p><p>profile is unchanged, stabilizing selection may be</p><p>operating. This situation appears to be relatively common,</p><p>with examples now known from diverse metazoans and</p><p>genes (ADH: Wu and Brennan 1993; Esterase-5B and -6:</p><p>Odgers, Healy, and Oakeshott 1995; Tamarina, Ludwig,</p><p>and Richmond 1997; unc-116: Maduro and Pilgrim 1996;</p><p>even-skipped: Ludwig, Patel, and Kreitman 1998; Patel et</p><p>al. 2000; yolk protein: Piano et al. 1999; runt: Wolff et al.</p><p>1999; brachyury: Takahashi et al. 1999; achaete-scute</p><p>complex: Skaer and Simpson 2000; Hoxb2: Scemama et</p><p>al. 2002; mating-type loci in budding yeast: Sjostrand,</p><p>Kegel, and Astrom 2002; Endo16: Romano and Wray</p><p>2003). (6) Compensatory selection: An interesting case in</p><p>humans involves a hypomorphic allele within the coding</p><p>sequence of CFTR that causes cystic fibrosis. Some</p><p>haplotypes contain a second mutation within the promoter</p><p>that adds a third Sp1 binding site, elevating transcription</p><p>and resulting in an improved prognosis (Romey et al.</p><p>1999, 2000). The third Sp1 site never occurs in haplotypes</p><p>that produce wild-type protein, suggesting that it may be</p><p>under positive selection as a result of its compensatory</p><p>effect.</p><p>5 Challenges in Studying Promoter Evolution</p><p>The structure and function of promoter sequences are</p><p>profoundly different from those of coding sequences (table</p><p>6). These differences impose nontrivial challenges for</p><p>studying the evolution of transcriptional regulation.</p><p>5.1 Coding and Regulatory Sequences Differ in Structure</p><p>and Function</p><p>Coding sequences have a regular, direct, precise, and</p><p>easily interpreted relationship with their proximate (bio-</p><p>chemical) phenotype, a specific sequence of amino acids.</p><p>In contrast, promoters have an idiosyncratic, indirect,</p><p>nonlinear, and context-dependent relationship with their</p><p>proximate phenotype, a particular transcription profile.</p><p>Furthermore, the transcription profile generated by a pro-</p><p>moter depends on other loci that encode transcription</p><p>factors that bind to it, and on the loci encoding the</p><p>transcription factors that regulate these immediate up-</p><p>stream regulators, and so forth. This regress transcends</p><p>generations, in that maternally loaded transcription factors</p><p>or their mRNAs are required to activate early zygotic gene</p><p>expression. Environmental influences on gene expression</p><p>add a further layer of complexity. Although the amino acid</p><p>sequence of a protein rarely changes during the course of</p><p>development or in response to environmental conditions,</p><p>the transcription profile of most genes is modulated during</p><p>the life cycle and in response to changing external</p><p>conditions. (Even when differential splicing produces</p><p>distinct protein isoforms from the same locus under</p><p>different circumstances, the relationship between DNA</p><p>and protein sequence remains direct.)</p><p>It is important to recognize that sequence data alone</p><p>cannot reveal the organization of binding sites within</p><p>a promoter; nor can they show what proteins bind to them,</p><p>or how they function, or what transcription profile they</p><p>generate. This is partly a matter of missing information: for</p><p>instance, the full matrix of binding sequences is not yet</p><p>known for most transcription factors, even in well-studied</p><p>species. But it is mostly an inescapable consequence of the</p><p>way transcription is regulated: many potential binding sites</p><p>have no influence on transcription in vivo, sequences</p><p>essential for transcription always reside both cis and trans,</p><p>and transcription can be strongly influenced by genetic</p><p>background, physiological status, and environmental</p><p>conditions.</p><p>5.2 Identifying Functional Binding Sites Requires</p><p>Biochemical Data and in Vivo Functional Assays</p><p>Because the sequences bound by transcription factors</p><p>are short and imprecise (see section 3.3), literally hundreds</p><p>of potential binding sites lie near every locus. Only</p><p>a fraction of these binding sites actually influence</p><p>transcription (Latchman 1998; Weinzierl 1999; Li and</p><p>Johnston 2001; Lee et al. 2002). Potential binding sites may</p><p>not function for a variety of reasons (see section 3.5.3; fig.</p><p>4). Which sites actually influence transcription, and are</p><p>therefore possible targets of selection, can</p><p>only be</p><p>determined experimentally. Biochemical characterizations</p><p>can identify binding sites precisely and are the only way to</p><p>determine whether consensus sequences differ among</p><p>species. The most common methods are footprinting and</p><p>mobility shift assays (Carey and Smale 2000). Because</p><p>these assays are carried out in vitro, they cannot reveal the</p><p>influence of chromatin modulation on protein binding or</p><p>transcription. Assays of in vivo binding sites (Walter and</p><p>Biggin 1996; Ren et al. 2000; Iyer et al. 2001; Lee et al.</p><p>2002) provide a more accurate representation but are</p><p>technically more demanding and undercount real binding</p><p>sites. The only definitive means of identifying a binding site</p><p>with a role in regulating transcription is to modify its</p><p>sequence and assay transcription in vivo, typically by</p><p>transient or stable transformation with a reporter gene (see</p><p>section 5.5). All of the methods mentioned above require</p><p>considerable effort when used to test a potential binding site</p><p>at multiple phases of the life cycle and under a variety of</p><p>environmental conditions. In practice, most promoters have</p><p>only been searched for potential binding sites at a restricted</p><p>phase of the life cycle and under uniform culture conditions.</p><p>For this reason, experimentally verified binding sites are</p><p>1398 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>nearly always an underestimate, and the physical extent of</p><p>a promoter is rarely well defined.</p><p>The resulting difficulties for studying promoter</p><p>evolution are substantial. (1) Information about promoter</p><p>structure is almost always incomplete. Few promoters</p><p>have been subjected to thorough searches for binding sites,</p><p>and some binding sites probably remain unidentified even</p><p>in carefully studied cases. Information about the functional</p><p>consequences of binding site differences among species is</p><p>limited to just a few cases (e.g., Singh and Berger 1998;</p><p>Wolff et al. 1999; Shaw et al. 2002). (2) The information</p><p>that does exist is almost always biased. Because of the</p><p>way promoter function is typically studied, some kinds of</p><p>binding sites are naturally less likely to be discovered.</p><p>These include binding sites that mediate responses to</p><p>physiological status and environmental conditions (be-</p><p>cause most assays are carried out under uniform</p><p>conditions), binding sites that act at restricted times during</p><p>the life cycle (because typically only part of the life cycle</p><p>is assayed), and binding sites of weak effect (because of</p><p>assay insensitivity). In addition, most studies measure</p><p>either quantitative or spatial aspects of transcription and</p><p>some ignore temporal changes; as a result, the binding</p><p>sites that are identified are often biased with respect to their</p><p>effects on time, space, and level of transcription.</p><p>Because empirical validation of binding sites is</p><p>laborious, attempts have been made to increase the</p><p>reliability of informatic approaches to binding site</p><p>identification. We discuss here a few of the many</p><p>approaches which have been developed (for additional</p><p>information, see Hardison [2000]; Stormo [2000]; Ohler</p><p>and Niemann [2001]; and Markstein and Levine [2002]).</p><p>Most informatic approaches apply either to a specific locus</p><p>or to a complete genome. In the former category are</p><p>programs that use databases of known binding site</p><p>matrices to scan a sequence for potential binding sites</p><p>(e.g., TRANSFAC: Wingender et al. 2001; EPD: Praz et al.</p><p>2002; PlantCARE: Lescot et al. 2002; SCPD: Zhu and</p><p>Zhang 1999). However, many of the potential binding sites</p><p>these programs identify have no biological function and</p><p>are simply spurious matches to a binding site (see previous</p><p>paragraphs and section 3.3.4). A complementary approach</p><p>involves comparisons with the homologous chromosomal</p><p>region from other species, a method known as ‘‘phyloge-</p><p>netic footprinting.’’ The rationale is that nucleotides within</p><p>binding sites are more likely to be conserved by natural</p><p>selection. This method can successfully identify previously</p><p>unknown binding sites (Loots et al. 2000; Wasserman et al.</p><p>2000; Yuh et al. 2002). The effectiveness of this method is</p><p>limited, however, because nucleotides can be conserved by</p><p>chance, because real binding sites can turn over even when</p><p>the transcriptional output is maintained, and because some</p><p>aspects of transcription are species-specific (e.g., Ludwig,</p><p>Patel, and Kreitman 1998; Dermitzakis and Clark 2002).</p><p>The first problem leads to false positives, whereas the</p><p>second and third generate false negatives. When a complete</p><p>genome sequence is available, several additional methods</p><p>can be applied to identify unknown binding sites. These</p><p>algorithms rely on large data sets to identify overrepre-</p><p>sented sequence motifs (e.g., Sinha and Tompa 2002),</p><p>Table 6</p><p>Structural and Functional Differences in Coding and Promoter Sequences for a</p><p>Protein-Coding Locus</p><p>Coding Promoter</p><p>Physical boundaries Defined by sequence Not defined by sequence</p><p>Start ATG (sometimes multiple) None</p><p>End TAA, TAG, or TGA None</p><p>Internala [C/A]AG jGU[A/G]AGU</p><p>(U/C)nNAG jG/A</p><p>None</p><p>Physical organization Discontinuous, colinear Discontinuous, nonlinear</p><p>Physical unit Exon Moduleb</p><p>Typical unit size ; 20–2,000 bp ; 200–2,000 bp</p><p>Number of units 1–10 (rarely more) 1–10 (rarely more?)</p><p>Organization and function colinear Yes No</p><p>Relative order of units Consistent Not consistent</p><p>Modules correspond to functions Sometimes Often</p><p>Functional organization Direct, local Indirect, distributed</p><p>Direct functional output Protein sequence Transcription profile</p><p>Unit of information Codon Binding site</p><p>Number of units ;150–1,000 (rarely more) ;6–60 (more?)</p><p>Information content 0.3–2.0 kb 0.08–0.8 kbc</p><p>Spacing between units Doesn’t matter Sometimes matters</p><p>Mapping Precise (1 AA)d Imprecise (.1 TxF)d</p><p>Degeneracy Precise (same AA) Imprecise (different TxF)</p><p>Consequence Qualitative (1 codon: 1 AA) Quantitative (level of Tx)d</p><p>Order of units Usually matters Sometimes matters</p><p>Genetic basis cis only cis and trans required</p><p>a Type I introns; other splice junction sequences exist.</p><p>b Cluster of transcription factor binding sites (also known as enhancer, UAS, etc.).</p><p>c cis-regulatory sequences only; additional information necessary for transcription is encoded in the sequences and ex-</p><p>pressions profile of trans regulators and, in some cases, in the environment.</p><p>d AA ¼ amino acid; Tx ¼ transcription; TxF¼ transcription factor.</p><p>Evolution of Transcriptional Regulation 1399</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>clusters of binding sites (e.g., Berman et al. 2002; Rebeiz,</p><p>Reeves, and Posakony 2002), or correlations with</p><p>expression data (e.g., Birnbaum, Benfey, and Shasha</p><p>2001). For all of these methods, both false positives and</p><p>false negatives remain a significant issue. Although</p><p>methods for informatic detection of binding sites are</p><p>becoming more sophisticated, for now the results are best</p><p>viewed as a starting point for empirical validation rather</p><p>than as a definitive identification of transcription factor</p><p>binding sites.</p><p>5.3 Identifying the Complement of Transcription</p><p>Factors that Interact with a Binding Site Requires</p><p>Biochemical Data</p><p>A binding site may be occupied by different</p><p>transcription factors (or by none) at different times or</p><p>places during development (see section 3.5.3; see also</p><p>table 4), with distinct functional consequences (Fry and</p><p>Farnham 1999; Lemon and Tjian 2000; Courey 2001). The</p><p>extent of ‘‘binding site sharing’’ by different transcription</p><p>factors remains poorly understood, because nearly all</p><p>functional studies of promoters in multicellular organisms</p><p>have examined a single phase of the life cycle (typically</p><p>embryos or differentiated cells in culture) under uniform</p><p>culture conditions.</p><p>Overlapping binding site specificities have important</p><p>implications for evolutionary studies: (1) A transcription</p><p>factor might not influence transcription, even if a consensus</p><p>binding site for it exists and is known to bind protein. The</p><p>presence of a binding</p><p>site is necessary but not sufficient for</p><p>transcription factor binding. Demonstrating an interaction</p><p>between a specific regulator and a binding site requires</p><p>some form of biochemical characterization, the most</p><p>common of which are ‘‘supershift’’ assays and in vivo</p><p>footprinting (Carey and Smale 2000). (2) A binding site</p><p>might be occupied by different transcription factors under</p><p>different circumstances. Indeed, the protein bound most of</p><p>the time may not be the one whose consensus recognition</p><p>motif is the closest match. Recognizing cases of varied</p><p>binding site occupancy requires testing nuclear extracts</p><p>across developmental stages, among cell types, and under</p><p>diverse environmental conditions using supershift assays</p><p>or in vivo footprinting. (3) The protein that occupies</p><p>a binding site can change evolutionarily. Likely cases of</p><p>‘‘transcription factor switching’’ have been identified</p><p>within humans (Rockman and Wray 2002). In addition,</p><p>an interaction that has been biochemically validated in one</p><p>species may not occur in another, even if the sequence of</p><p>the binding site is perfectly conserved. Demonstrating</p><p>a conserved or altered protein-DNA interaction requires</p><p>comparative biochemical data.</p><p>5.4 Comparisons of Promoter Sequence Are Not Always</p><p>Straightforward</p><p>Once functional binding sites have been mapped, the</p><p>next step is identifying homologous binding sites among</p><p>species or alleles. Promoter sequences can usually be</p><p>aligned rather easily within a species, although binding</p><p>sites that fall within repeats can be problematic. In the</p><p>most straightforward interspecific comparisons, potential</p><p>binding sites that occupy similar positions, spacing, and</p><p>orientation relative to the start site of transcription and</p><p>relative to each other are likely to be homologous.</p><p>Complications can arise for a variety of reasons: binding</p><p>site spacing is often functionally unconstrained (see</p><p>section 3.3.5), transposition can introduce similar binding</p><p>sites (see section 4.2), and random point mutations can</p><p>generate new binding sites at an appreciable frequency</p><p>(Stone and Wray 2001). (We use the term homologous in</p><p>its usual, phylogenetic, sense to denote the hypothesis that</p><p>a binding site is present in two living species because it</p><p>was present in their latest common ancestor and has</p><p>persisted since. Sequence similarity, in contrast, is simply</p><p>an observation, and can be due to either homology or</p><p>homoplasy.)</p><p>Once homologous binding sites have been identified,</p><p>routine methods of comparative analysis can be applied to</p><p>polarize character state transformations, identify reversals</p><p>and parallel transformations, and reconstruct ancestral</p><p>states. Most published comparisons of promoter structure</p><p>involve just two species, with the emphasis typically on</p><p>identifying conserved binding sites. By surveying more</p><p>taxa and incorporating functional data, it becomes possible</p><p>to identify origins, losses, and turnover of binding sites. As</p><p>with all comparative analyses, dense phylogenetic sam-</p><p>pling provides a more robust understanding of evolution-</p><p>ary transformations within promoters, particularly in cases</p><p>of rapid sequence divergence.</p><p>5.5 Promoter Function Can Only Be Determined</p><p>Experimentally</p><p>The only way to determine the expression profile</p><p>produced by a promoter haplotype is to assay it in vivo, in</p><p>its normal chromosomal and cell biological contexts. This</p><p>is most easily accomplished by examining the spatial and</p><p>temporal distribution of transcripts using in situ hybrid-</p><p>ization, RNA gel blots, or quantitative PCR. Because even</p><p>small differences in promoter sequence can alter transcrip-</p><p>tion (see section 4.2), interpreting the functional con-</p><p>sequences of such differences among alleles or between</p><p>species also requires assaying transcription.</p><p>Similarly, the only way to understand the contribution</p><p>of specific sequence differences within a promoter is to</p><p>carry out comparative functional tests. The most common</p><p>kind of experiment involves coupling a test regulatory</p><p>region to a reporter gene whose product is easily detected,</p><p>and then placing this construct in embryos or cells where it</p><p>is exposed to the array of transcription factors encountered</p><p>by the endogenous promoter (Carey and Smale 2000).</p><p>Further experiments, such as testing the consequences of</p><p>nucleotide substitutions within a specific binding site,</p><p>deleting a binding site, altering spacing or orientation</p><p>between binding sites, or testing restricted portions of the</p><p>promoter can be immensely informative. Although such</p><p>experiments are laborious, they provide almost the only</p><p>reliable information about binding site function. Fortu-</p><p>nately, expression assays are feasible in a growing number</p><p>of organisms. For comparative analyses, it is important to</p><p>carry out reciprocal functional tests, because transcription</p><p>1400 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>is a product of both cis and trans sequences. This, too, is</p><p>now possible in some taxa (Tümpel et al. 2002; Romano</p><p>and Wray 2003).</p><p>Comparative experimental tests, although unusual in</p><p>the literature, are necessary for analyzing the evolution of</p><p>promoter function. Issues of particular interest include the</p><p>following. (1) Identifying changes in a binding site function.</p><p>A binding site whose sequence is conserved between two</p><p>species may nonetheless function differently in them,</p><p>because the transcription factors and cofactors that interact</p><p>with it are expressed differently or because an adjacent</p><p>binding site for a cofactor has changed. (2) Determining the</p><p>function of multiply-represented binding sites. Multiple</p><p>binding sites for the same transcription factor within a single</p><p>promoter (fig. 3) may be functionally redundant, additive, or</p><p>synergistic (e.g., Small, Blair, and Levine 1992; Yuh and</p><p>Davidson 1998, 2001), with distinct consequences for</p><p>selection (Ludwig 2002). (3) Understanding the functional</p><p>significance of binding site organization. The position,</p><p>spacing, and orientation of individual binding sites in some</p><p>cases matters a great deal and in other cases not at all (see</p><p>3.5.2), again with distinct consequences for selection. (4)</p><p>Determining when, where, and under what conditions</p><p>a binding site functions. Although some binding sites may</p><p>function continuously and ubiquitously, most probably do</p><p>so only during part of the life cycle, in certain cell types, or in</p><p>response to particular environmental conditions. Binding</p><p>site functionmay also be context dependent, changing under</p><p>different circumstances (see section 3.5.3). (5) Identifying</p><p>the genetic basis for a known difference in transcription.</p><p>Interspecific differences in transcription profiles might be</p><p>due to changes in cis or trans (see sections 4.2 and 4.6). The</p><p>functional consequences of changes in cis can be identified</p><p>by means of in vivo expression assays (examples reviewed</p><p>in Paigen [1989] and Cavener [1992]).</p><p>The difficulty of carrying out comparative expression</p><p>assays imposes severe practical constraints on analyses of</p><p>promoter evolution. (1) In general, it will be more difficult</p><p>to obtain comparative information on the proximate</p><p>function of promoter sequences than of coding sequences.</p><p>Characterizing promoter function involves techniques that</p><p>are labor intensive and unfamiliar to most molecular</p><p>evolutionists. Yet without this information, it is difficult to</p><p>interpret comparative sequence data meaningfully. (2) At</p><p>least for the near term, comparative information on</p><p>binding site function will remain limited. Few promoters</p><p>have been analyzed biochemically or functionally in more</p><p>than one species, and even in these cases analyses have</p><p>been limited to a fraction of the complete cis-regulatory</p><p>region. (3) Predicting the proximate functional conse-</p><p>quences of mutations in promoter sequences will be more</p><p>difficult than in coding sequences. Because promoters lack</p><p>a general organizing ‘‘code,’’ because the function of</p><p>a binding site can be strongly context-dependent (see</p><p>section 3.5.3), and because promoter function depends on</p><p>the sequences and expression of transcription factors</p><p>encoded elsewhere, the biochemical phenotypic conse-</p><p>quences of sequence differences in a promoter region are</p><p>very difficult to interpret without functional tests. The</p><p>relative magnitude of likely functional consequences of</p><p>a mutation within a promoter can be organized into a very</p><p>rough rank order, as it can be more precisely within exons</p><p>(table 5). Overall, however, the considerably less regular</p><p>structure-function relationship within promoters will make</p><p>it much more difficult to discern general patterns of</p><p>sequence evolution.</p><p>5.6 Standard Classed Tests of Molecular Evolution Must</p><p>Be Modified for Analyzing Promoters</p><p>Although tests of selection on promoters are not</p><p>fundamentally different from tests on coding regions, they</p><p>must be applied with caveats. The major problems arise</p><p>when applying tests that use classes of nucleotide</p><p>substitutions to promoter data (e.g., Ka/Ks or McDonald-</p><p>Kreitman tests; McDonald and Kreitman 1991). These</p><p>tests classify coding mutations as synonymous or non-</p><p>synonymous, and they test for selection under the</p><p>assumption that synonymous sites evolve neutrally. To</p><p>apply these tests to promoter sequences, most authors</p><p>classify promoter mutations as occurring within binding</p><p>sites or within non-binding-site nucleotides and assume</p><p>that nonbinding sites are evolving neutrally. However, the</p><p>functional consequences of mutations in promoters cannot</p><p>be classified without additional functional data (see section</p><p>5.5; Jenkins, Ortori, and Brookfield 1995). More specif-</p><p>ically, practical difficulties in identifying binding sites (see</p><p>section 5.2) mean that most evolutionary analyses of</p><p>promoters will only be able to rely on functional</p><p>information from a single species. Binding sites absent</p><p>in the species for which functional data are available but</p><p>present in all other species will be missed, while those sites</p><p>functional in the reference species but not in all other</p><p>species may mistakenly be considered present in all. Both</p><p>types of error will result in some sequence differences</p><p>being classed incorrectly and will degrade the signal-to-</p><p>noise ratio in tests for selection. It follows that sequence</p><p>comparisons among the promoters of closely related</p><p>species, or classed tests that only use data from one</p><p>species (e.g., Hahn, Rausher, and Cunningham 2002), will</p><p>generally be the most informative and accurate. Even</p><p>fewer data from comparative studies are usually available</p><p>about the functional consequences of sequence differences</p><p>within a binding site. Only rarely will it be possible to</p><p>reliably detect the effect of nucleotide differences on</p><p>changes in binding specificity between species (see section</p><p>5.5).</p><p>A second problem with tests that rely on classes of</p><p>sites relates to the mechanism by which binding sites arise</p><p>and are presumably selected for. Although an excess of</p><p>nonsynonymous substitutions relative to synonymous</p><p>substitutions is good evidence for positive selection, it is</p><p>difficult to imagine a situation in which an excess of</p><p>binding-site substitutions relative to nonbinding-site sub-</p><p>stitutions can be interpreted in the same way. This follows</p><p>from three features of promoters. First, binding sites are</p><p>sometimes not functionally restricted to a specific position.</p><p>Individual binding sites may therefore turn over by</p><p>changing position without positive selection (Ludwig</p><p>2002). Second, sequences which have no binding affinity</p><p>for any transcription factor often need only a single base-</p><p>pair change in order to become a functional binding site</p><p>Evolution of Transcriptional Regulation 1401</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>(Stone and Wray 2001). A single point mutation can</p><p>therefore establish a new functional site consisting of</p><p>several nucleotides. Third, most nucleotide substitutions</p><p>within a binding site modulate or eliminate its function,</p><p>whereas relatively few mutations will change it into</p><p>a binding site for a different transcription factor. Rarely</p><p>will an excess of substitutions within a binding site be</p><p>a signal of positive selection, because binding sites often</p><p>simply cease to function after multiple substitutions. None</p><p>of these three features precludes selection for changes in</p><p>binding sites, only that they may combine to significantly</p><p>reduce the ability of classed tests to detect this selection in</p><p>practice.</p><p>All classed tests of section suffer from these</p><p>problems, but non-classed tests of neutrality (e.g., the</p><p>Hudson-Kreitman-Aguade test [HKA], Tajima’s D, Fu and</p><p>Li’s D: Hudson, Kreitman, and Aguade 1987; Tajima</p><p>1989; Fu and Li 1993) can be interpreted in the standard</p><p>way. Often a combination of these tests, as well as studies</p><p>of geographic structure of allele frequencies, may be</p><p>necessary to detect the action of selection in promoters</p><p>(e.g., Hamblin and DiRienzo 2000; Bamshad et al. 2002;</p><p>Fullerton et al. 2002).</p><p>6 Hypotheses About the Evolution of Transcriptional</p><p>Regulation</p><p>No general framework exists for understanding,</p><p>interpreting, and predicting how transcription evolves. In</p><p>this section, we present an initial attempt at providing such</p><p>a framework, in the form of testable hypotheses derived</p><p>from three sources: models of molecular evolution,</p><p>mechanisms of promoter function, and empirical evidence.</p><p>Our hope is that these hypotheses will encourage in-</p><p>vestigators to dig a little deeper into their data to address</p><p>a broad range of questions about promoter evolution.</p><p>Thus, we emphasize hypotheses that can be tested with</p><p>available techniques. The first six categories of predictions</p><p>are based on the neutral model of sequence evolution</p><p>(Kimura 1983) and are organized as shown in figure 7; the</p><p>last four categories address promoter ‘‘design’’ principles</p><p>and macroevolution.</p><p>6.1 Promoter Sequences Have Characteristic</p><p>Evolutionary Dynamics</p><p>Because promoters are organized and function</p><p>differently from other regions of the genome, they are</p><p>subject to distinct functional constraints. Predictable</p><p>patterns of sequence evolution should therefore distinguish</p><p>promoters from sequences that lack a role in transcriptional</p><p>regulation. (1) Overall substitution and indel frequencies in</p><p>promoters should be higher than in coding sequences and</p><p>lower than in introns and in nonregulatory intergenic</p><p>regions. Because most nucleotides in promoters do not</p><p>affect transcription, most substitutions and many indels</p><p>should have no functional consequence and should</p><p>therefore evolve without constraint. These predictions are</p><p>generally supported empirically (Jordan and McDonald</p><p>1998; Jareborg, Birney, and Durbin 1999; Waterston et al.</p><p>2002). Exceptions might include the promoters of genes</p><p>encoding proteins that tolerate many amino acid substitu-</p><p>tions, are under balancing selection, or whose introns</p><p>contain regulatory sequences (e.g., IL-4 and IL-13 in</p><p>mammals: Loots et al. 2000). (2) Indel size spectrum in</p><p>promoters should be more continuous than that in coding</p><p>sequences but similar to that in introns, UTRs, and</p><p>nonregulatory intergenic regions. Three factors may</p><p>contribute to a distinctive evolutionary dynamic of length</p><p>variation within promoter regions: the lack of a reading</p><p>frame, the low density of functionally important nucleo-</p><p>tides, and the ability of many binding sites to operate in</p><p>a position-independent manner. Indels should be more</p><p>common, the frequency spectrum of indel size should not</p><p>be biased toward multiples of three (as it is in coding</p><p>sequences), and repeat variation and large indels should be</p><p>much more common. These patterns are evident in some</p><p>cases (e.g., hairy: Kim 2001). (3) The order of binding sites</p><p>and modules within promoters should be less conserved</p><p>than the order of exons within transcription units. Codons</p><p>have an obligate colinearity with the amino acid sequence</p><p>of their protein product, whereas binding sites and modules</p><p>within promoters can often function to some extent</p><p>independently of position and order (see section 3.3.5).</p><p>Gross organizational changes within promoters</p><p>a coordinated phenotypic response</p><p>(Raff and Kaufman 1983; Gerhart and Kirschner 1997;</p><p>Carroll, Grenier, and Weatherbee, 2001; Wilkins 2002).</p><p>Mutations in the expression of transcriptional regulators</p><p>are therefore not simply more pleiotropic, they are more</p><p>likely to produce functionally integrated phenotypic</p><p>consequences. (3) The ‘‘Hox paradox.’’ The discovery</p><p>that many developmental regulatory genes and their</p><p>expression profiles are phylogenetically widespread within</p><p>the plant and animal kingdoms (Gerhart and Kirschner</p><p>1997; Carroll et al. 2001) raises an obvious problem: How</p><p>do orthologous regulatory proteins pattern anatomically</p><p>disparate organisms? At least part of the answer seems to</p><p>lie in evolutionary reorganization of gene networks, such</p><p>that many interactions between these proteins and the</p><p>collection of genes that they regulate has changed since</p><p>flies and mice last shared a common ancestor (Wray and</p><p>Lowe 2000; Davidson 2001; Wilkins 2002). (4) Evolv-</p><p>ability. Promoters may be more ‘‘evolvable’’ than coding</p><p>regions (Gerhart and Kirschner 1997; Stern 2000; Carroll,</p><p>Grenier, and Weatherbee 2001; Wilkins 2002). Many</p><p>promoters are organized into functional modules, each of</p><p>which produces a discrete aspect of the overall expression</p><p>profile (Arnone and Davidson 1997), confining pleiotropy</p><p>and allowing selection to modify discrete aspects of the</p><p>overall expression profile independently. In addition, many</p><p>promoter alleles are likely to be codominant and thus</p><p>immediately visible to selection, increasing the efficiency</p><p>with which beneficial alleles are fixed and deleterious ones</p><p>are eliminated.</p><p>2.2 Mutations in Transcriptional Regulation Influence</p><p>Phenotype</p><p>Transcriptional regulation is an integral component</p><p>of the way genotype is converted into phenotype. Many</p><p>mutants that have emerged from genetic screens for de-</p><p>velopmentally important genes involve defects in tran-</p><p>scriptional regulation (Wilkins 1993, 2002; Gilbert 2000).</p><p>The four-winged fly that results from certain mutations in</p><p>Ubx in Drosophila is perhaps the most famous: some</p><p>mutations located in regulatory sequences affect the</p><p>transcription profile, and others locating in exons alter</p><p>the function of the protein in regulating the transcription of</p><p>other genes (Bender et al. 1983; Simon et al. 1990). The</p><p>phenotypic consequences of some Ubx promoter muta-</p><p>tions are so distinct that they were originally thought to</p><p>represent separate genes (Lewis 1978).</p><p>Numerous studies have documented correlations be-</p><p>tween gene expression and anatomy. (1) Induced mutations.</p><p>The phenotypes of some induced mutations mimic natural</p><p>differences between species. Examples include homeotic</p><p>mutations in Drosophila melanogaster that mimic segment</p><p>and appendage number and identity characteristic of other</p><p>insects (Raff and Kaufman 1983; Carroll 1995), mutations</p><p>in Arabidopsis thaliana and Antirrhinum majus that mimic</p><p>the floral anatomy of other angiosperms (Lawton-Rauh</p><p>et al. 2000), and mutations in Caenorhabditis elegans that</p><p>mimic the tail anatomy of other nematodes (Fitch 1997).</p><p>Because most of these induced mutations generally do not</p><p>replicate the genetic basis for natural phenotypic differ-</p><p>ences (Carroll 1995; Budd 1999), however, convincing</p><p>evidence of the evolutionary significance of changes in</p><p>transcriptional regulation must come from natural cases. (2)</p><p>Comparisons of expression. In many cases, a gene required</p><p>for the development of a trait in one species shows</p><p>a difference in expression in other species that correlates</p><p>with a difference in that trait (e.g., Burke et al. 1995;</p><p>Brakefield et al. 1996; Dudareva et al. 1996; Sinha and</p><p>Kellogg 1996; Averof and Patel 1997; Stockhaus et al.</p><p>1997; Abzhanov and Kaufman 2000; Kopp et al. 2000;</p><p>Yamamoto and Jeffery 2000; Beldade, Brakefield, and</p><p>Long 2002; Bharathan et al. 2002; Hariri et al. 2002). A</p><p>causal relationship is plausible but not proven in these</p><p>cases, because comparisons of gene expression cannot by</p><p>themselves demonstrate that a change in transcriptional</p><p>regulation is the genetic basis for a phenotypic difference.</p><p>(3) Quantitative genetics. Anatomical changes that accom-</p><p>panied the domestication of maize from teosinte are due in</p><p>part to changes within the inferred promoter region of</p><p>a single gene encoding the transcription factor teosinte-</p><p>branched (Wang et al. 1999). Although this is a case of</p><p>artificial selection, it involved natural (rather than induced)</p><p>1378 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>genetic variation. Some differences in bristle patterns</p><p>among Drosophila species are attributable to changes in</p><p>promoter sequences (Stern 1998; Skaer and Simpson</p><p>2000; Sucena and Stern 2000). In other cases, genetic</p><p>variation in gene expression levels shows strong associ-</p><p>ations with specific organismal phenotypes (Gerber,</p><p>Fabre, and Planchon, 2000; Karp et al. 2000; Beldade,</p><p>Brakefield, and Long 2002). Unfortunately, because of the</p><p>confounding effects of linkage disequilibrium, quantitative</p><p>genetics generally lacks the resolution to identify precise</p><p>sequence differences that are responsible for particular</p><p>phenotypes. When combined with experimental tests or</p><p>case associations, however, specific sequence variants can</p><p>be identified (Cooper 1999). Using this approach, more</p><p>than 160 segregating promoter variants that influence</p><p>transcription have been identified in humans (Cooper</p><p>1999; Rockman and Wray 2002), and several have been</p><p>identified in Drosophila melanogaster (e.g., Robin et al.</p><p>2002).</p><p>2.3 Natural Populations Harbor Considerable Functional</p><p>Variation in Gene Expression</p><p>Many examples of variation in gene expression are</p><p>known from natural populations. (1) Spatial extent of</p><p>expression. In rainbow trout, an allele of PGM1 conferring</p><p>expression in the liver is associated with faster prehatching</p><p>growth (Allendorf, Knudsen, and Phelps 1982; Allendorf,</p><p>Knudsen, and Leary 1983). The spatial expression of</p><p>amylase in the midgut varies within both Drosophila</p><p>melanogaster and D. pseudoobscura; the genetic basis in</p><p>both cases is trans and responds to artificial selection in D.</p><p>pseudoobscura (Abraham and Doane 1978; Powell 1979;</p><p>Powell and Licthenfels 1979). The spatial extent of</p><p>expression of the transcription factor Distal-less within</p><p>the wing of the butterfly Bicyclus anynana varies in</p><p>correlation with wing color pattern, and it also responds to</p><p>artificial selection (Beldade, Brakefield, and Long 2002).</p><p>(2) Level of expression. Intraspecific differences in ex-</p><p>pression have been noted for GPDH in both larvae and</p><p>adults of D. melanogaster (Laurie-Ahlberg and Bewley</p><p>1983); b-glucuronidase in Mus domesticus (Pfister et al.</p><p>1982; Bush and Paigen 1992); Cyp6g1, a cytochrome</p><p>P450 family gene, in D. melanogaster (Daborn et al.</p><p>2002); and prolactin in the teleost Oreochromis niloticus</p><p>(Streelman and Kocher 2002). In all four cases, most or all</p><p>of the polymorphisms described are in cis. Many</p><p>additional examples are known from humans, where</p><p>nearly two-thirds of the known functional polymorphisms</p><p>in cis-regulatory sequences have a greater than twofold</p><p>impact on transcription rates (Rockman and Wray 2002).</p><p>(3) Inducibility of expression. Inducibility of amylase</p><p>expression in response to a starch diet varies within D.</p><p>melanogaster and responds to artificial selection (Matsuo</p><p>and Yamazaki 1984; Klarenberg, Sikkema, and Scharloo</p><p>1987); expression of b-glucuronidase in response to</p><p>androgen varies within Mus domesticus (Bush and Paigen</p><p>1992); and three different mobile element insertions into</p><p>the promoter of hsp70 reduce transcription in response to</p><p>thermal stress in D. melanogaster populations (Lerman</p><p>et al. 2003). Several other examples of variation in</p><p>inducibility are known from humans (Rockman and Wray</p><p>2002). In the human and hsp70 cases, the genetic basis is</p><p>known to reside in cis.</p><p>Additional studies have estimated the extent of</p><p>heritable genetic variation in gene expression within</p><p>populations. (1) Protein-based surveys. Several studies</p><p>have</p><p>should be</p><p>limited largely by mutation, whereas in coding regions</p><p>such changes should be limited primarily by functional</p><p>constraints. Small-scale inversions in promoters can exist</p><p>within populations (e.g., IGHA1 in humans: Denizot et al.</p><p>2001). (4) ‘‘Module shuffling’’ should be relatively</p><p>common. ‘‘Domain shuffling’’ has been an important part</p><p>of the evolution of many gene families (Lander et al. 2001).</p><p>The analogous process of module shuffling within</p><p>promoters may occur at a higher frequency. Several</p><p>examples of mobile element insertions that have brought</p><p>functional binding sites into range of a gene are known</p><p>(Britten 1997; Kidwell and Lisch 1997). The modular</p><p>organization of many promoters means that a transcription</p><p>profile could be dramatically modified in a functionally</p><p>integrated way.</p><p>6.2 Selection Acts Primarily on the Sequence and Spatial</p><p>Arrangement of Binding Sites</p><p>The output of a promoter derives from the nucleotide</p><p>sequences and spatial arrangement of transcription factor</p><p>binding sites (see section 3). It follows that sequences that</p><p>lie between binding sites should be free to vary, at most</p><p>showing weak biases that reflect mutational processes or</p><p>weak selection to maintain overall base composition or</p><p>conformational properties. (1) Negative selection should</p><p>operate primarily on nucleotide identity within binding</p><p>sites. This is the basic idea underlying ‘‘phylogenetic</p><p>footprinting’’ as a method of identifying binding sites (see</p><p>section 5.2). Several difficulties beset tests for negative</p><p>selection on binding sites: some nucleotide positions</p><p>assumed to be ‘‘non-binding sites’’ may in fact be part</p><p>of binding sites that have not yet been identified, some</p><p>nucleotide substitutions within known binding sites may</p><p>be functionally tolerated, and binding sites may turn over</p><p>within a promoter. Nonetheless, some studies have found</p><p>evidence for preferential conservation of binding sites</p><p>1402 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>(e.g., teashirt: Core et al. 1997; Otx: Yuh et al. 2002). (2)</p><p>Negative selection should operate on the spacing between</p><p>nearby binding sites. Protein-protein interactions associat-</p><p>ed with adjacent binding sites often rely on precise spacing</p><p>(see section 3.5.2), and small changes in spacing can</p><p>dramatically affect transcription (e.g., bicoid sites: Hanes</p><p>et al. 1994; protein C: Spek, Bertina, and Reitsma 1999).</p><p>These functional constraints on length variation are likely</p><p>to be absent in regions between modules or, more</p><p>generally, between distantly located binding sites. (3)</p><p>Negative selection should eliminate spurious binding sites.</p><p>Because they are small, imprecise, and exist for many</p><p>different transcription factors, binding sites will appear</p><p>through random mutation at appreciable rates in large</p><p>populations (Stone and Wray 2001). Where new binding</p><p>sites interfere with transcription, selection should eliminate</p><p>them. There is evidence for such selection in prokaryotic</p><p>genomes, although the strength of selection against such</p><p>binding sites is estimated to be quite weak (Hahn, Stajich,</p><p>and Wray 2003).</p><p>6.3 Selection Can Discriminate Among Binding Sites</p><p>Within a Promoter</p><p>Some binding sites are more important than others for</p><p>promoter function. The imprint of different levels of</p><p>functional constraint among binding sites within a single</p><p>promoter should be evident in sequence comparisons. (1)</p><p>Essential binding sites should evolve relatively slowly.</p><p>Functional analyses often reveal that one or a few binding</p><p>sites are absolutely necessary for activating transcription</p><p>and that others (usually a greater number) either modulate</p><p>or have no detectable impact on transcription. Although</p><p>within-consensus nucleotide substitutions introduce a com-</p><p>plication, comparisons should generally reveal lower rates</p><p>of turnover or loss for essential binding sites. (2) Binding</p><p>sites for repressors should evolve faster than those for</p><p>activators. There are many ways to repress transcription but</p><p>relatively fewways to activate it (Latchman 1998; Carey and</p><p>Smale 2000); furthermore, the consequences of failing to</p><p>repress transcription may be generally less severe than of</p><p>failing to activate it (see the next section for more on this</p><p>point). It follows that the binding siteswithin a promoter that</p><p>activate expression may experience stronger negative</p><p>selection than those that bind repressors. Because binding</p><p>site function is often context dependent (see section 3.4.7),</p><p>this pattern may be weak. (3)Multiply-represented binding</p><p>sites should evolve faster than unique ones. In some cases,</p><p>multiply-represented sites are functionally redundant or</p><p>each has a minor impact on the overall transcription profile.</p><p>Thus, selection may tolerate more nucleotide substitutions</p><p>and turnover of multiply-represented binding sites than</p><p>unique ones. Somemultiply-represented binding sites either</p><p>function synergistically (e.g., hunchback: Ma et al. 1996) or</p><p>have distinct functions (e.g., Endo16: Yuh et al. 2002),</p><p>whichwill weaken this prediction.Although several cases of</p><p>binding site turnover have been identified (Ludwig and</p><p>Krietman 1995; Liu, Wu, and He 2000; Dermitzakis and</p><p>Clark 2002; Scemama et al. 2002), direct comparisons of</p><p>turnover rates in unique versus multiply-represented</p><p>binding sites have not been made. (4) Loss of one binding</p><p>site may be followed by loss of another if their cognate</p><p>proteins interact.Binding sites that are occupied by proteins</p><p>that must interact in order to function will either both be</p><p>present or both be absent. For promoters with a modular</p><p>organization, loss of a crucial binding site may lead to</p><p>eventual loss of the entire module, because the clumped</p><p>distribution of binding sites into modules is probably</p><p>a consequence of interactions among the proteins that bind</p><p>them. In general, the functional interdependence among</p><p>promoter nucleotides makes these sequences candidates for</p><p>evolution according to a covarion (Fitch and Markowitz</p><p>1970) or fluctuating neutral space (Takahata 1987) model.</p><p>6.4 Binding Sites Can Evolve Neutrally</p><p>Many point mutations in exons are functionally neutral</p><p>or near neutral, and some of these are fixed through drift</p><p>(Kimura 1983; Ohta 1992). The same should be true of</p><p>promoter sequences. Three categories of sequence change in</p><p>binding sites, described below, should be effectively neutral</p><p>(fig. 8).When combinedwith neutral changes in nucleotides</p><p>between binding sites (see section 6.2), at least four distinct</p><p>kinds of neutral sequence evolution should be evidentwithin</p><p>promoter regions. (1) Some sequence differences within</p><p>binding sites should evolve because they do not alter the</p><p>transcription profile. Nucleotide substitutions in binding</p><p>sites that do not alter binding kinetics should not affect</p><p>transcription, and at the same time, some substitutions that</p><p>do alter binding kinetics may not affect the transcription</p><p>profile. Such transcriptionally silent changes within binding</p><p>sites will accumulate by drift. Because many transcription</p><p>factors can bindwith high specificity to two ormore variants</p><p>FIG. 7.—A neutral model of promoter evolution. This figure outlines</p><p>the organization of the first six groups of predictions about promoter</p><p>evolution (see sections 6.1–6.6), using a neutral model of promoter</p><p>evolution (i.e., the notion that the rate of evolution at a nucleotide position</p><p>is inversely related to its functional importance). (A) Schematic diagram</p><p>of a locus, defining the regions referred to below. Using Kimura’s (1983)</p><p>neutral model of sequence evolution, we can model the substitution rate</p><p>as the fraction of neutral mutations multiplied by the total mutation rate:</p><p>K¼ fltotal (0 � f � 1). (B) Relationship of sections 6.1–6.6 to the neutral</p><p>model. For instance, section 6.1 treats expected differences in patterns of</p><p>variation among genomic partitions.</p><p>Evolution of Transcriptional Regulation 1403</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>of a binding</p><p>sequence (see section 3.4.4), functionally</p><p>neutral substitutions within binding sites may be relatively</p><p>common. (2) Some differences in the complement of binding</p><p>sites should evolve because they do not alter the transcrip-</p><p>tion profile. In experimental assays, removing a binding site</p><p>does not always change the resulting transcription profile, at</p><p>least within the limits of assay sensitivity (e.g., SpHE:Wei et</p><p>al. 1995). Some evolutionary gains and losses of binding</p><p>sites may therefore represent transcriptionally neutral</p><p>changes that were fixed by drift. Functional redundancy</p><p>provides one avenue: if a new binding site evolves by</p><p>random walk (Stone and Wray 2001), mutations in an</p><p>existing site may be tolerated. The new binding site could</p><p>potentially even bind a different protein, so long as the</p><p>functional consequence is the same. (3) Some sequence</p><p>differences should evolve because they affect transcription</p><p>but not fitness. A difference in gene expression need not</p><p>influence organismal phenotype or fitness. Lack of a fitness</p><p>consequence for a difference in transcription could lead to</p><p>the evolution of ‘‘sloppy’’ or ‘‘gratuitous’’ expression</p><p>(Gerhart and Kirschner 1997). Neutral variation in gene</p><p>expression is evolutionarily relevant, because it could later</p><p>interact with polymorphisms elsewhere in the genome or</p><p>with changes in the environment to produce phenotypic</p><p>consequences.</p><p>6.5 Selection Can Discriminate Among Variants of</p><p>a Binding Site</p><p>Although many mutations within promoter regions</p><p>are probably phenotypically neutral, many others certainly</p><p>are not. Many sequence changes within binding sites affect</p><p>protein interactions, and this might alter the transcription</p><p>profile, which could in turn affect fitness. (1) Specific</p><p>binding sites may be constrained to a subset of the total</p><p>binding site matrix of their cognate protein for functional</p><p>reasons. The precise affinity of a binding site for</p><p>a particular transcription factor is sometimes functionally</p><p>important. In cases where high-affinity and low-affinity</p><p>variants of the binding site matrix have different</p><p>phenotypic consequences, negative selection will eliminate</p><p>variants that bind protein but result in lower fitness. In</p><p>interspecific comparisons, therefore, some binding sites for</p><p>a particular protein may show more variation than others.</p><p>Conversely, specific variants that confer a fitness advan-</p><p>tage should be under positive selection (e.g., ftz: Jenkins,</p><p>Ortori, and Brookfield 1995). (2) Binding sites for</p><p>transcription factors with many downstream targets</p><p>should evolve more rapidly than those for factors with</p><p>few targets. The consensus binding sites for general</p><p>transcription factors (TBP, TAFs, etc.; fig. 1B) should be</p><p>fairly broad because they are present in the promoters</p><p>of many genes: a very narrow consensus would impose</p><p>a high genetic load because most point mutations in these</p><p>sites would interfere with binding and thus would likely be</p><p>deleterious. Thus, binding sites for general factors</p><p>should exhibit higher levels of variation than binding sites</p><p>for transcription factors that regulate the specificity of</p><p>transcription, each of which will be present in fewer genes.</p><p>This argument can be extended to levels of variation in</p><p>binding sites for different specific transcription factors:</p><p>those that that interact with many targets should have</p><p>broader consensuses than those that interact with only a</p><p>few targets. (3) Binding-site specificity of strong activators</p><p>and repressors should be relatively strict. Because binding</p><p>of strong activator or strong repressor proteins is more</p><p>likely to have a large impact on transcription, selection</p><p>may operate to narrow the consensus binding sites for</p><p>these proteins. There are at least three ways, not mutually</p><p>exclusive, in which this might happen: a requirement for</p><p>a cofactor that also requires a specific binding site,</p><p>a relatively large binding site, and a relatively narrow</p><p>consensus binding matrix.</p><p>6.6 Selection Can Discriminate Among Regions and</p><p>Modules Within a Promoter</p><p>Selection should be able to discriminate degrees of</p><p>functional constraint within a promoter, and the result</p><p>should be evident as distinct regional patterns of sequence</p><p>evolution. (1) Distal regions of large promoters tend to</p><p>evolve faster than proximal regions. Binding sites required</p><p>for initially activating transcription often lie within the first</p><p>few hundred bases 59 of the basal promoter, whereas</p><p>booster, repressor, and tissue-specific modules are often</p><p>more distant. Physical proximity of activator binding sites</p><p>to the basal promoter may provide a more reliable or</p><p>efficient means of initiating transcription (although there</p><p>are many exceptions). A comparison of human-mouse</p><p>orthologs found that sequence conservation generally</p><p>decayed rapidly with distance from the start of transcrip-</p><p>tion (Jareborg, Birney, and Durbin 1999). (2) Activator</p><p>modules should evolve more slowly than repressor</p><p>modules. The loss of an activator module will, in many</p><p>cases, be analogous to a stop codon in that it abolishes</p><p>gene function. In contrast, loss of repressor function is less</p><p>FIG. 8.—Distinct consequences of variants in noncoding sequences.</p><p>A representation of the variation (or, for interspecies comparisons, fixed</p><p>differences) in noncoding sequences near a locus. From the total pool of</p><p>variation, some variants will lie within transcription factor binding sites;</p><p>a fraction of these will alter protein-DNA interaction; some of these will</p><p>affect transcription; a subset of these will affect organismal phenotype</p><p>(anatomy, physiology, behavior, etc.); and some of these will have fitness</p><p>consequences. These ratios and the kinds of variants that contribute to</p><p>each are poorly understood for cis-regulatory regions.</p><p>1404 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>likely to be incompatible with gene function. Furthermore,</p><p>there are many ways to repress transcription, but activation</p><p>requires a series of specific steps. (3) Booster modules</p><p>should evolve somewhat more rapidly than other modules.</p><p>Differences in transcript abundance twofold or greater are</p><p>common within populations (see section 2.3). Promoter</p><p>modules that modulate transcription level, but that provide</p><p>no spatial or temporal control, may therefore experience</p><p>fewer functional constraints on average than most other</p><p>kinds of modules. (4) Modules used for multiple phases of</p><p>expression should evolve more slowly than those used</p><p>once. On average, such modules should be under greater</p><p>functional constraint, which should be apparent as a greater</p><p>degree of sequence conservation. (5) Integrator and</p><p>tethering modules should evolve more slowly than other</p><p>categories of module. Promoter modules that function</p><p>epistatically (see section 3.5.4) may be among the most</p><p>functionally constrained modules within their respective</p><p>promoters.</p><p>6.7 Structural Complexity in Promoters Reflects</p><p>Functional Complexity</p><p>Genes differ in their functional requirements for</p><p>regulation: constitutive versus inducible expression, con-</p><p>stant level versus modulated level, one versus multiple</p><p>phases of expression, few inputs versus many inputs, and</p><p>so forth. These diverse regulatory requirements should be</p><p>reflected in a similar diversity of functional and organiza-</p><p>tional complexity in promoters. (By ‘‘complex promoter’’</p><p>we mean one with relatively many binding sites and</p><p>regulatory inputs.) (1) Genes that are constitutively</p><p>expressed should have simple promoters. In principle,</p><p>a promoter that is always and everywhere ‘‘on’’ need</p><p>contain only one binding site for a ubiquitous transcrip-</p><p>tional activator. Additional binding sites might be present,</p><p>however, to add robustness, to set levels of transcription</p><p>precisely, or to modulate levels in response to extreme</p><p>conditions such as heat shock. (2) Regulatory genes</p><p>expressed early in development should have complex</p><p>promoters. The promoters of genes that operate in early</p><p>embryos drive temporally and spatially precise transcrip-</p><p>tion, despite the</p><p>fact that pattern formation is ongoing and</p><p>spatial reference points are not yet well defined. The</p><p>promoters of these genes often use cooperative protein</p><p>binding to sharpen boundaries of transcription domains,</p><p>and this requires additional binding sites. Furthermore,</p><p>these promoters typically contain binding sites for several</p><p>positive and negative regulators, as they must integrate</p><p>multiple spatial and temporal inputs (Arnone and David-</p><p>son 1997). (3) Genes with several distinct expression</p><p>domains should have complex promoters with modular</p><p>organization. Promoters that drive multiphased expression</p><p>profiles should be more complex, on average, because they</p><p>respond to many inputs, and that response, in turn, requires</p><p>interaction with a greater variety of transcription factors.</p><p>Multiphased expression is very common for genes</p><p>encoding developmental regulatory proteins. (4) Genes</p><p>expressed exclusively in a single differentiated cell type</p><p>should have simple promoters. Genes encoding the</p><p>specialized products of terminally differentiated cells often</p><p>have relatively simple promoters even though they pro-</p><p>duce spatially complex expression profiles (e.g., CyIIa:</p><p>Arnone, Martin, and Davidson 1998). The promoters of</p><p>these genes are typically activated by one or a few tissue-</p><p>specific transcription factors, and sometimes they lack</p><p>binding sites for repressors (Davidson 2001). Several so-</p><p>called master regulators of differentiation are known,</p><p>including myoD in muscle cells (Yun and Wold 1996) and</p><p>achaete and neuroD in neurons (Lee 1997). (A corollary</p><p>of this relationship between dedicated regulators and</p><p>their downstream targets is discussed in section 6.9.) (5)</p><p>Genes that produce more than one isoform should have</p><p>complex promoters. Loci that produce multiple isoforms of</p><p>a protein may generally have more complex promoters,</p><p>simply because they regulate what is likely to be, on</p><p>average, a more complex overall expression profile. In</p><p>addition, alternate transcriptional start sites are often part</p><p>of the way in which distinct isoforms are generated, adding</p><p>complexity to such promoters. (6) Genes with aspects of</p><p>expression that are contingent on external/extracellular</p><p>conditions should have more complex promoters. Signal</p><p>transduction systems communicate changing conditions in</p><p>the cytoplasm or at the cell surface to the nucleus, often by</p><p>phosphorylation or dephosphorylation of a specific tran-</p><p>scription factor already present in the nucleus. Contingent</p><p>regulation of transcription should therefore require addi-</p><p>tional binding sites for these factors. (7) Genealogically</p><p>unrelated genes that are coordinately regulated should</p><p>share some binding sites. The promoters of genes that are</p><p>expressed in similar spatial and temporal patterns share</p><p>similar functional requirements and should therefore</p><p>sometimes contain binding sites that evolved indepen-</p><p>dently yet function in a similar manner. Possible cases</p><p>include insect chorion genes (Mitsialis and Kafatos 1985;</p><p>Cavener 1992) and vertebrate crystallin genes (Tomarev</p><p>et al. 1994), among many other examples (Arnone and</p><p>Davidson 1997; Bernstein, Tong, and Schreiber 2000;</p><p>Berman et al. 2002). Although few cases of convergence</p><p>in promoter structure and function have been identified as</p><p>such, this situation may prove to be common given the</p><p>ease with which binding sites can be gained (Cavener</p><p>1992; Stone and Wray 2001).</p><p>6.8 Rates of Promoter Evolution Depend on Many Factors</p><p>The diversity of organization and function in eu-</p><p>karyotic promoters (fig. 2; see also the preceding section)</p><p>should expose them to different modes and degrees of</p><p>natural selection, which should in turn be reflected in</p><p>a range of rates and patterns of sequence evolution (see</p><p>section 4.1). In addition, rates of promoter sequence</p><p>evolution may be poorly correlated with function, for</p><p>a variety of reasons. (1) Promoters containing few binding</p><p>sites should evolve relatively slowly. The function of</p><p>relatively simple promoters may be particularly sensitive</p><p>to sequence change because they depend on very few</p><p>proteins for activation and lack multiply-represented</p><p>binding sites that might confer some functional redundan-</p><p>cy. For these and other reasons, the level of functional</p><p>constraint per binding site may be inversely correlated</p><p>with the total number of binding sites in a promoter. (2)</p><p>Evolution of Transcriptional Regulation 1405</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>Rates of promoter evolution should correlate negatively</p><p>with codon usage bias at the same locus. A bias in codon</p><p>usage is generally interpreted to mean that a gene must be</p><p>translated rapidly at some point in the life cycle (Akashi</p><p>2001). Another way to produce protein rapidly is to</p><p>increase rates of mRNA synthesis. Thus, loci with codon</p><p>usage bias should have promoters that can direct high rates</p><p>of transcription. Just as the codons of such genes are</p><p>biased to a subset of the synonymous possibilities, so too</p><p>might transcription factor binding sites be restricted to</p><p>a subset of the binding site matrix that results in high-</p><p>affinity binding. (3) A particular mechanism of regulation</p><p>will sometimes be the target of selection. In some cases,</p><p>natural selection may favor a particular mechanism of</p><p>regulating a conserved transcription profile. For instance,</p><p>selection might favor stable transcription rates despite</p><p>environmental perturbations, something that might require</p><p>additional binding sites beyond the minimum necessary to</p><p>generate the expression profile under constant environ-</p><p>mental conditions. In such cases, the rate of promoter</p><p>evolution might not be correlated with that of the</p><p>transcription profile. (4) Rates of divergence in promoter</p><p>structure and phenotype should often be uncorrelated.</p><p>Within coding sequences, a large discrepancy can exist</p><p>between the magnitude of a change in genotype and the</p><p>resulting change in phenotype. For a variety of reasons</p><p>(see sections 4.3 and 4.6) a similar situation is likely to</p><p>exist within promoters.</p><p>6.9 Some Evolutionary Changes in the Architecture of</p><p>Gene Networks Are More Likely than Others</p><p>The architecture of a gene network (the nature and</p><p>organization of interactions between genes and gene</p><p>products) can change during the course of evolution</p><p>(Wilkins 2002). Some general evolutionary patterns</p><p>should be evident in how linkages are altered, added</p><p>(recruitment or co-option), or lost (abandonment). (1) The</p><p>genetic basis for an evolutionary change in transcription</p><p>is likely to reside in cis. Although a change in transcription</p><p>could arise in several ways, in practice the genetic basis</p><p>is likely to reside in the cis-regulatory sequence of the</p><p>downstream gene. This is because most transcription</p><p>factors regulate the expression of many target genes (see</p><p>section 3.6.1): thus, a change in the binding specificity or</p><p>expression profile of a transcription factor will affect the</p><p>expression profiles of many of its downstream targets,</p><p>whereas a change in a single binding site for that</p><p>transcription factor is likely to be much less pleiotropic.</p><p>Assuming that highly pleiotropic mutations are less likely</p><p>to become fixed (Fisher 1930), this fundamental asymme-</p><p>try means that changes in transcription of a given gene are</p><p>more likely to reside in its promoter than in the amino acid</p><p>sequences of its upstream regulators (Stern 2000). Protein</p><p>and microarray expression studies in mouse and humans</p><p>support this prediction (Klose et al. 2002; Schadt et al.</p><p>2003), although a similar study in yeast is equivocal (Brem</p><p>et al. 2002). (2) Recruitment of ‘‘top’’ and ‘‘intermediate’’</p><p>regulators should occur more frequently than recruitment</p><p>of terminal regulators. Several dramatic evolutionary</p><p>changes at the top of gene networks are known (sex</p><p>determination: Hodgkin 1992, Wilkins 2002; embryonic</p><p>patterning: Stauber, Jackle, and Schmidt-Ott 1999). In</p><p>each case, the plesiomorphic function of the recruited</p><p>transcription factor is quite different from the more</p><p>famous, apomorphic role. Several models have</p><p>been</p><p>presented to explain why recruitment of an additional</p><p>regulator should be tolerated functionally at the beginning</p><p>or middle of a gene network more often than at their</p><p>termini (Gehring and Ikeo 1999; Davidson 2001; Wilkins</p><p>2002). (3) Recruitment of a new regulator is more likely to</p><p>occur for structures that develop in regions and at times</p><p>where that regulator is already being expressed. Tran-</p><p>scriptional regulators are sometimes expressed in analo-</p><p>gous structures, most famously Pax-6 in eyes (Quiring et al.</p><p>1994) and Dlx in appendages (Panganiban et al. 1997).</p><p>Most of these cases appear to involve parallel recruitment</p><p>rather than mistaken interpretations of comparative</p><p>anatomy (reviewed by Davidson 2001 and Wilkins</p><p>2002). Recruitment to additional regulatory roles is more</p><p>probable in regions or cell types where the transcription</p><p>factor is already expressed (Davidson 2001). For instance,</p><p>Pax-6 is transcribed in photosensitive neurons throughout</p><p>the Metazoa, including organisms that lack image-forming</p><p>eyes. It will therefore be expressed in any organ that</p><p>contains photoreceptors, and it is more likely to be</p><p>recruited to additional roles in eye development than</p><p>transcription factors associated with, for example, muscle</p><p>cells. (4) Regulatory linkages to structural genes should be</p><p>more conservative than those between regulatory genes.</p><p>Some of the most widely conserved associations between</p><p>transcriptional regulators and specific downstream target</p><p>genes involve tissue-specific activators and structural</p><p>genes characteristic of those tissues (Gerhart and Kirschner</p><p>1997; Davidson 2001). Although the presence of these</p><p>associations in distantly related taxa suggests very long-</p><p>term (. 0.5 billion year) conservation, denser phyloge-</p><p>netic sampling is needed to distinguish this possibility</p><p>from independent recruitment of downstream genes. (5)</p><p>Transcription factor ‘‘switching’’ should be a common</p><p>basis for altered transcription profiles. Because many</p><p>transcription factors have overlapping binding specificities</p><p>(table 4), some point mutations will shift the equilibrium in</p><p>favor of binding by a different protein. Several cases of</p><p>such ‘‘transcription factor switching’’ have been identified</p><p>as polymorphisms within human populations (Rockman</p><p>and Wray 2002). (6) Negative and quantitative changes in</p><p>expression should be ‘‘easier’’ to achieve than activation</p><p>of novel expression domains. Because there are many ways</p><p>to repress transcription and relatively few ways to activate</p><p>it, a random change in promoter sequence is more likely to</p><p>modulate or abolish an existing phase of gene expression</p><p>than it is to activate a new phase of expression. (7) The</p><p>organization of gene networks is robust to perturbations.</p><p>Simulations suggest that gene networks are organized in</p><p>such a way that they produce consistent transcriptional</p><p>outputs across a range of transcription factor concen-</p><p>trations and transcription factor binding site interactions</p><p>(von Dassow et al. 2000). If this robustness proves to be</p><p>a general feature of real gene networks, it may be the result</p><p>of natural selection to canalize or stabilize transcription</p><p>against environmental variation and genetic background.</p><p>1406 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>6.10 Promoters Display Complex Macroevolutionary</p><p>Properties</p><p>Reconstructing the macroevolutionary history of</p><p>promoter structure and function represents an outstanding</p><p>challenge for studies of developmental evolution (see</p><p>section 7.2). These changes probably lie at the heart of</p><p>many important anatomical transformations and innova-</p><p>tions (Raff 1986; Carroll, Grenier, and Weatherbee 2001;</p><p>Davidson 2001; Wilkins 2002). (1) Promoters are built</p><p>from a mixture of small-scale mutations and rearrange-</p><p>ments. Populations harbor abundant small-scale variation</p><p>within promoter regions (nucleotide substitutions, indels,</p><p>and tandem repeat variants) (see section 2.3). These</p><p>common forms of mutation can alter transcription, with</p><p>consequences for fitness (e.g., FY: Hamblin and Di Rienzo</p><p>2000; CCR5: Bamshad et al. 2002; P450: Daborn et al.</p><p>2002). This ordinary, small-scale variation is likely to be</p><p>the primary contributor to interspecific differences in</p><p>promoter sequences. Larger-scale mutations (transposition</p><p>and chromosomal rearrangement) are less commonly</p><p>found within populations but can also alter transcription,</p><p>leading to fitness consequences (e.g., hsp70: Lerman et al.</p><p>2003). (2) The binding sites that exist within a single</p><p>promoter often have different times of origin. Comparisons</p><p>between species suggest that promoters often evolve by</p><p>binding site accretion and turnover (e.g., Ludwig and</p><p>Kreitman 1995; Rockman and Wray 2002). Importation of</p><p>intact regulatory modules by transposition or recombina-</p><p>tion is probably rarer. (3) Genomic rearrangements can</p><p>lead to novel expression profiles. Gene duplications may</p><p>lead to functional divergence not just of the encoded</p><p>protein but also of cis-regulatory sequences (Ferris and</p><p>Whitt 1979; Paigen 1989; Force et al. 1999), with several</p><p>cases now well documented (e.g., gooseberry and</p><p>paralogs: Li and Noll 1994; Hox3 duplication within</p><p>Diptera: Stauber, Prell, and Schmidt-Ott 2002). The</p><p>duplication-degeneration-complementation (DDC) model</p><p>of promoter evolution (Force et al. 1999) proposes that</p><p>selection can maintain functionally redundant coding</p><p>sequences after gene duplication if each copy loses</p><p>a different promoter module due to random mutation.</p><p>Although gene families can expand through large-scale</p><p>processes that duplicate entire promoter regions (poly-</p><p>ploidization, chromosomal nondisjunction, and large</p><p>translocations), local inversions and tandem duplications</p><p>are more common events (Seoighe et al. 2000). These</p><p>relatively small genomic rearrangements will often omit</p><p>some of the cis-regulatory sequences surrounding a gene,</p><p>producing truncated or hybrid promoters at their inception</p><p>(e.g., nNOS: Korneev and O’Shea 2002) and considerably</p><p>expanding the range of functional outcomes following</p><p>gene duplication. (4) The modular structure of promoters</p><p>facilitates modular changes in expression. In principle,</p><p>modular promoter function should allow selection to</p><p>operate on a discrete aspect of transcription with minimal</p><p>impact on other aspects of the total transcription profile.</p><p>Modular promoters may therefore be more evolvable</p><p>(Gerhart and Kirschner 1997; Stern 2000; Carroll, Grenier,</p><p>and Weatherbee 2001). (5) Regulatory recruitment is an</p><p>important mechanism underlying the evolution of novelty.</p><p>Deciphering the genetic basis for anatomical novelty</p><p>presents a significant challenge in evolutionary biology</p><p>(Müller and Wagner 1991). One possibility is that new</p><p>structures require the origin of new genes; the other</p><p>extreme is that new structures are built entirely by re-</p><p>organizing the activities or interactions among existing</p><p>genes. The improbability of de novo origins of functional</p><p>genes, combined with the observation that evolution gen-</p><p>erally operates by incremental tinkering (Darwin 1859;</p><p>Jacob 1977), suggests that the latter process predominates.</p><p>Several authors have proposed that regulatory changes in</p><p>development are a crucial and perhaps ubiquitous</p><p>component in the origin of evolutionary innovations</p><p>(Britten and Davidson 1971; Duboule and Wilkins 1998;</p><p>Carroll, Grenier, and Weatherbee 2001; Wilkins 2002).</p><p>This ‘‘new roles for old genes’’ hypothesis (Wray and</p><p>Lowe 2000) proposes that existing regulatory proteins,</p><p>including transcription factors, are recruited to build novel</p><p>features. The requisite genetic variation, namely gains and</p><p>losses of individual transcription factor binding sites and</p><p>changes in interactions among proteins bound to these</p><p>sites, is common within populations (see section 2.3).</p><p>7 Future Directions</p><p>Despite the challenges unique to studying promoter</p><p>evolution (see section 5), considerable progress is being</p><p>made on several fronts. Our ability to study the evolution of</p><p>transcriptional regulation has increased</p><p>enormously during</p><p>the past few years. Biochemical characterizations and ex-</p><p>pression assays are increasingly feasible in nonmodel organ-</p><p>isms, allowing evolutionary comparisons to move beyond</p><p>sequence inspection to highly informative functional tests.</p><p>In addition, the number of noncoding sequences available</p><p>for comparison is increasing exponentially. Whole genome</p><p>assemblies from related species are proving enormously</p><p>useful, providing many orthologous intergenic regions for</p><p>comparison. Importantly, many current areas of ignorance</p><p>about the evolution of transcriptional regulation are the</p><p>result of neglect rather than technical limitations. Below we</p><p>list several important, but poorly understood issues that can</p><p>be addressed through methods that are practical today.</p><p>7.1 Intraspecific Variation</p><p>Considerable effort has gone into characterizing</p><p>general patterns of intraspecific variation in exon and</p><p>intron sequences and into understanding the mechanisms</p><p>that shape this variation (Gillespie 1991; Li 1997). Our</p><p>understanding of variation in these partitions of the</p><p>genome is now highly quantitative, precise, and, in some</p><p>cases, predictive. In contrast, most parameters of in-</p><p>traspecific variation have been measured from just one or</p><p>two promoters, and several basic parameters have never</p><p>been estimated. An important goal for the near term is to</p><p>characterize intraspecific variation in promoter sequences:</p><p>(1) the nature, level, and scope of variation and how it</p><p>compares to that in other partitions of the genome such as</p><p>coding sequences, introns, UTRs, and nonregulatory</p><p>intergenic regions; (2) how much and what components</p><p>of this variation influence proximate phenotype (transcrip-</p><p>Evolution of Transcriptional Regulation 1407</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>tion profile), organismal phenotype (anatomical, physio-</p><p>logical, behavioral, etc.), and fitness (fig. 8); (3) the</p><p>frequencies and heterozygosities of functionally silent</p><p>versus transcriptionally penetrant and of neutral versus</p><p>non-neutral allelic variation in promoters; and (4) the</p><p>relative contributions of mutation, drift, and selection of</p><p>various kinds to standing variation in promoter sequences.</p><p>At present, by far the most information on population</p><p>variation in promoter sequences is available for humans</p><p>(Cooper 1999; Rockman and Wray 2002). Comparable</p><p>information is needed from other organisms, not only to</p><p>determine general patterns of variation but also to enable</p><p>follow-up studies on functional and selective conse-</p><p>quences that cannot be carried out in humans for ethical</p><p>and practical reasons.</p><p>7.2 Reconstructing History</p><p>Few studies have analyzed evolutionary changes in</p><p>promoter structure or function in detail. The frequencies</p><p>and patterns of evolutionary change in several features are</p><p>of interest: (1) the spatial distribution for different kinds</p><p>of binding sites (activator, repressor, architectural factor,</p><p>general transcription factor), for binding sites of high</p><p>and low affinity, and among different kinds of modules</p><p>(activator, repressor, booster, insulator, tetherer, integra-</p><p>tor); (2) the composition of binding sites within a promoter</p><p>(gains, losses, and replacement of individual binding sites),</p><p>binding site ‘‘switches’’ to higher affinity for a different</p><p>transcription factor, and correlations of such changes with</p><p>the number of instances of a binding site within a given</p><p>module or promoter; (3) promoter organization, including</p><p>changes in spacing and orientation of modules relative to</p><p>each other and to the basal promoter, the frequency of</p><p>gains and losses of entire modules, and possible relation-</p><p>ships between module turnover and function (activator,</p><p>repressor, etc.); (4) changes in proximate promoter</p><p>function including level, timing, and location of expres-</p><p>sion, as well as gains and losses of entire phases of</p><p>expression; (5) changes in such modes of transcriptional</p><p>regulation as constitutive, metabolically inducible, stress</p><p>inducible, and sexually dimorphic, among others; and (6)</p><p>changes in the array of upstream regulators that interact</p><p>with a promoter. Such analyses will be most informative</p><p>when based on dense taxon sampling and carried out</p><p>within the context of a robust phylogenetic framework. No</p><p>promoter has been subjected to a detailed analysis of this</p><p>kind.</p><p>7.3 Structure-Function Relationships</p><p>The evolutionary relationship between genetic and</p><p>phenotypic changes in transcription remains poorly un-</p><p>derstood, despite a handful of cases where intriguing, but</p><p>limited, information has been uncovered. Several issues</p><p>need to be addressed: (1) the proportions of variation (or</p><p>interspecific differences) in transcription that are heritable,</p><p>stochastic, and environmentally determined; (2) the extent</p><p>of maternal influences on gene expression, particularly in</p><p>embryos; (3) what kinds of mutations give rise to various</p><p>differences in transcription profiles, such as altered level,</p><p>altered timing, new location, and environmental sensitivity;</p><p>(4) the proportion of genetically based variation in</p><p>transcription that resides in cis (within promoter sequences)</p><p>and trans (in the expression or function of regulators that</p><p>bind to those sequences); (5) the heritability of particular</p><p>components of transcription, including location, time of</p><p>onset and cessation, level, and response to inducers or</p><p>environmental conditions; (6) the degree to which co-</p><p>ordinately expressed loci are genetically correlated, for</p><p>instance through common upstream regulators or the same</p><p>signal transduction system.</p><p>7.4 Relationship to Organismal Phenotype</p><p>Despite assertions that mutations within promoter</p><p>regions constitute the most ‘‘relevant’’ (Stern 2000) or</p><p>‘‘important’’ (Carroll 2000) source of genetic variation, the</p><p>fraction of phenotypic changes due to mutations within</p><p>regulatory versus coding sequences is not known, even to a</p><p>very rough approximation. Estimating this ratio represents</p><p>an important challenge in molecular evolution. Ultimately,</p><p>we would like to gauge the broad impact of transcriptional</p><p>regulation on the evolution of organismal function. Several</p><p>issues stand out: (1) determining what kinds of phenotypic</p><p>consequences result from mutations in promoters versus</p><p>coding sequences, and how these relate to functional</p><p>classes of encoded proteins (enzyme, transcription factor,</p><p>ion channel, etc.); (2) conversely, knowing what kinds of</p><p>mutations in promoters contribute to organismal pheno-</p><p>types of various kinds (local mutation, chromosomal re-</p><p>arrangement, and transposition, and various classes within</p><p>each); (3) evaluating how hard it is to achieve a change in</p><p>a gene expression profile, in particular the number and</p><p>kind of mutations it takes to shift features such as the</p><p>timing, level, location, or environmental sensitivity of</p><p>transcription or to establish a novel phase of transcription;</p><p>and (4) establishing what fraction of organismal pheno-</p><p>typic changes are due to mutations within regulatory</p><p>versus coding sequences.</p><p>8 Conclusions</p><p>Promoter sequences represent for evolutionary biolo-</p><p>gists a vast and largely uncharted territory within the</p><p>genome. First principles and a growing body of empirical</p><p>evidence point squarely toward the evolutionary impor-</p><p>tance of these regulatory sequences. Because transcriptional</p><p>regulation is complex, indirect, idiosyncratic, and context</p><p>dependent, understanding the evolutionary mechanisms</p><p>that shape promoter sequences will require a thorough</p><p>appreciation of molecular mechanisms as well as the use of</p><p>comparative data from promoter sequences, biochemical</p><p>assays, and functional tests. The conceptual and empirical</p><p>challenges to studying promoter evolution are significant,</p><p>but well worth tackling. The insights into evolutionary</p><p>history and mechanisms that will emerge from detailed</p><p>analyses of promoter evolution are potentially enormous.</p><p>This information will be essential for a complete un-</p><p>derstanding of the evolution of the genotype-phenotype</p><p>relationship. Changes in promoter function</p><p>are likely to be</p><p>an important component of reproductive isolation, the</p><p>1408 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>evolution of morphology and physiology, the origin of</p><p>phenotypic plasticity, and the genetic basis of evolutionary</p><p>novelties. Yet to date nearly everything we know about the</p><p>evolution of promoters has come from biologists with</p><p>relatively little background or interest in evolutionary</p><p>biology. It is time for the molecular evolution community to</p><p>seize the opportunities that promoters offer to expand our</p><p>understanding and appreciation of the evolution of genomes</p><p>and organisms.</p><p>Acknowledgments</p><p>Lynn Angerer, Ann Rouse, David Des Marais, Daven</p><p>Presgraves, Tania Rehse, and Jay Storz provided per-</p><p>ceptive comments and helped locate references. Two</p><p>anonymous reviewers offered constructive advice. Re-</p><p>search in G.A.W.’s lab is supported by the National</p><p>Science Foundation and by the National Aeronautics and</p><p>Space Administration.</p><p>Literature Cited</p><p>Abouheif, E. H., and G. A. Wray. 2002. The developmental</p><p>genetic basis for the evolution of wing polyphenism in ants.</p><p>Science 297:249–252.</p><p>Abraham, I., and W. W. Doane. 1978. Genetic regulation of</p><p>tissue-specific expression of amylase structural genes in</p><p>Drosophila melanogaster. Proc. Natl. Acad. Sci. USA</p><p>75:4446–4450.</p><p>Abu-Shaar, M., H. D. Ryoo, and R. S. Mann. 1999. Control of</p><p>the nuclear localization of extradenticle by competing nuclear</p><p>import and export signals. Genes Dev. 13:935–945.</p><p>Abzhanov, A., and T. C. Kaufman. 2000. Crustacean (malaco-</p><p>stracan) Hox genes and the evolution of the arthropod trunk.</p><p>Development 127:2239–2248.</p><p>Akashi, H. 2001. Gene expression and molecular evolution. Curr.</p><p>Opin. Genet. Dev. 11:660–666.</p><p>Alberts, B., A. Johnson, J. Lewis, M. Raff, K. Roberts, and</p><p>P. Walter. 2002. The molecular biology of the cell. Garland</p><p>Publishing, New York.</p><p>Allendorf, F. W., K. L. Knudsen, and R. F. Leary. 1983.</p><p>Adaptive significance of differences in the tissue-specific</p><p>expression of a phosphoglucomutase gene in rainbow trout.</p><p>Proc. Natl. Acad. Sci. USA 80:1397–1400.</p><p>Allendorf, F. W., K. L. Knudsen, and S. R. Phelps. 1982.</p><p>Identification of a gene regulating the tissue expression of</p><p>a phosphoglucomutase locus in rainbow trout. Genetics</p><p>102:259–268.</p><p>Andersson, C. R., E. O. Jensen, D. J. Llewellyn, E. S. Dennis,</p><p>and W. J. Peacock. 1996. A new hemoglobin gene from</p><p>soybean: a role for hemoglobin in all plants. Proc. Natl. Acad.</p><p>Sci. USA 93:5682–5687.</p><p>Angerer, L. M., D. W. Oleksyn, A. M. Levine, X. Li, W. H.</p><p>Klein, and R. C. Angerer. 2001. Sea urchin goosecoid</p><p>function links fate specification along the animal-vegetal</p><p>and oral-aboral embryonic axes. Development 128:4393–</p><p>4404.</p><p>Anisimov, S. V., M. V. Volkova, L. V. Lenskaya, V. K.</p><p>Khavinson, D. V. Solovieva, and E. I. Schwartz. 2001. Age-</p><p>associated accumulation of the apolipoprotein C-III gene</p><p>T-455C polymorphism C allele in a Russian population. J.</p><p>Gerontol. A. Biol. Sci. Med. Sci. 56:B27–B32.</p><p>Aparicio, S., A. Morrison, A. Gould, J. Gilthorpe, C. Chaudhuri,</p><p>P. Rigby, R. Krumlauf, and S. Brenner. 1995. Detecting</p><p>conserved regulatory elements with the model genome of the</p><p>Japanese pufferfish, Fugu rubipes. Proc. Natl. Acad. Sci. USA</p><p>92:1684–1688.</p><p>Arbeitman, M. N., E. E. Furlong, F. Imam, E. Johnson, B. H.</p><p>Null, B. S. Baker, M. A. Krasnow, M. P. Scott, R. W. Davis,</p><p>and K. P. White. 2002. Gene expression during the life cycle</p><p>of Drosophila melanogaster. Science 297:2270–2275.</p><p>Arnone, M. I., and E. H. Davidson. 1997. The hardwiring of</p><p>development: organization and function of genomic regula-</p><p>tory systems. Development 124:1851–1864.</p><p>Arnone, M. I., E. L. Martin, and E. H. Davidson. 1998. Cis-</p><p>regulation downstream of cell type specification: a single</p><p>compact element controls the complex expression of the CyIIa</p><p>gene in sea urchin embryos. Development 125:1381–1395.</p><p>Atchison, M. L. 1988. Enhancers: mechanisms of action and cell</p><p>specificity. Annu. Rev. Cell Biol. 4:127–153.</p><p>Averof, M., and N. H. Patel. 1997. Crustacean appendage</p><p>evolution associated with changes in Hox gene expression.</p><p>Nature 388:682–686.</p><p>Avila, S., M. C. Casero, R. Fernandez-Canton, and L. Sastre.</p><p>2002. Transactivation domains are not functionally conserved</p><p>between vertebrate and invertebrate serum response factors.</p><p>Eur. J. Biochem. 269:3669–3677.</p><p>Babich, V., N. Aksenov, V. Alexeenko, S. L. Oei, G. Buchlow,</p><p>and N. Tomilin. 1999. Association of some potential hormone</p><p>response elements in human genes with the Alu family</p><p>repeats. Gene 239:341–349.</p><p>Bamshad, M. J., S. Mummidi, E. Gonzalez et al. (11 co-authors).</p><p>2002. A strong signature of balancing selection in the 59 cis-</p><p>regulatory region of CCR5. Proc. Natl. Acad. Sci. USA</p><p>99:10539–10544.</p><p>Barrier, M., R. H. Robichaux, and M. D. Purugganan. 2001.</p><p>Accelerated regulatory gene evolution in an adaptive</p><p>radiation. Proc. Natl. Acad. Sci. USA 98:10208–10213.</p><p>Beckers, J., and D. Duboule. 1998. Genetic analysis of a con-</p><p>served sequence in the HoxD complex: regulatory redundancy</p><p>or limitations of the transgenic approach? Dev. Dyn.</p><p>213:1–11.</p><p>Beldade, P., P. M. Brakefield, and A. D. Long. 2002. Con-</p><p>tribution of Distal-less to quantitative variation in butterfly</p><p>eyespots. Nature 415:315–318.</p><p>Bell, A. C., and G. Felsenfeld. 1999. Stopped at the border:</p><p>boundaries and insulators. Curr. Opin. Genet. Dev. 9:191–</p><p>198.</p><p>Bell, S. D., and S. P. Jackson. 1998. Transcription in Archaea.</p><p>Cold Spring Harbor Symp. Quant. Biol. 63:41–51.</p><p>Belting, H.-G., C. S. Shashikant, and F. H. Ruddle. 1998.</p><p>Modification of expression and cis-regulation of Hoxc8 in the</p><p>evolution of diverged axial morphology. Proc. Natl. Acad.</p><p>Sci. USA 95:2355–2360.</p><p>Benbrook, D. M., and N. C. Jones. 1990. Heterodimer formation</p><p>between CREB and JUN proteins. Oncogene 5:295–302.</p><p>Bender, W., M. Akam, F. Karch, P. A. Beachy, M. Peifer, P.</p><p>Spierer, E. B. Lewis, and D. S. Hogness. 1983. Molecular-</p><p>genetics of the bithorax complex in Drosophila melanogaster.</p><p>Science 221:23–29.</p><p>Benecke, A., C. Gaudon, and H. Gronemeyer. 2001. Transcrip-</p><p>tional integration of hormone and metabolic signals by</p><p>nuclear receptors. Pp. 167–214 in J. Locker, ed. Transcription</p><p>factors. Academic Press, San Diego, Calif.</p><p>Benezra, R., R. L. Davis, D. Lockshon, D. L. Turner, and H.</p><p>Weintraub. 1990. The protein Id: a negative regulator of helix-</p><p>loop-helix DNA-binding proteins. Cell 61:49–59.</p><p>Bergman, C. M., and M. Kreitman. 2001. Analysis of conserved</p><p>noncoding DNA in Drosophila reveals similar constraints in</p><p>intergenic and intronic sequences. Genome Res. 11:1335–</p><p>1345.</p><p>Berman, B. P., Y. Nibu, B. D. Pfeiffer, P. Tomancak, S. E.</p><p>Celnick, M. Levine, G. M. Rubin, and M. B. Eisen. 2002.</p><p>Evolution of Transcriptional Regulation 1409</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>Exploiting transcription factor binding site clustering to</p><p>identify cis-regulatory modules involved in pattern formation</p><p>in the Drosophila genome. Proc. Natl. Acad. Sci. USA 99:</p><p>757–767.</p><p>Bernstein, B. E., J. K. Tong, and S. L. Schreiber. 2000. Genome-</p><p>wide studies of histone deacetylase function in yeast. Proc.</p><p>Natl. Acad. Sci. USA 97:13708–13713.</p><p>Berthelsen, J., V. Zappavigna, E. Feretti, F. Mavilio, and F. Blasi.</p><p>1998. The novel homeoprotein Prep1 modulates Pbx-Hox</p><p>protein cooperativity. EMBO J. 17:1434–1445.</p><p>Betran, E., K. Thornton, and M. Long. 2002. Retroposed new</p><p>genes out of the X in Drosophila. Genome Res. 12:1854–</p><p>1859.</p><p>Bharathan, G., T. E. Goliber, C. Moore, S. Kessler, T. Pham, and</p><p>N. R. Sinha. 2002. Homologies in leaf form inferred from</p><p>KNOXI gene expression during development. Science 296:</p><p>1858–1860.</p><p>Bharathan, G., B.-J. Janssen, E. A. Kellogg, and N. Sinha. 1997.</p><p>Did homeodomain proteins duplicate before the origin of</p><p>angiosperms, fungi, and metazoa? Proc. Natl. Acad. Sci. USA</p><p>94:13749–13753.</p><p>Biggin, M. D., and W. McGinnis. 1997. Regulation of seg-</p><p>mentation and segmental identity by Drosophila homeopro-</p><p>teins: the role</p><p>of DNA binding in functional activity and</p><p>specificity. Development 124:4425–4433.</p><p>Birnbaum, K., P. N. Benfey, and D. E. Shasha. 2001. cis element/</p><p>transcription factor analysis (cis/TF): a method for discovering</p><p>transcription factor/cis element relationships. Genome Res.</p><p>11:1567–1573.</p><p>Blumenthal, T. 1998. Gene clusters and polycistronic transcrip-</p><p>tion in eukaryotes. Bioessays 20:480–487.</p><p>Bonifer, C. 2000. Developmental regulation of eukaryotic gene</p><p>loci: which cis-regulatory information is required? Trends</p><p>Genet. 16:310–315.</p><p>Brakefield, P.M., J. Gates, D.Keys, F.Kesbeke, P. J.Wijngaarden,</p><p>A. Monteiro, V. French, and S. B. Carroll. 1996. Development,</p><p>plasticity and evolution of butterfly eyespot patterns. Nature</p><p>384:236–242.</p><p>Brem, R. B., G. Yvert, R. Clinton, and L. Kruglyak. 2002.</p><p>Genetic dissection of transcriptional regulation in budding</p><p>yeast. Science 296:752–755.</p><p>Brickman, J. M., M. Clements, R. Tyrell, D. McNay, K. Woods,</p><p>J. Warner, A. Stewart, R. S. P. Beddington, and M. Dattani.</p><p>2001. Molecular effects of novel mutations in Hesx1/HESX1</p><p>associated with human pituitary disorders. Development 128:</p><p>5189–5199.</p><p>Britten, R. J. 1997. Mobile elements inserted in the distant past</p><p>have taken on important functions. Gene 205:177–182.</p><p>Britten, R. J., and E. H. Davidson. 1969. Gene regulation for</p><p>higher cells: a theory. Science 165:349–357.</p><p>———. 1971. Repetitive and non-repetitive DNA sequences and</p><p>a speculation on the origins of evolutionary novelty. Q. Rev.</p><p>Biol. 46:111–138.</p><p>Brosius, J. 1999. RNAs from all categories generate retro-</p><p>sequences that may be exapted as novel genes or regulatory</p><p>elements. Gene 238:115–134.</p><p>Brunetti, C. R., J. E. Selegue, A. Monteiro, V. French, P. M.</p><p>Brakefield, and S. B. Carroll. 2001. The generation and</p><p>diversification of butterfly eyespot color patterns. Curr. Biol.</p><p>11:1578–1585.</p><p>Buckwold, V. E., Z. C. Xu, T. S. B. Yen, and J. H. Ou. 1997.</p><p>Effects of a frequent double-nucleotide basal core promoter</p><p>mutation and its putative single-nucleotide precursor muta-</p><p>tions on hepatitis B virus gene expression and replication. J.</p><p>Gen. Virol. 78:2055–2065.</p><p>Budd, G. E. 1999. Does evolution in body patterning genes</p><p>drive morphological change—or vice versa? Bioessays 21:</p><p>326–332.</p><p>Buggs, C., N. Nasrin, A. Mode, P. Tollet, H.-F. Zhao, J.-Å.</p><p>Gustafsson, and M. Alexander-Bridges. 1998. IRE-ABP</p><p>(insulin response element-A binding protein), An SRY-like</p><p>protein, inhibits C/EBPa (CCAAT/enhancer-binding</p><p>protein a)-stimulated expression of the sex-specific</p><p>cytochrome P450 2C12 gene. Mol. Endocrinol. 12:1294–</p><p>1309.</p><p>Bürglin, T. 1997. Analysis of TALE superclass homeobox genes</p><p>(MEIS, PBC, KNOX, Iriquois, TGIF) reveals a novel domain</p><p>conserved between plants and animals. Nucleic Acids Res.</p><p>25:4173–4180.</p><p>Burke, A. C., C. E. Nelson, B. A. Morgan, and C. Tabin. 1995.</p><p>Hox genes and the evolution of vertebrate axial morphology.</p><p>Development 121:333–346.</p><p>Burstin, J., D. De Vienne, P. Dubreuil, and C. Damerval. 1994.</p><p>Molecular markers and protein quantities as genetic descrip-</p><p>tors in maize. I. Genetic diversity among 21 inbred lines.</p><p>Theor. Appl. Genet. 89:943–950.</p><p>Bush, R. M., and K. Paigen. 1992. Evolution of b-glucuronidase</p><p>regulation in the genus Mus. Evolution 46:1–15.</p><p>Calhoun, V. C., A. Stathopoulos, and M. Levine. 2002.</p><p>Promoter-proximal tethering elements regulate enhancer-</p><p>promoter specificity in the Drosophila Antennapedia complex.</p><p>Proc. Natl. Acad. Sci. USA 99:9243–9247.</p><p>Carey, M., and S. T. Smale. 2000. Transcriptional regulation in</p><p>eukaryotes: concepts, strategies, and techniques. Cold Spring</p><p>Harbor Laboratory Press, Cold Spring Harbor, New York.</p><p>Carrión, A. M., W. A. Link, F. Ledo, B. Mellström, and J. R.</p><p>Naranjo. 1999. DREAM is a Ca2þ-regulated transcriptional</p><p>repressor. Nature 398:80–84.</p><p>Carroll, S. B. 1995. Homeotic genes and the evolution of</p><p>arthropods and chordates. Nature 376:479–485.</p><p>———. 2000. Endless forms: the evolution of gene regulation</p><p>and morphological diversity. Cell 101:577–580.</p><p>Carroll, S. B., J. K. Grenier, and S. D. Weatherbee. 2001. From</p><p>DNA to diversity: molecular genetics and the evolution of</p><p>animal design. Blackwell Science, Malden, Mass.</p><p>Caspi, A., J. McClay, T. E. Moffitt, J. Mill, J. Martin, I. W. Craig,</p><p>A. Taylor, and R. Poulton. 2002. Role of genotype in the cycle</p><p>of violence in maltreated children. Science 297:851–854.</p><p>Cavalieri, D., J. P. Townsend, and D. L. Hartl. 2000. Mani-</p><p>fold anomalies in gene expression in a vineyard isolate of</p><p>Saccharomyces cerevisiae revealed by DNA microarray</p><p>analysis. Proc. Natl. Acad. Sci. USA 97:12369–12374.</p><p>Cavener, D. R. 1992. Transgenic animal studies on the evolution</p><p>of genetic regulatory circuitries. Bioessays 14:237–244.</p><p>Cereb, N., and S. Y. Yang. 1994. The regulatory complex of</p><p>HLA class I promoters exhibits locus-specific conservation</p><p>with limited allelic variation. J. Immunol. 152:3873–3883.</p><p>Chen, G., and A. J. Courey. 2000. Groucho/TLE family proteins</p><p>and transcriptional repression. Gene 249:1–16.</p><p>Chiu, C. H., H. Schneider, J. L. Slightom, D. L. Gumucio, and</p><p>M. Goodman. 1997. Dynamics of regulatory evolution in</p><p>primate beta-globin gene clusters: cis-mediated acquisition of</p><p>simian gamma fetal expression patterns. Gene 205:47–57.</p><p>Choo, Y., and A. Klug. 1997. Physical basis of a protein-DNA</p><p>recognition code. Curr. Opin. Struct. Biol. 7:117–125.</p><p>Christophides, G. K., I. Livadras, C. Savakis, and K. Komito-</p><p>polou. 2000. Two medfly promoters that have originated by</p><p>recent gene duplications drive distinct sex, tissue and temporal</p><p>expression patterns. Genetics 156:173–182.</p><p>Chung, Y. D., H. C. Kwon, K.W. Chung, S. J. Kim, K. J. Kim, and</p><p>C. C. Lee. 1996. Identification of ovarian enhancer-binding</p><p>factors which bind to ovarian enhancer 1 of the Drosophila</p><p>genes yp1 and yp2.Mol. Gen. Genet. 251:347–351.</p><p>Clark, A. G. 1990. Genetic components of variation in energy</p><p>storage in Drosophila melanogaster. Evolution 44:637–650.</p><p>Coller, H. A., C. Grandori, P. Tamayo, T. Colbert, E. S. Lander,</p><p>1410 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>R. N. Eisenman, and T. R. Golub. 2000. Expression analysis</p><p>with oligonucleotide microarrays reveals that MYC regulates</p><p>genes involved in growth, cell cycle, signaling, and adhesion.</p><p>Proc. Natl. Acad. Sci. USA 97:3260–3265.</p><p>Conlon, F. L., L. Fairclough, B. M. J. Price, E. S. Casey, and</p><p>J. C. Smith. 2001. Determinants of T box protein specificity.</p><p>Development 128:3749–3758.</p><p>Cooper, D. N. 1999. Human gene evolution. Academic Press,</p><p>San Diego, Calif.</p><p>Core, N., B. Charroux, A. McCormick, C. Vola, L. Fasano, M. P.</p><p>Scott, and S. Kerridge. 1997. Transcriptional regulation of</p><p>the Drosophila homeotic gene teashirt by the homeodomain</p><p>protein Fushi tarazu. Mech. Dev. 68:157–172.</p><p>Costa, P., and C. Plomion. 1999. Genetic analysis of needle</p><p>proteins in maritime pine. 2. Variation in protein accumula-</p><p>tion. Silvae Genet. 48:146–150.</p><p>Courey, A. J. 2001. Regulatory transcription factors and cis-</p><p>regulatory regions. Pp. 17–34 in J. Locker, ed. Transcription</p><p>factors. Academic Press, San Diego, Calif.</p><p>Cowell, L. G., T. B. Kepler, M. Janitz, R. Lauster, and N. A.</p><p>Mitchison. 1998. The distribution of variation in regulatory</p><p>gene segments, as present in MHC class II promoters.</p><p>Genome Res. 8:124–134.</p><p>Cowen, L. E., D. Sanglard, D. Calabrese, C. Sirjusingh, J. B.</p><p>Anderson, and L. M. Kohn. 2000. Evolution of drug re-</p><p>sistance in experimental populations of Candida albicans.</p><p>J. Bacteriol. 182:1515–1522.</p><p>Cowles, C. R., J. N. Hirshhorn, D. Altshuler, and E. S. Lander.</p><p>2002. Detection of regulatory variation in mouse genes. Nat.</p><p>Genet. 32:432–437.</p><p>Crawford, D. L., J. A. Segal, and J. L. Barnett. 1999.</p><p>Evolutionary analysis of TATA-less proximal promoter</p><p>function. Mol. Biol. Evol. 16:194–207.</p><p>Czerny, T., G. Schaffner, and M. Busslinger. 1993. DNA</p><p>sequence recognition by pax proteins: bipartite structure of</p><p>the paired domain and its binding site. Genes Dev. 7:2048–</p><p>2061.</p><p>Daborn, P. J., J. L. Yen, M. R. Bogwitz,</p><p>G. L. Goff, E. Feil, S.</p><p>Jeffers, N. Tijet, T. Perry, D. Heckel, P. Batterham, R.</p><p>Feyereisen, T. G. Wilson, and R. H. French-Constant. 2002. A</p><p>single P450 allele associated with insecticide resistance in</p><p>Drosophila. Science 297:2253–2225.</p><p>Dailey, L., and C. Basilico. 2001. Coevolution of HMG domains</p><p>and homeodomains and the generation of transcriptional</p><p>regulation by Sox/POU complexes. J. Cell. Physiol. 186:315–</p><p>328.</p><p>Damerval, C., A. Maurice, J. M. Josse, and D. De Vienne. 1994.</p><p>Quantitative trait loci underlying gene product variation:</p><p>a novel perspective for analyzing regulation of genome</p><p>expression. Genetics 137:289–301.</p><p>Darwin, C. 1859. On the origin of species by means of natural</p><p>selection. John Murray, London.</p><p>Davidson, E. H. 2001. Genomic regulatory systems: development</p><p>and evolution. Academic Press, San Diego, Calif.</p><p>Dawes, R., I. Dawson, F. Falciani, G. Tear, and M. Akam. 1994.</p><p>Dax, a locust hox gene related to fushi-tarazu but showing no</p><p>pair-rule expression. Development 120:1561–1572.</p><p>Dawson, S. J., P. J. Morris, and D. S. Latchman. 1996. A single</p><p>amino acid change converts a repressor into an activator.</p><p>J. Biol. Chem. 271:11631–11633.</p><p>De Vienne, D., B. Bost, J. Fievet, M. Zivy, and C. Dillmann.</p><p>2001. Genetic variability of proteome expression and</p><p>metabolic control. Plant Physiol. Biochem. 39:271–283.</p><p>D’Elia, A. V., G. Tell, I. Paron, L. Pellizzari, R. Lonigro, and G.</p><p>Damante. 2002. Missense mutations of human homeoboxes:</p><p>a review. Hum. Mutat. 18:361–374.</p><p>Denizot, Y., E. Pinaud, C. Aupetit, C. Le Morvan, E.</p><p>Magnoux, J. C. Aldigier, and M. Cogne. 2001. Polymor-</p><p>phism of the human alpha1 immunoglobulin gene 39 enhancer</p><p>hs1,2 and its relation to gene expression. Immunology</p><p>103:35–40.</p><p>Dermitzakis, E. T., and A. G. Clark. 2002. Evolution of</p><p>transcription factor binding sites in mammalian gene regula-</p><p>tory regions: conservation and turnover. Mol. Biol. Evol.</p><p>19:1114–1121.</p><p>DeRobertis, E. M., and Y. Sasai. 1996. A common plan for</p><p>dorsoventral patterning in Bilateria. Nature 380:37–40.</p><p>Di Gregorio, A., J. C. Corbo, and M. Levine. 2001. The</p><p>regulation of forkhead/HNf-3 beta expression in the Ciona</p><p>embryo. Dev. Biol. 229:31–43.</p><p>Dickinson, W. J. 1988. On the architecture of regulatory systems:</p><p>evolutionary insights and implications. Bioessays 8:204–208.</p><p>DiLeone, R. J., L. B. Russell, and D. M. Kingsley. 1998. An</p><p>extensive 39 regulatory region controls expression of Bmp5 in</p><p>specific anatomical structures of the mouse embryo. Genetics</p><p>148:401–408.</p><p>Dillon, N., and P. Sabbattini. 2000. Functional gene expression</p><p>domains: defining the functional unit of eukaryotic gene</p><p>regulation. Bioessays 22:657–665.</p><p>Dobzhansky, T. 1936. Studies on hybrid sterility. II. Localization</p><p>of sterility factors in Drosophila pseudoobscura hybrids.</p><p>Genetics 21:113–135.</p><p>Doebley, J., and L. Lukens. 1998. Transcriptional regulators and</p><p>the evolution of plant form. Plant Cell 10:1075–1082.</p><p>Droge, P., and B. Muller-Hill. 2001. High local protein con-</p><p>centrations at promoters: strategies in prokaryotic and eu-</p><p>karyotic cells. Bioessays 23:179–183.</p><p>Duboule, D. 1994. Guidebook to the homeobox genes. Oxford</p><p>University Press, Oxford.</p><p>Duboule, D., and A. S. Wilkins. 1998. The evolution of</p><p>‘bricolage.’ Trends Genet. 14:54–59.</p><p>Dudareva, N., L. Cseke, V. M. Blanc, and E. Pichersky. 1996.</p><p>Evolution of floral scent in ?? Clarkia: novel patterns of S-</p><p>linalool synthase gene expression in the C. breweri flower.</p><p>Plant Cell 8:1137–1148.</p><p>Dynan, W. S. 1989. Modularity in promoters and enhancers. Cell</p><p>58:1–4.</p><p>Enard, W., P. Khaitovich, J. Klose et al. (13 co-authors). 2002a.</p><p>Intra- and interspecific variation in primate gene expression</p><p>patterns. Science 296:340–343.</p><p>Enard, W., M. Przeworski, S. E. Fisher, C. S. L. Lai, V. Wiebe,</p><p>T. Kitano, A. P. Monaco, and S. Pääbo. 2002b. Molecular</p><p>evolution of FOXP2, a gene involved in speech and language.</p><p>Nature 418:869–872.</p><p>Fairall, L., and J. W. R. Schwabe. 2001. DNA binding by</p><p>transcription factors. Pp. 65–84 in J. Locker, ed. Transcription</p><p>factors. Academic Press, San Diego, Calif.</p><p>Falciani, F., B. Hausdorf, R. Schröder, M. Akam, D. Tautz, R.</p><p>Denell, and S. Brown. 1996. Class 3 Hox genes in insects</p><p>and the origin of zen. Proc. Natl. Acad. Sci. USA 93:8479–</p><p>8484.</p><p>Fang, H., and B. P. Brandhorst. 1996. Expression of the actin</p><p>gene family in embryos of the sea urchin Lytechinus pictus.</p><p>Dev. Biol. 173:306–317.</p><p>Fang, S., A. Takahashi, and C.-I. Wu. 2002. A mutation in the</p><p>promoter of desaturase 2 is correlated with sexual isolation</p><p>between Drosophila behavioral races. Genetics 162:781–784.</p><p>Ferea, T. L., D. Botstein, P. O. Brown, and R. F. Rosenzweig.</p><p>1999. Systematic changes in gene expression patterns</p><p>following adaptive evolution in yeast. Proc. Natl. Acad. Sci.</p><p>USA 96:9721–9726.</p><p>Ferkowicz, M. J., and R. Raff. 2001. Wnt gene expression in sea</p><p>urchin development: heterochronies associated with the</p><p>evolution of developmental mode. Evol. Dev. 3:24–33.</p><p>Ferrigno, O., T. Virolle, Z. Djabari, J. P. Ortonne, R. J. White,</p><p>and D. Aberdam. 2001. Transposable B2 SINE elements can</p><p>Evolution of Transcriptional Regulation 1411</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>provide mobile RNA polymerase II promoters. Nat. Genet.</p><p>28:77–81.</p><p>Ferris, S. D., and G. S. Whitt. 1979. Evolution of the differential</p><p>regulation of duplicate genes following polyploidization. J.</p><p>Mol. Evol. 12:267–317.</p><p>Fisher, R. A. 1930. The genetical theory of natural selection.</p><p>Clarendon Press, Oxford.</p><p>Fitch, D. H. A. 1997. Evolution of male tail development in</p><p>rhabditid nematodes related to Caenorhabditis elegans. Syst.</p><p>Biol. 46:145–179.</p><p>Fitch, W. M., and E. Markowitz. 1970. An improved method for</p><p>determining codon variability in a gene and its application to</p><p>the rate of fixation of mutations in evolution. Biochem. Genet.</p><p>4:579–593.</p><p>Flores-Saaib, R. D., S. Jia, and A. J. Courey. 2001. Activation</p><p>and repression by the C-terminal domain of Dorsal. De-</p><p>velopment 128:1869–1879.</p><p>Force, A., M. Lynch, F. B. Pickett, A. Amores, Y.-L. Yan, and</p><p>J. Postlethwait. 1999. Preservation of duplicate genes by</p><p>complementary, degenerative mutations. Genetics 151:1531–</p><p>1545.</p><p>Foulkes, N. S., and P. Sassone-Corsi. 1992. More is better:</p><p>activators and repressors from the same gene. Cell 68:411–</p><p>414.</p><p>Frasch, M., X. Chen, and T. Lufkin. 1995. Evolutionary-</p><p>conserved enhancers direct region-specific expression of the</p><p>murine Hoxa-1 and Hoxa-2 loci in both mice and Drosophila.</p><p>Development 121:957–974.</p><p>Frazer, K. A., J. B. Sheehan, R. P. Stokowski, X. Chen, R.</p><p>Hosseini, J. F. Cheng, S. P. Fodor, D. R. Cox, and N. Patil.</p><p>2001. Evolutionarily conserved sequences on human chro-</p><p>mosome 21. Genome Res. 11:1651–1659.</p><p>Fry, C. J., and P. J. Farnham. 1999. Context-dependent</p><p>transcriptional regulation. J. Biol. Chem. 274:29583–29586.</p><p>Fu, Y.-X., and W.-H. Li. 1993. Statistical tests of neutrality of</p><p>mutations. Genetics 133:693–709.</p><p>Fullerton, S. M., A. Bartoszewicz, G. Ybazeta, Y. Horikawa,</p><p>G. I. Bell, K. K. Kidd, N. J. Cox, R. R. Hudson, and A. Di</p><p>Rienzo. 2002. Geographic and haplotype structure of can-</p><p>didate type 2 diabetes-susceptibility variants at the calpain-10</p><p>locus. Am. J. Hum. Genet. 70:1096–1106.</p><p>Galant, R., and S. B. Carroll. 2002. Evolution of a transcriptional</p><p>repression domain in an insect Hox protein. Nature 415:910–</p><p>913.</p><p>Gehring, W. J., and K. Ikeo. 1999. Pax6: mastering eye mor-</p><p>phogenesis and eye evolution. Trends Genet. 15:371–377.</p><p>Gerard, M., J. Zakany, and D. Duboule. 1997. Interspecies</p><p>exchange of a Hoxd enhancer in vivo induces premature</p><p>transcription and anterior shift of the sacrum. Dev. Biol.</p><p>190:32–40.</p><p>Gerber, S., F. Fabre, and C. Planchon. 2000. Genetics of seed</p><p>quality in soybean analysed by capillary gel electrophoresis.</p><p>Plant Sci. 152:181–189.</p><p>Gerhart, J., and M. Kirschner. 1997. Cells, embryos, and</p><p>evolution: toward a cellular and developmental understanding</p><p>of phenotypic variation and evolutionary adaptability. Black-</p><p>well Science, Malden, Mass.</p><p>Gibson,</p><p>G. 1996. Epistasis and pleiotropy as natural properties of</p><p>transcriptional regulation. Theor. Popul. Biol. 49:58–89.</p><p>Gilbert, S. F. 2000. Developmental biology. Sinauer Associates,</p><p>Sunderland, Mass.</p><p>———. 2001. Ecological developmental biology: developmental</p><p>biology meets the real world. Dev. Biol. 233:1–12.</p><p>Gill, G., and M. Ptashne. 1987. Mutants of GAL4 protein altered</p><p>in an activation function. Cell 51:121–126.</p><p>Gillespie, J. H. 1991. The causes of molecular evolution. Oxford</p><p>University Press, New York.</p><p>Giordano, M., C. Marchetti, E. Chiorboli, G. Bona, and P.</p><p>Momigliano Richiardi. 1997. Evidence for gene conver-</p><p>sion in the generation of extensive polymorphism in the</p><p>promoter of the growth hormone gene. Hum. Genet. 100:</p><p>249–255.</p><p>Glass, C. K., D. W. Rose, and M. G. Rosenfeld. 1997. Nuclear</p><p>receptor coactivators. Curr. Opin. Cell Biol. 9:222–232.</p><p>Gonzalez, P., P. V. Rao, S. B. Nunez, and J. S. Zigler, Jr. 1995.</p><p>Evidence for independent recruitment of zeta-crystallin/</p><p>quinone reductase (CRYZ) as a crystallin in camelids and</p><p>hystricomorph rodents. Mol. Biol. Evol. 12:773–781.</p><p>Goodyer, C. G., G. Zogopolos, G. Schwartzbauer, H. Zheng,</p><p>G. N. Hendy, and R. K. Menon. 2001. Organization and</p><p>evolution of the human growth hormone receptor 59-flanking</p><p>region. Endocrinology 142:1923–1934.</p><p>Grandori, C., S. M. Cowley, L. P. James, and R. N. Eisenman.</p><p>2000. The Myc/Max/Mad network and the transcriptional</p><p>control of cell behavior. Annu. Rev. Cell. Dev. Biol. 16:653–</p><p>699.</p><p>Gray, S., and M. Levine. 1996. Transcriptional repression in</p><p>development. Curr. Opin. Cell Biol. 8:358–364.</p><p>Grbic, M., L. M. Nagy, and M. R. Strand. 1998. Polyembryonic</p><p>insect development: insect pattern formation in a cellularised</p><p>environment. Dev. Genes Evol. 208:69–81.</p><p>Grosveld, F., M. Antoniou, M. Berry et al. (16 co-authors). 1993.</p><p>The regulation of human globin gene switching. Phil. Trans.</p><p>R. Soc. Lond. Ser. B 339:183–191.</p><p>Gstaiger, M., L. Knoepfl, O. Georgiev, W. Schaffner, and C. M.</p><p>Hovens. 1995. A B-cell co-activator of octamer-binding</p><p>transcription factors. Nature 373:360–362.</p><p>Gu, W., and R. G. Roeder. 1997. Activation of p53 sequence-</p><p>specific DNA binding by acetylation of the p53 C-terminal</p><p>domain. Cell 90:595–606.</p><p>Gu, Z., D. Nicolae, H. H.-S. Lu, and W.-H. Li. 2002. Rapid</p><p>divergence in expression between duplicate genes inferred</p><p>from microarray data. Trends Genet. 18:609–613.</p><p>Guardiola, J., A. Maffei, R. Lauster, N. A. Mitchison, R. S.</p><p>Accolla, and S. Sartoris. 1996. Functional significance of</p><p>polymorphism among MHC class II gene promoters. Tissue</p><p>Antigens 48:615–625.</p><p>Hahn, M. W., M. D. Rausher, and C. W. Cunningham. 2002.</p><p>Distinguishing between selection and population expansion</p><p>in an experimental lineage of bacteriophage T7. Genetics</p><p>161:11–20.</p><p>Hahn, M. W., J. E. Stajich, and G. A. Wray. 2003. The effects of</p><p>selection against spurious transcription factor binding sites.</p><p>Mol. Biol. Evol. (in press).</p><p>Hamblin, M. T., and A. DiRienzo. 2000. Detection of the</p><p>signature of natural selection in humans: evidence from the</p><p>Duffy blood group locus. Am. J. Hum. Genet. 66:1669–1679.</p><p>Hancock, J., P. Shaw, F. Benneton, and G. Dover. 1999. High</p><p>sequence turnover in the regulatory regions of the de-</p><p>velopmental gene hunchback in insects. Mol. Biol. Evol.</p><p>16:253–265.</p><p>Hanes, S. D., G. Riddihough, D. Ish-Horowicz, and R. Brent.</p><p>1994. Specific DNA recognition and intersite spacing are</p><p>critical for action of the bicoid morphogen. Mol. Cell. Biol.</p><p>14:3364–3375.</p><p>Hanna-Rose, W., and U. Hansen. 1996. Active repression</p><p>mechanisms of eukaryotic transcription factors. Trends Genet.</p><p>12:229–234.</p><p>Hardison, R. C. 2000. Conserved noncoding sequences are</p><p>reliable guides to regulatory elements. Trends Genet. 16:369–</p><p>372.</p><p>Hariri, A. R., V. S. Mattay, A. Tessitore, B. Kolachana, F. Fera,</p><p>D. Goldman, M. F. Egan, and D. R. Weinberger. 2002.</p><p>Serotonin transporter genetic variation and the response of the</p><p>human amygdala. Science 297:400–403.</p><p>1412 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>Harrison, S. C. 1991. A structural taxonomy of DNA-binding</p><p>domains. Nature 353:715–719.</p><p>Haudek, S. B., B. E. Natmessnig, H. Redl, G. Schlag, and B. P.</p><p>Giroir. 1998. Genetic sequences and transcriptional regulation</p><p>of the TNFA promoter: comparison of human and baboon.</p><p>Immunogenetics 48:202–207.</p><p>Hill, T. A., C. D. Day, S. C. Zondlo, A. G. Thackeray, and V. F.</p><p>Irish. 1998. Discrete spatial and temporal cis-acting elements</p><p>regulate transcription of the Arabidopsis floral homeotic gene</p><p>APETELA3. Development 125:1711–1721.</p><p>Hizver, J., H. Rozenberg, F. Frolow, D. Rabinovich, and Z.</p><p>Shakked. 2001. DNA bending by an adenine-thymine tract</p><p>and its role in gene regulation. Proc. Natl. Acad. Sci. USA</p><p>98:8490–8495.</p><p>Hodgkin, J. 1992. Genetic sex determination mechanisms and</p><p>evolution. Bioessays 14:253–261.</p><p>Holland, N. D., and L. Z. Holland. 1999. Amphioxus and the</p><p>utility of molecular genetic data for hypothesizing body part</p><p>homologies between distantly related animals. Am. Zool.</p><p>39:630–640.</p><p>Holstege, F. C. P., E. G. Jennings, J. J. Wyrick, T. I. Lee, C. J.</p><p>Hengartner, M. R. Green, T. R. Golub, E. S. Lander, and</p><p>R. A. Young. 1998. Dissecting the regulatory circuitry of</p><p>a eukaryotic genome. Cell 95:717–728.</p><p>Hope, I. A., and K. Struhl. 1986. Functional dissection of</p><p>a eukaryotic transcriptional activator, GCN4 of yeast. Cell</p><p>46:885–894.</p><p>Houchens, C. R., W. Montigny, L. Zeltser, L. Dailey, J. M.</p><p>Gilbert, and N. H. Heintz. 2000. The dhfr ori beta-binding</p><p>protein RIP60 contains 15 zinc fingers: DNA binding and</p><p>looping by the central three fingers and an associated proline-</p><p>rich region. Nucleic Acids Res. 28:570–581.</p><p>Hudson, R. R., M. Kreitman, and M. Aguade. 1987. A test</p><p>of neutral molecular evolution based on nucleotide data.</p><p>Genetics 116:153–159.</p><p>Hunt, G. M., D. Johnson, and C. T. Tiemesse. 2001. Character-</p><p>isation of the long terminal repeat regions of South African</p><p>human immunodeficiency virus type 1 isolates. Virus Genes</p><p>23:27–34.</p><p>Indovina, P., F. Megiorni, P. Ferrante, I. Apollonio, F.</p><p>Petronzelli, and M. C. Mazzilli. 1998. Different binding of</p><p>NF-Y transcriptional factor to DQA1 promoter variants. Hum.</p><p>Immunol. 59:758–767.</p><p>Iyer, V. R., C. E. Horak, C. S. Scafe, D. Botstein, M. Snyder, and</p><p>P. O. Brown. 2001. Genomic binding sites of the yeast</p><p>cell-cycle transcription factors SBF and MBF. Nature 409:</p><p>533–538.</p><p>Jackson, S. P., and R. Tjian. 1988. O-glycosylation of eukaryotic</p><p>transcription factors: implications for mechanisms of tran-</p><p>scriptional regulation. Cell 55:125–133.</p><p>Jackson-Fisher, A. J., C. Chitikila, M. Mitra, and B. F. Pugh.</p><p>1999. A role for TBP dimerization in preventing unregulated</p><p>gene expression. Mol. Cell 3:717–727.</p><p>Jacob, F. 1977. Evolution and tinkering. Science 196:1161–</p><p>1166.</p><p>Jacob, F., and J. Monod. 1961. On the regulation of gene activity.</p><p>Cold Spring Harbor Symp. Quant. Biol. 26:193–211.</p><p>Jacobs, J. J., and M. Van Lohuizen. 1999. Cellular memory of</p><p>transcriptional states by polycomb-group proteins. Semin.</p><p>Cell. Dev. Biol. 10:227–235.</p><p>James, L., and R. N. Eisenman. 2002. Myc and Mad bHLHZ</p><p>domains possess identical DNA-binding specificities but only</p><p>partially overlapping functions in vivo. Proc. Natl. Acad. Sci.</p><p>USA 99:10429–10434.</p><p>Jareborg, N., E. Birney, and R. Durbin. 1999. Comparative</p><p>analysis of noncoding regions of 77 orthologous mouse and</p><p>human gene pairs. Genome Res. 9:815–824.</p><p>Jaynes, J. B., and P. H. O’Farrell. 1991. Active repression of</p><p>transcription by the engrailed homeodomain protein. EMBO</p><p>J. 10:1427–1433.</p><p>Jeeninga, R. E., M. Hoogenkamp, M. Armand-Ugon, M. De</p><p>Baar, K. Verhoef, and B. Berkhout. 2000. Functional</p><p>differences between the long terminal repeat transcriptional</p><p>promoters of human immunodeficiency virus type 1 subtypes</p><p>A through G. J. Virol. 74:3740–3751.</p><p>Jenkins, D. L., C. A. Ortori, and J. F. Y. Brookfield. 1995. A</p><p>test for adaptive change in DNA sequences controlling</p><p>transcription. Proc.R. Soc. Lond. Ser. B Biol. Sci. 261:203–</p><p>207.</p><p>Jin, W., R. M. Riley, R. D. Wolfinger, K. P. White, G. Passador-</p><p>Gurgel, and G. Gibson. 2001. The contributions of sex,</p><p>genotype and age to transcriptional variance in Drosophila</p><p>melanogaster. Nat. Genet. 29:389–395.</p><p>Johnson, N. A., and A. H. Porter. 2000. Rapid speciation via</p><p>parallel, directional selection on regulatory genetic pathways.</p><p>J. Theor. Biol. 205:527–542.</p><p>Jones, P. A., and D. Takai. 2001. The role of DNA methylation in</p><p>mammalian epigenetics. Science 293:1068–1070.</p><p>Jones, S., P. Van Heyningen, H. M. Berman, and J. M. Thornton.</p><p>1999. Protein–DNA interactions: a structural analysis. J. Mol.</p><p>Biol. 287:877–896.</p><p>Jordan, I. K., and J. F. McDonald. 1998. Interelement selection in</p><p>the regulator region of the copia retrotransposon. J. Mol. Evol.</p><p>47:670–676.</p><p>Kadosh, D., and K. Struhl. 1998. Targeted recruitment of the</p><p>Sin3-Rpd3 histone deacetylase complex generates a highly</p><p>localized domain of repressed chromatin in vivo. Mol. Cell.</p><p>Biol. 18:5121–5127.</p><p>Kajiya, Y., K. Hamasaki, K. Nakata et al. (11 co-authors). 2001.</p><p>A long-term follow-up analysis of serial core promoter and</p><p>precore sequences in Japanese patients chronically infected by</p><p>hepatitis B virus. Digest. Dis. Sci. 46:509–515.</p><p>Kammandel, B., K. Chowdhury, A. Stoykova, S. Aparicio, S.</p><p>Brenner, and P. Gruss. 1999. Distinct cis-essential modules</p><p>direct the time-space pattern of the Pax6 gene activity. Dev.</p><p>Biol. 205:79–97.</p><p>Karp, C. L., A. Grupe, E. Schadt et al. (13 co-authors). 2000.</p><p>Identification of complement factor 5 as a susceptibility</p><p>locus for experimental allergic asthma. Nat. Immunol. 1:</p><p>221–226.</p><p>Kayo, T., D. B. Allison, R. Weindruch, and T. A. Prolla. 2001.</p><p>Influences of aging and caloric restriction on the transcrip-</p><p>tional profile of skeletal muscle from rhesus monkeys. Proc.</p><p>Natl. Acad. Sci. USA 98:5093–5098.</p><p>Kazazian, H. H. 1990. The thalassemia syndromes: molecular</p><p>basis and prenatal diagnosis in 1990. Semin. Hematol. 27:</p><p>209–228.</p><p>Keys, D. N., D. L. Lewis, J. E. Selegue, B. J. Pearson, L. V.</p><p>Goodrich, R. L. Johnson, J. Gates, M. P. Scott, and S. B.</p><p>Carroll. 1999. Recruitment of a hedgehog regulatory circuit in</p><p>butterfly eyespot evolution. Science 283:532–534.</p><p>Kidwell, M. G., and D. Lisch. 1997. Transposable elements as</p><p>sources of variation in animals and plants. Proc. Natl. Acad.</p><p>Sci. USA 94:7704–7711.</p><p>Kim, J. 2001. Macro-evolution of the hairy enhancer in</p><p>Drosophila species. J. Exp. Zool. 291:175–185.</p><p>Kim, J., J. Q. Kerr, and G.-S. Min. 2000. Molecular heterochrony</p><p>in the early development of Drosophila. Proc. Natl. Acad. Sci.</p><p>USA 97:212–216.</p><p>Kimura, M. 1983. The neutral theory of molecular evolution.</p><p>Cambridge University Press, Cambridge.</p><p>King, M. C., and A. C. Wilson. 1975. Evolution at two levels in</p><p>humans and chimpanzees. Science 188:107–116.</p><p>Kirchhamer, C. V., L. D. Bogarad, and E. H. Davidson. 1996.</p><p>Developmental expression of synthetic cis-regulatory systems</p><p>Evolution of Transcriptional Regulation 1413</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>composed of spatial control elements from two different</p><p>genes. Proc. Natl. Acad. Sci. USA 93:13849–13854.</p><p>Kirchhamer, C. V., C.-H. Yuh, and E. H. Davidson. 1996.</p><p>Modular cis-regulatory organization of developmentally ex-</p><p>pressed genes: two genes transcribed territorially in the sea</p><p>urchin embryo, and additional examples. Proc. Natl. Acad.</p><p>Sci. USA 93:9322–9328.</p><p>Kissinger, J. C., and R. A. Raff. 1998. Evolutionary changes in</p><p>sites and timing of actin gene expression in embryos of</p><p>the direct- and indirect-developing sea urchins, Heliocidaris</p><p>erythrogramma and H. tuberculata. Dev. Genes Evol.</p><p>208:82–93.</p><p>Klarenberg, A. J., K. Sikkema, and W. Scharloo. 1987.</p><p>Functional significance of regulatory map and structural amy</p><p>variants in Drosophila melanogaster. Heredity 58:383–389.</p><p>Klein, W. H., and X. T. Li. 1999. Function and evolution of Otx</p><p>proteins. Biochem. Biophys. Res. Commun. 258:229–233.</p><p>Klose, J., C. Nock, M. Herrmann et al. (13 co-authors). 2002.</p><p>Genetic analysis of the mouse brain proteome. Nat. Genet.</p><p>30:385–393.</p><p>Kmita, M., T. Kondo, and D. Duboule. 2000. Targeted inversion</p><p>of a polar silencer within the HoxD complex re-allocates</p><p>domains of enhancer sharing. Nat. Genet. 26:451–454.</p><p>Knoepfler, P. S., and M. P. Kamps. 1995. The pentapeptide motif</p><p>of hox proteins is required for cooperative DNA binding with</p><p>pbx1, physically contacts pbx1 and enhances DNA binding by</p><p>pbx1. Mol. Cell. Biol. 15:5811–5819.</p><p>Kopp, A., I. Duncan, and S. B. Carroll. 2000. Genetic control and</p><p>evolution of sexually dimorphic characters in Drosophila.</p><p>Nature 408:553–559.</p><p>Korneev, S., and M. O’Shea. 2002. Evolution of nitric oxide</p><p>synthase regulatory genes by DNA inversion. Mol. Biol. Evol.</p><p>19:1228–1233.</p><p>Kramer, S. G., T. M. Jinks, P. Schedl, and J. P. Gergen. 1999.</p><p>Direct activation of Sex-lethal transcription by the Drosophila</p><p>runt protein. Development 126:191–200.</p><p>Kuras, L., and K. Struhl. 1999. Binding of TBP to promoters in</p><p>vivo is stimulated by activators and requires Pol II holo-</p><p>enzyme. Nature 399:609–613.</p><p>Lander, E. S., L. M. Linton, and B. Birren et al. (242 co-authors).</p><p>2001. Initial sequencing and analysis of the human genome.</p><p>Nature 409:860–921.</p><p>Latchman, D. S. 1998. Eukaryotic transcription factors. Aca-</p><p>demic Press, San Diego, Calif.</p><p>Laurie-Ahlberg, C. C., and G. C. Bewley. 1983. Naturally</p><p>occurring genetic variation affecting the expression of</p><p>sn-glycerol-3-phosphate dehydrogenase in Drosophila</p><p>melanogaster. Biochem. Genet. 21:943–961.</p><p>Laurie-Ahlberg, C. C., G. Maroni, G. C. Bewley, J. C.</p><p>Lucchesi, and B. S. Weir. 1980. Quantitative genetic</p><p>variation of enzyme activities in natural populations of</p><p>Drosophila melanogaster. Proc. Natl. Acad. Sci. USA</p><p>77:1073–1077.</p><p>Lawton-Rauh, A. L., E. R. Alvarez-Buylla, and M. D.</p><p>Purugganan. 2000. Molecular evolution of flower develop-</p><p>ment. Trends Ecol. Evol. 15:144–149.</p><p>Lee, H., S. N. Cho, H. E. Bang, J. H. Lee, G. H. Bai, S. J. Kim,</p><p>and J. D. Kim. 2000. Exclusive mutations related to isoniazid</p><p>and ethionamide resistance among Mycobacterium tubercu-</p><p>losis isolates from Korea. Eur. J. Clin. Microbiol. Infect. Dis.</p><p>17:508–511.</p><p>Lee, J. E. 1997. Basic helix-loop-helix genes in neural de-</p><p>velopment. Curr. Opin. Neurobiol. 7:13–20.</p><p>Lee, T. I., N. J. Rinaldi, R. Robert et al. (20 co-authors). 2002.</p><p>Transcriptional regulatory networks in Saccharomyces cer-</p><p>evisiae. Science 298:799–804.</p><p>Lee, T. I., and R. A. Young. 2000. Transcription of eukaryotic</p><p>protein-coding genes. Annu. Rev. Genet. 34:77–137.</p><p>Lemon, B., and R. Tjian. 2000. Orchestrated response: a</p><p>symphony of transcription factors for gene control. Genes</p><p>Dev. 14:2551–2569.</p><p>Lerman, D. N., P. Michalak, A. B. Helin, B. R. Bettencourt, and</p><p>M. E. Feder. 2003. Modification of heat-shock gene ex-</p><p>pression in Drosophila melanogaster populations via trans-</p><p>posable elements. Mol. Biol. Evol. 20:135–144.</p><p>Lescot, M., P. Dehais, G. Thijs, K. Marchal, Y. Moreau, Y. van</p><p>de Peer, P. Rouze, and S. Rombauts. 2002. PlantCARE,</p><p>a database of plant cis-acting regulatory elements and a portal</p><p>to tools for in silico analysis of promoter sequences. Nucleic</p><p>Acids Res. 30:325–327.</p><p>Lettice, L. A., T. Horikoshi, S. J. H. Heaney et al. (18 co-</p><p>authors). 2002. Disruption of a long-range cis-acting regulator</p><p>for Shh causes preaxial polydactyly. Proc. Natl. Acad. Sci.</p><p>USA 99:7548–7553.</p><p>Lewin, B. 2000. Genes VII. Oxford University Press, Oxford.</p><p>Lewis, E. B. 1978. Gene complex controlling segmentation in</p><p>Drosophila. Nature 276:565–570.</p><p>Li, Q. M., and S. A. Johnston. 2001. Are all DNA binding and</p><p>transcription regulation by an activator physiologically</p><p>relevant? Mol. Cell. Biol. 21:2467–2474.</p><p>Li, W.-H. 1997. Molecular evolution. Sinauer Associates,</p><p>Sunderland, Mass.</p><p>Li, W. W., M. M. Dammerman, J. D. Smith, S. Metzger, J. L.</p><p>Breslow, and T. Leff. 1995. Common genetic variation in the</p><p>promoter of the human apo CIII gene abolishes regulation by</p><p>insulin and may contribute to hypertriglyceridemia. J. Clin.</p><p>Invest. 96:2601–2605.</p><p>Li, X.,</p><p>measured levels of variation in gene expression from</p><p>1- or 2-dimensional protein gels in a variety of organisms:</p><p>Zea mays (Burstin et al. 1994; Damerval et al. 1994; de</p><p>Vienne et al. 2001), Pinus pinaster (Costa and Plomion</p><p>1999), Glycine max (Gerber, Fabre, and Planchon 2000),</p><p>Mus musculus (Klose et al. 2002), and Homo sapiens</p><p>(Enard et al. 2002a). Studies with the first three organisms</p><p>documented that protein abundance has a strong genetic</p><p>component, and all of these studies found that populations</p><p>contain considerable variation in expression level for most</p><p>of the proteins surveyed. In D. melanogaster, chromo-</p><p>some substitution lines show substantial levels of variation</p><p>in gene expression as measured by enzyme activities</p><p>(Laurie-Ahlberg et al. 1980; Wilton et al. 1982; Clark</p><p>1990). Although protein abundance and enzyme activity</p><p>are indirect indices of transcription, these results sug-</p><p>gest considerable genetic variation for gene expression in</p><p>general. (2) mRNA-based surveys.More direct estimates of</p><p>variation in transcription come from microarray analyses</p><p>that survey thousands of loci. Studies in mice (Karp et al.</p><p>2000; Schadt et al. 2003), humans (Schadt et al. 2003), the</p><p>teleost Fundulus heteroclitus (Oleksiak, Churchill, and</p><p>Crawford 2002), D. melanogaster (Jin et al. 2001; Rifkin,</p><p>Kim, and White 2003), Zea mays (Schadt et al. 2003), and</p><p>Saccharomyces cerevisiae (Cavalieri, Townsend, and</p><p>Hartl 2000; Brem et al. 2002), all indicate that genetic</p><p>variation in transcript abundance is pervasive within</p><p>populations. Much of this variation may be heritable.</p><p>Schadt et al. (2003) found that 33% of the 23,574 loci</p><p>surveyed from a cross of two inbred strains of mice</p><p>showed a genetic component for expression differences</p><p>within the liver, 29% of the 2,726 loci surveyed from 56</p><p>humans belonging to four families showed a heritable</p><p>difference in expression within lymphoblasts, and 18,805</p><p>genes consistently differed in transcription within ear leaf</p><p>tissue among progeny from a cross of two maize strains.</p><p>What proportion of the genetic basis for this variation</p><p>resides in the promoters of the genes showing transcrip-</p><p>tional variation (cis) or in the sequences or expression</p><p>profiles of their upstream regulators (trans) has been</p><p>examined in a few cases. Quantitative trait loci (QTL)</p><p>underlying variation in expression of at least 32% of 570</p><p>variably expressed transcripts in yeast mapped in cis</p><p>(Brem et al. 2002), whereas the comparable fraction of</p><p>genes with cis-acting QTL in mouse liver is even higher</p><p>(Schadt et al. 2003). Reverse transcriptase polymerase</p><p>chain reaction (RT-PCR) offers more reliable quantitation</p><p>than microarrays, and it also provides a means of directly</p><p>comparing transcription rates among alleles. In a pre-</p><p>liminary survey of 69 loci in four inbred lines of Mus</p><p>musculus, Cowles et al. (2002) found quantitative and</p><p>tissue-specific variation among alleles at 4 loci. Using</p><p>a similar approach, Yan et al. (2002) found evidence of</p><p>variation in gene expression at 6 of 13 loci examined in</p><p>humans. Taken together, microarray and RT-PCR surveys</p><p>Evolution of Transcriptional Regulation 1379</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>of mRNA levels provide solid evidence of abundant</p><p>genetic variation in transcriptional regulation in diverse</p><p>species, and they suggest that much of this variation</p><p>resides in cis regulatory sequences. (3) Detailed analyses</p><p>of promoter function. The most extensive direct evidence</p><p>of functional variation in promoter sequences now avail-</p><p>able comes from humans, where many specific poly-</p><p>morphisms have been identified through direct functional</p><p>studies (Cooper 1999). Although the human genome is not</p><p>particularly polymorphic, a typical individual is estimated</p><p>to be heterozygous for a functional promoter polymor-</p><p>phism at ;40% of all loci (Rockman and Wray 2002).</p><p>Comparable data do not yet exist for other species, but RT-</p><p>PCR surveys (Cowles et al. 2002; Yan et al. 2002) provide</p><p>a rapid means of estimating heterozygosity that affects</p><p>transcription at many loci.</p><p>2.4 Natural Selection Operates on Allelic Variation in</p><p>Promoters</p><p>Evidence for natural selection on eukaryotic promoter</p><p>alleles comes from a variety of sources (also see section 4.7).</p><p>(1) Human populations. Promoter polymorphisms at</p><p>numerous loci in humans have functional consequences</p><p>that influence diverse aspects of physiology, behavior,</p><p>anatomy, and life history (Cooper 1999; Rockman andWray</p><p>2002). Some of these promoter alleles have likely fitness</p><p>consequences (for examples, see next paragraph and section</p><p>4.7). (2) Wild populations. A latitudinal cline of LDH</p><p>promoter allele frequencies in the teleost Fundulus hetero-</p><p>clitus is probably maintained by temperature differences</p><p>(Crawford, Segal, and Barnett 1999; Segal, Barnett, and</p><p>Crawford 1999). Two other cases, mentioned earlier, are</p><p>known from D. melanogaster: promoter alleles segregating</p><p>at both Cyp6G1 and hsp70 appear to be under selection in</p><p>wild populations (Daborn et al. 2002; Lerman et al. 2003).</p><p>(3) Artificial selection and experimental evolution. Domes-</p><p>tication of maize involved selection on the inferred</p><p>regulatory region of the tb locus (Wang et al. 1999). Studies</p><p>with yeast point to regulation of transcription as a critical</p><p>component of adaptive change. Adaptation of Saccharo-</p><p>myces cerevisiae to glucose limitation was accompanied by</p><p>twofold or greater changes in the abundance of transcripts</p><p>from nearly 10% of all genes, consistently across replicates</p><p>(Ferea et al. 1999). The evolution of drug resistance in</p><p>experimental populations of Candida albicans correlated</p><p>with overexpression of the four known resistance genes</p><p>(Cowen et al. 2000). (4) Sequence comparisons. More</p><p>extensive, but less direct, evidence that natural selection acts</p><p>on promoters comes from cases of apparent evolutionary</p><p>conservation of cis-regulatory sequences among distantly</p><p>related species (for examples, see section 4.1). Consistent</p><p>underrepresentation of specific sequence motifs provides</p><p>evidence for genome-wide selection to remove spurious</p><p>transcription initiation sequences in a broad diversity of</p><p>prokaryotes (Hahn, Stajich, and Wray 2003).</p><p>Several examples of natural selection operating on</p><p>transcriptional regulation involve pathogen-host interac-</p><p>tions. For instance, some promoter alleles in Mycobacte-</p><p>rium tuberculanum and hepatitis B alter transcription to</p><p>the pathogen’s benefit and may be under positive selection</p><p>(Buckwold et al. 1997; Rinder et al. 1998; Lee et al. 2000;</p><p>Kajiya et al. 2001). The origin and subsequent fixation of</p><p>these mutations in separate host individuals demonstrates</p><p>the ability of positive selection to operate in a predictable</p><p>way on genetic variation within a promoter. Specific</p><p>variants within the human immunodeficiency virus (HIV)</p><p>promoter, including gains of binding sites for host nuclear</p><p>factor kappa-B (NF-kB) and upstream stimulatory factor</p><p>(USF), as well as functional modifications in the basal</p><p>promoter, cause differences in the level of viral transcrip-</p><p>tion (Montano et al. 1997; Jeeninga et al. 2000). The E</p><p>subtype of HIV has significantly increased transcription</p><p>rates and has gone to near fixation locally in southern</p><p>Africa; it is associated with increased levels of secondary</p><p>infections and may be under positive selection to the</p><p>pathogens’ advantage (Montano et al. 2000; Hunt,</p><p>Johnson, and Tiemesse 2001). Conversely, human pop-</p><p>ulations harbor promoter variants that influence suscepti-</p><p>bility to pathogens or disease progression after infection.</p><p>Because human generation times are much longer than</p><p>those of pathogens, signatures of selection are more</p><p>difficult to detect. Nonetheless, promoter allelles at TNFa,</p><p>IL-4, IL-10, FY, CCR5, and TGFb influence mortality</p><p>from a variety of viral, bacterial, and protoctistan</p><p>pathogens and are likely to be under selection (Tourna-</p><p>mille et al. 1995; Hamblin and Di Rienzo 2000; Shin et al.</p><p>2000; Thurz 2001; Bamshad et al. 2002; Meyer et al.</p><p>2002; Nakayama et al.</p><p>and M. Noll. 1994. Evolution of distinct developmental</p><p>functions of three Drosophila genes by acquisition of different</p><p>cis-regulatory regions. Nature 367:83–87.</p><p>Li, Y., J. P. Bernot, C. Illingworth et al. (10 co-authors). 2001.</p><p>Gene conversion within regulatory sequences generates maize</p><p>r alleles with altered gene expression. Genetics 159:</p><p>1727–1740.</p><p>Liang, Z., and M. D. Biggin. 1998. Eve and ftz regulate a wide</p><p>array of genes in blastoderm embryos: the selector homeo-</p><p>proteins directly or indirectly regulate most genes in Dro-</p><p>sophila. Development 125:4471–4482.</p><p>Lieb, J. D., X. Liu, D. Botstein, and P. O. Brown. 2001.</p><p>Promoter-specific binding of Rap1 revealed by genome-wide</p><p>maps of protein–DNA association. Nat. Genet. 28:327–334.</p><p>Liu, T., J. Wu, and F. He. 2000. Evolution of cis-acting elements</p><p>in 59 flanking regions of vertebrate actin genes. J. Mol. Evol.</p><p>50:22–30.</p><p>Locker, J. 2001. Transcription factors. Academic Press, San</p><p>Diego, Calif.</p><p>Long, M., W. Wang, and J. Zhang. 1999. Origin of new genes</p><p>and source for N-terminal domain of the chimerical gene,</p><p>jingwei, in Drosophila. Gene 238:135–141.</p><p>Loots, G. G., R. M. Locksley, C. M. Blankespoor, Z. E. Wang,</p><p>W. Miller, E. M. Rubin, and K. A. Frazer. 2000. Identification</p><p>of a coordinate regulator of interleukins 4, 13, and 5 by cross-</p><p>species sequence comparisons. Science 288:136–140.</p><p>Love, J. J., X. Li, D. A. Case, K. Geise, R. Grosschedl, and P. E.</p><p>Wright. 1995. Structural basis for DNA bending by the</p><p>architectural transcription factor LEF-1. Nature 376:791–795.</p><p>Lowe, C. J., and G. A. Wray. 1997. Radical alterations in the</p><p>roles of homeobox genes during echinoderm evolution.</p><p>Nature 389:718–721.</p><p>Ludwig, M. Z. 2002. Functional evolution of noncoding DNA.</p><p>Curr. Opin. Genet. Dev. 12:634–639.</p><p>Ludwig, M. Z., C. Bergman, N. H. Patel, and M. Kreitman. 2000.</p><p>Evidence for stabilizing selection in a eukaryotic enhancer</p><p>element. Nature 403:564–567.</p><p>Ludwig, M. Z., and M. Kreitman. 1995. Evolutionary dynamics</p><p>1414 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>of the enhancer region of even-skipped in Drosophila. Mol.</p><p>Biol. Evol. 12:1002–1011.</p><p>Ludwig, M. Z., N. H. Patel, and M. Kreitman. 1998. Functional</p><p>analysis of eve stripe 2 enhancer evolution in Drosophila:</p><p>rules governing conservation and change. Development 125:</p><p>949–958.</p><p>Lufkin, T. 2001. Developmental control by Hox transcriptional</p><p>regulators and their cofactors. Pp. 215–235 in J. Locker, ed.</p><p>Transcription factors. Academic Press, San Diego, Calif.</p><p>Lutz, B., H. C. Lu, G. Eichele, D. Miller, and T. C. Kaufman.</p><p>1996. Rescue of Drosophila labial null mutant by the</p><p>chicken ortholog Hoxb-1 demonstrates that the function of</p><p>Hox genes is phylogenetically conserved. Genes Dev.</p><p>10:176–184.</p><p>Ma, X., D. Yuan, K. Diepold, T. Scarborough, and J. Ma. 1996.</p><p>The Drosophila morphogenic protein bicoid binds DNA</p><p>cooperatively. Development 122:1195–1206.</p><p>Maduro, M., and D. Pilgrim. 1996. Conservation of function and</p><p>expression of unc-119 from two Caenorhabditis species</p><p>despite divergence of non-coding DNA. Gene 183:77–85.</p><p>Mahmoudi, T., and C. P. Verrijzer. 2001. Chromatin silencing</p><p>and activation by Polycomb and trithorax group proteins.</p><p>Oncogene 20:3055–3066.</p><p>Manzanares, M., H. Wada, N. Itasaki, P. A. Trainor, R.</p><p>Krumlauf, and P. W. Holland. 2000. Conservation and</p><p>elaboration of Hox gene regulation during evolution of the</p><p>vertebrate head. Nature 408:854–857.</p><p>Margarit, E., A. Guillén, C. Bebordosa, J. Vidal-Taboada, M.</p><p>Sánchez, F. Ballesta, and R. Oliva. 1998. Identification of</p><p>conserved potentially regulatory sequences of the SRY gene</p><p>from 10 different species of mammals. Biochem. Biophys.</p><p>Res. Commun. 245:370–377.</p><p>Markstein, M., and M. Levine. 2002. Decoding cis-regulatory</p><p>DNAs in the Drosophila genome. Curr. Opin. Genet. Dev.</p><p>12:601–606.</p><p>Mastick, G. S., R. McKay, T. Oligino, K. Donovan, and A. J.</p><p>López. 1995. Identification of target genes regulated by</p><p>homeotic proteins in Drosophila melanogaster through</p><p>genetic selection of Ultrabithorax protein-binding sites in</p><p>yeast. Genetics 139:349–363.</p><p>Mathias, J. R., H. L. Zhong, H. H. Jin, and A. K. Vershon. 2001.</p><p>Altering the DNA-binding specificity of the yeast Mat alpha 2</p><p>homeodomain protein. J. Biol. Chem. 276:32696–32703.</p><p>Matsuo, Y., and T. Yamazaki. 1984. Genetic analysis of natural</p><p>populations of Drosophila melanogaster in Japan. IV. Natural</p><p>selection on the inducibility, but not on the structural genes, of</p><p>amylase loci. Genetics 108:879–896.</p><p>McDonald, J. H., and M. Kreitman. 1991. Adaptive protein</p><p>evolution at the Adh locus in Drosophila. Nature 351:</p><p>652–654.</p><p>McKinney, M. L., and K. J. McNamara. 1991. Heterochrony: the</p><p>evolution of ontogeny. Plenum Press, New York.</p><p>Metherall, J. E., F. P. Gillespie, and B. G. Forget. 1988. Analyses</p><p>of linked beta-globin genes suggest that nondeletion forms</p><p>of hereditary persistence of fetal hemoglobin are bona fide</p><p>switching mutants. Am. J. Hum. Genet. 42:476–481.</p><p>Meyer, C. G., J. May, A. J. Luty, B. Lell, and P. G. Kremsner.</p><p>2002. TNFA–308A associated with shorter intervals of Plasmo-</p><p>dium falciparum reinfections. Tissue Antigens 59:287–292.</p><p>Milo, R., S. Shen-Orr, S. Itzkovitz, D. Kashtan, D. Chklovskii,</p><p>and U. Alon. 2002. Network motifs: simple building blocks of</p><p>complex networks. Science 298:824–827.</p><p>Mitsialis, S. A., and F. C. Kafatos. 1985. Regulatory elements</p><p>controlling chorion gene expression are conserved between</p><p>flies and moths. Nature 317:453–456.</p><p>Miyashita, N. T. 2001. DNA variation in the 59 upstream region</p><p>of the Adh locus of the wild plants Arabidopsis thaliana and</p><p>Arabis gemmifera. Mol. Biol. Evol. 18:164–171.</p><p>Mody, M., Y. Cao, Z. Cui et al. (10 co-authors). 2001. Genome-</p><p>wide gene expression profiles of the developing mouse</p><p>hippocampus. Proc. Natl. Acad. Sci. USA 98:8862–8867.</p><p>Montano, M. A., C. P. Nixon, T. Ndung’u, H. Bussmann, V. A.</p><p>Novitsky, D. Dickman, and M. Essex. 2000. Elevated tumor</p><p>necrosis factor-alpha activation of human immunodeficiency</p><p>virus type 1 subtype C in southern Africa is associated with an</p><p>NF-kappaB enhancer gain-of-function. J. Infect. Dis. 181:</p><p>76–81.</p><p>Montano, M. A., V. A. Novitsky, J. T. Blackard, N. L. Cho, D.</p><p>A. Katzenstein, and M. Essex. 1997. Divergent transcriptional</p><p>regulation among expanding human immunodeficiency virus</p><p>type 1 subtypes. J. Virol. 71:8657–8665.</p><p>Morishima, A. 1998. Identification of preferred binding sites of</p><p>a light-inducible DNA-binding factor (MNF1) within 59-</p><p>upstream sequence of c4-type phosphoenolpyruvate carbox-</p><p>ylase gene in maize. Plant Mol. Biol. 38:633–646.</p><p>Mouchel-Vielh, E., M. Blin, C. Rigolot, and J. S. Deutsch. 2002.</p><p>Expression of a homologue of the fushi-tarazu ( ftz) gene in</p><p>a cirriped crustacean. Evol. Dev. 4:76–85.</p><p>Müller, G. B., and G. P. Wagner. 1991. Novelty in evolution:</p><p>restructuring the concept. Annu. Rev. Ecol. Syst. 22:229–256.</p><p>Muller, H. J. 1942. Isolating mechanisms, evolution, and tem-</p><p>perature. Biol. Symp. 6:71–125.</p><p>Naganawa, S., H. N. Ginsberg, R. M. Glickman, and G. S.</p><p>Ginsburg. 1997. Intestinal transcription and synthesis of</p><p>apolipoprotein AI is regulated by five natural polymorphisms</p><p>upstream of the apolipoprotein CIII gene. J. Clin. Invest.</p><p>99:1958–1965.</p><p>Nakayama, E. E., L. Meyer, A. Iwamoto, A. Persoz, Y. Nagai, C.</p><p>Rouzioux, J. F. Delfraissy, P. Debre, D. McIlroy, I.</p><p>Theodorou, T. Shioda, and S. S. Group. 2002. Protective</p><p>effect of interleukin-4–589T polymorphism on human</p><p>immunodeficiency virus type 1 disease progression: relation-</p><p>ship with virus load. J. Infect. Dis. 185:1183–1186.</p><p>Narlikar, G. J., H.-Y. Fan, and R. E. Kingston. 2002. Cooperation</p><p>between complexes that regulate chromatin structure and</p><p>transcription. Cell 108:475–487.</p><p>Naylor, L. H., and E. M. Clark. 1990. d(TG)n.d(CA)n sequences</p><p>upstream of the rat prolactin gene form Z-DNA and inhibit</p><p>gene transcription. Nucleic Acids Res. 18:1595–1601.</p><p>Neznanov, N., A. Umezawa, and R. G. Oshima. 1997. A</p><p>regulatory element within a coding exon modulates keratin 18</p><p>gene expression in transgenic mice. J. Biol. Chem. 272:</p><p>27549–27557.</p><p>Nielsen, L. B., D. Kahn, T. Duell, H. U. Weier, S. Taylor, and</p><p>S. G. Young. 1998. Apolipoprotein B gene expression in</p><p>a series of human apolipoprotein B transgenic mice</p><p>generated with recA-assisted restriction endonuclease cleav-</p><p>age-modified bacterial artificial chromosomes. An intestine-</p><p>specific enhancer element is located between 54 and 62</p><p>kilobases 59 to the structural gene. J. Biol. Chem.</p><p>273:21800–21807.</p><p>Nurminsky, D. I., E. N. Moriyama, E. R. Lozovskaya, and D. L.</p><p>Hartl. 1996. Molecular phylogeny and genome evolution</p><p>in the Drosophila virilis species group: duplications of the</p><p>alcohol dehydrogenase gene. Mol. Biol. Evol. 13:132–149.</p><p>Nurminsky, D. I., M. V. Nurminskaya, D. Deaguiar, and D. L.</p><p>Hartl. 1998. Selective sweep of a newly evolved sperm-</p><p>specific gene in Drosophila. Nature 396:572–575.</p><p>Odgers, W. A., M. J. Healy, and J. G. Oakeshott. 1995.</p><p>Nucleotide polymorphism in the 59 promoter region of</p><p>esterase 6 in Drosophila melanogaster and its relationship</p><p>to enzyme activity variation. Genetics 141:215–222.</p><p>Ohler, U., and H. Niemann. 2001. Identification and analysis of</p><p>eukaryotic promoters: recent computational approaches.</p><p>Trends Genet. 17:56–60.</p><p>Evolution of Transcriptional Regulation 1415</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>Ohta, T. 1992. The nearly neutral theory of molecular evolution.</p><p>Annu. Rev. Ecol. Syst. 23:263–286.</p><p>Ohtsuki, S., M. Levine, and H. N. Cai. 1998. Different core</p><p>promoters possess distinct regulatory activities in the Dro-</p><p>sophila embryo. Genes Dev. 12:547–556.</p><p>Oleksiak, M. F., G. A. Churchill, and D. L. Crawford. 2002.</p><p>Variation in gene expression within and among natural pop-</p><p>ulations. Nat. Genet. 32:261–266.</p><p>Onyango, P., W. Miller, J. Lehoczky, C. T. Leung, B. Birren,</p><p>S. Wheelan, K. Dewar, and A. P. Feinberg. 2000. Sequence</p><p>and comparative analysis of the mouse 1-megabase region</p><p>orthologous to the human 11p15 imprinted domain. Genome</p><p>Res. 10:1697–1710.</p><p>Orphanides, G., T. Lagrange, and D. Reinberg. 1998. The general</p><p>transcription factors of RNA polymerase II. Genes Dev.</p><p>10:2657–2683.</p><p>Orr, H. A., and D. C. Presgraves. 2000. Speciation by postzygotic</p><p>isolation: forces, genes and molecules. Bioessays 22:1085–</p><p>1094.</p><p>Osada, S., H. Yamamoto, T. Nishihara, and M. Imagawa. 1996.</p><p>DNA binding specificity of the CCAAT/enhancer-binding</p><p>protein transcription factor family. J. Biol. Chem. 271:3891–</p><p>3896.</p><p>Paigen, K. 1989. Experimental approaches to the study of</p><p>regulatory evolution. Am. Nat. 134:440–458.</p><p>Panganiban, G., S. M. Irvine, C. Lowe et al. (14 co-authors).</p><p>1997. The origin and evolution of animal appendages. Proc.</p><p>Natl. Acad. Sci. USA 94:5162–5166.</p><p>Papenbrock, T., R. L. Peterson, R. S. Lee, T. Hsu, A. Kuroiwa,</p><p>and A. Awgulewitsch. 1998. Murine Hoxc-9 gene contains</p><p>a structurally and functionally conserved enhancer. Dev. Dyn.</p><p>212:540–547.</p><p>Paquette, J., N. Giannoukakis, C. Polychronakos, P. Vafiadis,</p><p>and C. Deal. 1998. The INS 59 variable number of tandem</p><p>repeats is associated with IGF2 expression in humans. J. Biol.</p><p>Chem. 273:14158–14164.</p><p>Parks, A. L., B. A. Parr, J.-E. Chin, D. S. Leaf, and R. A. Raff.</p><p>1988. Molecular analysis of heterochronic changes in the</p><p>evolution of direct developing sea urchins. J. Evol. Biol.</p><p>1:27–44.</p><p>Patrinos, G. P., P. Kollia, A. Loutradi-Anagnostou, D. Louko-</p><p>poulos, and M. N. Papadakis. 1998. The Cretan type of</p><p>non-deletional hereditary persistence of fetal hemoglobin [A</p><p>gamma-158C–.T] results from two independent gene</p><p>conversion events. Hum. Genet. 102:629–634.</p><p>Petronzelli, F., A. Kimura, P. Ferrante, and M. C. Mazzilli. 1995.</p><p>Polymorphism in the upstream regulatory region of DQA1</p><p>gene in the Italian population. Tissue Antigens 45:258–263.</p><p>Pfister K., K. Paigen, G. Watson, and V. Chapman. 1982.</p><p>Expression of beta-glucuronidase haplotypes in prototype and</p><p>congenic mouse strains. Biochem. Genet. 20:519–536.</p><p>Piano, F., M. J. Parisi, R. Karess, and M. P. Kambysellis. 1999.</p><p>Evidence for redundancy but not trans factor-cis element co-</p><p>evolution in the regulation of Drosophila Yp genes. Genetics</p><p>152:605–616.</p><p>Pinsonneault, J., B. Florence, H. Vaessin, and W. McGinnis.</p><p>1997. A model for extradenticle function as a switch that</p><p>changes Hox proteins from repressors to activators. EMBO J.</p><p>16:2032–2042.</p><p>Pirkkala, L., P. Nykanen, and L. Sistonen. 2001. Roles of the heat</p><p>shock transcription factors in regulation of the heat shock</p><p>response and beyond. FASEB J. 15:1118–1131.</p><p>Plaza, S., S. Saule, and C. Dozier. 1999. High conservation of</p><p>cis-regulatory elements between quail and human for the Pax-</p><p>6 gene. Dev. Genes Evol. 209:165–173.</p><p>Powell, J. R. 1979. Population genetics of Drosophila amylase.</p><p>II. Geographic patterns in D. pseudoobscura. Genetics 92:</p><p>613–622.</p><p>Powell, J. R., and J. M. Lichtenfels. 1979. Population genetics</p><p>of Drosophila amylase. I. genetic control of tissue-specific</p><p>expression in D. pseudoobscura. Genetics 92:603–612.</p><p>Praz, V., R. Perier, C. Bonnard, and P. Bucher. 2002. The</p><p>Eukaryotic Promoter Database, EPD: new entry types and</p><p>links to gene expresion data. Nuclic Acids Res. 30:322–</p><p>324.</p><p>Pugh, B. F. 2001. RNA polymerase II transcription machinery.</p><p>Pp. 1–16 in J. Locker, ed. Transcription factors. Academic</p><p>Press, San Diego, Calif.</p><p>Purugganan, M. D. 2000. The molecular population genetics of</p><p>regulatory genes. Mol. Ecol. 9:1451–1461.</p><p>Quiring, R., U. Walldorf, U. Kloter, and W. J. Gehring. 1994.</p><p>Homology of the eyeless gene of Drosophila to the small eye</p><p>gene in mice and aniridia in humans. Science 265:785–789.</p><p>Raff, R. A. 1996. The shape of life: genes, development, and the</p><p>evolution of animal form. The University Of Chicago Press,</p><p>Chicago.</p><p>Raff, R. A., and T. C. Kaufman. 1983. Embryos, genes, and</p><p>evolution: the developmental-genetic basis of evolutionary</p><p>change. Macmillan, New York.</p><p>Rebeiz, M., N. L. Reeves, and J. W. Posakony. 2002. SCORE:</p><p>a computational approach to the identification of cis-</p><p>regulatory modules and target genes in whole-genome</p><p>sequence data. Proc. Natl. Acad. Sci. USA 99:9888–</p><p>9893.</p><p>Regier, J. C., and N. S. Vlahos. 1988. Heterochrony and the</p><p>introduction of novel modes of morphogenesis during the</p><p>evolution of moth choriogenesis. J. Mol. Evol. 28:19–31.</p><p>Reinberg, D., G. Oprhanides, R. Ebright et al. (29 co-authors).</p><p>1998. The RNA polymerase II general transcription factors:</p><p>past, present, and future. Cold Spring Harbor Symp. Quant.</p><p>Biol. 63:83–103.</p><p>Ren, B., F. Robert, J. J. Wyrick et al. (14 co-authors). 2000.</p><p>Genome-wide location and function of DNA binding proteins.</p><p>Science 290:2306–2309.</p><p>Richards, E. J., and S. C. R. Elgin. 2002. Epigenetic codes for</p><p>heterochromatin silencing: rounding up the usual suspects.</p><p>Cell 108:489–500.</p><p>Riechmann, J. L., M. Wang, and E. M. Meyerowitz. 1996. DNA-</p><p>binding properties of Arabidopsis Mads domain homeotic</p><p>proteins APETALA1, APETALA3, PISTILLATA and</p><p>AGAMOUS. Nucleic Acids Res. 24:3134–3141.</p><p>Rifkin, S. A., J. Kim, and K. P. White. 2003. Evolution of gene</p><p>expression in the Drosophila melanogaster subgroup. Nat</p><p>Genet. 33:138–144.</p><p>Rinder, H., A. Thomschke, S. Rusch-Gerdes, G. Bretzel, K.</p><p>Feldmann, M. Rifai, and T. Loscher. 1998. Significance of</p><p>ahpC promoter mutations for the prediction of isoniazid</p><p>resistance in Mycobacterium tuberculosis. Eur. J. Clin.</p><p>Microbiol. Infect. Dis. 17:508–511.</p><p>Roberts, S. B., N. Segil, and H. Heintz. 1991. Differential</p><p>phosphorylation of the transcription factor Oct1 during the</p><p>cell cycle. Science 253:1022–1026.</p><p>Robin, C., R. F. Lyman, A. D. Long, C. H. Langley, and T. F. C.</p><p>Mackay. 2002. hairy: a quantitative trait locus for Drosophila</p><p>sensory bristle number. Genetics 162:155–164.</p><p>Rockman, M. V., and G. A. Wray. 2002. Abundant raw material</p><p>for cis-regulatory evolution in humans. Mol. Biol. Evol. 19:</p><p>1981–1990.</p><p>Romano, L. A., and G. A. Wray. 2003. Conservation of Endo16</p><p>expression in sea urchins despite divergence in both cis and</p><p>trans-acting components of transcriptional regulation.</p><p>De-</p><p>velopment (in press).</p><p>Romey, M. C., C. Guittard, J. P. Chazalette et al. (14 co-authors).</p><p>1999. Complex allele [–102T.Aþs549r (T.G)] is associated</p><p>with milder forms of cystic fibrosis than allele s549r[T.G]</p><p>alone. Hum. Genet. 105:145–150.</p><p>1416 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>Romey, M. C., N. Pallares-Ruiz, A. Mange, C. Mettling, R.</p><p>Peytavi, J. Demaille, and M. Claustres. 2000. A naturally</p><p>occurring sequence variation that creates a YY1 element is</p><p>associated with increased cystic fibrosis transmembrane</p><p>conductance regulator gene expression. J. Biol. Chem. 275:</p><p>3561–3567.</p><p>Ronshaugen, M., N. McGinnis, and W. McGinnis. 2002. Hox</p><p>protein mutation and macroevolution of the insect body plan.</p><p>Nature 415:914–917.</p><p>Ross, J. L., P. P. Fong, and D. R. Cavener. 1994. Correlated</p><p>evolution of the cis-acting regulatory elements and</p><p>developmental expression of the Drosophila Gld gene in</p><p>seven species from the subgroup melanogaster. Dev. Genet.</p><p>15:38–50.</p><p>Rothenburg, S., F. Koch-Nolte, A. Rich, and F. Haag. 2001. A</p><p>polymorphic dinucleotide repeat in the rat nucleolin gene</p><p>forms Z-DNA and inhibits promoter activity. Proc. Natl.</p><p>Acad. Sci. USA 98:8985–8990.</p><p>Ruez, C., F. Payre, and A. Vincent. 1998. Transcriptional control</p><p>of Drosophila bicoid by Serendipity delta: cooperative</p><p>binding sites, promoter context, and co-evolution. Mech.</p><p>Dev. 78:125–134.</p><p>Saccone, G., I. Peluso, D. Artiaco, E. Giordano, D. Bopp, and</p><p>L. C. Polito. 1998. The Ceratitis capitata homologue of the</p><p>Drosophila sex-determining gene Sex-lethal is structurally</p><p>conserved, but not sex-specifically regulated. Development</p><p>125:1495–1500.</p><p>Sackerson, C., M. Fujioka, and T. Goto. 1999. The even-skipped</p><p>locus is contained in a 16-kb chromatin domain. Dev. Biol.</p><p>211:39–52.</p><p>Saito, T., H. M. Lachman, L. Diaz et al. (12 co-authors). 2002.</p><p>Analysis of monoamine oxidase a (MAOA) promoter poly-</p><p>morphism in Finnish male alcoholics. Psych. Res. 109:</p><p>113–119.</p><p>Sandrelli, F., S. Campesan, M. G. Rossetto, C. Benna, E.</p><p>Zieger, A. Megighian, M. Couchman, C. P. Kyriacou, and</p><p>R. Costa. 2001. Molecular dissection of the 59 region of no-on-</p><p>transientA of Drosophila melanogaster reveals cis-</p><p>regulation by adjacent dGpi1 sequences. Genetics 157:</p><p>765–775.</p><p>Sauer, F., S. K. Hansen, and R. Tjian. 1995. DNA Template</p><p>requirement and activator-coactivator requirements for tran-</p><p>scriptional synergism by Drosophila bicoid. Science 270:</p><p>1825–1827.</p><p>Scaffidi, P., and M. E. Bianchi. 2001. Spatially precise DNA</p><p>bending is an essential activity of the Sox2 transcription</p><p>factor. J. Biol. Chem. 276:47296–47302.</p><p>Scemama, J. L., M. Hunter, J. McCallam, V. Prince, and E.</p><p>Stellwag. 2002. Evolutionary divergence of vertebrate Hoxb2</p><p>expression patterns and transcriptional regulatory loci. J. Exp.</p><p>Zool. 294:285–299.</p><p>Schadt, E. E., S. A. Monks, T. A. Drake et al. (14 co-authors).</p><p>2003. Genetics of gene expression surveyed in maize, mouse</p><p>and man. Nature 422:297–302.</p><p>Schiff, N. M., Y. Feng, J. A. Quine, P. A. Krasney, and D. R.</p><p>Cavener. 1992. Evolution of the expression of the Gld gene in</p><p>the reproductive tract of Drosophila. Mol. Biol. Evol. 9:1029–</p><p>1049.</p><p>Schlichting, C. D., and M. Pigliucci. 1998. Phenotypic evolution:</p><p>a reaction norm perspective. Sinauer Associates, Sunderland,</p><p>Mass.</p><p>Segal, J. A., J. L. Barnett, and D. L. Crawford. 1999. Functional</p><p>analysis of natural variation in Sp1 binding sites of a TATA-</p><p>less promoter. J. Mol. Evol. 49:736–749.</p><p>Segil, N., S. B. Roberts, and N. Heitz. 1991. Mitotic</p><p>phosphorylation of the Oct-1 homeodomain and regulation</p><p>of Oct-1 DNA binding activity. Science 254:1814–1816.</p><p>Seoighe, C., N. Federspiel, T. Jones et al. (29 co-authors). 2000.</p><p>Prevalence of small inversions in yeast gene order evolution.</p><p>Proc. Natl. Acad. Sci. USA 97:14433–14437.</p><p>Serfling, E., M. Jasin, and W. Schaffner. 1985. Enhancers and</p><p>eukaryotic gene-transcription. Trends Genet. 1:224–230.</p><p>Shabalina, S. A., and A. S. Kondrashov. 1999. Pattern of</p><p>selective constraint in C. elegans and C. briggsae genomes.</p><p>Genet. Res. 74:23–30.</p><p>Shabalina, S. A., A. Y. Ogurtsov, V. A. Kondrashov, and A. S.</p><p>Kondrashov. 2001. Selective constraint in intergenic regions</p><p>of human and mouse genomes. Trends Genet. 17:373–376.</p><p>Shashikant, C. S., C. B. Kim, M. A. Borbely, W. C. Wang, and F.</p><p>H. Ruddle. 1998. Comparative studies on mammalian Hoxc8</p><p>early enhancer sequence reveal a baleen whale–specific</p><p>deletion of a cis-acting element. Proc. Natl. Acad. Sci. USA</p><p>95:12364–12369.</p><p>Shaw, P. J., N. S. Wratten, A. P. McGregor, and G. A. Dover.</p><p>2002. Coevolution in bicoid-dependent promoters and the</p><p>inception of regulatory incompatibilities among species of</p><p>higher Diptera. Evol. Dev. 4:265–277.</p><p>Shikama, N., J. Lyon, and N. B. La Thangue. 1997. The p300/</p><p>CBP family: integrating signals with transcription factors and</p><p>chromatin. Trends Cell Biol. 7:230–236.</p><p>Shin, H. D., C. Winkler, J. C. Stephens et al. (15 co-authors).</p><p>2000. Genetic restriction of HIV-1 pathogenesis to AIDS by</p><p>promoter alleles of IL10. Proc. Natl. Acad. Sci. USA 97:</p><p>14467–14472.</p><p>Shore, P., and A. D. Sharrocks. 2001. Regulation of transcription</p><p>by extracellular signals. Pp. 113–135 in J. Locker, ed.</p><p>Transcription factors. Academic Press, San Diego, Calif.</p><p>Simon, J., M. Peifer, W. Bender, and M. O’Connor. 1990.</p><p>Regulatory elements of the bithorax complex that control ex-</p><p>pression along the anterior-posterior axis. EMBO J. 9:3945–</p><p>3956.</p><p>Singh, N., K. W. Barbour, and F. G. Berger. 1998. Evolution of</p><p>transcriptional regulatory elements within the promoter of</p><p>a mammalian gene. Mol. Biol. Evol. 15:312–325.</p><p>Singh, N., and F. G. Berger. 1998. Evolution of a mammalian</p><p>promoter through changes in patterns of transcription factor</p><p>binding. J. Mol. Evol. 46:639–648.</p><p>Sinha, N. R., and E. A. Kellogg. 1996. Parallelism and diversity</p><p>in multiple origins of C4 photosynthesis in the grass family.</p><p>Am. J. Bot. 83:1458–1470.</p><p>Sinha, S., and M. Tompa. 2002. Discovery of novel transcription</p><p>factor binding sites by statistical overrepresentation. Nucleic</p><p>Acids Res. 30:5549–5560.</p><p>Sjostrand, J. O., A. Kegel, and S. U. Astrom. 2002. Functional</p><p>diversity of silencers in budding yeasts. Eukaryot. Cell 1:548–</p><p>557.</p><p>Sjottem, E., C. Andersen, and T. Johansen. 1997. Structural and</p><p>functional analysis of DNA bending by Sp1 family transcrip-</p><p>tion factors. J. Mol. Biol. 267:490–504.</p><p>Skaer, N., D. Pistillo, and P. Simpson. 2002. Transcriptional</p><p>heterochrony of scute and changes in bristle pattern between</p><p>two closely related species of blowfly. Dev. Biol. 252:</p><p>31–45.</p><p>Skaer, N., and P. Simpson. 2000. Genetic analysis of bristle loss</p><p>in hybrids between Drosophila melanogaster and D. simulans</p><p>provides evidence for divergence of cis-regulatory sequences</p><p>in the achaete-scute gene complex. Dev. Biol. 221:148–167.</p><p>Smale, S. T., A. Jain, J. Kaufmann, K. H. Emamai, K. Lo, and I.</p><p>P. Garraway. 1998. The initiator element: a paradigm for core</p><p>promoter heterogeneity within metazoan protein-coding</p><p>genes. Cold Spring Harbor Symp. Quant. Biol. 63:21–31.</p><p>Small, S., A. Blair, and M. Levine. 1992. Regulation of even-</p><p>skipped stripe-2 in the Drosophila embryo. EMBO J. 11:</p><p>4047–4057.</p><p>Spek, C. A., R. M. Bertina, and P. H. Reitsma. 1999. Unique</p><p>distance- and DNA-turn-dependent interactions in the human</p><p>Evolution of Transcriptional Regulation 1417</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>protein C gene promoter confer submaximal transcriptional</p><p>activity. Biochem. J. 340:513–518.</p><p>Stauber, M., H. Jackle, and U. Schmidt-Ott. 1999. The anterior</p><p>determinant bicoid of Drosophila is a derived Hox class 3</p><p>gene. Proc. Natl. Acad. Sci. USA 96:3786–3789.</p><p>Stauber, M., A. Prell, and U. Schmidt-Ott. 2002. A single Hox3</p><p>gene with composite bicoid and zerknult expression character-</p><p>istics in non-Cyclorrhaphan flies. Proc. Natl. Acad. Sci. USA</p><p>99:274–279.</p><p>Stern, D. L. 1998. A role of Ultrabithorax in morphological</p><p>difference between Drosophila species.</p><p>Nature 396:463–466.</p><p>———. 2000. Perspective: evolutionary developmental biology</p><p>and the problem of variation. Evolution 54:1079–1091.</p><p>Stockhaus, J., U. Schlue, M. Koczor, J. A. Chitty, W. C. Taylor,</p><p>and P. Westhoff. 1997. The promoter of the gene encoding the</p><p>C4 form of phoshpoenolpyruvate carboxylase directs meso-</p><p>phyll-specific expression in transgenic C4 Flaveria spp. Plant</p><p>Cell 9:479–489.</p><p>Stone, J. R., and G. A. Wray. 2001. Rapid evolution of cis-</p><p>regulatory sequences via local point mutations. Mol. Biol.</p><p>Evol. 18:1764–1770.</p><p>Storgaard, T., J. Christensen, B. Aasted, and S. Alexandersen.</p><p>1993. Cis-acting sequences in the Aleutian mink disease</p><p>parvovirus late promoter important for transcription: compar-</p><p>ison to the canine parvovirus and minute virus of mice. J.</p><p>Virol. 67:1887–1895.</p><p>Stormo, G. D. 2000. DNA binding sites: representation and</p><p>discovery. Bioinformatics 16:16–23.</p><p>Stougaard, J., N. N. Sandal, A. Grøn, A. Kuhle, and K. A.</p><p>Marcker. 1987. 59 analysis of the soybean leghaemoglobin</p><p>lbc3 gene: regulatory elements required for promoter activity</p><p>and organ specificity. EMBO J. 6:3565–3569.</p><p>Streelman, J. T., and T. D. Kocher. 2002. Microsatellite variation</p><p>associated with prolactin expression and growth of salt-</p><p>challenged Tilapia. Physiol. Genomics 9:1–4.</p><p>Struhl, K. 1999. Fundamentally different logic of gene regulation</p><p>in eukaryotes and prokaryotes. Cell 98:1–4.</p><p>Sucena, E., and D. L. Stern. 2000. Divergence of larval</p><p>morphology between Drosophila sechellia and its sibling</p><p>species caused by cis-regulatory evolution of ovo/shaven-</p><p>baby. Proc. Natl. Acad. Sci. USA 97:4530–4534.</p><p>Suske, G. 1999. The Sp-family of transcription factors. Gene</p><p>238:291–300.</p><p>Sutton, K. A., and M. Wilkinson. 1997. Rapid evolution of</p><p>a homeodomain: evidence for positive selection. J. Mol. Evol.</p><p>45:579–588.</p><p>Swalla, B. J., and W. R. Jeffery. 1996. Requirement of the Manx</p><p>gene for expression of chordate features in a tailless ascidian</p><p>larva. Science 274:1205–1208.</p><p>Tajima, F. 1989. Statistical method for testing the neutral</p><p>mutation hypothesis by DNA polymorphism. Genetics 123:</p><p>585–595.</p><p>Takahashi, A., S. C. Tsaur, J. A. Coyne, and C. I. Wu. 2001. The</p><p>nucleotide changes governing cuticular hydrocarbon variation</p><p>and their evolution in Drosophila melanogaster. Proc. Natl.</p><p>Acad. Sci. USA 98:3920–3925.</p><p>Takahashi, H., Y. Mitani, G. Satoh, and N. Satoh. 1999.</p><p>Evolutionary alterations of the minimal promoter for noto-</p><p>chord-specific Brachyury expression in ascidian embryos.</p><p>Development 126:3725–3734.</p><p>Takahata, N. 1987. On the overdispersed molecular clock.</p><p>Genetics 116:169–179.</p><p>Tamarina, N. A., M. Z. Ludwig, and R. C. Richmond. 1997.</p><p>Divergent and conserved features in the spatial expression of</p><p>the Drosophila pseudoobscura esterase-5B gene and the</p><p>esterase-6 gene of Drosophila melanogaster. Proc. Natl.</p><p>Acad. Sci. USA 94:7735–7741.</p><p>Tautz, D. 2000. Evolution of transcriptional regulation. Curr.</p><p>Opin. Genet. Dev. 10:575–579.</p><p>Thanos, D., and T. Maniatis. 1995. Virus induction of human</p><p>IFN beta gene expression requires the assembly of an en-</p><p>hanceosome. Cell 83:1091–1100.</p><p>Theissen, G., A. Becker, A. Di Rosa, A. Kanno, J. T. Kim,</p><p>T. Munster, K. U. Winter, and H. Saedler. 2000. A short</p><p>history of MADS-box genes in plants. Plant Mol. Biol.</p><p>42:115–149.</p><p>Thompson, J. R., S. W. Chen, L. Ho, A. W. Langston, and L. J.</p><p>Gudas. 1998. An evolutionary conserved element is essential</p><p>for somite and adjacent mesenchymal expression of the</p><p>Hoxa1 gene. Dev. Dyn. 211:97–108.</p><p>Thurz, M. 2001. Genetic susceptibility in chronic viral hepatitis.</p><p>Antiviral Res. 52:113–116.</p><p>Ting, C. T., S. C. Tsaur, M. L. Wu, and C. I. Wu. 1998. A rapidly</p><p>evolving homeobox at the site of a hybrid sterility gene.</p><p>Science 282:1501–1504.</p><p>Tomarev, S. I., M. K. Duncan, H. J. Roth, A. Cvekl, and J.</p><p>Piatagorsky. 1994. Convergent evolution of crystallin gene</p><p>regulation in squid and chicken: the AP-1/ARE connection.</p><p>J. Mol. Evol. 39:134–143.</p><p>Torchia, J., C. K. Glass, and M. G. Rosenfeld. 1998. Co-</p><p>activators and co-repressors in the integration of transcrip-</p><p>tional responses. Curr. Opin. Cell Biol. 10:373–383.</p><p>Tournamille, C., Y. Colin, J. P. Cartron, and C. Le Van Kim.</p><p>1995. Disruption of a GATA motif in the Duffy gene promoter</p><p>abolishes erythroid gene expression in Duffy-negative indi-</p><p>viduals. Nat. Genet. 10:224–228.</p><p>Trefilov, A., J. Berard, M. Krawczak, and J. Schmidtke. 2000.</p><p>Natal dispersal in rhesus macaques is related to serotonin</p><p>transporter gene promoter variation. Behav. Genet. 30:295–</p><p>301.</p><p>Treisman, J., P. Gönczy, M. Vashishtha, E. Harris, and C.</p><p>Desplan. 1989. A single amino acid can determine the DNA</p><p>binding specificity of homeodomain proteins. Cell 59:553–</p><p>562.</p><p>Triezenberg, S. J. 1995. Structure and function of transcriptional</p><p>activation domains. Curr. Opin. Genet. Dev. 5:190–196.</p><p>Tümpel, S., M. Maconochie, L. M. Wiedemann, and R.</p><p>Krumlauf. 2002. Conservation and diversity in the cis-</p><p>regulatory networks that integrate information controlling</p><p>expression of Hoxa2 in hindbrain and cranial neural crest cells</p><p>in vertebrates. Dev. Biol. 246:45–56.</p><p>Varga-Weisz, P. 2001. ATP-dependent chromatin remodeling</p><p>factors: nucleosome shufflers with many missions. Oncogene</p><p>20:3076–3085.</p><p>Venter, J. C., M. D. Adams, E. W. Myers et al. (274 co-authors).</p><p>2001. The sequence of the human genome. Science</p><p>291:1304–1351.</p><p>Vidigal, P., J. J. Gemner, and N. N. Zein. 2002. Poly-</p><p>morphisms in the interleukin-10, tumor necrosis factor-a,</p><p>and transforming growth factor-b1 genes in chronic</p><p>hepatitis C patients treated with interferon and ribavarin.</p><p>J. Hepatol. 36:271–277.</p><p>Vogelauer, M., J. Wu, N. Suka, and M. Grunstein. 2000. Global</p><p>histone acetylation and deacetylation in yeast. Nature 408:</p><p>495–498.</p><p>Von Dassow, G., E. Meir, E. M. Munro, and G. M. Odell. 2000.</p><p>The segment polarity network is a robust developmental</p><p>module. Nature 406:188–192.</p><p>Wagner, A. 2001. The yeast protein interaction network evolves</p><p>rapidly and contains few duplicate genes. Mol. Biol. Evol.</p><p>18:1283–1292.</p><p>Wallace, B. 1963. Genetic diversity, genetic uniformity, and</p><p>heterosis. Can. J. Genet. Cytol. 5:239–253.</p><p>Walter, J., and M. D. Biggin. 1996. DNA binding specificity of</p><p>1418 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>two homeodomain proteins in vitro and in Drosophila</p><p>embryos. Proc. Natl. Acad. Sci. USA 93:2680–2685.</p><p>Wang, R. L., A. Stec, J. Hey, L. Lukens, and J. Doebley. 1999.</p><p>The limits of selection during maize domestication. Nature</p><p>398:236–239.</p><p>Wang, W., F. G. Brunet, E. Nevo, and M. Long. 2002. Origin</p><p>of sphinx, a young chimeric RNA gene in Drosophila</p><p>melanogaster. Proc. Natl. Acad. Sci. USA 99:4448–</p><p>4453.</p><p>Wassermann, W. W., M. Palumbo, W. Thompson, J. W. Fickett,</p><p>and C. E. Lawrence. 2000. Human-mouse genome compar-</p><p>isons to locate regulatory sites. Nat. Genet. 26:225–228.</p><p>Waterston, R. H., K. Lindblad-Toh, E. Birney et al. (219 co-</p><p>authors). 2002. Initial sequencing and comparative analysis of</p><p>the mouse genome. Nature 420:520–562.</p><p>Wei, Z., L. M. Angerer, M. L. Gagnon, and R. C. Angerer. 1995.</p><p>Characterization of the SpHE promoter that is spatially</p><p>regulated along the animal-vegetal axis of the sea urchin</p><p>embryo. Dev. Biol. 171:195–211.</p><p>Weinzierl, R. O. J. 1999. Mechanisms of gene expression.</p><p>Imperial College Press, London.</p><p>West, R. J., R. Yocum, and M. Ptashne. 1984. Saccharomyces</p><p>cerevisiae GAL1-GAL10 divergent promoter region: location</p><p>and function of the upstream activating sequence UASG. Mol.</p><p>Cell. Biol. 4:2467–2478.</p><p>Wheeler, J. C., K. Shigesada, J. P. Gergen, and Y. Ito. 2000.</p><p>Mechanisms of transcriptional regulation by Runt domain</p><p>proteins. Semin. Cell. Dev. Biol. 11:369–375.</p><p>White, K. P., S. A. Rifkin, P. Hurban, and D. S. Hogness. 1999.</p><p>Microarray analysis of Drosophila development during meta-</p><p>morphosis. Science 286:2179–2184.</p><p>White, R. J. 2001. Gene transcription: mechanisms and control.</p><p>Blackwell Science, Malden, Mass.</p><p>Wilkins, A. S. 1993. Genetic analysis of animal development.</p><p>Wiley-Liss, Inc.,</p><p>New York.</p><p>———. 2002. The evolution of developmental pathways.</p><p>Sinauer Associates, Sunderland, Mass.</p><p>Wilson, A. C. 1975. Evolutionary importance of gene regulation.</p><p>Stadler Symp. 7:117–134.</p><p>Wilton, A. N., C. C. Laurie-Ahlberg, T. H. Emigh, and J. W.</p><p>Curtsinger. 1982. Naturally occurring enzyme activity varia-</p><p>tion in Drosophila melanogaster. II. Relationships among</p><p>enzymes. Genetics 102:207–221.</p><p>Wingender, E., X. Chen, E. Fricke et al. (14 co-authors). 2001.</p><p>The TRANSFAC system on gene expression regulation.</p><p>Nucleic Acids Res. 29:281–283.</p><p>Wolff, C., M. Pepling, P. Gergen, and M. Klingler. 1999.</p><p>Structure and evolution of a pair-rule interaction element: runt</p><p>regulatory sequences in D. melanogaster and D. virilis. Mech.</p><p>Dev. 80:87–99.</p><p>Wolffe, A. P. 1994. Insulating chromatin. Curr. Biol. 4:85–87.</p><p>———. 2001. Chromatin structure and the regulation of</p><p>transcription. Pp. 35–64 in J. Locker, ed. Transcription</p><p>factors. Academic Press, San Diego, Calif.</p><p>Wray, G. A., and A. E. Bely. 1994. The evolution of echinoderm</p><p>development is driven by several distinct factors. Develop-</p><p>ment 120(Suppl):97–106.</p><p>Wray, G. A., and C. J. Lowe. 2000. Developmental regulatory</p><p>genes and echinoderm evolution. Syst. Biol. 49:28–51.</p><p>Wray, G. A., and D. R. McClay. 1989. Molecular heterochronies</p><p>and heterotopies in early echinoid development. Evolution</p><p>43:803–813.</p><p>Wright, C. E., F. Haddad, A. X. Quin, P. W. Bodell, and K. M.</p><p>Baldwin. 1999. In vivo regulation of b-MGC gene in rodent</p><p>heart: role of T3 and evidence for an upstream enhancer. Am.</p><p>J. Physiol. 276:C883–C891.</p><p>Wright, S. 1982. Character change, speciation, and the higher</p><p>taxa. Evolution 36:427–443.</p><p>Wu, C. Y., and M. D. Brennan. 1993. Similar tissue-specific</p><p>expression of the Adh genes from different Drosophila species</p><p>is mediated by distinct arrangements of cis-acting sequences.</p><p>Mol. Gen. Genet. 240:58–64.</p><p>Xu, P.-X., X. Zhang, S. Heaney, A. Yoon, A. M. Michelson, and</p><p>R. L. Maas. 1999. Regulation of Pax6 expression is conserved</p><p>between mice and flies. Development 126:383–395.</p><p>Yamamoto, K. R., B. D. Darimont, R. L. Wagner, and J. A.</p><p>Iñiguez-Lluhı́. 1998. Building transcriptional regulatory com-</p><p>plexes: signals and surfaces. Cold Spring Harbor Symp.</p><p>Quant. Biol. 63:587–598.</p><p>Yamamoto, Y., and W. R. Jeffery. 2000. Central role for the lens</p><p>in cave fish eye degeneration. Science 289:631–633.</p><p>Yan, H., W. S. Yuan, V. E. Velculescu, B. Vogelstein, and K. W.</p><p>Kinzler. 2002. Allelic variation in human gene expression.</p><p>Science 297:1143–1143.</p><p>Yu, H., S. H. Yang, and C. J. Goh. 2002. Spatial and temporal</p><p>expression of the orchid floral homeotic gene DOMADS1 is</p><p>mediated by its upstream regulatory regions. Plant Mol. Biol.</p><p>49:225–237.</p><p>Yuh, C. H., H. Bolouri, and E. H. Davidson. 1998. Genomic cis-</p><p>regulatory logic: experimental and computational analysis of</p><p>a sea urchin gene. Science 279:1896–1902.</p><p>———. 2001. Cis-regulatory logic in the endo16 gene: switching</p><p>from a specification to a differentiation mode of control.</p><p>Development 128:617–629.</p><p>Yuh, C.-H., C. T. Brown, C. B. Livi, L. Rowen, P. J. C. Clarke,</p><p>and E. H. Davidson. 2002. Patchy interspecific sequence</p><p>similarities efficiently identify positive cis-regulatory elements</p><p>in the sea urchin. Dev. Biol. 246:148–161.</p><p>Yun, K. S., and B. Wold. 1996. Skeletal muscle determination</p><p>and differentiation: story of a core regulatory network and its</p><p>context. Curr. Opin. Cell Biol. 8:877–889.</p><p>Zeller, R. W., J. D. Griffith, J. G. Moore, C. V. Kirchhamer,</p><p>R. J. Britten, and E. H. Davidson. 1995. A multimeriz-</p><p>ing transcription factor of sea urchin embryos capable of</p><p>looping DNA. Proc. Natl. Acad. Sci. USA 92:2989–</p><p>2993.</p><p>Zerucha, T., T. Stuhmer, G. Hatch, B. K. Park, Q. M. Long, G.</p><p>Y. Yu, A. Gambarotta, J. R. Schultz, J. L. R. Rubenstein,</p><p>and M. Ekker. 2000. A highly conserved enhancer in the</p><p>Dlx5/Dlx6 region is the site of cross-regulatory interactions</p><p>between Dlx genes in the embryonic forebrain. J. Neuro.</p><p>20:709–721.</p><p>Zhu, J., and M. Q. Zhang. 1999. SCPD: a promoter database</p><p>of the yeast Saccharomyces cerevisiae. Bioinformatics 15:</p><p>607–611.</p><p>Zuckerkandl, E. 1963. Perspectives in molecular anthropology.</p><p>Pp. 256–274 in A. Rich and N. Davidson, eds. Structural</p><p>chemistry and molecular biology. W.H. Freeman, San</p><p>Francisco.</p><p>William Jeffery, Associate Editor</p><p>Accepted April 1, 2003</p><p>Evolution of Transcriptional Regulation 1419</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>2002; Vidigal, Gemner, and Zein</p><p>2002). Some promoter alleles confer protection from one</p><p>pathogen while increasing susceptibility to another (e.g.,</p><p>TNFa�380A: Meyer et al. 2002), raising the possibility of</p><p>balanced polymorphisms.</p><p>2.5 Divergence in Promoter Function May Contribute to</p><p>Reproductive Isolation</p><p>Changes in transcriptional regulation may also be</p><p>important in speciation. The Dobzhansky-Muller model of</p><p>speciation requires interspecific differences at pairs of</p><p>interacting loci (Dobzhansky 1936; Muller 1942). Because</p><p>of the large number of highly specific interactions that</p><p>occur between proteins and DNA within promoters, these</p><p>regions represent likely sites for postzygotic isolation</p><p>resulting from multilocus epistasis (Johnson and Porter</p><p>2000). Empirical support comes from genetic loci that are</p><p>involved in reproductive isolation. Only four such loci have</p><p>been identified definitively, and all have turned out to</p><p>involve changes in transcriptional regulation: the coding</p><p>sequence of the transcription factor Odysseus within the</p><p>genus Drosophila (Ting et al. 1998); promoter sequences of</p><p>Xmk2 and CKDN2X within the teleost genus Xiphophorus</p><p>(reviewed in Orr and Presgraves 2000); and a promoter</p><p>polymorphism in desaturase 2 of D. melanogaster that is</p><p>correlated with intraspecific differences in mating behavior</p><p>and may be involved in premating isolation (Fang,</p><p>Takahashi, and Wu 2002).</p><p>3 Transcriptional Regulation in Eukaryotes</p><p>The familiar regularities that characterize coding</p><p>sequences, in particular the genetic code, are absent from</p><p>1380 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>promoters. Understanding the functional consequences of</p><p>evolutionary differences in promoter sequences therefore</p><p>requires a clear knowledge of the mechanisms of</p><p>transcriptional regulation. In this section, we review the</p><p>structure and function of eukaryotic promoters. The</p><p>literature on this topic is vast, and the emphasis here is</p><p>on features directly pertinent to promoter evolution. Our</p><p>focus is on the transcription of protein-coding loci, which</p><p>comprise the majority of genes in eukaryotic genomes and</p><p>about which the most information is available. Transcrip-</p><p>tional regulation in Eubacteria is distinct in many ways</p><p>(Struhl 1999; Lewin 2000), whereas in Archaea it is not</p><p>particularly well understood (although the latter shares</p><p>many features with eukaryotic regulation: Bell and</p><p>Jackson 1998; Weinzierl 1999). Neither prokaryotic group</p><p>is covered in this review. For more detailed reviews of</p><p>mechanisms of eukaryotic transcriptional regulation see</p><p>Latchman (1998), Weinzierl (1999); Carey and Smale</p><p>(2000), Lee and Young (2000), Lewin (2000), Davidson</p><p>(2001), Locker (2001), and White (2001).</p><p>3.1 Promoters and Gene Expression</p><p>Only some of the genes in a eukaryotic cell are</p><p>expressed at any given moment. The proportion and</p><p>composition of transcribed genes changes considerably</p><p>during the life cycle, among cell types, and in response to</p><p>fluctuating physiological and environmental conditions</p><p>(e.g., White et al. 1999; Iyer et al. 2001; Kayo et al. 2001;</p><p>Mody et al. 2001; Arbeitman et al. 2002). Given that</p><p>eukaryotic genomes contain on the order of 0.5 to 53 104</p><p>genes, regulating this differential gene expression requires</p><p>an exceptionally complex array of specific physical in-</p><p>teractions among macromolecules.</p><p>3.1.1 Most Regulation of Gene Expression Occurs at the</p><p>Level of Transcription</p><p>Eukaryotes employ diverse mechanisms to regulate</p><p>gene expression, including chromatin condensation, DNA</p><p>methylation, transcriptional initiation, alternative splicing</p><p>of RNA, mRNA stability, translational controls, several</p><p>forms of post-translational modification, intracellular</p><p>trafficking, and protein degradation (Lewin 2000; Alberts</p><p>et al. 2002). Of these broad categories, the most common</p><p>point of control is the rate of transcriptional initiation</p><p>(Latchman 1998; Carey and Smale 2000; Lemon and Tjian</p><p>2000; White 2001). For virtually every eukaryotic gene</p><p>where relevant information exists, transcriptional initiation</p><p>appears to be the primary determinant, or one of the most</p><p>important determinants, of the overall gene expression</p><p>profile.</p><p>3.1.2 Transcriptional Regulation Is Primarily Gene-</p><p>Specific</p><p>To a first approximation, the transcription of each</p><p>gene in a eukaryotic genome is controlled independently.</p><p>Operons (multi-locus transcripts regulated by a single</p><p>promoter) are unusual in eukaryotes, a contrast with most</p><p>prokaryotes. (Eukaryotic exceptions include the protozoan</p><p>Trypanosoma brucei and the nematode Caenorhabditis</p><p>elegans, where a substantial fraction of genes are</p><p>transcribed as polycistronic mRNAs: Blumenthal 1998).</p><p>Even paralogs within gene families are typically regulated</p><p>independently and often have quite different expression</p><p>profiles (e.g., Ferris and Whitt 1979; Fang and Brandhorst</p><p>1996; Christophides et al. 2000; Gu et al. 2002). Although</p><p>a regulatory region sometimes directly influences the</p><p>transcription of two loci (for examples, see section 3.3.7</p><p>and fig. 2), such cases apparently are uncommon.</p><p>Distributed transcriptional regulation allows selection to</p><p>fine-tune the expression profile of each gene independently.</p><p>3.1.3 Gene Expression Profiles Are Complex</p><p>Most genes are differentially transcribed across the</p><p>life cycle, according to environmental conditions, in dif-</p><p>ferent cell types and regions, and among sexes. Transcrip-</p><p>tional regulation is a highly dynamic process: rates of RNA</p><p>synthesis can fluctuate by orders of magnitude, change</p><p>over time scales of minutes, and differ among adjacent</p><p>cells. Most genes have spatially and temporally heteroge-</p><p>neous expression profiles. Genes encoding regulatory</p><p>proteins possess some of the most complex expression</p><p>profiles. In metazoans and metaphytes, most such genes</p><p>are expressed in several distinct domains (Gerhart and</p><p>Kirschner 1997; Davidson 2001). For instance, the</p><p>transcription factor Pax-6 is expressed at different times</p><p>and at different levels in the telencephalon, hindbrain, and</p><p>spinal cord of the central nervous system; in the lens,</p><p>cornea, neural and pigmented retina, lacrimal gland, and</p><p>conjunctiva of the eye; and in the pancreas (Kammandel et</p><p>al. 1999). Where data are available, they link distinct</p><p>phases of these complex expression profiles to distinct</p><p>regulatory functions (Wray and Lowe 2000; Davidson</p><p>2001; Wilkins 2002). Although the transcription profiles of</p><p>‘‘housekeeping’’ genes are generally much simpler, most</p><p>are transcribed at different levels among cell types and are</p><p>shut down in response to extreme environmental con-</p><p>ditions such as heat shock.</p><p>3.1.4 Promoters Integrate Information and Alter</p><p>Transcription Accordingly</p><p>At its most fundamental level, the function of</p><p>a promoter is to integrate information about the status</p><p>of the cell in which it resides, and to alter the rate of</p><p>transcriptional initiation of a single gene accordingly. The</p><p>inputs that a promoter integrates can take many forms. The</p><p>promoters of genes expressed during early development</p><p>integrate spatial and temporal inputs to produce highly</p><p>dynamic patterns of transcription in specific regions of the</p><p>embryo (Davidson 2001; Wilkins 2002). The promoters of</p><p>genes encoding housekeeping proteins are constitutively</p><p>active, but they can shut down in response to specific</p><p>conditions, such as heat shock or starvation (Pirkkala,</p><p>Nykanen, and Sistonen 2001). Other promoters are off by</p><p>default, but they can be activated in response to specific</p><p>hormonal, physiological, or environmental cues (Benecke,</p><p>Gaudon, and Gronemeyer 2001; Shore and Sharrocks</p><p>2001). These diverse inputs eventually reach promoters in</p><p>the form of transcription factors, proteins that bind in</p><p>Evolution of Transcriptional Regulation 1381</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>a sequence-specific manner to the DNA near a gene,</p><p>altering rates of transcriptional initiation. The shifting</p><p>array of active transcription</p><p>factors within the nucleus</p><p>determines whether a gene is transcribed or not and how</p><p>much mRNA is produced from it.</p><p>3.2 Promoter Structure</p><p>The organization of promoters is much less regular</p><p>than that of coding sequences and lacks an equivalent</p><p>of the genetic code or other sequence features that pro-</p><p>vide a consistent relationship to function. This fact has</p><p>far-reaching implications for studying the evolution</p><p>of promoter structure and function (see section 5).</p><p>3.2.1 Promoters Lack Universal Structural Features</p><p>No consistent sequence motifs exist for promoters of</p><p>protein-coding genes. Two functional features are always</p><p>present (fig. 1A), although they cannot always be</p><p>recognized from sequence information alone. One is</p><p>a basal promoter (or core promoter), the site upon which</p><p>the enzymatic machinery of transcription assembles.</p><p>Although necessary for transcription, the basal promoter</p><p>is apparently not a common point of regulation, and it</p><p>cannot by itself generate functionally significant levels of</p><p>mRNA (Kuras and Struhl 1999; Lee and Young 2000;</p><p>Lemon and Tjian 2000). The other functional feature is</p><p>a collection of diverse transcription factor binding sites</p><p>that confer specificity of transcription. Proteins bound to</p><p>these sites produce a scalar response, the frequency with</p><p>which new transcripts are initiated (Latchman 1998;</p><p>Davidson 2001; Locker 2001).</p><p>3.2.2 The Transcriptional Machinery Assembles on the</p><p>Basal Promoter</p><p>Eukaryotic genes that encode proteins are transcribed</p><p>by the RNA polymerase II holoenzyme complex, which is</p><p>composed of 10 to 12 proteins (Orphanides, Lagrange, and</p><p>Reinberg 1998; Lee and Young 2000). This transcriptional</p><p>machinery assembles on the basal promoter, a ;100-bp</p><p>region whose functions are to provide a docking site for</p><p>the transcription complex and to position the start of</p><p>transcription relative to coding sequences (Reinberg et al.</p><p>1998; Lee and Young 2000; Pugh 2001). Basal promoter</p><p>sequences differ among genes. For many genes, the critical</p><p>binding site is a TATA box, usually located about 25–30</p><p>bp 59 of the transcription start site. However, many genes</p><p>lack a TATA box and instead contain an initiator element</p><p>spanning the transcription start site. So-called null basal</p><p>promoters exist that contain neither a TATA box nor an</p><p>initiator element, and some basal promoters that contain</p><p>one or the other also contain additional protein binding</p><p>sites for general transcription factors (Carey and Smale</p><p>2000; Ohler and Niemann 2001). A gene may have more</p><p>than one basal promoter, each of which initiates tran-</p><p>scription at a distinct position (fig. 2J and K), and both</p><p>TATA and TATA-less basal promoters can be associated</p><p>with alternate start sites of the same gene (Goodyer et al.</p><p>2001). The functional consequences of differences in basal</p><p>promoter structure are not well understood, although genes</p><p>with TATA-less basal promoters may generally be</p><p>FIG. 1.—Promoter structure and function. (A) Organization of a generalized eukaryotic gene, showing the relative position of the transcription unit,</p><p>basal promoter region (black box with bent arrow), and transcription factor binding sites (vertical bars). The position of transcription factor binding sites</p><p>differs enormously between loci; although they often reside within a few kb 59 of the start site of transcription (as shown here), many other</p><p>configurations are possible (fig. 2). (B) Idealized promoter in operation. Initiating transcription requires several dozen different proteins which interact</p><p>with each other in specific ways. These include the RNA polymerase II holoenzyme complex (;15 proteins); TATA-binding protein (TBP; 1 protein);</p><p>TAFs (TBP-associated factors, also known as general transcription factors; ;8 proteins); transcription factors (precise composition and number bound</p><p>differs among loci and varies in space and time and according to environmental conditions, but several to many any time transcription is active);</p><p>transcription cofactors (again, precise composition and number will vary); and chromatin remodeling complexes (which can contain a dozen or more</p><p>proteins).</p><p>1382 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>FIG. 2.—A bestiary of promoters. The known cis-regulatory regions of several genes from diverse eukaryotes are shown (partial promoters in</p><p>panels M and N). These locus maps are drawn to the same scale (upper right): black boxes¼precisely mapped regulatory regions; gray boxes¼ regions</p><p>containing regulatory sequences that have not been mapped precisely (the actual extent of regulatory sequences is likely to be much smaller); white</p><p>boxes¼ exons (UTRs and coding sequences); bent arrows¼ transcription start sites; numbers¼ distinct regulatory regions whose contribution to the</p><p>total transcription profile has been defined experimentally; dashed lines indicate interactions between a module and more than one locus or a nonadjacent</p><p>locus. Note the wide range in the spatial extent and position of cis-regulatory sequences. The smallest promoters of polymerase II–transcribed genes are</p><p>in the range of 200–300 bp (A, B, J); in exceptional cases regulatory modules may lie more than 200 kb from the start of transcription (Q). Transcription</p><p>factor binding sites generally reside in 59 flanking sequences (E–H), but may also lie in the 59 UTR (P: Scr module 1), introns (L: Otx modules 7–11),</p><p>and 39 flanking sequences (Q: BMP5 modules 1–5). Nearly all promoters are compact in Saccharomyces cerevisiae (J), but promoter size differs</p><p>enormously between loci in the sea urchin Strongylocentrotus purpuratus (compare A, E, L). Many promoters are highly modular, with different</p><p>regulatory regions producing discrete components of the transcription profile (D–F, I, K). Some modules regulate transcription at more than one time</p><p>and place during development (G, K: APETALA3 module 3 and eve modules 8–11). Conversely, some expression domains are regulated by more than</p><p>one module (I: eve modules 3–7, 10, and 11 are required to produce the seven embryonic stripes of eve transcription; K: modules 1 and 4 are required</p><p>for transcription of PAX6 in the retina). Genes expressed in similar patterns sometimes have rather different promoter organization (P: module 2 of ftz</p><p>produces seven embryonic stripes of transcription that are very similar to the ones produced by modules 3–7, 10, and 11 of eve). Although the cis-</p><p>regulatory sequences of a given locus generally lie between it and the two flanking loci, in unusual cases there may be an intervening transcription unit.</p><p>For instance, ftz lies between cis-regulatory sequences that interact only with Scr (P: module 4). In some cases, regulatory regions influence</p><p>transcription at more than one locus. These may be divergently or convergently transcribed tandem paralogs (M, N: Yp1/Yp2 and Dlx6/Dlx4,</p><p>respectively) or even genealogically unrelated adjacent loci (J, M: GAL10/GAL1 and APOC3/APOA1). Loci, protein product, taxon, and references: (A)</p><p>SpHE (metalloendoprotease) of the sea urchin Strongylocentrotus purpuratus (Wei et al. 1995); (B) DQA1 (histocompatibility protein) of Homo</p><p>sapiens (Petronzelli et al. 1995); (C) bMHC (myosin heavy chain) of Rattus rattus (Wright et al. 1999); (D) lbc3 (leghemoglobin) of Glycine max</p><p>(Stougaard et al. 1987; (E) Endo16 (cell adhesion protein) of S. purpuratus (Yuh, Bolouri, and Davidson 1998, 2001); (F) forkhead (winged-helix</p><p>transcription factor) of the urochordate Ciona intestinalis (Di Gregorio, Corbo, and Levine 2001); (G) APETALA3 (MADS-box transcription factor) of</p><p>Arabidopsis thaliana (Hill et al. 1998); (H) DOMADS1 (MADS-box transcription factor) of the orchid Dendrobium cv Madame Thong-In (Yu, Yang,</p><p>and Goh 2002); (I) even-skipped (homeodomain transcription factor) of D. melanogaster (Sackerson, Fujioka, and Goto 1999); (J) GAL10 and GAL1</p><p>(genealogically unrelated metabolic enzymes) of Saccharomyces cerevisiae (West, Yocum, and Ptashne 1984); (K) PAX6 (paired-box transcription</p><p>factor) of Mus musculus</p><p>(Kammandel et al. 1999); (L) Otx (homeodomain transcription factor) of S. purpuratus (Yuh et al. 2002); (M) Yp1 and Yp2</p><p>(paralogous yolk proteins) of D. melanogaster (Chung et al. 1996); (N) Dlx6 and Dlx4 (paralogous homeodomain transcription factors) of Danio rerio</p><p>(Zerucha et al. 2000; the intron/exon structure of the loci is not known in detail; only shared regulatory elements are shown); (O) APOC3 and APOA1</p><p>(genealogically unrelated lipid carrier proteins; only shared regulatory elements are shown) of H. sapiens (Li et al. 1995; Naganawa et al. 1997);</p><p>(P) ftz and Scr (paralogous homeodomain transcription factors) of D. melanogaster (Calhoun, Stathopoulos, and Levine 2002); (Q) BMP5</p><p>(signaling protein) of M. musculus (DiLeone, Russell, and Kingsley 1998; the position of exon 3 is not known precisely; splice patterns are omitted for</p><p>simplicity).</p><p>Evolution of Transcriptional Regulation 1383</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>transcribed constitutively at relatively low levels (Pugh</p><p>2001). A key early step in transcriptional initiation is</p><p>attachment of TATA-binding protein (TBP) to DNA</p><p>(Jackson-Fisher et al. 1999; Kuras and Struhl 1999). In</p><p>promoters lacking TATA boxes, proteins that associate</p><p>with other basal promoter motifs facilitate TBP association</p><p>with DNA in a sequence-independent manner. Once TBP</p><p>binds, several TBP-associated factors (TAFs) guide the</p><p>RNA polymerase II holoenzyme complex onto the DNA</p><p>(fig. 1B). This step, which can be positively or negatively</p><p>modulated by transcription factors bound at other sites, is</p><p>one of the most important points of transcriptional</p><p>regulation (Latchman 1998; Lee and Young 2000; Lemon</p><p>and Tjian 2000).</p><p>3.2.3 The Start Site of Transcription Varies in Both</p><p>Sequence and Position</p><p>The start site of transcription, unlike the start site of</p><p>translation, does not require a specific sequence motif and</p><p>cannot be identified from sequence data. After the RNA</p><p>polymerase II holoenzyme complex assembles onto DNA,</p><p>a second contact is established ;30 bp downstream. This</p><p>second contact point is the start site of transcription. It is</p><p>thus the physical size of the transcriptional machinery and</p><p>the particular composition of binding sites that facilitate its</p><p>binding to the basal promoter and that determine where</p><p>transcription begins (fig. 1B). Spacing between the start</p><p>sites of transcription and translation differs considerably</p><p>among genes, ranging from ;101 to 104 bp; the 59</p><p>untranslated region (UTR) can also contain introns that</p><p>alter its length post-transcriptionally. The functional con-</p><p>sequences of differences in 59 UTR length are not well</p><p>understood.</p><p>3.2.4 Basal Promoters Provide Limited Transcriptional</p><p>Activity and Specificity</p><p>By itself, a basal promoter initiates transcription at</p><p>a very low rate, even when the local chromatin is suitably</p><p>decondensed (Jackson-Fisher et al. 1999; Kuras and Struhl</p><p>1999; Lemon and Tjian 2000). Furthermore, most of the</p><p>proteins that bind to basal promoter motifs are ubiqui-</p><p>tously expressed and therefore provide little regulatory</p><p>specificity (Carey and Smale 2000; Lee and Young 2000;</p><p>Lemon and Tjian 2000). These proteins are known as</p><p>general transcription factors. A few tissue-specific iso-</p><p>forms of these proteins are known, however, and may</p><p>exert some degree of transcriptional regulation (Holstege</p><p>et al. 1998; Smale et al. 1998). Additional mechanisms of</p><p>transcriptional regulation involving the basal promoter are</p><p>discussed later (see section 3.3.6).</p><p>3.2.5 Specificity of Transcription Is Controlled by</p><p>Proteins that Bind to Discrete, Idiosyncratic Sites</p><p>Producing functionally significant levels of mRNA</p><p>requires the sequence-specific association of transcription</p><p>factors with DNA sequences outside the basal promoter</p><p>(Weinzierl 1999; Carey and Smale 2000; Lemon and Tjian</p><p>2000). The composition and organization of these</p><p>transcription factor binding sites varies enormously among</p><p>genes (fig. 2). The nucleotide sequences of these binding</p><p>sites determine which transcription factors are capable of</p><p>associating with the promoter of a given gene. Which</p><p>transcription factors actually do so depends on which of</p><p>them is present in the nucleus in an active form and, in</p><p>many cases, on the presence of cofactors as well (Locker</p><p>2001). The complement of active transcription factors</p><p>within the nucleus differs during the course of devel-</p><p>opment, in response to environmental conditions, across</p><p>regions of the organism, and among cell types (Latchman</p><p>1998; Davidson 2001). This changing array of transcrip-</p><p>tion factors provides nearly all of the control over when,</p><p>where, at what level, and under what circumstances a par-</p><p>ticular gene is transcribed. Thus, the genetic basis for the</p><p>expression profile of each gene resides in part within its</p><p>promoter and in part within the many other segments of</p><p>the genome that encode specific transcription factors that</p><p>bind to the promoter.</p><p>3.3 Transcription Factor Binding Sites</p><p>The composition and configuration of transcription</p><p>factor binding sites near a gene are major determinants of</p><p>its expression profile, and they therefore constitute an</p><p>important class of sequences that are potential targets of</p><p>natural selection on gene expression.</p><p>3.3.1 Promoters Contain Numerous Transcription Factor</p><p>Binding Sites</p><p>Identifying genuine binding sites is not straight-</p><p>forward for a variety of reasons (see sections 3.3.3 and 5.2;</p><p>Weinzierl 1999; Carey and Smale 2000). It is difficult to</p><p>be certain that all functional binding sites within a promoter</p><p>have been identified, and it is prudent to assume that some</p><p>binding sites remain uncharacterized even within well-</p><p>studied promoters. Because of this uncertainty, the range</p><p>and average number of binding sites found in a typical</p><p>promoter is not known, much less any correlations</p><p>between these parameters and the nature of the gene</p><p>product or mode of expression. Nonetheless, a perusal of</p><p>well-characterized eukaryotic promoters suggests that</p><p>numbers on the order of 10–50 binding sites for 5–15</p><p>different transcription factors is not unusual (for examples,</p><p>see Arnone and Davidson [1997] and Wilkins [2002]).</p><p>3.3.2 Transcription Factor Binding Sites Are Distributed</p><p>Sparsely and Unevenly</p><p>Binding sites typically comprise a minority of the</p><p>nucleotides within a promoter region. This fraction ranges</p><p>from 10% to 20% within relatively well-studied regulatory</p><p>regions (table 1, fig. 3). These regions are often inter-</p><p>spersed with regions that contain no binding sites (fig. 2).</p><p>Disjunct regulatory regions often produce discrete portions</p><p>of the total transcription profile (see section 3.5.4).</p><p>Nucleotides that do not affect the specificity of transcrip-</p><p>tion factor binding are generally assumed to be non-</p><p>functional with respect to transcription. In some cases,</p><p>however, these nucleotides may influence the local</p><p>conformation of DNA, with direct consequences for</p><p>protein binding (e.g., Naylor and Clark 1990; Hizver</p><p>et al. 2001; Rothenburg et al. 2001). Spacing between</p><p>1384 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>binding sites varies enormously, from partial overlap to</p><p>tens of kilobases (figs. 2 and 3). Functional constraints on</p><p>binding site spacing are often related to protein inter-</p><p>actions that take place during DNA binding (see section</p><p>3.5.2).</p><p>3.3.3 Transcription Factor Binding Sites Are Short and</p><p>Imprecise</p><p>Because of the way transcription factors interact with</p><p>DNA, several different criteria are used to define binding</p><p>sites. (1) Physical contact versus binding specificity. The</p><p>segment of DNA protected from nuclease digestion by</p><p>a transcription factor (its ‘‘footprint’’) is typically wider</p><p>than the nucleotides that confer binding specificity (its</p><p>binding site). Most transcription factor binding sites span</p><p>5–8 bp (table 2), whereas footprints are typically 10–20</p><p>bp. (2) Single versus multiple sequences. Most binding</p><p>sites can tolerate at least one, and often</p><p>more, specific</p><p>nucleotide substitution without completely losing func-</p><p>tionality (Latchman 1998; Courey 2001). This is evident</p><p>from comparing different binding sites known to bind</p><p>the same transcription factor and from in vitro assays</p><p>of protein-DNA binding (see section 3.4.4; for examples</p><p>within a single promoter, see fig. 3). The full range of</p><p>sequences (in practice, often poorly understood) that can</p><p>bind a particular transcription factor with significantly</p><p>higher specificity than random DNA under physiological</p><p>conditions is often described by a position weight matrix,</p><p>in which the probability that each position in the binding</p><p>site will be represented by a particular nucleotide is</p><p>tabulated. When binding site matrices are factored in, the</p><p>number of nucleotides required for specific protein binding</p><p>drops to about 4–6 bp for a typical binding site (table 2).</p><p>Although binding site matrices are generally composed of</p><p>related sequences, some transcription factors bind to rather</p><p>different sequences in association with different binding</p><p>partners (e.g., jun/jun, fos/jun, CRE-BP1/jun dimers:</p><p>Latchman 1998, Fairall and Schwabe 2001). (3) Informatic</p><p>versus functional consensus. The term consensus sequence</p><p>refers to the single ‘‘best’’ variant of the binding site matrix</p><p>or to a degenerate sequence that captures most of the</p><p>binding site matrix (table 2). Two rather different criteria</p><p>are used to define consensus sequences: sequence compar-</p><p>isons (most commonly, simply the average sequence of</p><p>multiple instances of binding sites for same protein) and</p><p>biochemical assays (the single variant with the highest</p><p>affinity for the protein in vitro).</p><p>3.3.4 Many Potential Binding Sites Are Nonfunctional</p><p>Given that there are many different transcription</p><p>factors with different binding matrices, and given that</p><p>binding sites are short and imprecise, every kilobase of</p><p>genomic DNA contains many dozens of potential tran-</p><p>scription factor binding sites on the basis of random</p><p>similarity (Carroll, Grenier, and Weatherbee 2001; Stone</p><p>and Wray 2001). For a variety of reasons (fig. 4), many of</p><p>these consensus matches don’t bind protein in vivo and</p><p>have no influence on transcription (Biggin and McGinnis</p><p>1997; Weinzierl 1999; Li and Johnston 2001). Identifying</p><p>the potential binding sites that actually bind protein</p><p>requires biochemical and experimental tests (see sections</p><p>5.2 and 5.3).</p><p>3.3.5 Variants Within a Binding Site Matrix Can Differ</p><p>Functionally</p><p>Althoughmost transcription factors can bind to several</p><p>distinct sequences, they may do so with different kinetics</p><p>(Czerny, Schaffner, and Busslinger 1993; Carey and Smale</p><p>2000). Differences in binding affinities are particularly</p><p>important when two binding sites overlap physically or are</p><p>located very near each other, because only one binding site</p><p>can be occupied by protein at a time (fig. 3A: Otx, Z, and</p><p>CG binding sites). In such cases, differences in protein</p><p>concentrations and binding kinetics will determine which</p><p>binding site is occupied most of the time. Differences in</p><p>kinetics can also be important for binding sites not near each</p><p>other, because active promoters compete for a single pool of</p><p>transcription factors within each nucleus and there are</p><p>typically fewer transcription factors present than there are</p><p>binding sites in a genome.</p><p>3.3.6 Transcription Factor Binding Sites Occupy a Wide</p><p>Range of Positions Relative to the Transcription Unit</p><p>Although transcription factor binding sites sometimes</p><p>occupy a single, discrete region near the start site of</p><p>transcription (fig. 2A–E), in many cases they are dispersed</p><p>Table 1</p><p>Density of Binding Site Nucleotides in Promoter Regions</p><p>Locus/Species Regiona Binding/Nonbindingb</p><p>Proportion</p><p>Binding Referencec</p><p>Endo16, Strongylocentrotus purpuratus Module A 33/130 0.22 1</p><p>leghemoglobin, Glycine max 23/191 0.11 2</p><p>Adh, Arabidopsis thaliana 60/440 0.12 3</p><p>Adh, Drosophila melanogaster 345/1155 0.23 4</p><p>even-skipped, Drosophila melanogaster Stripe 2 module 285/1430 0.10 5</p><p>a Few promoters have been analyzed in sufficient detail that nucleotides over their entire extent can be confidently assigned</p><p>to binding sites versus nonbinding sites (see section 5.2). For all the examples shown here, the promoter is larger (in some cases</p><p>much larger) than the region for which detailed information is available (fig. 2).</p><p>b Nucleotides identified by the authors as involved in specific binding of transcription factors, as a fraction of all nucleo-</p><p>tides comprising the module or promoter region. Somewhat different criteria were used to identify binding sites in these stud-</p><p>ies, and tallies of binding site nucleotides are likely to be underestimates (see section 5.2).</p><p>c References: (1) Yuh, Bolouri, and Davidson 2001; (2) Stougaard et al. 1987, Andersson et al. 1996; (3) Miyashita</p><p>2001; (4) Nurminsky et al. 1996; (5) Small, Blair, and Levine 1992.</p><p>Evolution of Transcriptional Regulation 1385</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>into several distinct clusters (fig. 2I, K–L, and P). The</p><p>physical extent of cis-regulatory regions varies by nearly</p><p>three orders of magnitude, from a few hundred base pairs</p><p>to .100 kb (fig. 2). An extreme example of physical</p><p>dispersion is a regulatory module of the Shh locus in</p><p>humans and mice that lies ;800 kb distant from the start</p><p>site of transcription (Lettice et al. 2002). The position of</p><p>transcription factor binding sites relative to the transcrip-</p><p>tion unit also differs enormously among genes. They often</p><p>lie within a few kb 59 of the basal promoter (fig. 2A–G),</p><p>but they can occupy a wide range of other positions: . 30</p><p>kb 59 of the basal promoter (e.g., Ubx in D. melanogaster:</p><p>Simon et al. 1990; Pax-6 in mouse: Kammandel et al.</p><p>1999; APOB in humans: Nielsen et al. 1998); within the 59</p><p>UTR (Scr in D. melanogaster: Calhoun, Stathopoulos, and</p><p>Levine 2002); within introns (Otx in the sea urchin</p><p>Strongylocentrotus purpuratus: Yuh et al. 2002; CCR5</p><p>in humans: Bamshad et al. 2002); . 30 kb 39 of the</p><p>transcription unit (BMP5 in mouse: DiLeone, Russell, and</p><p>Kingsley 1998); and, in rare instances, even within</p><p>a coding exon (keratin 18 in humans: Neznanov,</p><p>Umezawa, and Oshima 1997; nonA in Drosophila:</p><p>Sandrelli et al. 2001). This diversity of positions is</p><p>possible because DNA looping allows interaction between</p><p>FIG. 3.—Examples of binding site organization. Transcription factor binding sites within three cis-regulatory regions are shown to scale. Boxes</p><p>indicate nucleotides that contribute to binding specificity (with the exception of Gt, where footprints are shown); transcription factor names or binding</p><p>site motifs are shown above binding sites; tx initiation¼ start site of transcription; tl initiation¼ start site of translation; solid bars under inset maps of</p><p>locus organization indicate the approximate position of the sequence shown. Note that nucleotides contributing to protein binding comprise only a small</p><p>fraction of the total, even within those regions where binding sites are relatively dense (see also table 1). Multiply-represented binding sites are often</p><p>present in both orientations and represent variations of the consensus (CG and GCF1 of Endo16; Kr and Bc of eve; CTCTT nodulin motif of lbc3).</p><p>Spacing between binding sites provides clues to function: those centered ;10 bp apart may bind proteins that interact on the same side of the DNA</p><p>helix (Otx and CG of Endo16; upstream nodulin motifs of lbc3); those that overlap may operate as a switch, with one protein preventing the other from</p><p>binding under some conditions (Otx, Z, and CG of Endo16; Kr and Bc of eve); whereas those more than about 20 bp apart probably bind proteins that</p><p>either do not interact or do so through DNA bending or looping. (A) Module A and basal promoter of Endo16 from the sea urchin Strongylocentrotus</p><p>purpuratus (Yuh, Bolouri, and Davidson 1998). Module A contains 11 binding sites for five transcription factors, two of which interact with multiple</p><p>binding sites. (B) Stripe 2 module from eve of D. melanogaster</p><p>(Small et al. 1992). This module contains 17 binding sites for four transcription factors,</p><p>all of which interact with multiple binding sites. Note that boxed nucleotides for Gt are footprints rather than binding sites. (C) Proximal promoter</p><p>region of lbc3 of Glycine max (Stougaard et al. 1987). This region contains five binding sites for at least three different transcription factors. Multiple</p><p>instances of two different nodulin motifs are present; these binding sites are found upstream of several genes that are expressed in root nodules of</p><p>legumes.</p><p>1386 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>proteins associated with DNA at distant binding sites (fig.</p><p>1B) (see section 3.4.5). Binding sites may even lie on the</p><p>far side of an adjacent locus (fig. 2O). The position of</p><p>binding sites for some transcription factors may be</p><p>functionally constrained. For instance, CCAAT binding</p><p>sites for the transcription factor CBP (CREB binding</p><p>protein) are generally located 50–100 bp 59 of the</p><p>transcription start site, and those for Sp1 are often located</p><p>near the basal promoter of many mammalian genes. For</p><p>most transcription factors, however, binding sites lack any</p><p>obvious spatial restriction relative to other features of the</p><p>locus. In general, the functional consequences of binding</p><p>site position are poorly understood.</p><p>3.3.7 Specific Sequences Limit the Regulatory Influence of</p><p>Binding Sites</p><p>Because binding sites can interact with basal</p><p>promoters that are tens or even hundreds of kilobases</p><p>distant, they are potentially able to influence transcription</p><p>at more than one locus. At least three mechanisms spatially</p><p>restrict this influence. (1) Insulator sequences. Some, and</p><p>perhaps many, promoters are bounded by insulator</p><p>sequences (also known as boundary elements) (Wolffe</p><p>1994; Bell and Felsenfeld 1999; Dillon and Sabbattini</p><p>2000). Mechanisms of insulator function are not well</p><p>understood but appear to involve chromatin modulation</p><p>(Bell and Felsenfeld 1999). (2) Basal promoter selectivity.</p><p>Some regulatory sequences interact preferentially with</p><p>TATA or TATA-less basal promoters, even if a basal</p><p>promoter of the other kind is closer to them (Ohtsuki,</p><p>Levine, and Cai 1998). (3) Selective tethering. Sequences</p><p>immediately 59 of a basal promoter may help selectively</p><p>recruit transcription factor complexes bound at distant</p><p>sites. For instance, an activator module (enhancer) located</p><p>close to the ftz locus in Drosophila associates only with the</p><p>more distant basal promoter of Scr (fig. 2O) (Calhoun,</p><p>Stathopoulos, and Levine 2002).</p><p>3.3.8 Some Binding Sites Affect Transcription at More</p><p>than One Locus</p><p>Although most binding sites directly influence the</p><p>expression of just one gene, many exceptions are known.</p><p>One manifestation is a ‘‘divergent promoter,’’ where</p><p>binding sites regulate transcription of paralogous loci that</p><p>lie on opposite strands of DNA with their 59 ends centrally</p><p>located (fig. 2M). Binding site ‘‘sharing,’’ or cross-</p><p>regulation, of adjacent loci also occurs in other contexts:</p><p>paralogs that are transcribed convergently (fig. 2N) or in</p><p>parallel (e.g., beta-globin: Grosveld et al. 1993; Hox</p><p>complex: Ohtsuki, Levine, and Cai 1998; Kmita, Kondo,</p><p>and Duboule 2000) and even among genealogically un-</p><p>related loci that lie near each other (fig. 2J and O). Single</p><p>mutations in single binding sites may affect the transcription</p><p>of more than one gene. In humans, for example, segregating</p><p>variants are known that simultaneously influence transcrip-</p><p>tion of the genes encoding beta-globin and gamma-globin</p><p>(Metherall, Gillespie, and Forget 1988; Grosveld et al.</p><p>1993), the insulin and IGF2 genes (Paquette et al. 1998),</p><p>and the APOA1 and APOCIII genes (Li et al. 1995;</p><p>Naganawa et al. 1997). In the last case (fig. 2O), nucleotide</p><p>variants have distinct effects on each locus: the rare</p><p>haplotype downregulates APOA1 in the colon but upregu-</p><p>lates APOCIII in the liver. Cross-regulation may be the</p><p>reason for the long-term physical linkage of genes in the</p><p>Hox complexes of animals (Lufkin 2001). The general</p><p>prevalence of cross-regulation remains uncertain (Bonifer</p><p>2000). Even where cross-regulation is known to occur,</p><p>however, the involved loci are sometimes each regulated by</p><p>unique regulatory sequences as well as shared ones, pro-</p><p>viding some degree of differential regulation.</p><p>3.4 Transcription Factors</p><p>The transcription of every gene is regulated by</p><p>transcription factors and cofactors that interact with its</p><p>promoter. The distant and dispersed regions of the genome</p><p>Table 2</p><p>Size and Information Content of Transcription Factor Binding Sites</p><p>Transcription Factor Consensus Binding Sitea Information Contentb Referencec</p><p>C/EBP RTTGCGYAAY 17 bits 1</p><p>Runt TGYGGTY 12 bits 2</p><p>Krox-20 GCGGGGCG 16 bits 3</p><p>Otx RGATTA 11 bits 4</p><p>eve ATTA 8 bits 5</p><p>Pax-5 RNNCANTGNNGCGKRACSR 23 bits 6</p><p>AP1 CCWWWWWWGG 14 bits 7</p><p>Myc-bHLH CACGTG 12 bits 8</p><p>TATA-binding protein TATAWAW 12 bits 3</p><p>MNF1 CCRCCC 11 bits 9</p><p>Jun/Fos heterodimer TGAGTCA 14 bits 10</p><p>Jun/CREB heterodimer TGACGTCA 16 bits 10</p><p>PBC/Hox heterodimer TGATNNATTA 16 bits 11</p><p>a Consensus sequence, as reported by the authors of the reference in the right-hand column. R¼G/A, W¼A/T, Y¼ C/T,</p><p>K ¼ G/T, S¼ C/G, N ¼ A/C/G/T.</p><p>b Each nonredundant nucleotide position contains two bits of information (i.e., can be represented by two binary states);</p><p>similarly, twofold redundant positions contain one bit.</p><p>c (1) Osada et al. (1996); (2) Kramer et al. (1999); (3) Latchman (1998); (4) Klein and Li (1999); (5) Biggin and McGin-</p><p>nis (1997); (6) Czerny, Schaffner, and Busslinger (1993); (7) Riechmann, Wang, and Meyerowitz (1996); (8) Coller et al.</p><p>(2000); (9) Morishima (1998); (10) Benbrook and Jones (1990); (11) Lufkin (2001).</p><p>Evolution of Transcriptional Regulation 1387</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>that encode these proteins constitute a second important</p><p>class of sequences that are potentially the target of natural</p><p>selection on the transcription profile of a particular gene.</p><p>3.4.1 Transcription Factors Belong to a Relatively Small</p><p>Number of Gene Families</p><p>Most transcription factors belong to gene families</p><p>(Latchman 1998; Locker 2001). The size of each tran-</p><p>scription factor gene family differs considerably among</p><p>genomes (table 3), but the reasons and functional</p><p>consequences of these differences are not understood.</p><p>Existing paralogs are the result of duplications that</p><p>occurred across a wide range of times, from before the</p><p>divergence of eukaryotic kingdoms to much more recently</p><p>(Duboule 1994; Bharathan et al. 1997; Dailey and Basilico</p><p>2001; Stauber, Prell, and Schmidt-Ott 2002). There are</p><p>approximately 12 to 15 structurally distinct DNA-binding</p><p>domains known from eukaryotic transcription factors</p><p>(Harrison 1991; Fairall and Schwabe 2001). For intensively</p><p>studied organisms, the known transcription factor families</p><p>may constitute a nearly complete list. Far less is known</p><p>about the diversity and evolutionary history of transcription</p><p>cofactors, proteins that bind to transcription factors but not</p><p>to DNA (fig. 1B; see the following section and section</p><p>3.4.5).</p><p>3.4.2 Transcription Factors Are Structurally and</p><p>Functionally Modular Proteins</p><p>Most transcription factors contain several distinct</p><p>functional domains. These may include almost any</p><p>combination of the following. (1) DNA-binding domains.</p><p>The amino acids that comprise the DNA binding region</p><p>may be contiguous (e.g., homeodomain, MADS box) or</p><p>dispersed within the primary sequence (e.g., Zn-fingers).</p><p>Some transcription factors contain two distinct DNA</p><p>binding regions (e.g., many Pax family members contain</p><p>both a homeodomain and a paired-box domain). (2)</p><p>Protein-protein interaction domains. Transcription factors</p><p>engage in a variety of interactions with other proteins (see</p><p>section 3.4.5). Most transcription factors contain from one</p><p>to several such domains. Interaction domains, which</p><p>generally are more difficult to recognize from sequence</p><p>inspection</p><p>than DNA binding domains, include leucine</p><p>zippers and the pentapeptide motif of homeodomain</p><p>proteins (Latchman 1998). (3) Domains that act as</p><p>intracellular trafficking signals. Many transcription factors</p><p>contain a nuclear localization signal. In some cases, the</p><p>activity of a transcription factor may be modulated by</p><p>controlling the ratio of cytoplasmic-to-nuclear localization</p><p>(e.g., Exd: Abu-Shaar, Ryoo, and Mann 1999). (4) A</p><p>ligand-binding domain. Some transcription factors, such as</p><p>specific steroid hormones, can bind ligands which</p><p>modulate their activity. Most known cases belong to the</p><p>nuclear receptor family (Benecke, Gaudon, and Grone-</p><p>meyer 2001), but an unrelated Ca2þ-binding transcription</p><p>factor has recently been discovered (Carrión et al. 1999).</p><p>Many protein-DNA binding domains predate the</p><p>divergence of plants and animals (e.g., homeodomain:</p><p>Bharathan et al. 1997), as do some protein-protein</p><p>interaction domains (Bürglin 1997). The evolutionary</p><p>history of transcription factor gene families includes many</p><p>examples of ‘‘domain shuffling’’ and loss of specific</p><p>domains. For instance, a paralog may retain a DNA-</p><p>binding domain but lose a protein-protein interaction</p><p>domain responsible for transcriptional activation; the</p><p>resulting protein will function as a repressor if it competes</p><p>for binding sites with a paralog that contains an activation</p><p>domain (e.g., Sp family: Suske 1999). Transcription</p><p>cofactors, by definition, lack a DNA-binding domain, but</p><p>they typically contain domains that mediate a specific</p><p>protein-protein association with a transcription factor and</p><p>directly or indirectly interact with effector complexes</p><p>(either the transcriptional machinery or chromatin remod-</p><p>eling complexes).</p><p>3.4.3 Transcription Factor Structure Determines DNA</p><p>Binding Specificity</p><p>The DNA binding domain of most transcription</p><p>factors is a short motif, most commonly an alpha helix but</p><p>sometimes a beta-strand or a less organized loop, that</p><p>inserts into the major groove of double-stranded DNA</p><p>(Choo and Klug 1997; Jones et al. 1999; Fairall and</p><p>Schwabe 2001). A single amino acid substitution within</p><p>the binding domain can alter binding specificity (Treisman</p><p>et al. 1989; Mathias et al. 2001). DNA binding domains</p><p>are often highly conserved evolutionarily (Duboule 1994;</p><p>FIG. 4.—Context-dependence of transcriptional regulation. The</p><p>function of a transcription factor binding site is always context depen-</p><p>dent to some extent. (A) The binding site for a protein that activates</p><p>transcription, for instance, will not function under several different</p><p>conditions: (B) when the transcription factor is absent; (C) when local</p><p>chromatin is condensed, whether or not the transcription factor is present;</p><p>(D) when an adjacent binding site is occupied, masking the binding site of</p><p>interest; (E) when the transcription factor is present but in an inactive</p><p>form; or (F) when a different protein is present that has a higher affinity</p><p>for the binding site. (G) Many transcription factors interact with cofactors</p><p>to exert their influence on transcription. In such cases, additional situ-</p><p>ations contribute to context dependence and the binding site will not</p><p>function or will function differently: (H) when the cofactor is absent; (I)</p><p>when a different cofactor is present that alters binding specificity; or</p><p>(J) when a different cofactor is present that allows binding but alters</p><p>subsequent protein interactions.</p><p>1388 Wray et al.</p><p>D</p><p>ow</p><p>nloaded from</p><p>https://academ</p><p>ic.oup.com</p><p>/m</p><p>be/article/20/9/1377/976747 by guest on 07 August 2024</p><p>Dailey and Basilico 2001), although functional poly-</p><p>morphisms that lead to differences in binding kinetics are</p><p>known (e.g., Brickman et al. 2001). Sequence-specific</p><p>protein-DNA contacts rarely extend across more than 5 bp,</p><p>and for some motifs, such as Zn-fingers, they extend only</p><p>3 bp. The extent of this physical interaction is not</p><p>sufficient to provide much sequence specificity, as a given</p><p>5-bp sequence occurs on average every 1,024 bp. Three</p><p>structural features can increase DNA binding specificity</p><p>(Latchman 1998; Fairall and Schwabe 2001): (1) multiple</p><p>DNA binding domains can exist within a single transcrip-</p><p>tion factor (e.g., most Pax family members contain both</p><p>paired-box and homeodomain DNA binding domains,</p><p>whereas all Zn-finger transcription factors contain multiple</p><p>Zn-fingers); (2) additional structural features can bind</p><p>nearby nucleotides through minor groove contacts (e.g.,</p><p>many homeodomain and GATA factors); and (3) binding</p><p>to DNA may require homodimerization or heterodimeri-</p><p>zation (e.g., myc/mad/max, fos/jun, and most nuclear</p><p>receptor family members). All three structural features</p><p>effectively increase the number of specific nucleotides</p><p>required for efficient binding and typically involve non-</p><p>contiguous nucleotides within promoters (table 2).</p><p>3.4.4 Transcription Factors Bind to More than One</p><p>Sequence, Although They Do So with Different</p><p>Affinities</p><p>Transcription factors bind relatively tightly to double-</p><p>stranded DNA (Kd is typically in the range of 10–9 to</p><p>10–10), with a high degree of sequence specificity (Biggin</p><p>and McGinnis 1997; Carey and Smale 2000). Because of</p><p>their sequence specificity and binding kinetics, and</p><p>because many potential target sites are present in a genome,</p><p>eukaryotic transcription factors need to be present in copy</p><p>numbers of ;5–20 3 103 per nucleus in order to bind</p><p>efficiently (Dröge and Müller-Hill 2001). Although they</p><p>associate in a sequence-specific manner, most transcription</p><p>factors bind a range of motifs rather than a single one (see</p><p>section 3.3.3). The extent of this binding site matrix differs</p><p>considerably among transcription factors (table 2). Binding</p><p>specificity may be strongly influenced by cofactors. For</p><p>instance, some Hox transcription factors interact with</p><p>TALE family proteins, resulting in more efficient binding</p><p>or in binding to a narrower consensus (Knoepfler and</p><p>Kamps 1995; Berthelsen et al. 1998). Post-translational</p><p>modifications, most commonly phosphorylation, can also</p><p>modulate binding specificity. Several enzymes, including</p><p>the MAP and Janus kinases, fine-tune the phosphorylation</p><p>state of transcription factors, exerting a significant in-</p><p>fluence on overall transcription patterns (Shore and</p><p>Sharrocks 2001). Paralogous transcription factors may</p><p>interact with the same binding site (table 4), although their</p><p>binding kinetics may differ. The consensus sequence for</p><p>most transcription factors is not yet well defined, with</p><p>most consensus determinations based on sequence com-</p><p>parisons rather than direct biochemical or functional</p><p>analyses. Surprisingly little information exists about</p><p>evolutionary changes in consensus sequences.</p><p>3.4.5 Transcription Factors Influence Transcription</p><p>Through Protein-Protein Interactions</p><p>All proteins that regulate transcription directly or</p><p>indirectly influence the frequency with which the poly-</p><p>merase II complex assembles onto the basal promoter. This</p><p>influence is exerted through a wide variety of protein-</p><p>protein interactions, the most common of which are</p><p>summarized in figure 1 and discussed below (Latchman</p><p>1998; Courey 2001; Shore and Sharrocks 2001). In</p><p>general, the protein-protein interaction domains of tran-</p><p>scription factors are not as well characterized as their</p><p>DNA-binding domains. (1) A transcription factor bound to</p><p>DNA can interact with components of the basal transcrip-</p><p>tional machinery, facilitating or inhibiting its association</p><p>with the basal promoter and resulting in an increase or</p><p>decrease in overall transcription rates. These interactions</p><p>are specific and take place through protein-protein</p><p>interaction domains (Triezenberg 1995; Torchia, Glass,</p><p>and Rosenfeld 1998). Some transcription factors contain</p><p>activation domains that associate directly with one of the</p><p>TAFs (TBP-associated factors, also known as general</p><p>transcription factors) to increase the frequency with which</p><p>the RNA polymerase II complex initiates transcription</p><p>(e.g., GAL4: Gill and Ptashne 1987), whereas others</p><p>contain repression domains that have the opposite effect</p><p>(e.g.,</p>
- Importância do Ensino de Arte
- ARTIGO-SOBRE-AUDIeNCIA-DO-ARTIGO-334-CPC
- Revisão 02 - Historia da Arte
- Questionário Online - Estudos Disciplinares XII
- Educação e Artes: Conceitos e Práticas
- Jongo e Teatro: Leituras Performáticas
- A Música Popular Brasileira nos anos 1970
- Apostila Fisioterapia Aquática
- Anatomia Artistica (Michel Lauricella) (Z-Library)
- 8c4199fc091ef912c9f73d2fb563cbdc
- Didática e Metodologia do Ensino de Arte - Avaliação Flex I
- POSTAGEM_2_PEID
- h U Os tratamentos alopaticos convencionais presumem a utilização de fármacos para o tratament variados sintomas. Entretanto, algumas condições com...
- 3. Os signos linguísticos são aqueles formados por dois elementos fundamentais, a saber, a sequência sonora (significante) e um conceito (significa...
- Aponte a alternativa CORRETA em relação à provisão e reversão. Selecione a resposta: a A provisão se refere ao valor exato determinado antecipadam...
- A retomada da Antiguidade clássica pela perspectiva do patrimônio cultural foi realizada com o objetivo dea) reprodução de episódios bíblicos.b)...
- Esses artistas impressionistas passaram a:a) usar mais a cor preta, fazendo contornos nítidos, que melhor definiam as imagens e as cores do objet...
- Essa obra está vinculada à pintura:a) surrealista, pois evidencia uma temática de forte caráter social e reflete a liberação do inconsciente do p...
- bas de caráter vanguardista, se vistas retrospectivamente podem ser resgatadas como episódios que contribuíram para aglutinar artistas e intelectua...
- Estamos nos referindo ao conceito de:a. Patrimônio ambiental urbano.b. Patrimônio imaterial.c. Paisagem cultural.d. Bens intangíveis.
- Sobre as concepções então forjadas podemos afirmar que:a. A temática do patrimônio continua vinculada à questão das identidades coletivas.b. As ...
- Com isso ampliou-se o conceito de bem patrimonial que passou a considerar a existência de bens:a. Naturais e culturais.b. Culturais tangíveis e ...
- Sobre a importância do patrimônio histórico podemos afirmar, inspirados na frase acima, que:a. O patrimônio é matéria de reflexão filosófica e, n...
- Sobre as políticas de preservação atuais podemos dizer que:a. Tem se dedicado à valorização de bens de pouco interesse para as origens da nação, ...
- Dentre os desafios a serem enfrentados quanto à defesa do patrimônio cultural no Brasil e no mundo, podemos citar:a. Alto custo dos investimentos...
- Telômeros e Envelhecimento Celular
- lista de exercicios 1 de genetica prof danielle
Perguntas dessa disciplina
Grátis
Grátis