Caiping Cai,Fan Zhou,Weixi Li,Yujia Yu,Zhihan Guan,Baohong Zhang,Wangzhen Guo,
a State Key Laboratory of Crop Genetics &Germplasm Enhancement and Utilization,Nanjing Agricultural University,Nanjing 210095,Jiangsu,China
b Engineering Research Center of Ministry of Education for Cotton Germplasm Enhancement and Application,Nanjing Agricultural University,Nanjing 210095,Jiangsu,China
c Department of Biology,East Carolina University,Greenville,NC 27858,USA
Keywords: Cotton Petal color R2R3-MYB transcription factor LTR-RT insertion Flavonoid/anthocyanin biosynthesis Recessive epistasis
ABSTRACT Although a few cases of genetic epistasis in plants have been reported,the combined analysis of genetically phenotypic segregation and the related molecular mechanism remains rarely studied.Here,we have identified a gene (named GaPC) controlling petal coloration in Gossypium arboreum and following a heritable recessive epistatic genetic model.Petal coloration is controlled by a single dominant gene,GaPC.A loss-of-function mutation of GaPC leads to a recessive gene Gapc that masks the phenotype of other color genes and shows recessive epistatic interactions.Map-based cloning showed that GaPC encodes an R2R3-MYB transcription factor.A 4814-bp long terminal repeat retrotransposon insertion at the second exon led to GaPC loss of function and disabled petal coloration. GaPC controlled petal coloration by regulating the anthocyanin and flavone biosynthesis pathways.Expression of core genes in the phenylpropanoid and anthocyanin pathways was higher in colored than in white petals.Petal color was conferred by flavonoids and anthocyanins,with red and yellow petals rich in anthocyanin and flavonol glycosides,respectively.This study provides new insight on molecular mechanism of recessive epistasis,also has potential breeding value by engineering GaPC to develop colored petals or fibers for multifunctional utilization of cotton.
Following Mendel’s genetic laws,two or more genes may interact with each other to influence their phenotypic expression.Among such non-allelic gene interactions,epistasis refers to masking by one gene of the phenotypes of other genes.The epistasis can be divided into dominant and recessive epistasis according to whether a dominant gene or recessive gene plays the role of standing over [1].Although a few cases of genetic epistasis analysis based on phenotypic investigation are known,the simultaneous analysis of non-allelic gene epistasis genetic segregation and the associated molecular mechanism remains rarely reported.
Flower color is an important agronomic trait,and pigments are widely studied.Flowers,the major reproductive organs of plants,are composed of sepals,stamens,pistils,and petals.Bright colors of flower petals attract insect pollination,promoting pollination efficiency and increasing yields in cross-pollinated crops [2].Flower petals also have ornamental or edible value.Petal colors are determined by the pigments in petal cells.The pigments responsible for petal colors are mainly flavonoids,anthocyanins,and carotenoids.Flavonoids are secondary metabolites that are present in plants mostly in the form of glycosides.Depending on the chemical substituents on their skeletons,they are divided into anthocyanins,flavonols,and other classes [3-6].The four main kinds of flavonol glycosides are kaempferol,quercetin,myricetin,and isorhamnetin,most of which are yellow.Six main anthocyanin glycosides (pelargonidin,cyanidin,delphinidin,peonidin,petunidin,and malvidin) can present red,pink,blue,or purple colors[4,5].
The flavonoid/anthocyanin biosynthetic pathway is well understood,and is related to the formation of pigments.In the general phenylpropanoid pathway,phenylalanine generates 4-coumaroyl-CoA under the catalysis of phenylalanine ammonia lyase (PAL),cinnamate 4-hydroxylase (C4H),and 4-coumaryol CoA ligase (4CL).The flavonoid/anthocyanin pathway starts from one molecule of p-coumaroyl-CoA and three molecules of malonyl-CoA to generate naringenin chalcone by the action of chalcone synthase(CHS).CHS and other enzymes including chalcone isomerase(CHI),flavonoid 3-hydroxylase (F3H),flavonoid 3′-hydroxylase(F3′H),flavonoid 3′5′hydroxylase (F3′5′H),dihydroflavonol reductase (DFR),flavonol synthase (FLS),anthocyanidin synthase (ANS),and UDP-flavonoid glucosyl transferase(UFGT)function in the flavonoid and anthocyanin biosynthetic pathway.All flavonol and anthocyanin glycosides are transferred and stored in vacuoles under glutathione-s-transferase (GST) and multidrug and toxic compound extrusion (MATE),showing multifarious colors [4-7].CHS,CHI,F3H,F3′H and F3′5′H are early enzymes or genes in the flavonoid/anthocyanin pathway,and FLS,DFR,ANS and UFGT are late enzymes or genes.Previous studies [6,8] showed that MBW ternary complexes(containing R2R3-MYB,bHLH transcription factors,and WD40 protein) were regulators that activate the core enzymes or genes of the flavonoid and anthocyanin pathway.Among MBW ternary complexes,subgroup 6 R2R3-MYB ofArabidopsisare central components.Subgroup 6 R2R3-MYB(AtMYB75,AtMYB90,AtMYB113andAtMYB114)and their gene homologs have been reported to be key factors determining the various anthocyanin glycoside pigmentation patterns of flowers by regulating the anthocyanin biosynthetic pathway in crops and ornamental plants such asMedicago truncatula[8],Brassica juncea[9],Xanthoceras sorbifolium[10].NiorMYB113-1andNiorMYB113-2function cooperatively to specify formation of complex petal color patterns inNigella orientalis[11].Ectopic expression ofEutrema salsugineum EsMYB90in tobacco andArabidopsisincreased pigmentation and anthocyanin accumulation in various organs [12].
Cotton(Gossypiumspp.)provides natural textile fibers,edible oil and protein,and colorful flower petals with potential ornamental value.Wild diploid and tetraploid cotton species have brilliant flower petal color and petal-base red spots.The three cultivated cotton speciesG.herbaceum,G.arboreum,andG.barbadensealso have relatively rich petal colors,whereas the petal color ofG.hirsutum,which accounts for 90%of global planting area,is mostly light yellow,a color unattractive to pollinators.In previous studies [2,13-15],subgroup 6 R2R3-MYB members were found to be involved in red plant,cotton gland pigmentation,and purple spot formation at the base of cotton flower petals.However,the combined analysis of phenotypically genetic analysis and the related molecular mechanism of cotton petal coloration is largely unknown.
In this study,fiveG.arboreumaccessions with differing petal colors were selected to clarify the genetic and molecular mechanism of petal coloration in cotton.Genetic analysis indicated that petal coloration was controlled by a key switch gene,and a recessive epistatic genetic model of petal color was revealed by loss of function of the key gene.Map-based cloning identified the gene responsible for petal coloration.Combined with transcriptome and metabolite analyses,the expression patterns and metabolites of multiple flower colors were further investigated.The results have not only great theoretical value to contribute the regulation mechanism of petal coloration,but also potential breeding value to develop colored cotton petals or fibers in upland cotton by engineering genes in the flavonoid/anthocyanin biosynthetic pathway.
FiveG.arboreumaccessions with differing petal colors and petal-base spot colors were used (Fig.1A,B): red petals with red petal-base spots(R_R),red yellow petals with red petal-base spots(RY_R),yellow petals with yellow petal-base spots (Y_Y),white petals with red petal-base spots (W_R),and white petals with no petal-base spots(W_W).For genetic analysis and map-based cloning of petal coloration genes,two F2populations(Y_Y×W_R)and(RY_R × W_W) were developed.All materials were planted at the Wanjiang Base of Nanjing Agricultural University,Maanshan,Anhui province,China (31°N,118°E).

Fig.1.Flower phenotypes of five G.arboreum accessions.(A)Whole flower.(B)Different parts of a flower.R_R,red petals with red petal-base spots;RY_R,red yellow petals with red petal-base spots;Y_Y,yellow petals with yellow petal-base spots;W_R,white petals with red petal-base spots;W_W,white petals with no petal-base spots.
For population 1 (Y_Y × W_R),an F2population(named Pop1),Y_Y was crossed with W_R and the F1plants were self-pollinated to produce 737 F2segregating plants.For population 2(RY_R×W_W),an F2population(named Pop2),RY_R was crossed with W_W and the F1plants were self-pollinated to produce 149 F2segregating plants.Pop1 and Pop2 were used for genetic analysis of petal color.Pop1 was used for BSA-seq analysis and map-based cloning of candidate genes.
Petal colors of two F2separated populations plants were investigated during the flowering stage.The χ2test for goodness of fit was used to test for Mendelian 3:1 (in Pop1) or 9:3:3:1 ratio (in Pop2) inheritance of petal color.Fresh leaves from plants of five accessions,and both F2populations were sampled for genomic DNA extraction.At the peak flowering period,petals without the petal base region were collected from the fiveG.arboreumaccessions at 0 DPA(days post anthesis)around 11:00 a.m.for quantitative real-time PCR (qRT-PCR) analysis,and transcriptome and metabolite analysis.All collected flower samples were immediately frozen in liquid nitrogen and stored at -70 °C.
Two parents of Pop 1(Y_Y and W_R)and two Pop1 plants with differing phenotypes were selected for BSA-sequencing analysis.Two BSA bulks(Y and W bulks)were constructed of 52 plants with yellow petals and 50 with white petals in F2population.Equal quantities of high-quality genomic DNA from the plants in each bulk were mixed for DNA library construction.The DNA libraries were sequenced with an Illumina HiSeq2500.Library construction and sequencing were performed by Genepioneer Biotechnologies,Nanjing,Jiangsu,China,https://www.genepioneer.com/.
Clean reads from the Y and W bulks and both parents were separately mapped to theG.arboreumreference genome [16] with BWA software (Burrows-Wheeler Alignment: version 0.7.17) [17](Table S1).SNPs and small InDels were called with GATK (version 3.7-0-gcfedb67) [18] and SAMtools [19] after local realignment and base recalibration.SNP-index were calculated in 1-Mb windows with sliding interval 100 kb.Δ(SNP_index) was calculated from the SNP-index of the W bulk with the SNP-index of the Y bulk.SNP density was estimated using 20-Kb windows and at 99% confidence interval were considered as candidate regions for target traits[20].The calculation method of Δ(InDel_index)was the same as for Δ(SNP_index).
To narrow the range of the target gene,InDel loci with InDel length >3 bp and ED value >0.5 were selected for primer development in the candidate interval based on BSA-Seq.InDel primers were designed with Primer 5.0 software using 300-bp upstream and downstream genome sequences of InDel loci.Predicted amplified fragment lengths were between 200 and 300 bp and annealing temperatures were about 58 °C.All InDel primer information is listed in Table S2.InDel primers were used for polymorphism detection among parents,and polymorphic InDel markers were further used to genotype members of the F2populations.Genomic DNA was isolated by the CTAB method [21].
Total RNAs were extracted from frozen petals using RNAprep Pure Plant Plus Kit (DP441) (Tiangen Biotech Co.,Ltd.,Beijing,China) and the RNAs were reverse-transcribed into cDNAs using the HiScript III RT SuperMix for qPCR (+gDNA wiper) (R323)(Vazyme Biotech Co.,Ltd.,Nanjing,Jiangsu,China)according to the manufacturer’s protocol.Gene-specific qRT-PCR primers were designed with Beacon Designer 7.0 (Premier Biosoft International,San Francisco,CA,USA).Primer information is shown in Table S2.qRT-PCR was performed on an ABI 7500 instrument (Applied Biosystems) using qPCR SYBR Master Mix (Yeasen Biotechnology Co.,Ltd.,Shanghai,China)according to the manufacturer’s instructions,with three biological replicates and three technical replicates.GhHIS3(AF024716) was used as a reference gene.Relative expression levels were calculated using the 2-ΔΔCTmethod [22].
VIGS assay was performed as described previously [23-24].A 298-bp fragment corresponding to the open reading frame (ORF)positions 417-714 bp ofGaPCwas selected and amplified to construct a pCLCrV:GaPCvector.A recombinant primer pair was designed using CE Design V1.03 (Table S2).The recombinant pCLCrV:GaPCwas transformed intoA.tumefaciensstrain LBA4404.Agrobacteriumcultures containing pCLCrVB and pCLCrV:GaPCwere mixed in a 1:1 ratio.The mixture was injected into the backs of cotton seedling cotyledons with a needleless syringe.The injected plants were held at 23/20 °C (day/night) in a growth chamber with a 16 h light/8 h dark cycle until flowering.Injected empty vector (pCLCrVA and pCLCrVB) and (pCLCrV:Suand pCLCrVB) plants were used as mock treatments and technical controls,respectively.Su(Sulfur)encodes a component of the magnesium chelatase complex.Silencing ofSuin cotton causes yellow leaves [25].RNA fromGaPC-silenced and mock-treated petals without the petal base region were extracted to detectGaPCexpression.
The coding sequence (CDS) ofGaPCwithout a stop codon was amplified and inserted into the pBinGFP4 vector to produce a pBin-35S:GaPC-eGFP4 construct.GaPC-eGFP4 fusion proteins were transiently expressed inNicotiana benthamianaepidermal cells by theAgrobacteriuminfiltration method [26].H2B-mCherry nuclear-localized protein was used as a control and visualized as red signal [27].Fluorescent signal was observed under a confocal microscope (LSM 780,Zeiss,Germany).
GaFLS(Ga05G2477) andGaUFGT(Ga02G0536) promoter fragment were amplified with primers (Table S2) and ligated into the reporter vector pGreen II 0800-LUC.pBI121-35S:GaPCand empty pBI121 vector were used as effector and negative control,respectively.The recombinant vectors were transformed separately intoA.tumefaciensstrain GV3101.TheA.tumefaciens-mediated transformation of tobacco leaf cells,incubation of infiltrated tobacco,and observation methods were as previously described [28].
At peak flowering period,petals without petal base region of five differentG.arboreumaccessions were collected at 0 DPA around 11:00 AM Each accession was represented by three biological replicates.Total RNAs were isolated and cDNA libraries were constructed.The 15 cDNA libraries were sequenced on the Illumina HiSeq 2000 platform.Raw data were transformed,filtered,and mapped to the reference genome ofG.arboreum[16] using Hisat2[29](Table S3).Expression levels were calculated using transcripts per kilobase of exon model per million mapped reads (TPM).The count files were used for differential expression analysis using the R package DESeq2 [30],with a threshold of fold change 1 and aP-value 0.05.Kyoto Encyclopedia of Genes and Genomes (KEGG)pathway enrichment analyses were performed with the R package ClusterProfiler [31].The raw data can be found in the NCBI Sequence Read Archive database under accession numbers PRJNA904836.
Gene co-expression network analysis was performed using weighted gene co-expression network analysis (WGCNA) [32].TPM values of differentially expressed genes (DEGs) were used as the input data for WGCNA analysis,with a soft threshold value(b=9) and a scale-free topology fit index (R2>0.8).A weighted adjacency matrix was converted to a topological overlap matrix,and genes were hierarchically clustered based on topological overlap similarity.Co-expression networks were visualized with Cytoscape v3.9.1 (https://cytoscape.org/) [33].
Petals of fourG.arboreumaccessions (R_R,RY_R,Y_Y,and W_W) without petal base region were collected to measure flavonoids/anthocyanins and carotenoids by high-performance liquid chromatography (HPLC).Sample collection was the same as for transcriptome analysis.Each accession was represented by three biological replicates.Based on the orthogonal partial leastsquares discriminant analysis (OPLS-DA) models,variable importance in project(VIP) was used to identify differentially expressed metabolites (DEMs),with a fold change ≥2 or ≤0.5 andPvalue <1.Combining the KEGG enrichment analysis of DEGs and DEMs,identified pathways jointly enriched in transcripts and metabolites.Samples extraction,identification,and data analysis were performed by the MetWare Biotechnology Co.,Ltd.(Wuhan,Hubei,China,https://www.metware.cn).Flavonoid and anthocyanin pathways were drawn using KEGG information (https://www.kegg.jp/).
To investigate the inheritance of petal color phenotypes in cotton,we constructed two F2populations (Y_Y × W_R) and(RY_R × W_W).In Pop1 (Y_Y × W_R),plants with yellow petals and white petals numbered 564 and 173,following a 3:1 segregation ratio(χ2=0.84 <χ20.05:1=3.84),suggesting that the petal coloration phenotype was controlled by a single dominant gene,which was namedGaPC(G.arboreumpetal coloration).Interestingly,there were three petal color phenotypes in Pop2(RY_R×W_W),comprising 92 plants with red yellow petals,22 with yellow petals,and 35 with white petals,fitting a 9:3:4 ratio (χ2=2.20 <χ20.05:2=5.99).These results showed that non-allelic gene interactions determine petal color.When a key switch gene undergoes loss of function,the recessive allele masks the phenotype of the other color genes,indicating a recessive epistasis genetic model for petal color control.
To fine-map theGaPCgene,we first performed BSA-seq analysis using Pop1.Based on the Δ (SNP/InDel-index) method,a region of 14.71-42.48 Mb on chromosome 11 was identified as aGaPCcandidate region(Fig.S1;Table S4).Eleven InDel markers polymorphic between the parents were developed to screen 173 plants with white petals in Pop1 F2population.The region was narrowed to a 6.57-Mb interval between markers I_20.03 and I_26.60 containing 168 genes (Fig.2A;Table S5).Transcriptome analysis in petals of the two parents Y_Y and W_R showed that 10 genes were expressed differentially,andGa11G1228was significantly decreased in transcript abundance in W_R relative to Y_Y.This gene was accordingly assigned as the candidate gene forGaPC(Fig.2B;Table S6).
To confirm thatGa11G1228was the causal gene ofGaPC,we further isolatedGa11G1228genomic and cDNA sequences from the two parents(Y_Y and W_R).Comparison ofGa11G1228between Y_Y and W_R revealed a 4814-bp long terminal repeat retrotransposon(LTRRT)insertion at the second exon in white petal W_R,leading to the absence of expression ofGa11G1228and loss of function (Fig.2C;Table S7).The cloned ORF ofGa11G1228in Y_Y was 714 bp long and encoded 237 amino acids (Table S7).qRT-PCR analysis confirmed thatGa11G1228was highly expressed in yellow petals and almost not expressed in white petals(Fig.2D).Subcellular localization showed that Ga11G1228 was located in the nucleus(Fig.S2).
Ga11G1228encodes a putative R2R3-MYB transcription factor,which was orthologous to the geneAtMYB113(AT1G66370).InArabidopsis,AtMYB75,AtMYB90,AtMYB113andAtMYB114belong to subgroup 6 R2R3-MYB and control anthocyanin biosynthesis [7].There were five orthologs ofAtMYB113in theG.arboreumgenome,of which three tandemly duplicated genes were located on chromosome 7 and the other two on chromosomes 11 and 13(Fig.3A).

Fig.3.Sequence identification and expression of GaPC in five G.arboreum accessions.(A) Phylogenetic tree of four subgroup 6 R2R3-MYB members in A.thaliana and five AtMYB113 orthologs in G.arboreum.(B)Agarose gel electrophoresis of GaPC full-length genomic amplification in five G.arboreum accessions.(C)Gene structure of GaPC in five G.arboreum accessions.W_R and W_W had a 4814-bp LTR-RT insertion at the second exon of GaPC.(D)Expression of GaPC in petals of five G.arboreum accessions.Different letters indicate significant differences(P <0.01).(E)PAGE of HDW1228 marker in five G.arboreum accessions.The F-primer of HDW1228 is in the second exon region,the Rprimer is in the front of the LTR-RT.Y_Y,R_R and RY_R lacked the LTR-RT insertion and yielded no amplification product.
Ga11G1228genomic sequences were further isolated from the other threeG.arboreumaccessions.Ga11G1228had the LTR-RT insertion in the other white-petal accession W_W,with no LTRRT insertion in the R_R or RY_R accessions (Figs.3B,C,S3).The CDSs ofGa11G1228in the three colored-petal accessions were identical (Fig.S4;Table S7).The expression levels ofGa11G1228in petals were highest in R-R and RY-R,decreased in Y-Y,and were almost not expressed in the two white-petal accessions (W-R and W-W) (Fig.3D).
To further clarify whether the LTR-RT insertion inGa11G1228led to the white petal color,a pair of specific primers(HDW1228) for the F-primer in the second exon region and the R-primer in the front of the LTR-RT were designed (Table S2).The 266-bp products were successfully amplified in W-R and WW parents(Fig.3E),and 208 F2plants showed the white-petal phenotype with 173 in Pop1 and 35 in Pop2 (Fig.S5),which showed complete co-segregation with the white-petal phenotype.
CLCrV-VIGS technology[23]was used for rapid functional identification ofGaPC.CLCrV was derived from the bipartite geminivirusCotton leaf crumple virus,and gene silencing induced by CLCrV lasts for the whole life cycle of plants [34,35].However,CLCrV does not work inG.arboreumaccessions.WithSuencoding a component of the magnesium chelatase complex as positive control,we injected a pCLCrV:Suand pCLCrVB 1:1 mixture into diploidG.arboreumY-Y and tetraploidG.hirsutumTM-1 at the same time.Three weeks after agroinfiltration,silencingSuin TM-1 plants resulted in yellow mottling in newly emerged leaves but no change in Y-Y(Fig.S6A).Due to high identity ofGaPCorthologs in tetraploid cotton,two tetraploid cotton species,G.barbadenseHai 7124 with yellow petals andG.hirsutumR1 with red petals,were selected for VIGS to investigate the functional role ofGaPCorthologs in petal coloration.
The CDS and protein similarities ofGaPCwith its orthologous genes inG.barbadenseandG.hirsutumexceeded 98% and 97%,respectively (Fig.S6B,C).We selected a 298-bp specific fragment in the 3′end of the gene from Y_Y,and inserted into pCLCrVA to construct pCLCrV:GaPCvector.The constructed vectors were infiltrated individually intoG.barbadenseHai 7124 andG.hirsutumR1 seedlings to transiently silence the endogenous genes.Three months later,pCLCrV:Susilenced plants showed yellow mottled leaves.GaPC-silenced petals displayed albinism areas inG.barbadenseHai 7124 andG.hirsutumR1(Fig.4A,B).qRT-PCR analysis showed thatGaPCorthologs in petal albinism areas were dominantly down-regulated compared with the control (Fig.4C,D).

Fig.4.Phenotype of GaPC-ortholog-silenced plants in G.barbadense Hai 7124 (left,yellow petals) and G.hirsutum R1 (right,red petals).(A,B) The petals of GbPC- or GhPCsilenced plants in G.barbadense Hai 7124 and G.hirsutum R1,respectively.Albinism in yellow and red petals is shown.Yellow mottled leaves are visible in flower and boll stage for Su-silenced plants.(C,D) Expression of target gene in GbPC- or GhPC-silenced petals.In GbPC- or GhPC-silenced plants,the regions of the petals albinism were sampled to extract RNA for qRT-PCR detection,with petals of mock-treated plants as control.GhHIS3(AF024716)was used as a reference gene.Error bars represent standard deviation.*, P <0.05;**, P <0.01 (Student’s t-test).
Taken together,GaPCand its orthologs in tetraploid cotton were thus necessary for control of cotton petal coloration.
To investigate the regulation of petal color,transcriptome sequencing,differential expression gene,and enrichment pathway analysis were performed using four petal color accessions R_R,RY_R,Y_Y and W_W (Fig.S7).DEGs were identified for each of the six pairwise combinations of the four.The greatest number of DEGs,17,363 (5459 upregulated and 11,904 downregulated),was found in group R_R/Y_Y.The DEGs in group R_R/W_W,RY_R/Y_Y,Y_Y/W_W,RY_R/W_W,and R_R/RY_R numbered 13,205,12,931,12,126,5211 and 4503,respectively (Fig.5A).

Fig.5.Transcriptome and co-expression network analysis.(A) Numbers of DEGs by comparing four different petals color from R_R,RY_R,Y_Y,and W_W accessions.(B)Correlation heat map of four modules.The turquoise and yellow modules are correlated in that genes in the both were highly expressed in Y_Y.The GaPC gene is in the brown module.(C)Bar plot of KEGG enrichment pathway from 3141 genes co-expressed with GaPC.(D)Co-expression network of GaPC gene with 500 most heavily weighted values in the brown module.The WGCNA method was used to construct the co-expression network.(E)GaPC activates the promoters of GaFLS and GaUFGT.The LUC signals from cotransformation of effector and reporter were much stronger than that from transformation of the reporter construct alone.The LUC activities were normalized to REN,and values relative to the empty vector control are shown.Error bars represent standard deviation of three biological replicates.Significance of the results was assessed using Student’s t-tests (**, P <0.01).
After removal of duplicate genes from the six groups,a total of 22,489 genes were subjected to WGCNA.Four modules were clustered,with the number of genes in each module ranging from more than 1000 to more than 10,000 (Fig.S8).The brown module contained 3142 genes,which were highly expressed in red petals R_R and lowly expressed in yellow petals Y_Y and white petals W_W.The blue module contained 5897 genes,which were highly expressed in white petals W_W and lowly expressed in red petals R_R.The turquoise module contained 11,676 genes,which were highly expressed in yellow petals Y_Y and lowly expressed in red petals R_R and RY_R.The yellow module contained 1465 genes,which were also highly expressed in yellow petals Y_Y,but lowly expressed in white petals W_W.The genes in the turquoise and yellow modules were highly expressed in Y_Y,indicating the high correlation between the two modules (Figs.5B,S9A).KEGG pathway enrichment showed that metabolic pathways,biosynthesis of secondary metabolites,flavonoid biosynthesis,phenylpropanoid biosynthesis,and glutathione metabolism were enriched in the brown,turquoise,and yellow modules (Fig.S9B).
TheGaPCgene was clustered in the brown module and coexpressed with all 3141 genes.KEGG enrichment analysis showed that those genes were involved in flavonoid biosynthesis,phenylpropanoid biosynthesis,metabolic pathways,and biosynthesis of secondary metabolites (Fig.5C).WithGaPCas hub gene,a coexpression network of the 500 most heavily weighted genes was constructed (Fig.5D),indicating the central role ofGaPCin the anthocyanin and flavone biosynthesis pathways.
It has been reported [2,7] that MYB113 controls anthocyanin biosynthesis by activating the expression of genes encoding FLS(flavonol synthase),DFR (dihydroflavonol reductase),ANS (anthocyanidin synthase)and UFGT(UDP-flavonoid glycosyl transferase).To investigateGaPCs function in anthocyanin and flavone biosynthesis,we examined the interaction of GaPC with the promoter ofGaFLS(Ga05G2477) andGaUFGT(Ga02G0536).In comparison with the control,LUC activity was significantly higher whenGaPCand pGaUFGTor pGaFLSwere co-transformed into tobacco leaves,indicating that GaPC binds the promoters of bothGaUFGTandGaFLSto activate their expression (Fig.5E).
To identify the pigment composition of the four differently colored petals,flavonoids/anthocyanins,and carotenoids were measured in R_R,RY_R,Y_Y and W_W,respectively.There was no difference in the composition and content of carotenoids in the petals of the four accessions,indicating that the petal colors were dependent only on flavonoids and anthocyanins (Fig.S10).Of 108 tested compounds,52 were detected in cotton petals.They were divided into eight categories:flavonoid,cyanidin,delphinidin,procyanidin,pelargonidin,peonidin,petunidin,and malvidin(Fig.6A;Table S8).The total anthocyanin content of R_R was more than 3100 μg g-1of fresh weight and represented 47 compounds in the eight categories,followed RY_R (2636 μg g-1) containing 46 compounds in eight categories and Y_Y (1104 μg g-1) containing 31 compounds in seven categories.The pigment content of W_W was lowest(330 μg g-1),only 1/9 that of R_R,and included 26 different compounds in six categories.Fig.6B showed the name and content of the 14 most abundant compounds in the four petalcolor accessions.The 14 compounds correspond to 4,6,1,1 and 2 compounds in flavonoid,cyanidin,delphinidin,pelargonidin and procyanidin categories,respectively.The most abundant anthocyanins in petals of R_R and RY_R were flavonoid,cyanidin and delphinidin,while the most abundant anthocyanin in petals of Y_Y and W_W was flavonoid.These results indicated that the yellow petals were produced by flavonoid substances,the red yellow were generated by superimposition of cyanidin and delphinidin,and the color of red petals was deepened by increased content of the latter two categories.There was a small amount of flavonoid in white petals.

Fig.6.Flavonoid and anthocyanin metabolites in petals of four petal-color accessions.(A)Total content of flavonoids/anthocyanins.(B)The 14 most abundant compounds of flavonol and anthocyanin glycosides in petals.(C)DEMs identified by comparison of petal colors of R_R,RY_R,Y_Y,and W_W accessions.DEMs from six pairwise combination groups are identified.
DEMs were further identified in petals of four accessions compared in six pairwise groups.There were more upregulated than downregulated components in all six groups,with 17-39 upregulated metabolites and 1-9 downregulated metabolites (Fig.6C).KEGG annotation and enrichment analysis showed that the same six pathways were enriched in four groups,R_R/W_W,RY_R/Y_Y,Y_Y/W_W,and R_R/Y_Y: flavonoid biosynthesis (ko00941),anthocyanin biosynthesis (ko00942),metabolic pathways (ko01100),biosynthesis of secondary metabolites (ko01110),flavone and flavonol biosynthesis (ko00944),and isoflavonoid biosynthesis(ko00943).The RY_R/Y_Y group was not enriched in ko00944 and the R_R/RY_R group was not enriched in ko00943 or ko00944 (Fig.S11;Table S9).
Most of the DEG-and DEM-enriched pathways were the same.The core genes in the anthocyanin pathway were enriched in flavonoid biosynthesis (ko00941),metabolic pathways (ko01100),and biosynthesis of secondary metabolites (ko01110) (Table S9).To identify the mechanism of formation of cotton petal colors,we analyzed the expression patterns of core genes in the anthocyanin biosynthesis pathway and the content of main metabolites in fourG.arboreumaccessions (Fig.7).From the key enzymes or genes(PAL,C4H,4CL) of the phenylpropanoid pathway,to the early enzymes or genes of the flavonoid and anthocyanin pathway(CHS,CHI,F3H,F3′H,F3′5′H),and then to the late enzymes or genes(FLS,DFR,ANS,UFGT),the expression of these genes was generally higher in red and red yellow petals than in yellow and white petals.Yellow and white petals contained flavonol glycosides,including kaempferol (C21833: kaempferol-3-O-rutinoside) and quercetin(C05623: quercetin-3-O-glucoside and C05625: rutin),no myricetin.Red and red yellow petals contained both flavonoid and anthocyanin glycosides.The anthocyanin glycosides were mainly involved in cyanidin (C08604: cyanidin-3-O-glucoside,C08647:cyanidin-3-O-galactoside),delphinidin (C12138: delphinidin-3-Oglucoside),and a little pelargonidin (C12137: pelargonidin-3-Oglucoside).The compositions of flavonol glycosides were the same in red and red yellow petals as in yellow and white petals.These results indicated that different compositions of anthocyanins and flavonols contributed different petal colors.

Fig.7.Flavonoid metabolism pathway in cotton petals.In the heat map,core gene expression(rectangle) and main metabolite content (ellipse)are shown.The four cotton accessions are R_R,RY_R,Y_Y and W_W.The values in ellipses and rectangles have been normalized by log2(TPM+1)for each gene and log2(content+1)for each metabolite.Enzymes and genes:PAL,phenylalanine ammonia lyase;C4H,cinnamate 4-hydroxylase;4CL,4-coumaryol CoA ligase;CHS,chalcone synthase;CHI,chalcone isomerase;F3H,flavonoid 3-hydroxylase;F3′H,flavonoid 3′-hydroxylase;F3′5′H,flavonoid 3′5′hydroxylase;DFR,dihydroflavonol reductase;FLS,flavonol synthase;ANS,anthocyanidin synthase;UFGT,UDP-flavonoid glucosyl transferase.Metabolites: C21833,kaempferol-3-O-rutinoside;C05623,quercetin-3-O-glucoside;C05625,rutin;C12137,pelargonidin-3-O-glucoside;C08604,cyanidin-3-O-glucoside;C08647,cyanidin-3-O-galactoside;four other glucosides of cyanidin (including cyanidin-3-O-arabinoside;cyanidin-3-O-xyloside;C12095,cyanidin-3-O-(6-O-p-coumaroyl)-glucoside;C08639,cyanidin-3,5-O-diglucoside);C12138,delphinidin-3-O-glucoside.
In summary,we identified a switch geneGaPCthat controls petal coloration in cotton.WhenGaPCwas normally expressed,it regulated petal color by activating key enzymes or genes in the anthocyanin and flavone pathways.
Mendel’s Laws,law of segregation and independent assortment,are the basis of classical genetics,which have a great significance in genetic research and breeding application.When two different genes affect the same character,a phenotypic ratio different from 9:3:3:1 can be observed.When two or more genes affect the same character,these genes may have complex hierarchical relationships.When a gene masks the effects of one or more genes,it is called epistasis.When a recessive gene plays an epistatic role,it is called recessive epistasis.Several cases of recessive epistasis of non-allelic gene interactions are listed in textbooks,such as color of the protein layer in maize endosperm and coat color pigmentation in mice (https://cnx.org/contents/cUeevuaC@2/Laws-of-Inheritance).However,the combined analysis of non-allelic genes genetic epistasis and the responsible molecular mechanisms are seldom reported.Here we have described a case of recessive epistasis in cotton petal coloration.In an F2segregating population generated by crossing with red yellow petals (RY-R) and white petals (W-W) accessions,three petal colors were observed,generating a phenotypic segregation of 9 red yellow: 3 yellow: 4 white petals.Genetic analysis revealed non-allelic gene interactions controlling petal coloration.
By map-based cloning and transcriptome analysis,we cloned the causal geneGaPCcontrolling petal coloration,encoding a putative R2R3-MYB transcription factor.To our knowledge,this is the first report of a petal-coloration switch gene in cotton.Sequence comparison ofGaPCin fiveG.arboreumaccessions with differing petal colors indicated that a 4814 bp LTR-RT insertion caused loss of function ofGaPCin W_R and W_W accessions.When theGaPCgene lost its function and could not work to activate color gene expression in the anthocyanin and flavone biosynthesis pathway,the mutated gene acted as the recessive geneGapcin Mendelian genetic analysis,and followed a recessive epistatic genetic model in petal color.Several reports have indicated that LTR-RT insertion changed gene expression and caused loss of gene function[36,37].A glabrous mutation was caused by insertion of a 5188-bp LTR-RT in the fourth exon of the HD-ZIP geneCsGLin cucumber[38],and a 5 kb LTR-RT insertion in an HD-ZIP gene (At_HD1) led to glabrous cotton stems [39].
The subgroup 6 R2R3-MYB controlling flower petal color or pattern forms has been investigated in model plants and ornamental plants [40-44].However,there have been few reports in field crops.In this study,we confirmed thatGaPCis a regulator of petal anthocyanin metabolism and pigment accumulation in cotton.Transcriptome profiling and metabolic pathway analysis revealed that the petal colors in cotton were derived from flavonoids and anthocyanins and not carotenoids.WGCNA revealed thatGaPCwas a hub gene in the brown module and co-expressed with all 3141 genes of this module.Genes of the brown module were highly expressed in red petals R_R and lowly expressed in yellow petals Y_Y and white petals W_W.This finding was consistent with the results from metabolomics.The composition and content of metabolites in Y_Y and W_W were more similar than those in R_R and Y_Y or R_R and W_W.Based on the DEMs and coexpressed DEGs by comparing of petal colors,phenylpropanoid biosynthesis,flavonoid biosynthesis,anthocyanin biosynthesis,and flavone and flavonol biosynthesis were enriched.The expression levels of these core genes in phenylalanine biosynthetic and flavonoid and anthocyanin biosynthetic pathways,including PAL,C4H,4CL,CHS,CHI,F3H,F3′H,F3′5′H,FLS,DFR,ANS,and UFGT,were generally higher in red and red yellow petals than in yellow and white petals.The dual-luciferase assay further confirmed that GaPC bound the promoters of bothGaFLSandGaUFGTto activate their expression.
Flower colors are made up of pigments.The content of flavonoids and anthocyanins in petals determines petal coloration.In many ornamental plants,large amounts of flavonol and anthocyanin glycosides have been identified in investigations of pigment composition of differing petal colors [45-48].In this study,we investigated the flavonol and anthocyanin composition of cotton petals,identifying 52 compounds,with most found in R_R and RY_R,fewer in Y_Y,and fewest in W_W.There were many flavonoid substances in all four accessions,including quercetin and some kaempferol,while anthocyanins were found mainly in R_R and RY_R.Red and red yellow petals contained the same kind of flavonoid glycosides,and also contained three kinds of anthocyanin glycosides: cyanidin,delphinidin,and a little pelargonidin.Zhang et al.[49] measured anthocyanin contents in flowers with and without red spots of twoG.arboreummaterials,identifying 17 compounds of which cyanidin and delphinidin exceeded 90%of total anthocyanins in the petal red spot.Taken together,red color was produced mainly by anthocyanin glycosides and yellow color was composed mainly of flavonol glycosides in cotton.
In this study,we identified an R2R3-MYB geneGaPCcontrolling cotton petal coloration.GaPCwas orthologous to theArabidopsisgeneAtMYB113(AT1G66370).InA.thaliana,AtMYB75,AtMYB90,AtMYB113andAtMYB114belong to subgroup 6 R2R3-MYB and regulate late biosynthetic genes in the anthocyanin biosynthesis pathway [7].In theG.arboreumgenome,there are five orthologs ofAtMYB113,of which three tandemly duplicated genes(Ga07G0898,Ga07G0899,andGa07G0903) are located on chromosome 7,Ga11G1228andGa13G1293were located on chromosomes 11 and 13,respectively (Table S10).NoAtMYB75,AtMYB90andAtMYB114orthologs were detected in cotton.In previous studies[2,13-15],functional roles ofGa07G0899andGa07G0903orthologs in cotton have been reported.Using a(T586×Yumian 1)RIL population,a red plant geneGhPAP1Dwas cloned,which wasGa07G0899orthologs in tetraploid cotton.A 228-bp tandem repeat in the promoter region ofGhPAP1Dwas responsible for the red plant phenotype in T586.Overexpression ofGhPAP1Dled to increased anthocyanin accumulation in transgenic tobaccos and cottons [13].Using aG.hirsutumcv.Emian22 ×G.barbadenseacc.3-79 F2mapping population,Wang et al.[14] also cloned the gene for red leaf trait and verified that a repeat sequence in the promoter region increased the promoter activity of the gene and its expression.Overexpressing the gene in fiber using the fiber elongation-specific promoterPGbEXPA2showed the brown fiber phenotype in transgenic lines.Gao et al.[15]reported that a Cotton Gland Pigmentation 1 CGP1,Ga07G0903ortholog in tetraploid cotton regulated gland pigmentation but not morphogenesis.Knockdown and knockout ofCGP1generated a glandless-like phenotype.Abid et al.[2] cloned a Beauty Mark geneGbBM,also an ortholog ofGa07G0903,that controlled purple spot formation at the base of flower petals inG.barbadense.In the present study,we located a new geneGaPCcontrolling petal coloration.The gene wasGa11G1228,an ortholog ofAtMYB113in subgroup 6 of the R2R3-MYB family inArabidopsis.Plant-specific R2R3-MYB genes have contributed the most specialized roles in plant-specific developmental processes [7].Combining previous studies with the present one shows that subgroup 6 R2R3-MYB members in cotton influence functional differentiation,including promoting the formation of red plants and regulating gland pigmentation,petal purple spot,and petal coloration.Potentially,colorful petals and fibers will be genetically modified by regulatingGaPCor other subgroup 6 R2R3-MYB members to alter anthocyanin biosynthesis for attracting insect pollinators,increasing the ornamental value of flower petals,and developing cotton cultivars with colorful fibers.
We identified a case of recessive epistasis of petal coloration in cotton and map-based clonedGaPCcontrolling petal coloration.GaPCencoding an R2R3-MYB transcription factor,controlled petal colors by regulating the anthocyanin and flavone biosynthesis pathways.A 4814-bp LTR-RT insertion in the second exon ofGaPCled to failure of petal coloration and showed the recessive phenotype of white petal.GaPCactivates enzymes or genes of flavonoid and anthocyanin pathways,leading to the formation of multiple petal colors.Subgroup 6 R2R3-MYB members in cotton play specialized roles in plant-specific developmental processes with high functional differentiation.This study has not only great theoretical value to contribute the genetic model of recessive epistasis and the related molecular mechanism of petal color regulation,these findings suggest engineering genes to develop colored cotton petals or fibers for commercial application.
Caiping Cai:Formal analysis,Visualization,Writing -original draft.Fan Zhou:Formal analysis,Visualization,Investigation.Weixi Li:Software,Visualization.Yujia Yu:Validation.Zhihan Guan:Investigation.Baohong Zhang:Writing -review &editing.Wangzhen Guo:Supervision,Conceptualization,Writing -review&editing.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
We are grateful to Dr.Zuoren Yang from Institute of Cotton Research of the Chinese Academy of Agricultural Sciences for providing the CLCrV-VIGS vectors.This work was supported by the Fundamental Research Funds for the Central Universities(KYZZ2022003) and Jiangsu Collaborative Innovation Center for Modern Crop Production project (No.10).
Supplementary data for this article can be found online at https://doi.org/10.1016/j.cj.2023.03.013.