Xuemeng Wang,Han Zhang,Rui Song,Ming Sun,Ping Liu,Peixin Tian,Peisheng Mao,Shangang Jia
College of Grassland Science and Technology,China Agricultural University,Beijing 100193,China
Keywords: Hard seed Multispectral imaging Transcriptomics Metabolomics ABA
ABSTRACT Physical dormancy(PY)commonly present in the seeds of higher plants is believed to be responsible for the germination failure by impermeable seed coat in hard seeds of legume species,instead of physiological dormancy (PD).In this study,a non-destructive approach involving multispectral imaging was used to successfully identify hard seeds from non-hard seeds in Medicago sativa,with accuracy as high as 96.8%-99.0%.We further adopted multiple-omics strategies to investigate the differences of physiology,metabolomics,methylomics,and transcriptomics in alfalfa hard seeds,with non-hard seeds as control.The hard seeds showed dramatically increased antioxidants and 125 metabolites of significant differences in non-targeted metabolomics analysis,which are enriched in the biosynthesis pathways of flavonoids,lipids and hormones,especially with significantly higher ABA,a hormone known to induce dormancy.In our transcriptomics results,the enrichment pathway of ‘‘response to abscisic acid” of differential expressed genes (DEG) supported the key role of ABA in metabolomics results.The methylome analysis identified 54,899,46,216 and 54,452 differential methylation regions for contexts of CpG,CHG and CHH,and 344 DEGs might be regulated by hypermethylation and hypomethylation of promoter and exon regions,including four ABA-and JA-responsive genes.Among 8% hard seeds in seed lots,24.5% still did not germinate after scarifying seed coat,and were named as non-PY hard seeds.Compared to hard seeds,significantly higher contents of ABA/IAA and ABA/JA were identified in non-PY hard seeds,which indicated the potential presence of PD.In summary,the significantly changed metabolites,gene expressions,and methylations all suggested involvement of ABA responses in hard seeds,and germination failure of alfalfa hard seeds was caused by combinational dormancy (PY+PD),rather than PY alone.
Alfalfa(Medicago sativa)is the most important perennial forage crop worldwide,with high biomass yield and protein content,and despite being a dicot and a legume,is known as the‘‘King of Grass”because of its feed value and significance in forage industry [1].Alfalfa is always used as a fodder for cattle and other livestock,and new uses have been extended to alfalfa protein foods for pets,fish,and even humans,such as alfalfa meal [2].The benefits provided by alfalfa include food production,soil nitrogen fixation and conservation,biodiversity,and other ecological benefits [3].High-quality alfalfa seeds are essential for reproduction and production of alfalfa,and their international trade is also large in forage and crop industry.It is widely known that hard seeds in alfalfa cannot be avoided and this negative characteristic reduces seed pricing and value,due to its negative effect on germination in field conditions.Hard seeds with impermeable coats remain dry and firm during imbibition [4].Hard seeds could benefit seed industry to some extent,such as extending seed longevity,enhancing seed preservation,and being convenient for transportation of seeds,and it is also beneficial for seed storage in germplasm bank [5].Unfortunately,hard seed results in some serious problems for agricultural production such as prevention of rapid imbibition and synchronous germination,nonuniform seedling establishment,increased weed competition and even failure of germination [6].
Seed dormancy,delayed germination after embryogenesis,is a normal,developmentally controlled and necessary aspect of the lifecycle for most higher plants.However,the degree of dormancy varies and can confer an inability to germinate under suitable environmental conditions,and can seriously affect the yield and quality of crops [7].Seed dormancy usually consists of five categories,including two morphology-related ones of morphological dormancy (MD) and morphophysiological dormancy (MPD),hormone-mediated one for physiological dormancy (PD),hardseededness for physical dormancy (PY),and combinational dormancy (PY+PD) [8].MD and MPD are related to the underdeveloped embryos,and the germination time of MPD is longer than that of MD.PD is the major type of seed dormancy in higher plants,such asArabidopsis thaliana,Nicotiana plumbaginifolia,Echinochloa crusgalli,and it is controlled by hormones,including GA(gibberellin),ABA(abscisic acid),IAA(indoleacetic acid),JA(jasmonic acid),and ETH (ethylene) [9].The balance of ABA and other hormones affects the degree of seed dormancy.For example,the ABA/GA balance could influence the seed germination,as higher ABA level leads to seed dormancy and lower-levels cause seed dormancy release [10].Furthermore,ethylene promotes germination by reducing the sensitivity of seeds to ABA,while the interaction between JA and ABA inhibits seed germination [11].PY of seed is also called hard seed or hardseededness,and widely exists in higher plants,mainly found in Malvaceae,Convolvulaceae,and Rosaceae,and especially Leguminosae.Traditionally,the dormancy of alfalfa seeds is always believed to be PY[12].Hard seeds are beneficial to wild plant species where longer-term dormancy is an environmental adaptation but problematic for crop species where rapid uniform germination is more valuable.
In contrast to the abundant information on physiological dormancy in seeds ofA.thaliana,the physiological and molecular researches on hard seeds of legumes are very limited,mostly focusing on morphological characteristics and individual component content [13].For example,seed coat compositions,including polyphenols,flavonoids,lignin,and lignans,were highly enhanced in hard seeds of fava bean(Vicia faba)and pea(Pisum sativum)[14].In a previous study,phenolic compounds were strongly related to hard seeds,and epicatechin plays a positive role in hard seeds of wild soybean (Glycine soja) under the various levels of water and gas [15],while the contents of hydroxylated fatty acids in hard seeds of wild soybean were higher than those in non-hard seed[16].Quercetin content also showed a positive correlation with hard pea seeds [4].Some transcription factors were suggested to contribute to hardseededness inMedicago truncatula,such asKNOX4andKCS12,and irregular wrinkles in outermost cuticle could not be found in the imbibed hard seeds[16,17].The distribution of DNA methylation in plant genomes is centrally involved in the seed development,maturation,and germination.The regulation of DNA methylation does not include changes of DNA sequence,and therefore seeds might exhibit the potential adaption to abiotic and biotic stresses[18].A previous study found that DNA methylation dynamic changes influenced seed dormancy [19].To date,seed coat compositions and molecular mechanisms affecting the degree of dormancy are not clear in alfalfa hard seeds.
Due to absence of visible morphological difference between alfalfa hard and non-hard seeds,it is difficult to distinguish them,and the most convenient and effective way is to soak seeds in water [20].Other traditional methods,such as phenol detection and nitrogen blue tetrazole staining,all damage the seed structure[15].However,the method of imbibition is very time-consuming,and moreover,non-hard seeds cannot maintain their original state due to germination.These limitations impeded the comparison of hard and non-hard seeds in physiological and molecular further studies.Multispectral imaging (MSI) obtains the spatial and spectral information of the target samples,and shows a great potential in screening hard seeds [12].In this study,we identified the hard seed from non-hard seed by using MSI,and then compared them for the physiological,transcriptomic,metabolomic,and methylomic differences.Previous studies always showed PY in seeds of legumes,but we showed PY+PD pattern in alfalfa hard seeds,by combining multiple omics analysis and MSI,as 24.5% hard seeds still could not germinate after scarifying seed coat.
Seeds of alfalfa cultivar ‘‘Zhongmu No.1” (Fig.1A) were harvested in 2019 at the full ripe stage from the fields in Jiuquan,Gansu,China (latitude 39°37′N,longitude 98°30′E,elevation 1480 m),and stored in the laboratory of China Agricultural University (25 °C and 35% relative humidity),for the following experiments conducted in 2021,including physiological examination,metabolomics,methylomics,and transcriptomics.

Fig.1.Overview of discrimination of hard and non-hard seeds in MSI.(A)Seed performance under three modes(RGB,Gray,and nCDA).(B)Reflectance of 19 wavelengths in hard and non-hard seeds.*and**on the lines respectively indicate significant differences at P <0.05 and P <0.01,and NS indicates no significant differences.(C)nCDA image of hard (lower panel) and non-hard (upper panel) seeds corresponded to their actual germination situation (D).
In a germination test,25 seeds were placed in one 11-cmdiameter petri dish with three filter papers,and wetted with 10 mL of distilled water.The dishes were placed into a 20°C incubator with a cycle of 16 h light and 8 h darkness.Occasionally,1.5 mL of distilled water was added to the filter papers to ensure their dampness.After 10 days,hard seeds were determined if failed to germinate,and hard (H) and non-hard (N) seeds were recorded and counted.In verification test of physiological dormancy,>400 hard seeds,which had been determined based on MSI screening,were placed between two flat sandpapers,and their seed coats were scarified.Then the hard seeds were moved to petri dishes for germination tests under the same conditions and processes above.The seeds that could not germinate after scarifying seed coat,are named as non-PY hard seeds,which were used for hormone surveys.
The seeds harvested at the full ripe stage were scanned in a petri dish in VideometerLab4 (Videometer A/S,Herlev,Denmark),according to a previous method[21].Briefly,25 seeds were placed in one single petri dish for MSI signal recording,and totally >400 seeds were examined,before germination test.In addition to 19 wavelengths,the contents of chlorophyllaand b were also recorded based on the excitation/emission fluorescence of 630/700 nm and 405/600 nm,respectively,in VideometerLab4.The morphological and spectral data was exported from VideometerLab4,and used for multivariate analysis based on five models,i.e.,PCA (principal component analysis),ELM (extreme learning machine),RF (random forest),SVM (support vector machine),and LDA (linear discriminant analysis).These five multivariate models of PCA,ELM,RF,SVM,and LDA were implemented by using R packages ofFactoMineR[22],elmNNRcpp[23],randomForest[24],e1071[25],andMASS[21],respectively.In addition,nCDA(normalized canonical discriminant analysis),which was performed in Videometer software version 4,was also employed to verify the prediction of hard and non-hard seeds by the above five models.
The hard and non-hard seed identified by MSI were used for survey of activities of antioxidant enzymes.The activity of the superoxide dismutase (SOD) (EC 1.15.1.1) was measured based on absorbance values at 560 nm,in 3 mL of reaction mixture containing 1.5 mL of 50 mmol L-1potassium phosphate buffer (pH 7.8),0.3 mL of ethylenediamine tetraacetic acid (EDTA),0.2 mL ofdistilled water,0.3 mL of nitroblue tetrazolium(NBT),0.3 mL riboflavin,0.3 mL Met,and 0.1 mL of enzyme.The activity of guaiacol peroxidase (POD) (EC1.11.1.7) was indicated by dynamic changes in absorbance values at 470 nm within 1 min,in 2.97 mL of reaction mixture containing 0.02 mL enzyme,0.15 mL guaiacol,0.1 mL H2O2,and 2.7 mL 100 mmol L-1potassium phosphate buffer (pH 6.0).The activity of catalase (CAT) (EC 1.11.1.6) was indicated by the changes in absorbance values at 240 nm,in 3 mL of reaction mixture containing 2.95 mL H2O2and 0.05 mL enzyme.
Malondialdehyde (MDA) concentration was calculated for lipid peroxidatio,by measuring absorbance values at 450 nm,532 nm,and 600 nm,in reaction mixture containing 0.5 mL 0.5%thiobarbituric acid(TBA),1%trichloroacetic acid(TCA),and 0.6 mL of supernate.Bovine serum albumin (BSA) was used to make standard curve,and the values at 595 nm for samples and Coomassie brilliant blue G-250 dye were used to measure the total soluble protein content.The contents of ABA,GA,IAA,and JA were measured by using the enzyme-linked immunosorbent assay(ELISA).
All the hard and non-hard seeds identified by multispectral imaging were subject to non-target metabolomic analysis.The seeds kept at -80 °C were ground into powder in liquid nitrogen,before being submitted to in Novogene Company (Beijing,China),for non-target metabolomics survey.Briefly,100 mg powder was resuspended in prechilled 80%methanol.The suspension was incubated on ice for 5 min,before being centrifuged at 15,000×gand 4 °C for 20 min.LC-MS grade water was added into the supernatant,and diluted it to a final concentration of 53% methanol.The supernatant was subsequently transferred to a fresh Eppendorf tube and then were centrifuged at 15,000×gand 4 °C for 20 min.Finally,the supernatant was injected into the LC-MS/MS for formal running [26].The LC-MS raw data was processed by using Compound Discoverer 3.1(CD3.1,ThermoFisher),to perform peak management and metabolite determination.The metabolites were annotated based on the three databases,including Kyoto Encyclopedia of Genes and Genomes (KEGG) (https://www.genome.jp/kegg/pathway.html),Human Metabolome Database (HMDB)(https://hmdb.ca/metabolites),and LIPID Maps (https://www.lipidmaps.org/).The PCA and partial least squares discrimination analysis (PLS-DA) were performed by using metaX software [27].The differential metabolites were determined based on the cutoffs of variable influence on projection (VIP) >1,P<0.05,and fold change >2 or <0.5.Volcano and KEGG enrichment plots were made by using R package ofggplot2.
The hard and non-hard dry seeds identified by MSI were used for extracting the total RNA,and their purity and concentration were evaluated,according to a previous method [28].Total RNA was used for complementary DNA(cDNA)synthesis and construction of sequencing library,and then sent for next-generation sequencing in Illumina NovaSeq.Three biological replicates were used for RNA-Seq experiments.Differentially expressed genes(DEGs) were determined by using R package ofDESeq2[29].PCA,heatmap,and volcano plots were made by using R packages ofggbiplot,complexheatmap,andggplot2,respectively.The KEGG and Gene Ontology (GO) enrichment analysis was performed by using R package ofpathview.
The hard and non-hard dry seeds identified by MSI were used for DNA extraction.Ten seeds for hard and non-hard seeds were pooled and ground into fine powder in liquid nitrogen.DNA was extracted using CTAB method [28],and was sequenced with an Illumina NovaSeq6000 PE150,after sodium bisulfite processing and library construction.Differentially methylated regions(DMRs)were identified by using Bismark[30]alignment and the methylKit pipeline,based on a window of 1000 bp [31].Quantitative realtime PCR (qRT-PCR) was conducted for expression assessment of genes related to DNA methylation,with primers included in Table S3,according to our previous method [28].
Fourteen morphological features were retrieved from the MSI data of all the seeds (Table S1),and a comparison was conducted between hard and non-hard seeds.In total,eight morphological features were observed with significant differences (P<0.05)between hard and non-hard seeds,such as area,length,and hue,and differences in the other six features were not found.
A significant difference of the mean reflectance between hard and non-hard seeds was observed,and the reflectance of nonhard seeds (N) was apparently higher than that of hard seeds (H)in the whole wavelength region of 365 nm to 970 nm,especially from 490 nm to 970 nm (Fig.1B).Additionally,the significant differences in the fluorescence of chlorophyll a and b were also detected(Fig.S1A,B,D,E).The fluorescence intensity in hard seeds was significantly lower than that in non-hard seeds (Fig.S1C,F).
Six models(i.e.,PCA,SVM,LDA,RF,ELM,and nCDA)were established to classify hard and non-hard seeds,based on the morphological and spectral features.The PCA showed that the first two principles components (i.e.,PC1 and PC2) explained 50.1% of original variance,and only a slight spatial isolation was found between the two groups(Fig.S2A).Variable correlation plot showed that the contribution of the spectral data to PCA is larger than that of morphologic data (Fig.S2B),especially for 490-780 nm in the visible region.The second principal component was mainly correlated with the wavelengths from 365 to 450 nm,and the third principal component was preferably associated with morphological data.
Unlike the poor performance of PCA,the LDA visualization showed the hard seeds were clearly separated from non-hard seeds and dead seeds (Fig.2A).The average accuracy of SVM,LDA,RF,and ELM models to classify hard and non-hard seeds is 97.5%,98.4%,99.0%,and 96.8%,respectively (Table S2).The sensitivity of four models is from 93.5%to 99.7%,and LDA model has the highest sensitivity value.The specificity is from 97.2% to 100.0%,with the highest in ELM.We combined the prediction of SVM,LDA,RF,and ELM models to define hard and non-hard seeds,together with the color plotting in nCDA model for double check.With red and blue colors for non-hard and hard seeds respectively in nCDA model (Fig.1C),we found that the prediction results of nCDA model were always consistent with the actual germination results(Fig.1D).

Fig.2.LDA and PCA plots for MSI,RNA-seq,and non-target metabolomics.(A)LDA plot based on the data of MSI.(B)PCA plot based on the transcriptome data.(C)PCA plot based on the metabolome data in negative ion mode.(D) PCA plot based on the metabolome data in positive ion mode.
To discover the physiological differences between hard and non-hard seeds,we conducted the comparative analysis on several important physiological parameters.We found that the activity of SOD,POD,and CAT in hard seed was higher than that in non-hard seed (P<0.01) (Fig.3A-C),while the contents of soluble protein and MDA in hard seed were significantly lower than those in non-hard seed (P<0.01) (Fig.3D,E).

Fig.3.Physiological differences between hard and non-hard seeds.(A)Activity of SOD.(B)Activity of POD.(C)Activity of CAT.(D)Content of soluble protein.(E)Content of MDA.** indicates significant difference at P <0.01,based on independent t-test.
To evaluate the differences of metabolite composition between hard and non-hard seeds identified by MSI,we conducted a nontarget metabolomic survey.Firstly,based on the accurate results of PLS-DA,the two groups of hard and non-hard seeds were separated(Fig.2C,D).In total,937 metabolites were detected,including 345 in negative mode and 591 in positive mode,among which 63 and 62 differential metabolites (P<0.05) were identified for the negative and positive ion modes,respectively (Tables S4,S5).The 25 and 38 metabolites were significantly up-regulated and down-regulated,respectively in the negative ion mode (Fig.S3A),while 25 and 37 metabolites were significantly up-regulated and down-regulated,respectively in the positive ion mode (Fig.S3B).
Differential metabolites were clustered into two groups,including up-regulated and down-regulated ones in hard seeds.The heatmap depicted the distribution of the different metabolites between hard and non-hard seeds(Fig.4A,B).In the negative ion mode,the contents of gibberellin A4 and 3-Indoleacrylic acid in hard seed were higher than those of non-hard seed,and the content of jasmonic acid was lower in hard seed (Fig.4A).In the positive ion mode,the contents of daidzin,daidzein,and abscisic acid were higher than those of non-hard seed (Fig.4B).In addition,we also conducted the metabolite-metabolite correlation analysis to reveal the mutual relationships among different metabolites,as they are always shown with close links.In the correlation of top20 differential metabolites in the negative ion mode,organic acids dominated in the list of significant metabolites,and we observed that quercetin was only positively correlated with 5-aminovaleric acid,LPG 16:1,and sinapinic acid (Fig.S4A),which are all with enhanced contents in hard seeds (Fig.4A).In top20 differential metabolites in the positive ion mode,isoliquiritigenin was positively correlated with only daidzin(Fig.S4B),whose levels were both increased significantly in hard seeds(Fig.4B).According to the LIPID MAP annotation results of these different metabolites (Fig.S5),there were five categories in negative ion mode,including fatty acids (FA),glycerophospholipids (GP),Polyketides (PK),prenol lipids (PR),and sterol lipids(ST).The‘‘fatty acids and conjugates”is the largest enrichment category (Fig.S5A).The most significant enriched categories are ‘‘fatty acids and conjugates” in FA,glycerophosphocholines in GP,flavonoids in PK,isoprenoids in PR,and steroids in ST (Fig.S5B).

Fig.4.Differential metabolites and KEGG enrichment between hard and non-hard seeds in non-target metabolomics analysis.(A) Differential metabolites in negative ion mode.H,hard seeds with four replicates;N,non-hard seeds with four replicates.(B)Differential metabolites in positive ion mode.(C)KEGG enrichment bubble chart based on differential metabolites in negative (left) and positive (right) ion modes.The color of the dots represents the P-value in hyper geometric test.The circle size represents the number of differential metabolites.
The metabolic pathways in hard seed were mostly related to their biosynthesis and degradation in KEGG enrichment results(Fig.4C).In total,76 and 72 metabolic pathways were identified in the negative and positive ion modes,respectively.Biosynthesis of secondary metabolites is the common significant metabolic pathway in both negative and positive ion modes,such as rutin,quercetin,L-tryptophan,daidzein,nicotinic acid,formononetin,isoliquiritigenin,and bilirubin(Tables S6,S7).The biggest enriched metabolic pathways in negative ion mode were ‘‘alpha-Linolenic acid metabolism” and ‘‘flavone and flavonol biosynthesis”,with the smallestP-values (Table S6).It is known that α-linolenic acid regulated JA synthesis and was suggested to be involved in breaking dormancy [32].The majority of the identified metabolites are components of the flavonoid biosynthesis pathway,as quercetin-3-o-β-D-glucopyranosyl-6-acetate,rutin,daidzein,daidzin,isoliquiritigenin,quercetin,glycitin,and formononetin were all more abundant in hard seed,compared to non-hard seed.The contents of 13(S)-HOTrE and JA in α-Linolenic acid metabolism were lower in hard seed (Fig.4A).Isoflavonoid biosynthesis is the most significant metabolic pathway in the positive ion mode,as the content of daidzein,formononetin,glycitin,and daidzin in hard seed were higher than those in non-hard seed (Fig.4C).In conclusion,the significant differences between hard and non-hard seed may result from the hormones,lipids,and flavonoids.
The dry seeds maintain low level of transcriptomic activity,albeit of the glass state,and a part of RNA transcripts may come from developing seeds in the original harvest.To uncover the transcriptional changes between hard and non-hard seeds at the full ripe stage,we performed transcriptome sequencing between these two groups with three replicates.The results showed that a total of 478 genes were identified as differentially expressed genes(DEGs)at |log2FC| >1 andP<0.05,including 121 up-regulated and 357 down-regulated DEGs.The PCA analysis apparently separated the hard seeds from non-hard seeds (Fig.2B).GO analysis enriched the DEGs mainly in the 8 categories,including ‘‘response to wounding”,‘‘response to abscisic acid”,and‘‘phosphoprotein phosphatase activity” (Fig.5A),while KEGG enrichment placed these DEGs into three pathways of ‘‘protein processing in endoplasmic reticulum”,‘‘glycolysis or gluconeogenesis”,and ‘‘pyruvate metabolism” (Fig.5B).

Fig.5.Transcriptome analysis in hard and non-hard seeds.(A) GO enrichment for DEGs between hard and non-hard seeds.(B) KEGG enrichment for DEGs.(C) Two downregulated genes in the KEGG pathways of signal transduction of ABA (MsG0380015362.01.T01,i.e.,PP2C) and JA (MsG0280008436.01.T01,i.e.,JAZ).(D) The expression heatmap of key DEGs related to hormones.H,hard seed;N,non-hard seed.
We further found that the hard seed of alfalfa exhibited the significant changes in the transcription of the genes encoding enzymes in the pathways of hormone signaling,which were consistent with the results by non-target metabolomics analysis.It suggested that the ABA responses are highly highlighted in hard seeds.For example,ABA receptor protein phosphatase 2C (KEGG id of k14497),which acts as central components and key negative regulators in ABA signaling,was with significant decreased expressions in hard seeds(Fig.5C).And the expression of nine more ABArelated genes was all down-regulated in hard seeds (Fig.5D;Table S8),which seemed to be related to negative regulation of abscisic acid-activated signaling pathway.JASMONATE ZIM DOMAIN (JAZ) genes (KEGG id of k13464),which are involved in the JA responses as transcriptional repressors,were significantly down-regulated in hard seeds (Fig.5C;Table S8).A strong downregulation of gibberellin inactivation gene(GA2ox,k04125),which is related to GA gibberellin deactivation,was suggested in hard seeds,while the two DEGs,MsG0880042725.01.T01 (gibberellin biosynthetic process) and MsG0880046135.01.T01 (response to gibberellin),were with significantly up-regulated expressions(Table S8),which are all consistent with the gibberellin increase in non-target metabolomics analysis.Two DEGs related to auxin were inhibited slightly,i.e.,MsG0280010584.01.T01 (indoleacetic acid biosynthetic process) and MsG0580024862.01.T01 (methyl indole-3-acetate esterase).We also observed the DEGs in the pathways of ethylene signaling and metabolism.For example,ethyleneresponsive element binding protein(EREBP,k09286),which acts as the positive ethylene-dependent transcription factors,was with significant decreased expressions in hard seeds(MsG0380015649.01.T01 and MsG0580029378.01.T01) (Fig.5D;Table S8).
To investigate the overall methylation patterns underlying seed dormancy,we conducted the whole-genome bisulfite sequencing (WGBS),in a comparison of hard and non-hard seeds and then the corresponding methylation situation were decoded.Approximately,> 300 million sequencing reads with a read length of 150 bp were obtained for each sample.Methylation analysis revealed that mCpG levels (~70%) were two-fold higher than that of mCHG (~40%),and seven-fold higher than mCHH(~10%) in hard and non-hard seeds (Fig.S6A).We assessed the average methylation levels among three different contexts,and CpG and CHG exhibited higher methylation levels than CHH in both hard and non-hard seeds (Fig.S7).Differentially methylated regions (DMRs) were determined between hard and non-hard seeds (Fig.6A),for three contexts of CpG,CHG,and CHH.It showed that in CpG and CHH contexts,there were more hyper-DMRs than the hypo-DMRs (Fig.6B),including 54,899,46,216,and 54,452 DMRs for CpG,CHG,and CHH respectively(Fig.6B).We further identified 344 DEGs (P<0.05,71.8% of all the DEGs) in these DMRs (Table S9).It is noted that 17 DEGs were associated with selected metabolites of hormones and lipids in hard seeds (Table S10).Compared to non-hard seed,the expression of two DEGs related to ABA signaling was downregulated,and their promoter regions were methylated under CHH context,in hard seed (Table S10).Among them,the geneninja-family protein AFP3(MsG0780036552) works as a negative regulator of ABA response,which was consistent with its down-regulated expression (Fig.5D) and ABA increased pattern in metabolomics results (Fig.4D),in hard seeds.Similarly,JArelated genes (MsG0280007019 and MsG0280008436) were methylated and their expressions were inhibited accordingly.It suggested that hypermethylation and hypomethylation during seed dormancy partially contributed to gene expression regulation and hard seededness.The results of qRT-PCR also showed the significantly decreased expression changes of DNA methyltransferases and DNA demethylases in hard seeds (Fig.S6B-E),includingClsy1,DME,DRM,andMET,which suggested that DNA methylation is different between hard and non-hard seeds.

Fig.6.Hyper-and hypo-DMRs in a comparison of hard and non-hard seeds.(A)Volcano plot for DMRs.The red(P <0.05 and methylation difference value >20)and blue dots(P <0.05 and methylation difference value <-20)indicate the hyper-and hypo-DMRs in hard seeds compared to non-hard seeds,respectively,and the grey dots indicate the non-differentially methylated regions.(B) Numbers of hyper-and hypo-DMRs in hard seeds compared to non-hard seeds,which were achieved for CpG,CHG,and CHH.
Based on the significant changes of hormones as indicated in metabolomics and transcriptomics results,including ABA,JA,IAA,and GA,we hypothesized that the dormancy of alfalfa hard seed was contributed by‘‘PY+PD”,rather than PY alone.In order to validate this hypothesis,we performed hormone measurements in alfalfa seeds.We counted hard seeds in all the seeds,and achieved a final ratio of 8%.And then we scarified the seed coat of hard seeds,and found that 24.5%still could not germinate(Fig.7A,B),so they are named as non-PY hard seeds.Seed physiological dormancy is strongly dependent on hormonal balance,i.e.,higher ABA content compared to others,which was found in non-PY hard seeds.The ratio of ABA/GA in hard seed was significantly higher than those in non-hard seed(Fig.7C),suggesting its potential contribution to hard seed.And strikingly,the ratios of ABA/IAA and ABA/JA were significantly higher in non-PY hard seeds,which all suggested the presence of hormone-dependent seed dormancy(Fig.7C),while ABA/GA was significantly decreased in non-PY hard seed.It seems that hard seeds maintained a complicated physiological status in terms of the balance of ABA and other phytohormones,which all indicated an ABA-increase trend in hard seeds.In conclusion,our multi-omics results strongly suggested combination of PY and PD pattern in hard seed dormancy,which are related to impermeable seed coat and the balance of ABA and other plant hormones,respectively (Fig.7D).

Fig.7.Hormone changes for physiological dormancy in non-PY hard seeds.(A)Ratio of non-PY hard seeds in hard seeds is 30:100 in this example of petri dish.(B)Ratio of hard seeds in all seeds (~8%) and non-PY hard seeds in hard seeds (24.5%).(C) ABA/GA,ABA/IAA,and ABA/JA in three groups of non-hard,hard,and non-PY hard seeds.Different letters in columns indicate the statistically significant differences at P <0.05(Duncan’s test).(D)Flow chart for the verification of PY+PD in hard seed dormancy in alfalfa.
To date,most studies reported the physical dormancy (PY) on leguminous hard seeds,which are caused by impermeable seed coat [13].Germination tests can identify the hard and non-hard seeds,but the latter cannot maintain their original state after imbibition.Therefore,few studies showed the detailed differences of hard and non-hard seeds,but physical limitations and dormancy caused by the hard seed trait is an area of active research.For example,physiological and biochemical pathways are reported to be involved in the production of the seed impermeable layer,and genes are found to contribute to the PY of hard seeds inM.truncatula[17].We used MSI and multivariate analysis technology to distinguish hard and non-hard seeds with high accuracy and without any destruction.It allowed us to investigate the differences in metabolomics,transcriptomics,and methylome between hard and non-hard seeds.
Nowadays,MSI has been successfully used in seed research,such as seed screening for different cultivars in alfalfa and rice[33,34],seed germination and vitality detection in alfalfa and soybean [21],and determination of seed health in cowpea and wheat[35,36].It has been confirmed that MSI can discriminate hard seeds in some legumes,with an accuracy of 89.17%-90.0% in alfalfa,based on PCA,LDA,and SVM models[12].In this study,we showed a much better performance of MSI (almost 100%),as the accuracy of distinguishing alfalfa hard and non-hard seed reached 96.8%-99.0% by combining LDA,SVM,RF,and ELM,and nCDA provided transformed spectral images to double check the seed types.Multispectral features of the seeds are significantly related to the variations of the physiological status and substance content,which have been the basis of seed classification [21].For example,many studies suggested that phenolic impermeability substances,especially epicatechin,showed a significant positive correlation with hardseededness and seed coat color [15].In this study,morphology-based MSI analysis showed that three-color features are with very significant differences (Table S1).In addition,the contents of lipid,sugar,and soluble protein contents in hard seed were higher than those in non-hard seed,based on the result of physiological and non-targeted metabolomics.The earlier study revealed that the variations in the visible and near-infrared regions were due to changes of color and physicochemical in seeds,respectively [37].Therefore,MSI has a great potential in attaching the spectral features to the differences in physiology and molecular between hard and non-hard seeds [21,36],and the correlation of multispectral imaging data and seed components,for example,amino acid [38],need to be further explored in the future.
The ability of resistant against stresses usually can be related with the high content of secondary metabolites[39].In this study,we found plenty of secondary metabolites with higher accumulation in hard seed,including flavonoids,sesquiterpenoids,and phenolic acids,and these results are consistent with previous researches on hard seeds[40].Flavonoids are known for their ability of scavenging antioxidant,and the levels of different flavonoids,such as quercetin,luteolin,and isoflavanoids,were involved in resistance against environmental stress[41,42].In this study,there were a lot of membrane glycerolipids (Table S11),including PE(phosphatidylethanolamine),PC (phosphatidylcholine),PI (phosphatidylinositol),PG (phosphoglycerol),and PA (phosphatidic acid),which might contribute to defense physically and biochemically in hard seeds.It is known that the lipid could be involved in stress defense and water permeability [43],which is related to hard seed and its dormancy.
It is widely accepted that phytohormones are important for germination and physiological dormancy in seeds [44].Of note,the hard seed of alfalfa was defined as PY [15,45],but in this study,physiological dormancy was also demonstrated,as the metabolome showed that the ABA content in alfalfa hard seed was significantly higher than that in non-hard seed,whose interaction with GA,IAA,and JA was further confirmed by the independent hormone measurement.ABA is the major regulator for seed dormancy,and seeds with higher ABA content have reduced germination[46].Seeds harvested in ABA-deficient mutants germinated faster than the wild type,and transgenic plants with overexpression of genes related to ABA biosynthesis showed deep dormancy[47].Actually,the balance of ABA and other phytohormones have been widely studied on controlling seed dormancy,such as ABA/GA [48].The larger ratio of ABA/GA lowered seed germination and induced seed dormancy [46].In our study,the ABA/GA was significantly higher in hard seeds than that in non-hard seeds.Recent studies showed that the interaction of ABA and JA also resulted into the delayed seed germination [11,46],and endogenous JA with significantly higher content showed no seed dormancy in Arabidopsis (Col-0),compared to seed dormancy ecotype of Cvi [49].In addition,IAA increase was reported to be required during the seed germination,and the ratio of ABA and IAA was expected to suppress seed germination and promote seed dormancy [50].
Typically,breaking seed coat would turn the impermeable seeds into permeable ones.In fact,we found that the 24.5% of alfalfa hard seeds cannot still germinate,even after scraping their seed coat (Fig.7A),and it suggested that there are other factors related to seed dormancy,in addition to seed coat limitation.The evolution of PY to ‘‘PY+PD” within Anacardiaceae was raised,and phylogenetic and fossil evidences also seemed to support the same evolution strategy in Leguminosae [51].In this study,we found that there were significant differences of ABA/GA,ABA/IAA,and ABA/JA in hard seeds and non-PY hard seeds,compared to non-hard seeds,which all confirmed the presence of physiological dormancy and PY+PD pattern in alfalfa hard seeds.The regulatory network related to the hormonal interaction,for example ABA and JA,needs to be investigated in the future,to try to dissect their complicated crosstalk patterns.
Higher content of JA interacts with JAZ and ABI5(activating ABA responses) results into seed germination [11].It implied that four down-regulated JA-related genes in hard seed might be related to seed dormancy(Table S8).Besides hormone-related genes,several other metabolites were also found in hard seed.For example,chitinases in soybean seeds were active in plant defense [52],and we found up regulation of chitinase gene (MsG0480020857.01.T01)in hard seeds,suggesting its potential role in hypoxia response.High content of lipids is one of the key factors to distinguish hard seed and non-hard seed [17],and we found that four genes (MsG0280007667.01.T01,MsG0280011074.01.T01,MsG0380015877.01.T01,and MsG0880044742.01.T01) involved in lipid metabolic process were with up-regulated expressions.GmHs1-1controls the seed coat impermeability in wild soybean[53].KNOX4andKCS12genes were expressed in seed coat,and involved in the formation of PY,as PY was lost in the mutants without their expression,inMedicago truncatula[17,45].We searched the corresponding genes ofKNOX4andKCS12in our RNA-seq results,and found that the genes were not expressed in alfalfa hard seed at all.It suggested that there should be different responsible genes involved in the formation of impermeable seed coat in alfalfa,compared toM.truncatula.Our results provide a basis for the further studies on the genes underlying seed coat limitation in alfalfa hard seeds.
In this study,MSI was successfully employed to predict hard and non-hard seed in alfalfa,with a high accuracy of~100%.Furthermore,we carried out the multiple omics analyses,which revealed significant differences in levels of physiology,transcriptomics,metabolomics and methylome between hard seeds and non-hard seeds.Notably,we identified significant hormone contents,including ABA,JA,GA,and IAA,in alfalfa hard seeds,which might inhibit seed germination.The transcriptomic analysis supported these changes of hormones,especially for ABA,which might be contributed partially by DNA methylation.Our results confirmed that the dormancy of alfalfa seed was contributed by‘‘PY+PD”,rather than PY alone.
Xuemeng Wang:Data curation,Methodology,Software,Investigation,Formal analysis,Visualization,Writing -original draft,Writing -review &editing.Han Zhang:Methodology,Data curation,Investigation,Validation,Writing -review &editing.Rui Song:Investigation,Validation.Ming Sun:Investigation,Methodology.Ping Liu:Validation.Peixin Tian:Investigation.Peisheng Mao:Resources,Funding acquisition,Supervision.Shangang Jia:Conceptualization,Methodology,Project administration,Supervision,Funding acquisition,Writing -review &editing.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
We thank Dr.David Holding at University of Nebraska-Lincoln for polishing the manuscript.This research was supported by the earmarked fund for CARS (CARS-34),National Key Research and Development Program of China (2022YFD1300804),and the Key R &D Project of Sichuan Science and Technology Program(2023YFSY0012).
Supplementary data for this article can be found online at https://doi.org/10.1016/j.cj.2023.03.003.