High-fidelity(HiFi)sequencing has facilitated the assembly and analysis of the most repetitive region of the genome,the centromere.Nevertheless,our current understanding of human centromeres is based on a relatively small number of telomere-to-telomere assemblies,which have not yet captured its full diversity.In this study,we investigated the genomic diversity of human centromere higher order repeats(HORs)via both HiFi reads and haplotype-resolved assemblies from hundreds of samples drawn from ongoing pangenome-sequencing projects and reprocessed them via a novel HOR annotation pipeline,HiCAT-human.We used this wealth of data to provide a global survey of the centromeric HOR landscape;in particular,we found that 23 HORs presented significant copy number variability between populations.We detected three centromere genotypes with unbalanced population frequencies on chromosomes 5,8,and 17.An inter-assembly comparison of HOR loci further revealed that while HOR array structures are diverse,they nevertheless tend to form a number of specific landscapes,each exhibiting different levels of HOR subunit expansion and possibly reflecting a cyclical evolutionary transition from homogeneous to nested structures and back.
Shenghan GaoYimeng ZhangStephen J. BushBo WangXiaofei YangKai Ye
Rice(Oryza sativa)is a significant crop worldwide with a genome shaped by various evolutionary factors.Rice centromeres are crucial for chromosome segregation,and contain some unreported genes.Due to the diverse and complex centromere region,a comprehensive understanding of rice centromere structure and function at the population level is needed.We constructed a high-quality centromere map based on the rice super pangenome consisting of a 251-accession panel comprising both cultivated and wild species of Asian and African rice.We showed that rice centromeres have diverse satellite repeat CentO,which vary across chromosomes and subpopulations,reflecting their distinct evolutionary patterns.We also revealed that long terminal repeats(LTRs),especially young Gypsy-type LTRs,are abundant in the peripheral CentO-enriched regions and drive rice centromere expansion and evolution.Furthermore,high-quality genome assembly and complete telomere-to-telomere(T2T)reference genome enable us to obtain more centromeric genome information despite mapping and cloning of centromere genes being challenging.We investigated the association between structural variations and gene expression in the rice centromere.A centromere gene,OsMAB,which positively regulates rice tiller number,was further confirmed by expression quantitative trait loci,haplotype analysis and clustered regularly interspaced palindromic repeats(CRISPR)/CRISPR-associated protein9 methods.By revealing the new insights into the evolutionary patterns and biological roles of rice centromeres,our finding will facilitate future research on centromere biology and crop improvement.
Non-B-form DNA differs from the classic B-DNA double helix structure and plays a crucial regulatory role in replication and transcription.However,the role of non-B-form DNA in centromeres,especially in polyploid wheat,remains elusive.Here,we systematically analyzed seven non-B-form DNA motif profiles(A-phased DNA repeat,direct repeat,G-quadruplex,inverted repeat,mirror repeat,short tandem repeat,and Z-DNA)in hexaploid wheat.We found that three of these non-B-form DNA motifs were enriched at centromeric regions,especially at the CENH3-binding sites,suggesting that non-B-form DNA may create a favorable loading environment for the CENH3 nucleosome.To investigate the dynamics of centromeric non-B form DNA during the alloploidization process,we analyzed DNA secondary structure using CENH3 ChIP-seq data from newly formed allotetraploid wheat and its two diploid ancestors.We found that newly formed allotetraploid wheat formed more non-B-form DNA in centromeric regions compared with their parents,suggesting that non-B-form DNA is related to the localization of the centromeric regions in newly formed wheat.Furthermore,non-B-form DNA enriched in the centromeric regions was found to preferentially form on young LTR retrotransposons,explaining CENH3's tendency to bind to younger LTR.Collectively,our study describes the landscape of non-B-form DNA in the wheat genome,and sheds light on its potential role in the evolution of polyploid centromeres.
Congyang YiQian LiuYuhong HuangChang LiuXianrui GuoChaolan FanKaibiao ZhangYang LiuFangpu Han
Dear Editor,Centromeres,the basis for cell division,offer essential insights into cell dynamics,genome stability,and evolutionary processes(McKinley and Cheeseman,2016).Because of ultra-high complexity,high-quality sequences of centromeric regions have long been difficult to obtain,hindering studies of centromere function,evolution,and variation.
Dear Editor,Pumpkin(Cucurbita maxima)belongs to the Cucurbita genus of the Cucurbitaceae family and is a globally cultivated vegetable crop with great economic significance.The global production and planting area of pumpkin,including C.maxima,C.moschata,and C.pepo,reached 7.38 million tons and more than 0.39 million hectares in 2022(https://www.fao.org/).
Qingguo ZengMinghua WeiShuai LiHaiyan WangChangjuan MoLi YangXinzheng LiZhilong BieQiusheng Kong