中国媒介生物学及控制杂志 ›› 2022, Vol. 33 ›› Issue (1): 54-61.DOI: 10.11853/j.issn.1003.8280.2022.01.010

• 实验研究 • 上一篇    下一篇

热带臭虫转录组分析

李婷1, 廖嵩1, 徐业1, 王常禄2, 王建国1   

  1. 1. 江西农业大学入侵生物实验室, 江西南昌 330045;
    2. Department of Entomology, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
  • 收稿日期:2021-08-16 出版日期:2022-02-20 发布日期:2022-02-17
  • 通讯作者: 王建国,E-mail:jgwang@jxau.edu.cn;王常禄,E-mail:changluw@rutgers.edu
  • 作者简介:李婷,女,在读硕士,从事农业昆虫与害虫防治研究,E-mail:tinglilsy@qq.com
  • 基金资助:
    江西省2018年“双千人才”计划(jxsq2018102116);江西省国家外国专家项目(G20200222010,G2021022002)

Transcriptome analysis of Cimex hemipterus (Hemiptera: Cimicidae)

LI Ting1, LIAO Song1, XU Ye1, WANG Chang-lu2, WANG Jian-guo1   

  1. 1. Laboratory of Invasion Biology, School of Agricultural Sciences, Jiangxi Agricultural University, Nanchang, Jiangxi 330045, China;
    2. Department of Entomology, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA
  • Received:2021-08-16 Online:2022-02-20 Published:2022-02-17
  • Supported by:
    Jiangxi Province 2018"Double Thousand Talents"Plan (No. jxsq2018102116); Jiangxi Province National Foreign Experts Program (No. G20200222010, G2021022002)

摘要: 目的 获得热带臭虫转录组数据,为后续基因功能研究奠定基础。方法 基于Illumina NovaSeq 6000高通量测序平台,对热带臭虫混合样本进行测序,采用Trinity软件组装有效测序数据获得unigenes。将unigenes分别与非冗余蛋白数据库(NR)、基因本体论(GO)、京都基因和基因组百科全书(KEGG)、直系同源蛋白分组比对数据库(eggNOG)和注释蛋白数据库(Swiss-Prot)进行比对和功能注释,利用MISA软件对热带臭虫的unigenes进行简单重复序列位点分析,使用Samtools和GATK软件检测热带臭虫单核苷酸多态性(SNP)。结果 转录组测序数据质控后获得5.49 Gb高质量数据,组装后获得21 619个unigenes,总长度为29 322 540 bp,平均长度为1 356 bp。将unigenes序列与5个公共数据库进行比对和功能注释,得到45 362个有功能注释的unigenes,预测发现10 906条编码区序列,检测到7 754个简单重复序列(SSR)位点信息和33 144个SNP位点信息。结论 报道并构建了热带臭虫转录组数据库,并对21 619个unigenes进行组装和功能注释分析,该数据的获得可为后期热带臭虫的功能基因研究、SSR和SNP分子标记开发、遗传多样性分析以及遗传图谱构建等奠定基础。

关键词: 热带臭虫, 转录组, 生物信息学, 微卫星, 单核苷酸多态性

Abstract: Objective To obtain the transcriptome data of Cimex hemipterus and to provide a basis for subsequent functional genomic studies. Methods The mixed sample of C. hemipterus was sequenced on the Illumina NovaSeq 6000 high-throughput sequencing platform. Unigenes were obtained by assembling the sequencing data using the Trinity software for alignment and functional annotation against the databases of non-redundant protein sequence (NR), Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), Evolutionary Genealogy of Genes: Non-supervised Orthologous Groups (eggNOG), and Swiss-Prot. The unigenes were analyzed for simple sequence repeat (SSR) loci with MISA and single nucleotide polymorphism (SNP) loci with Samtools and GATK softwares. Results A total of 5.49 Gb high-quality data were obtained from quality-controlled raw sequencing data. A total of 21 619 unigenes with a total length of 29 322 540 bp and an average length of 1 356 bp were obtained. Alignment and functional annotation of the unigenes sequences with five public databases resulted in 45 362 functional unigenes, 10 906 coding sequences (CDSs), 7 754 SSR loci, and 33 144 SNPs. Conclusion The transcriptome database of C. hemipterus was established and 21 619 unigenes were assembled for functional annotation analysis, which lays a foundation for subsequent studies of C. hemipterus in terms of functional genes, molecular marker development of SSR and SNP, genetic diversity, and genetic maps.

Key words: Cimex hemipterus, Transcriptome, Bioinformatics, Microsatellite, Single nucleotide polymorphism

中图分类号: