A FRAME WORK FOR IDENTIFICATION OF RELATIONSHIP BETWEEN GENE AND DISEASE CAUSING MUTATION USING BIOLOGICAL TEXT MINING

A.Murali Krishna, Dr. S. Jyothi

Abstract


We have gone through various papers describing the mutations in between them and associated disease in a rapid pace. The articles of previous studies show that there is a need to acquire knowledge of gene mutation causing diseases and its association. The need cannot be solved manually, but it has to be automated, so our study is based to develop a framework which gathers information of disease association mutation for knowledge sharing to doctors and researchers. Our work is done using texting mining for extraction of disease causing mutation and its associated NLP from previous abstracts. Our proposed system extracts mutation causing gene using NLP.DMLtool consists of modules of NLP that process text input using semantic and synaptic patterns to gain disease mutation. DML developed gives recall and precision high with F-score 0.87 , 0.89 and 0.91, which were evaluated on 3 various datasets related to associated disease mutations.  In DML we used a special module which extracts mentioned mutation and its gene text associated with it.

Various types of datasets have been evaluated on our framework and its performance has been check with performance metric.The obtained results shows better performance compared to the existing on association of disease-mutation and also solves problems of low precision and their approaches.LMA is applied to large data sets of different type of abstracts in Pubmed, it extracts associated disease-mutations and its related information of patients, population of data and its type size.

The gained result from our work is stored in a database, which can be acquired by query processing. In our work we conclude that using text mining method, we can increase high throughput, this gives potential to the research and also assist the research in identifying mutation causing disease and it’s associated with.


Keywords


Mutation;Extraction;NLP;Text Mining;Pubmed;Gene;

References


. Zhang J, Chiodini R, Badr A, Zhang G (2011) The impact of next-generation se-quencing on genomics. J Genet Genomics 38: 95–109. doi: 10.1016/j.jgg.2011.02.003 PMID: 21477781

. Capriotti E, Nehrt NL, Kann MG, Bromberg Y (2012) Bioinformatics for personal genome interpretation. Brief Bioinform 13: 495–512. PMID: 22247263

. Burger JD, Doughty E, Khare R, Wei C-H, Mishra R, Aberdeen J, et al. (2014) Hybrid curation of gene-mutation relations combin-ing automated extraction and crowdsourcing. Database 2014. Available: doi: 10.1093/database/bau094

. The UniProt Consortium (2014) Activities at the Universal Protein Resource (UniProt). Nucleic Acids Res 42:D191–D198. doi: 10.1093/nar/gkt1140 PMID: 24253303

. Forbes SA, Bindal N, Bamford S, Cole C, Kok CY, Beare D, et al. (2011) COSMIC: mining completecancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res 39: D945–50.doi: 10.1093/nar/gkq929 PMID: 20952405

. Wu T-J, Shamsaddini A, Pan Y, Smith K, Crichton DJ, Simonyan V, et al. (2014) A framework for orga-nizing cancer-related variations from existing databases, publica-tions and NGS data using a High-per-formance Integrated Virtual Environment (HIVE). Database 2014: bau022. doi: 10.1093/database/ bau022 PMID: 24667251

. Amberger J, Bocchini CA, Scott AF, Hamosh A (2009) McKusick’s Online Mendelian In-heritance in Man (OMIM). Nucleic Acids Res 37: D793–6. doi: 10.1093/nar/gkn665 PMID: 18842627

. Stenson PD, Mort M, Ball EV, Howells K, Phillips AD, Thomas NS, et al. (2009) The Human Gene Muta-tion Database: 2008 up-date. Genome Med 1: 13. doi: 10.1186/gm13 PMID: 19348700

. Béroud C, Hamroun D, Collod-Béroud G, Boileau C, Soussi T, Claustres M (2005) UMD (Universal Mutation Database): 2005 update. Hum Mutat 26: 184–191. PMID: 16086365

. Thorisson GA, Lancaster O, Free RC, Hast-ings RK, Sarmah P, Dash D, et al. (2009) HGVbaseG2P: central genetic association database. Nucleic Acids Res 37: D797–802. doi: 10.1093/nar/gkn748PMID: 18948288

. Singh A, Olowoyeye A, Baenziger PH, Dant-zer J, Kann MG, Radivojac P, et al. (2008) MutDB: updateon development of tools for the biochemical analysis of genetic variation. Nucleic Acids Res 36: D815–9. PMID: 17827212

. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, et al. (2001) dbSNP: the NCBIdatabase of genetic variation. Nucleic Acids Res 29: 308–311. PMID: 11125122

. Whirl-Carrillo M, McDonagh EM, Hebert JM, Gong L, Sangkuhl K, Thorn CF, et al. (2012) Pharmacoge-nomics knowledge for personalized medicine. ClinPharmacolTher 92: 414–417. doi: 10.1038/clpt. 2012.96 PM-ID: 22992668

. Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, Church DM, et al. (2014) ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res 42: D980–5. doi: 10.1093/nar/gkt1113 PMID: 24234437

. Plazzer JP, Sijmons RH, Woods MO, Pel-tomäki P, Thompson B, DenDunnen JT, et al. (2013) The InSiGHT database: utilizing 100 years of insights into Lynch syndrome. Fam Cancer 12: 175–180. doi: 10.1007/s10689-013-9616-0 PMID: 23443670

. Doughty E, Kertesz-Farkas A, Bodenreider O, Thompson G, Adadey A, Peterson T, et al. (2011) Toward an automatic method for ex-tracting cancer- and other disease-related point mutations from the biomedical litera-ture. Bioinformatics 27: 408–415. doi: 10.1093/bioinformatics/btq667 PMID: 21138947

. Rebholz-Schuhmann D, Marcel S, Albert S, Tolle R, Casari G, Kirsch H (2004) Automat-ic extraction of mutations from Medline and cross-validation with OMIM. Nucleic Acids Res 32: 135–142. PMID: 14704350

. Horn F, Lau AL, Cohen FE (2004) Auto-mated extraction of mutation data from the literature: applicationof MuteXt to G protein-coupled receptors and nuclear hormone receptors. Bioinformatics 20: 557–568.


Full Text: PDF

Refbacks

  • There are currently no refbacks.




Copyright © 2012 - 2023, All rights reserved.| ijitr.com

Creative Commons License
International Journal of Innovative Technology and Research is licensed under a Creative Commons Attribution 3.0 Unported License.Based on a work at IJITR , Permissions beyond the scope of this license may be available at http://creativecommons.org/licenses/by/3.0/deed.en_GB.