C-Gemis: A computational tool for gene expression data analysis for gastric cancer
C-Gemis: Uma ferramenta computacional para análise de dados de expressão gênica de câncer gástrico
Palavras-chave:
Gastric cancer, TCGA, GEO, Survival analysisResumo
Background: Computational tools dedicated to the analysis of transcripts microarray and RNA-seq can provide an instrument to search biomarkers related to diagnosis and prognosis in different neoplasia. This process is carried out by automatizing the computational process that allows the exploration, visualization, and analysis of gene expression data. Objective: The present paper describes a new tool named C-Gemis for gene expression data analysis. Methods: C-Gemis is an online and free computation tool that explores differential gene expression and survival analysis with visualization of results. Results: C-Gemis optimizes the search for Gastric Cancer (GC) biomarkers in available data from public databases and stands out in usability, objectivity, and easy-to-understand graphics presentation. The results are presented considering Laurén's, WHO, and TCGA molecular classification. The tool is available at the website: www.cgemis.com.br. Conclusions: C-Gemis provides an easy way to automate data analysis of microarray and RNA-seq. The following steps incorporate other types of cancer, reaching a high detail related to cancer classifications and subclassifications.
Downloads
Referências
BARRETT, T. et al. NCBI GEO: archive for functional genomics data sets--update. Nucleic Acids
Res. 2013 Jan;41(Database issue):D991-5. doi: 10.1093/nar/gks1193. Epub 2012 Nov 27. PMID:
; PMCID: PMC3531084.
BENILTON, C. pd.mogene.2.0.st [Internet]. Bioconductor; 2017 [cited 2022 Jan 24]. Available from:
https://bioconductor.org/packages/pd.mogene.2.0.st.
BROAD INSTITUTE TCGA GENOME DATA ANALYSIS CENTER. Analysis Overview for Stomach
Adenocarcinoma (Primary solid tumor cohort) - Jan 15 2014 [Internet]. Broad Institute of MIT and
Harvard; 2014. Available from:
http://gdac.broadinstitute.org/runs/analyses__2014_01_15/reports/cancer/STAD-TP/index.html [acessed
-01-24]
BROADHEAD, ML.; CLARK, JC.; DASS, CR.; CHOONG, PF. Microarray: an instrument for cancer
surgeons of the future? ANZ J Surg. 2010 Jul-Aug;80(7-8):531-6. doi: 10.1111/j.1445-
2010.05379.x. PMID: 20795968.
CANCER GENOME ATLAS RESEARCH NETWORK. Comprehensive molecular characterization
of gastric adenocarcinoma. Nature. 2014 Sep 11;513(7517):202-9. doi: 10.1038/nature13480. Epub
Jul 23. PMID: 25079317; PMCID: PMC4170219.
CARLSON, M. hgu133plus2.db [Internet]. Bioconductor; 2017 [cited 2022 Jan 24]. Available from:
https://bioconductor.org/packages/hgu133plus2.db.
CARVALHO, BS. Irizarry RA. A framework for oligonucleotide microarray preprocessing.
Bioinformatics. 2010 Oct 1;26(19):2363-7. doi: 10.1093/bioinformatics/btq431. Epub 2010 Aug 5.
PMID: 20688976; PMCID: PMC2944196.
CERAMI, E. et al. The cBio cancer genomics portal: an open platform for exploring
multidimensional cancer genomics data. Cancer Discov. 2012 May;2(5):401-4. doi: 10.1158/2159-
CD-12-0095. Erratum in: Cancer Discov. 2012 Oct;2(10):960. PMID: 22588877; PMCID:
PMC3956037.
CGEMIS. Available from: www.cgemis.com.br
CLOUGH, E.; BARRETT, T. The Gene Expression Omnibus Database. Methods Mol Biol.
;1418:93-110. doi: 10.1007/978-1-4939-3578-9_5. PMID: 27008011; PMCID: PMC4944384.
COLAPRICO, A. et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA
data. Nucleic Acids Res. 2016 May 5;44(8):e71. doi: 10.1093/nar/gkv1507. Epub 2015 Dec 23. PMID:
; PMCID: PMC4856967.
CRISTESCU, R. et al. Molecular analysis of gastric cancer identifies subtypes associated with
distinct clinical outcomes. Nat Med. 2015 May;21(5):449-56. doi: 10.1038/nm.3850. Epub 2015 Apr 20.
PMID: 25894828.
D'ANGELO, G.; DI RIENZO, T.; OJETTI, V. Microarray analysis in gastric cancer: a review. World J
Gastroenterol. 2014 Sep 14;20(34):11972-6. doi: 10.3748/wjg.v20.i34.11972. PMID: 25232233; PMCID:
PMC4161784.
DAVIS S.; MELTZER, PS. GEOquery: a bridge between the Gene Expression Omnibus (GEO) and
BioConductor. Bioinformatics. 2007 Jul 15;23(14):1846-7. doi: 10.1093/bioinformatics/btm254. Epub
May 12. PMID: 17496320.
DE MENDIBURU, F. (2014). Agricolae: statistical procedures for agricultural research. R package
version, 1(1).
GAO, J. et al. Integrative analysis of complex cancer genomics and clinical profiles using the
cBioPortal. Sci Signal. 2013 Apr 2;6(269):pl1. doi: 10.1126/scisignal.2004088. PMID: 23550210;
PMCID: PMC4160307.
GAUTIER L.; COPE, L.; BOLSTAD, BM.; IRIZARRY, RA. affy--analysis of Affymetrix GeneChip
data at the probe level. Bioinformatics. 2004 Feb 12;20(3):307-15. doi: 10.1093/bioinformatics/btg405.
PMID: 14960456.
GOLUB, T. et al. Molecular classification of cancer: class discovery and class prediction by gene
expression monitoring. Science. 1999 Oct 15;286(5439):531-7. doi: 10.1126/science.286.5439.531.
PMID: 10521349.
HAIBE-KAINS, B. et al . A comparative study of survival models for breast cancer prognostication
based on microarray data: does a single gene beat them all? Bioinformatics. 2008 Oct 1;24(19):2200-
doi: 10.1093/bioinformatics/btn374. Epub 2008 Jul 17. PMID: 18635567; PMCID: PMC2553442.
LAUREN, P. The two histological main types of gastric carcinoma: diffuse and so-called intestinal-type
carcinoma. An attempt at a histo-clinical classification. Acta Pathol Microbiol Scand. 1965;64:31-49. doi:
1111/apm.1965.64.1.31. PMID: 14320675.
NAGTEGAAL, ID. et al. WHO Classification of Tumours Editorial Board. The 2019 WHO
classification of tumours of the digestive system. Histopathology. 2020 Jan;76(2):182-188. doi:
1111/his.13975. Epub 2019 Nov 13. PMID: 31433515; PMCID: PMC7003895.
SARKANS, U. et al. From ArrayExpress to BioStudies. Nucleic Acids Res. 2021 Jan 8;49(D1):D1502-
D1506. doi: 10.1093/nar/gkaa1062. PMID: 33211879; PMCID: PMC7778911.
SCHRÖDER, MS. et al. Survcomp: an R/Bioconductor package for performance assessment and
comparison of survival models. Bioinformatics. 2011 Nov 15;27(22):3206-8. doi:
1093/bioinformatics/btr511. Epub 2011 Sep 7. PMID: 21903630; PMCID: PMC3208391.
SINNOTT, JA.; CAI, T. Inference for survival prediction under the regularized Cox model.
Biostatistics. 2016 Oct;17(4):692-707. doi: 10.1093/biostatistics/kxw016. Epub 2016 Apr 22. PMID:
; PMCID: PMC5031946.
SUNG, H. et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality
Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021 May;71(3):209-249. doi:
3322/caac.21660. PMID: 33538338
TCGA. About TCGA. URL www.cancergenome.nih.gov/abouttcga [acessed 2022-01-24]
WAN, C.; LI, J. Synthesis of well-dispersed magnetic CoFe2O4 nanoparticles in cellulose aerogels
via a facile oxidative co-precipitation method. Carbohydr Polym. 2015 Dec 10;134:144-50. doi:
1016/j.carbpol.2015.07.083. Epub 2015 Aug 5. PMID: 26428110.
WORLD HEALTH ORGANIZATION (WHO), International Programme on Chemical Safety.
Biomarkers in Risk Assessment: Validity and Validation, 2001.
WU, P. et al. Integration and Analysis of CPTAC Proteomics Data in the Context of Cancer
Genomics in the cBioPortal. Mol Cell Proteomics. 2019 Sep;18(9):1893-1898. doi:
1074/mcp.TIR119.001673. Epub 2019 Jul 15. PMID: 31308250; PMCID: PMC6731080.
YANG, Y. et al. Databases and web tools for cancer genomics study. Genomics Proteomics
Bioinformatics. 2015 Feb;13(1):46-50. doi: 10.1016/j.gpb.2015.01.005. Epub 2015 Feb 21. Erratum in:
Genomics Proteomics Bioinformatics. 2015 Jun;13(3):202-203. PMID: 25707591; PMCID:
PMC4411507.