Dimensionality Reduction Techniques and their Applications in Cancer Classification: A Comprehensive Review
Abstract
Dimensionality reduction techniques have become a vital tool in the investigation of high-dimensional data like gene expression profiles in cancer research. Here is a review, we deliver a comprehensive
overview of dimensionality reduction techniques and their applications in cancer classification. Firstly,
we introduce the concepts and approaches of dimensionality reduction, and after that, we explore
several methods for decreasing dimensionality. These techniques include Linear Discriminant Analysis (LDA), Principal Component Analysis (PCA), and t-Distributed Stochastic Neighbour Embedding (tSNE). We then present a comprehensive review of applications of these techniques in cancer classification, counting lung cancer, colon cancer, breast cancer, and leukemia. Moreover, we discuss the advantages and disadvantages of different dimensionality reduction techniques in cancer
classification, as well as their limitations and future directions. Finally, we summarize the most recent
stage in the area and make it available for use of some recommendations for future studies. Overall,
this review highlights the importance of dimensionality reduction techniques in classification of cancer and provides a valuable resource for researchers working in this field.
Keywords: Feature extraction, cancer classification, dimensionality reduction, feature selection
INTRODUCTION
A massive portion of digital data has really been continuously produced during the last few years in
a variety of application areas. Moreover, data are becoming exponentially larger, heterogeneous,
complicated, and dimensional. Applications for High Dimensional Data (HDD) have been found in a
variety of industries, such as education, biomedicine, the web, health, social media, and business.
Keyworde: Feature extraction, cancer classification, dimensionality reduction, feature selection
Full Text PDF
Refrences:
1. R. Aziz, C. K. Verma, and N. Srivastava, “Artificial Neural Network Classification of High
Dimensional Data with Novel Optimization Approach of Dimension Reduction,” Ann. Data Sci.,
vol. 5, no. 4, pp. 615–635, Dec. 2018, doi: 10.1007/s40745-018-0155-2.
2. S. Ayesha, M. K. Hanif, and R. Talib, “Overview and comparative study of dimensionality
reduction techniques for high dimensional data,” Inf. Fusion, vol. 59, no. January, pp. 44–58, 2020,
doi: 10.1016/j.inffus.2020.01.005.
3. L. J. P. Van Der Maaten, E. O. Postma, and H. J. Van Den Herik, “Dimensionality Reduction: A
Comparative Review,” J. Mach. Learn. Res., vol. 10, pp. 1–41, 2009, doi:
10.1080/13506280444000102.
4. U. N. Wisesty, E. Lisnawati, A. Aditsania, and D. S. Kusumo, “Dimensionality Reduction using
Principal Component Analysis for Cancer Detection based on Microarray Data Classification
Dimensionality Reduction using Principal Component Analysis for Cancer Detection based on
Microarray Data Classification,” no. November, 2018, doi: 10.3844/jcssp.2018.1521.1530.
5. I. Guyon, “Gene Selection for Cancer Classification,” pp. 389–422, 2002.
6. H. Lu, J. Chen, K. Yan, Q. Jin, Y. Xue, and Z. Gao, “A hybrid feature selection algorithm for gene
expression data classification,” Neurocomputing, vol. 256, pp. 56–62, Sep. 2017, doi:
10.1016/j.neucom.2016.07.080.
7. F. Murtagh, “A survey of recent advances in hierarchical clustering algorithms,” Comput. J., vol.
26, no. 4, pp. 354–359, 1983, doi: 10.1093/comjnl/26.4.354.
8. S. Shukla and S. Naganna, “A Review ON K-means DATA Clustering APPROACH,” vol. 4, no.
17, pp. 1847–1860, 2014.
9. A. Bommert, X. Sun, B. Bischl, J. Rahnenführer, and M. Lang, “Benchmark for filter methods for
feature selection in high-dimensional classification data,” Comput. Stat. Data Anal., vol. 143, p.
106839, 2020, doi: 10.1016/j.csda.2019.106839.
10. M. Jansi Rani and D. Devaraj, “Two-Stage Hybrid Gene Selection Using Mutual Information and
Genetic Algorithm for Cancer Data Classification,” J. Med. Syst., vol. 43, no. 8, Aug. 2019, doi:
10.1007/s10916-019-1372-8.
11. O. Ahmad Alomari, A. Tajudin Khader, M. Azmi Al-Betar, and L. Mohammad Abualigah, “Gene
selection for cancer classification by combining minimum redundancy maximum relevancy and
bat-inspired algorithm,” Int. J. Data Min. Bioinform., vol. 19, no. 1, pp. 32–51, 2017, doi:
10.1504/IJDMB.2017.088538.
12. J. Galon et al., “Cancer classification using the Immunoscore: A worldwide task force,” J. Transl.
Med., vol. 10, no. 1, 2012, doi: 10.1186/1479-5876-10-205.
13. N. Almugren and H. Alshamlan, “A survey on hybrid feature selection methods in microarray gene
expression data for cancer classification,” IEEE Access, vol. 7, pp. 78533–78548, 2019, doi:
10.1109/ACCESS.2019.2922987.
14. V. Elyasigomari, D. A. Lee, H. R. C. Screen, and M. H. Shaheed, “Development of a two-stage
gene selection method that incorporates a novel hybrid approach using the cuckoo optimization
algorithm and harmony search for cancer classification,” J. Biomed. Inform., vol. 67, pp. 11–20,
2017, doi: 10.1016/j.jbi.2017.01.016.
15. I. Jain, V. K. Jain, and R. Jain, “Correlation feature selection based improved-Binary Particle
Swarm Optimization for gene selection and cancer classification,” Appl. Soft Comput., vol. 62, pp.
203–215, 2018, doi: 10.1016/j.asoc.2017.09.038.
16. G. Nguyen et al., “Machine Learning and Deep Learning frameworks and libraries for large-scale
data mining: a survey,” Artif. Intell. Rev., vol. 52, no. 1, pp. 77–124, 2019, doi: 10.1007/s10462-
018-09679-z.
17. S. Cho and H. Won, “Machine Learning in DNA Microarray Analysis for Cancer Classification,”
no. May 2014, 2018.
18. N. Almugren and H. Alshamlan, “A survey on hybrid feature selection methods in microarray gene
expression data for cancer classification,” IEEE Access, vol. 7. Institute of Electrical and
Electronics Engineers Inc., pp. 78533–78548, 2019. doi: 10.1109/ACCESS.2019.2922987.
19. A. Yaqoob, R. M. Aziz, N. K. Verma, P. Lalwani, and A. Makrariya, “A Review on Nature-Inspired
Algorithms for Cancer Disease Prediction and Classification,” 2023.
20. A. Ghodsi, “Dimensionality Reduction A Short Tutorial.”
21. A. Jović, K. Brkić, and N. Bogunović, “A review of feature selection methods with applications,”
2015 38th Int. Conv. Inf. Commun. Technol. Electron. Microelectron. MIPRO 2015 – Proc., no.
May 2015, pp. 1200–1205, 2015, doi: 10.1109/MIPRO.2015.7160458.
22. D. A. A. Gnana, “Literature Review on Feature Selection Methods for High-Dimensional Data
Literature Review on Feature Selection Methods for High-Dimensional Data,” no. August, 2016,
doi: 10.5120/ijca2016908317.
23. C. C. Aggarwal, “Educational and software resources for data classification,” Data Classif.
Algorithms Appl., pp. 657–665, 2014, doi: 10.1201/b17320.
24. V. Bolón-Canedo, N. Sánchez-Maroño, and A. Alonso-Betanzos, “A review of feature selection
methods on synthetic data,” Knowl. Inf. Syst., vol. 34, no. 3, pp. 483–519, 2013, doi:
10.1007/s10115-012-0487-8.
25. G. Chandrashekar and F. Sahin, “A survey on feature selection methods,” Comput. Electr. Eng.,
vol. 40, no. 1, pp. 16–28, 2014, doi: 10.1016/j.compeleceng.2013.11.024.
26. B. Remeseiro and V. Bolon-Canedo, “A review of feature selection methods in medical
applications,” Comput. Biol. Med., vol. 112, no. July, p. 103375, 2019, doi:
10.1016/j.compbiomed.2019.103375.
27. H. J. Ferreau et al., “Embedded Optimization Methods for Industrial Automatic Control,” vol. 50,
no. 1, pp. 13194–13209, 2017, doi: 10.1016/j.ifacol.2017.08.1946.
28. I. Gitchat, “特征选择 ( Feature Selection ) 特征选择 Feature Selection,” Comput. Vis., vol. 392,
no. March, pp. 1–10, 2018, [Online]. Available: http://link.springer.com/10.1007/978-3-030-
03243-2_299-1
29. P. Lamba and K. Rawal, “A Survey of Algorithms for Feature Extraction and Feature Classification
Methods.”
30. F. Roberti de Siqueira, W. Robson Schwartz, and H. Pedrini, “Multi-scale gray level co-occurrence
matrices for texture description,” Neurocomputing, vol. 120, pp. 336–345, 2013, doi:
10.1016/j.neucom.2012.09.042.
31. A. Hadid, J. Ylioinas, M. Bengherabi, M. Ghahramani, and A. Taleb-Ahmed, “Gender and texture
classification: A comparative analysis using 13 variants of local binary patterns,” Pattern Recognit.
Lett., vol. 68, pp. 231–238, 2015, doi: 10.1016/j.patrec.2015.04.017.
32. Á. Serrano, I. M. de Diego, C. Conde, and E. Cabello, “Recent advances in face biometrics with
Gabor wavelets: A review,” Pattern Recognit. Lett., vol. 31, no. 5, pp. 372–381, 2010, doi:
10.1016/j.patrec.2009.11.002.
33. S. E. Lee, K. Min, and T. Suh, “Accelerating Histograms of Oriented Gradients descriptor
extraction for pedestrian recognition,” Comput. Electr. Eng., vol. 39, no. 4, pp. 1043–1048, 2013,
doi: 10.1016/j.compeleceng.2013.04.001.
34. N. Aloysius, A. V. Vidyapeetham, G. Madathilkulangara, and A. V. Vidyapeetham, “A Review on
Deep Convolutional Neural Networks,” no. April, 2017, doi: 10.1109/ICCSP.2017.8286426.
35. I. Guyon, S. Gunn, and M. Nikravesh, “Feature Extraction,” 2006.
36. J. Behmann, A. Mahlein, T. Rumpf, C. Ro, and L. Plu, “A review of advanced machine learning
methods for the detection of biotic stress in precision crop protection,” pp. 239–260, 2015, doi:
10.1007/s11119-014-9372-7.
37. K. K. Kumar, K. Chaduvula, and B. R. Markapudi, “A Detailed Survey On Feature Extraction
Techniques In Image Processing For Medical Image Analysis,” vol. 07, no. 10, pp. 2275–2284,
2020.
38. K. L. Tang, T. H. Li, W. W. Xiong, and K. Chen, “Ovarian cancer classification based on
dimensionality reduction for SELDI-TOF data,” BMC Bioinformatics, vol. 11, 2010, doi:
10.1186/1471-2105-11-109.
39. M. F. Kabir, T. Chen, and S. A. Ludwig, “A performance analysis of dimensionality reduction
algorithms in machine learning models for cancer prediction,” Healthc. Anal., vol. 3, no. November
2022, p. 100125, 2023, doi: 10.1016/j.health.2022.100125.
40. M. Nilashi, O. Ibrahim, H. Ahmadi, and L. Shahmoradi, “A knowledge-based system for breast
cancer classification using fuzzy logic method,” Telemat. Informatics, vol. 34, no. 4, pp. 133–144,
2017, doi: 10.1016/j.tele.2017.01.007.
41. S. M. Ayyad, A. I. Saleh, and L. M. Labib, “Gene expression cancer classification using modified
K-Nearest Neighbors technique,” BioSystems, vol. 176, no. January, pp. 41–51, 2019, doi:
10.1016/j.biosystems.2018.12.009.
42. H. Salem, G. Attiya, and N. El-Fishawy, “Classification of human cancer diseases by gene
expression profiles,” Appl. Soft Comput. J., vol. 50, pp. 124–134, 2017, doi:
10.1016/j.asoc.2016.11.026.
43. M. Dashtban and M. Balafar, “Gene selection for microarray cancer classification using a new
evolutionary method employing artificial intelligence concepts,” Genomics, vol. 109, no. 2, pp. 91–
107, 2017, doi: 10.1016/j.ygeno.2017.01.004.
44. S. F. Abdoh, M. Abo Rizka, and F. A. Maghraby, “Cervical cancer diagnosis using random forest
classifier with SMOTE and feature reduction techniques,” IEEE Access, vol. 6, pp. 59475–59485,
2018, doi: 10.1109/ACCESS.2018.2874063.
45. A. Madduri, S. S. Adusumalli, H. S. Katragadda, M. K. R. Dontireddy, and P. S. Suhasini,
“Classification of Breast Cancer Histopathological Images using Convolutional Neural Networks,”
Proc. 8th Int. Conf. Signal Process. Integr. Networks, SPIN 2021, pp. 755–759, 2021, doi:
10.1109/SPIN52536.2021.9566015.
46. R. Yan et al., “Breast cancer histopathological image classification using a hybrid deep neural
network,” Methods, vol. 173, no. June 2019, pp. 52–60, 2020, doi: 10.1016/j.ymeth.2019.06.014.