Essential cancer protein identification using graph-based random walk with restart
Rout T., Mohapatra A., Kar M., Muduly D.K.
Article, Computer Methods in Biomechanics and Biomedical Engineering, 2026, DOI Link
View abstract ⏷
Protein-protein interaction (PPI) network analysis holds significant promise for cancer diagnosis and drug target identification. This paper introduces a novel random walk-based method called essential cancer protein identification using graph-based random walk with restart (EPI-GBRWR) to address this gap. This proposed method incorporates local and global topological features of proteins, enhancing the accuracy of essential protein identification in PPI networks. Starting with meticulous preprocessing of cancer gene datasets from NCBI, including breast, lung, colorectal, and ovarian cancers, and identifying a core set of common genes. The proposed method constructs PPI networks to capture complex protein interactions from these common cancer genes. Topological analysis, including a centrality measures matrix, is generated to perform the analysis to identify essential nodes. The study revealed that 40 essential proteins among breast, colorectal, lung and ovarian cancer showcase the potency of integrative methodologies in unravelling cancer complexity, signalling a transformative era in cancer research and treatment. The strength of the findings from the study has direct clinical relevance in cancer diseases. It contributes to the field of precision medicine to guide personalized treatment strategies.
Disease Diagnosis and Management Using Bioinformatics and Cyber-Physical Systems
Rout T., Mohapatra A., Kar M., Muduly D.K.
Book chapter, Studies in Big Data, 2025, DOI Link
View abstract ⏷
Cancer is one of the most prevalent diseases worldwide. Its extreme heterogeneity and complexity make it a challenging subject of study. The use of ‘omics’ technologies, such as genomics, proteomics, transcriptomics, and metabolomics, has revolutionized the understanding of cancer. By integrating data from these ‘omics’ technologies and conducting comprehensive analyses, researchers can gain a more comprehensive understanding of the molecular basis of cancer. This knowledge can lead to the essential cancer protein identification for early detection, prognosis, and the development of targeted therapies. Understanding cancer at the molecular level is crucial, as it allows for the development of targeted therapies and personalized treatment strategies. This book chapter describes a comprehensive overview of cancer protein identification and the application of the PPI network in cancer research. The comprehensive emphasis on PPI network topology fundamentals and PPI network biology-based approaches in cancer research is indeed essential. PPI network topology helps researchers identify key nodes that play critical roles in cancer disease development. The network biology approach involves integrating data from various high-throughput omics technologies, such as genomics, proteomics, transcriptomics, and metabolomics, into a single conceptual framework. Network biology-based approaches uncover the intricate relationships between genes, proteins, and other biomolecules involved in cancer development. Exploring the central role of PPI networks played as a biological network model. These PPI networks intricately map the interactions between proteins, unraveling the complexity of cellular processes and signaling pathways. Understanding the critical significance of protein interactions lays the foundation for their application in cancer diagnosis. The role of PPI network in cancer disease diagnosis offers insights into the evolving landscape of cancer diagnosis and precision medicine, highlighting the potential of PPI networks to revolutionize early detection, patient stratification, and personalized therapeutic interventions in the fight against cancer. Furthermore, we address the challenges and future directions in harnessing PPI networks for enhanced cancer disease diagnosis, underscoring their importance in the pursuit of more precise, effective, and timely diagnostic strategies.
Centrality-Based Approach for Identifying Essential Cancer Proteins in PPI Networks
Rout T., Mohapatra A., Kar M., Muduly D.K.
Article, SN Computer Science, 2025, DOI Link
View abstract ⏷
Abstract: Protein–protein interaction (PPI) networks serve as invaluable repositories, shedding light on the intricate web of protein interactions within living organisms. Traditional methods for identifying essential proteins fail to capture the complex dynamics of these networks. This research aims to introduce a novel quantum-inspired centrality-based approach for identifying essential proteins in PPI networks (Quantum-EPI), leveraging quantum walk principles and a sequential pattern algorithm. A composite score for each protein is calculated by leveraging centrality measures, including degree, betweenness, closeness, and clustering coefficient. The algorithm, inspired by quantum interference and superposition, employs composite scores as a sorting mechanism. A quantum walk-inspired thresholding strategy distinguishes essential proteins, introducing a quantum-inspired probabilistic selection. The approach is exemplified using a set of centrality values and validated through permutation and enrichment tests. The Quantum-EPI approach prioritizes candidate cancer proteins and identifies nine proteins as the most significant disease-causing proteins through local and global topological analysis, correlation analysis, and ranking strategies. The simulation results, along with the validation through permutation and enrichment tests, demonstrate the superiority of the Quantum-EPI approach over existing state-of-the-art methods, providing a strong foundation for its efficacy in pinpointing essential proteins crucial to PPI network connectivity. The Quantum-EPI approach not only contributes to the field of network medicine but also sets the stage for innovative methodologies in protein identification and pathway analysis. The findings of this research hold significant promise for cancer research, drug design, and disease prevention, offering a new paradigm for essential protein discovery in biological systems.
Essential Cancer Proteins and Key Pathway Identification of Cancer through Bioinformatic Analysis
Rout T., Mohapatra A., Muduly D., Kar M.
Conference paper, Procedia Computer Science, 2025, DOI Link
View abstract ⏷
Pathway analysis is crucial for deciphering the molecular mechanisms underlying cancer by identifying essential proteins within biological pathways. This study introduces a comprehensive framework for pathway analysis to uncover proteins associated with cancer pathogenesis. A computational approach exploring protein-protein interaction (PPI) networks and cancer gene datasets is applied to identify critical proteins involved in cancer progression. The methodology includes preprocessing gene data, using centrality metrics to detect vital proteins, and conducting pathway enrichment analysis to uncover dysregulated cancer pathways. This study identifies 44 critical proteins in PPI networks linked to breast, lung, colorectal, and ovarian cancers. These proteins span ten key pathways for regulating cell cycles, growth, and differentiation. The research emphasizes the significant roles of proteins like TP53, BRCA1, RPS27A, PCNA, CDK1, and CTNNB1 in cancer development, offering important insights into their functional impact on cancer-related pathways. The pathway analysis approach elucidates the molecular basis of cancer and advances precision oncology research. The study examines major cellular pathways, highlighting potential targets for therapeutic intervention and personalized treatment strategies. Future endeavours aim to translate identified essential cancer proteins into clinical practice through multidisciplinary collaboration and implementation strategies, promising the development of effective cancer treatment strategies.
A Novel Hybrid Architecture for Breast Cancer Detection Using Machine Learning and Deep Learning
Sonali K., Chakravarty S., Rout T., Patro A.S., Dalai A., Mallick S.R.
Conference paper, 2025 2nd International Conference on Circuits, Power, and Intelligent Systems, CCPIS 2025, 2025, DOI Link
View abstract ⏷
One of the main causes of death for women globally is breast cancer, and better treatment results depend on early and precise identification. In order to classify breast cancer using medical imaging data, this study compares machine learning (ML) with deep learning (DL) approaches. Support Vector Machine (SVM), Random Forest (RF), and K-Nearest Neighbors (KNN) algorithms were used to classify the features that were retrieved using the Gray Level Co-occurrence Matrix (GLCM) in the machine learning technique. Among these techniques, the best accuracy attained was 85.17%. Using pre-trained architectures like VGG16 and ResNet16, convolutional layers were used in the DL technique to automatically extract features. These models performed quite well, with a maximum accuracy of 96.55%. The findings unequivocally show that deep learning models perform better than conventional machine learning methods that depend on manually created features since they can instantly learn intricate patterns from photos. This study supports the incorporation of DL into computer-aided diagnostic systems for clinical usage and emphasizes its potential to improve breast cancer diagnosis.
A systematic review of graph-based explorations of PPI networks: methods, resources, and best practices
Rout T., Mohapatra A., Kar M.
Review, Network Modeling Analysis in Health Informatics and Bioinformatics, 2024, DOI Link
View abstract ⏷
This systematic review aims to provide a comprehensive overview of graph-based methodologies utilized in the analysis of protein–protein interaction (PPI) networks. The primary objective is to synthesize existing literature and identify key methodologies, resources, and best practices in the field, with a focus on their application in uncovering essential cancer proteins. A systematic literature search was conducted across various databases to identify relevant studies focusing on graph-based explorations of PPI networks. The selected articles were critically reviewed, and data were extracted regarding the methodologies employed, resources utilized, and best practices identified. The review proceeds to outline a workflow that illustrates the systematic process from the compilation of gene/protein datasets to the generation of essential cancer proteins. A case study on “uncovering essential cancer proteins in breast cancer” was included to exemplify the application of graph-based methodologies in a real-world scenario. The review revealed various graph-based methodologies utilized in PPI network analysis, including centrality measures, pathway enrichment analyses, and network visualization techniques. Essential resources such as databases, software tools, and repositories were identified, along with best practices for data preprocessing, network construction, and analysis. The synthesis of findings, complemented by the case study, provides researchers with a comprehensive understanding of the current landscape of graph-based PPI network analysis and its application in cancer research. This systematic review contributes to the field by offering a holistic overview of graph-based explorations in PPI network research, with a specific focus on cancer protein identification. By synthesizing existing knowledge and identifying essential resources and best practices, this review serves as a valuable resource for researchers, facilitating informed decision-making and enhancing research quality and reproducibility. The inclusion of the case study underscores the practical application of graph-based methodologies in uncovering essential cancer proteins.
Essential proteins in cancer networks: a graph-based perspective using Dijkstra’s algorithm
Rout T., Mohapatra A., Kar M., Muduly D.K.
Article, Network Modeling Analysis in Health Informatics and Bioinformatics, 2024, DOI Link
View abstract ⏷
Identifying essential proteins within cancer-related PPI networks is a significant challenge due to the heterogeneity and complexity of cancer diseases. Identifying these proteins is crucial for developing effective therapeutic strategies and understanding cancer biology. This study introduces a novel graph-based approach to identify essential cancer proteins within PPI networks, focusing on breast, lung, colorectal, and ovarian cancers. The proposed methodology involves a multi-step process beginning with identifying and preprocessing common genes associated with breast, colorectal, lung, and ovarian cancers. The PPI networks are constructed using these common genes. The PPI networks are analyzed to find the shortest paths using centrality measures. Centrality measures, particularly betweenness centrality, prioritize proteins with the highest impact on cancer progression. Betweenness centrality is used as a threshold to exclude nonessential proteins. The identified proteins are validated and categorized into cancer-related pathways through permutation and enrichment tests. The proposed approach successfully identified 64 essential proteins across breast, lung, colorectal, and ovarian cancers. These proteins were categorized into 14 cancer-related pathways, including cell cycle regulation, Wnt/β-Catenin signaling, RTK/RAS/RAF/MEK/ERK signaling, and PI3K/AKT/mTOR signaling. The identified pathways highlight complex interactions of these proteins, pivotal functions in cancer progression, and therapeutic targets. The validation process, through permutation and enrichment tests, confirmed the robustness and relevance of these findings, indicating their significant impact on understanding and potentially treating cancer. Identifying essential cancer proteins using this novel graph-based approach has significant clinical relevance, particularly for precision medicine. These findings can guide personalized treatment strategies and enhance the understanding of cancer biology. Future research will extend this methodology to other types of cancers and clinical applicability.
Essential Protein Identification in Cancer: A Graph-Based Approach Integrating Topological and Biological Features in PPI Networks
Rout T., Mohapatra A., Kar M., Muduly D.K.
Article, SN Computer Science, 2024, DOI Link
View abstract ⏷
The essential protein identification on the protein–protein interaction (PPI) network can have crucial applications in cancer disease diagnosis and drug target cell identification. The study uses a graph-based approach to identify essential proteins in protein–protein interaction networks. Despite significant advancements in cancer research, identifying essential cancer proteins within PPI networks remains still a major challenge. The advantages of using PPI networks are the interconnectedness of cancer proteins and prioritize with the most significant impact on cancer disease progression. The proposed approach introduces an innovative way of identifying essential cancer proteins within PPI networks associated with breast, lung, colorectal, and ovarian cancers. This study commenced with an organized sequence of analytical procedures using cancer gene datasets from the National Center for Biotechnology Information (NCBI) about breast, lung, colorectal, and ovarian cancers. A graph-based random walk with restart (EPI-GBRWR), a novel method is introduced for exploring essential proteins that integrates topological and biological properties within PPI networks. A pivotal moment ensued with the implementation of an essential protein identification using graph-based random walk with restart, shedding light on the hierarchical influence of proteins within the PPI network. The outcomes of this investigation substantiate and contextualize the functional ramifications of the identified proteins through rigorous statistical assessments, including permutation and enrichment tests. The application of pathway analysis to these findings illuminates interconnected molecular pathways in cancer. This work underscores the potency of integrative methodologies in deciphering the complexity of cancer, presenting a transformative era in cancer research and treatment. The computational results confirm EPI-GBRWR’s efficiency in predicting essential proteins. Compared to other state-of-the-art methods for identifying essential proteins, EPI-GBRWR outperforms various evaluation criteria, marking a significant advancement in precision oncology.
Bioinformatics in Cancer: Key Proteins and Pathway Analysis
Rout T., Chakravarty S., Muduly D.K.
Conference paper, Proceedings of the 2024 International Conference on Artificial Intelligence and Emerging Technology, Global AI Summit 2024, 2024, DOI Link
View abstract ⏷
Pathway analysis is essential for understanding cancer's molecular mechanisms by identifying key proteins within biological pathways. This study presents a comprehensive framework using protein-protein interaction (PPI) networks and cancer gene datasets to uncover essential proteins involved in cancer progression. The methodology includes preprocessing cancer gene data, applying centrality measures to pinpoint crucial proteins, and conducting pathway enrichment analysis to explore dysregulated pathways. Results reveal 44 essential cancer proteins linked with breast, ovarian, lung, and colorectal cancers, spread across ten crucial pathways for cell cycle regulation, growth, and differentiation. Notable proteins like TP53, BRCA1, RPS27A, PCNA, CDK1, and CTNNB1 are highlighted for their significant roles in cancer pathogenesis. This approach provides a deeper understanding of the functional roles of these proteins and their impact on cancer pathways, advancing precision oncology by identifying potential therapeutic targets and personalized treatment strategies. Future efforts will focus on translating these findings into clinical practice for effective cancer treatment.
Bioinformatics Approaches to Cancer: Protein and Pathway Studies
Rout T., Chakravarty S., Muduly D.K.
Conference paper, 2024 International Conference on Augmented Reality, Intelligent Systems, and Industrial Automation, ARIIA 2024, 2024, DOI Link
View abstract ⏷
Pathway analysis is crucial for revealing the molecular mechanisms of cancer, as it emphasizes the important proteins that participate in various biological pathways. This study introduces a comprehensive framework leveraging protein-protein interaction (PPI) networks and cancer gene datasets to uncover essential proteins in cancer progression. The methodology encompasses preprocessing cancer gene data, applying centrality measures to identify critical proteins, and conducting pathway enrichment analysis to explore dysregulated pathways. The research identifies 44 essential cancer proteins associated with breast, lung, colorectal, and ovarian cancers, which are distributed across ten vital pathways related to cell cycle control, growth, and differentiation. Notable proteins such as TP53, BRCA1, RPS27A, PCNA, CDK1, and CTNNB1 are highlighted for their significant contributions to cancer initiation and progression. This approach enhances our understanding of the functional roles of these proteins and their influence on cancer pathways, advancing precision oncology by identifying potential therapeutic targets and personalized treatment strategies. Future efforts will translate these findings into clinical practice for more effective cancer treatment.
Machine Learning for Breast Cancer Detection: Svm-Based Predictive Modeling
Swain R.P., Rout T., Sahoo P., Sahoo M.
Conference paper, Proceedings - 2024 International Conference on Artificial Intelligence and Quantum Computing, AIQC 2024, 2024, DOI Link
View abstract ⏷
Cancer is a heterogeneous disease that can spread to any body part. Breast cancer is one of the most dangerous cancers among women, contributing significantly to mortality worldwide. This research proposed a novel method for predicting breast cancer using Support Vector Machine (SVM) classification. The dataset is obtained from the University of California Irvine machine learning repository (UCI) Machine Learning repository from the University of Wisconsin Hospitals. It comprises 699 instances, of which 458 are benign and 241 are malignant, with 11 key attributes. The Radial Basis Function (RBF) kernel is used for the SVM classification to map the input data into a higherdimensional space. This model achieved 97 % an impressive accuracy and shows its potential to accurately detect breast cancer early. Thus offering a valuable tool for early diagnosis and treatment. Python was utilized for all implementations, enabling efficient processing and analysis of the dataset. This approach highlights the effectiveness of SVM in medical diagnostics, particularly for breast cancer prediction.
Centrality Measures and Their Applications in Network Analysis: Unveiling Important Elements and Their Impact
Rout T., Mohapatra A., Kar M., Patra S., Muduly D.
Conference paper, Procedia Computer Science, 2024, DOI Link
View abstract ⏷
The applications of centrality measures in protein-protein interaction (PPI) network analysis are diverse and encompass fundamental biological insights; cancer disease-related discoveries, and practical implications for drug development. This multidimensional approach in PPI network analysis provides a comprehensive understanding of the pivotal elements and their impact on biological systems. Analyzing centrality measures in PPI networks enables the identification of essential proteins, hub and bottleneck proteins that occupy strategic positions within the PPI network structure. Essential proteins in PPI networks are significant elements that indicate their importance in maintaining PPI network integrity and functionality. Studying centrality measures can reveal hidden patterns and relationships within these PPI networks. This paper identifes PPI networks with a high degree of connectivity ("hubs") and proteins with high betweenness centrality (bottlenecks), along with closeness centrality and clustering coefficient. This measure's significance in PPI networks has implications for various felds. The proposed approach successfully identifed and characterized infuential proteins and found the top 20 essential proteins. These proteins likely hold significant functional importance through hubs and bottlenecks and serve as potential targets for further investigation. This approach has the potential to identify essential proteins involved in cancer diseases. Leveraging centrality measures in the analysis of PPI networks ofers a multifaceted approach to understanding cancer biology and its implications for personalized medicine, drug design, and the development of innovative cancer therapies.
Multi-Centrality and Path-Based Analysis for Essential Cancer Protein Detection in PPI Networks∗
Rout T., Chakravarty S., Mohanta R.K., Jyothi R.S.S., Varadarajan V.S.D., Patro P.
Conference paper, 2nd International Conference on Signal Processing, Communication, Power and Embedded Systems, SCOPES 2024, 2024, DOI Link
View abstract ⏷
This paper presents an innovative graph-based method for identifying essential proteins within PPI networks, explicitly targeting cancer-related genes in breast, lung, col- orectal, and ovarian cancers. The proposed approach explores centrality measures, utilizing betweenness centrality to distin- guish essential proteins from non-essential ones. The methodology begins by identifying common genes through a sequential pattern algorithm, a crucial step that helps select genes for further analysis. This is followed by constructing a PPI network for each cancer type. Network analysis is conducted to extract essential cancer proteins. A total of 15 top essential proteins are identified, with validation performed through permutation analyses to ensure the reliability of the findings. These proteins were mapped to significant cellular pathways, playing vital roles in cancer progression. Grouping these proteins into specific path- ways highlights their functional importance and the underlying molecular mechanisms driving cancer. The results of this study have significant implications for cancer research, particularly in the realm of precision medicine. The identification of essential proteins not only enhances our understanding of cancer but also opens avenues for developing targeted, personalized treatment strategies.
Identification of Essential Cancer Protein in PPI Network Using Graph-Based Approach
Rout T., Choudhury N.R.R., Panda S.K., Mohapatra A., Kar M., Muduly D.K.
Conference paper, IEEE Region 10 Humanitarian Technology Conference, R10-HTC, 2023, DOI Link
View abstract ⏷
Identifying essential cancer proteins is critical in cancer research, as targeting these proteins can lead to effective cancer treatments. The proposed method is primarily intended to design a novel approach for identifying essential cancer proteins in the PPI network using a graph-based method. This method is used to find the shortest path on a biological network. The network is analysed using centrality measures, and a threshold value of betweenness centrality filters out nonessential proteins from the list of candidate proteins. The proposed graph-based method revealed that 50 essential proteins are selected from the list of proteins. Finally, the permutation test is performed to validate the findings. The proposed graph-based method identifies disease-related proteins in various contexts by performing common gene identification, PPI network construction, network analysis and validation. This method can be adapted and applied to different cancer types by adjusting the gene selection criteria, network construction methods, and validation stratcgies.
Notice of Removal: Big data and its applications: A review
Rout T., Garanayak M., Senapati M.R., Kamilla S.K.
Retracted, International Conference on Electrical, Electronics, Signals, Communication and Optimization, EESCO 2015, 2015, DOI Link
View abstract ⏷
Big data is an overall compatible term for any collection of data sets too large and complex. It is difficult to process them using traditional data processing applications. Here the challenges include analyzing, capturing, searching, sharing, storage, transferring, visualization, and violations of privacy. The trend to larger data sets is due to the additional information derived from analysis of a single large set of data which are related with one another, as compared to separate smaller sets with the same amount of data totally which allows correlations to be found to trends in business applications, preventing diseases and reducing crime and so on. Big data is a term that describes any huge amount of structural, semi-structural and unstructured data that has the potential to be mined for information. Big data doesn't refer to any specific quantity yet. The term is often used when speaking about petabytes and exabytes of data.