top of page
Search

Shivam's paper is published in npj Systems Biology and Applications Journal.

  • Writer: complex analysis
    complex analysis
  • Nov 29
  • 1 min read

Shivam's paper titled "Feature learning augmented with sampling and heuristics (FLASH) improves model performance and biomarker identification" is published in npj Systems Biology and Applications Journal. Big biological datasets, such as gene expression profiles, often contain redundant features that degrade model performance and limit generalization across independent datasets with complexity like class imbalance and hidden sub-clusters. To overcome challenges, we present 'FLASH', a novel feature selection method combining filtration and heuristic-based systematic elimination. FLASH generates random samples and computes p-values for each feature using multiple statistical tests (t-test, ANOVA, Wilcoxon Rank-Sum, Brunner-Munzel, Mann-Whitney). Features are scored by aggregating significant p-values across samples. The coefficient from the machine learning model with the highest accuracy on the filtered features is used to rank them. Recursive elimination with cross-validation systematically removes features while monitoring accuracy. The final submit is selected based on the highest performance during elimination, to achieve effective feature selection. We show that our method preserves predictive performance on independent datasets. Our comprehensive evaluation across diverse datasets showed that FLASH outperforms the compared feature selection methods dRFE, Mutual Information, MRMR, ElasticNet, NeuralNet, Permutation test and SAGA within the scope of our tested datasets and evaluation settings. Additionally, features selected by FLASH demonstrated greater biological relevance, as evidenced by higher overlap with disease-associated genes from DisGeNET in an independent dataset.

 
 
 

Recent Posts

See All
PhD Position Alert!!

We've two PhD vacancies in our group. The research theme of our lab is to understand the mechanism of biological phenomena through mathematical and computational tools. The candidates with mathematica

 
 
 

Comments


Computational and Mathematical Biology Centre, BRIC-THSTI, NCR Biotech Science Cluster, Faridabad-121001, India

0129 2876 491

© 2035 by Complex Analysis Group, THSTI

bottom of page