Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

YBS 2021 | May 23, 2021 | Virtual Event

Virtual Viewing Hall

View By Category

College Fair
Student Challenge

Student Challenge Presentations

Presentation 01: Using Single-Cell Sequencing to Identify Changes in Cellular Identity in Aged Mice

Show
  • Aaron Lewis,

Short Abstract: Age is the most important risk factor for a number of diseases such as cancer, heart disease, and dementia. Being able to delay the onset of these diseases and reduce morbidity time is a goal of longevity research and would convey substantial savings in healthcare costs. A popular theory of aging - the Information Theory of Aging postulates aging is caused by changes in epigenetic information which creates aging phenotypes and changes in cellular identity as we age. I propose to use the latest innovations in sequencing and bioinformatics - single-cell sequencing - to gather data about changes in cellular identity at a more granular level than currently tested. I would use the existing single-cell dataset "Tabula Muris" to analyze changes in cellular identity of specific cell types which would give greater understanding for the Information Theory of Aging.


Video not uploaded

Presentation 02: Phylogenetic Analysis of SNHG1 non-coding gene overexpressed in several cancers using Bioinformatics.

Show
  • Bhagyashree Mishra,

Short Abstract: Small nucleolar RNA host gene 1 (SNHG1) is a non-coding gene has been found overexpressed in several cancer types. In order to understand its evolution and ancestry, we derive phylogenetic trees using similar sequences from BLAST search and analyze these trees. Specifically, we will use both distance-based and maximum likelihood trees in order to gain a consensus. We will present our results, preliminary as these are.


Video not uploaded

Presentation 03: OncoML: A Multi-omics-based Machine Learning Approach for Targeted Cancer Drug Prediction

Show
  • Darsh Mandera,

Short Abstract: The current approach to cancer treatment is a one-size-fits-all approach, failing to comprehend tumor heterogeneity, resulting in 75% ineffectiveness of cancer treatment. Recent research has focused on modeling of drug prediction by applying machine learning on genetic mutations or using microRNA (miRNA), a key biomarker of cancer. Although these approaches demonstrate improved potential of targeted drug prediction, they present some limitations. Gene mutations have shown to account for only a small subset of candidate biomarkers, and while miRNA-based gene expression is regarded as offering more predictive modalities, both can be complemented by the integration and analysis of the multi-omic view of cancer. In this research, machine learning model was trained and tested with over 80% of cancer types using gene mutation, miRNA, and drug response data from The Cancer Genome Atlas. Feature Selection using ExtraTreesClassifier identified 945 gene mutations and miRNAs as key features out of over 18,000 features. The model was tested with multiple machine learning classifiers including DecisionTree, K-NearestNeighbors, and Ensemble Learning-based approaches - AdaBoostClassifier and OneVsRestClassifier. OneVsRestClassifier, when combined with cross validation, outperformed other approaches and is able to predict drugs for cancer patients based on their gene mutations and miRNA data with an accuracy of 83%.


Video not uploaded

Presentation 04: Pairing a wearable multi-sensor patch with machine learning algorithms to recognize early signs of health complications such as stroke, seizures, and heart attacks

Show
  • Ishan Doma,

Short Abstract: Throughout life, the body suffers from many health issues such as high blood pressure or abnormal blood sugar, which leads to events such as a heart attack or diabetic shock. Although some of these attacks can be prevented by a healthy lifestyle, others may appear without warning or apparent cause, especially in old age. The example of seizures can be especially dangerous, as an unforeseen seizure without a place to lie down can lead to additional injuries. An inexpensive preventative system to warn patients at risk for these attacks would be extremely useful, especially in geriatric populations, who are at heightened risk for sudden health conditions. An efficient monitoring system can come in the form of a wearable smart patch. Patch-based sensors for blood pressure, heart rate, blood oxygen saturation, and blood glucose content already exist, and can be integrated into the smart patch. However, processing the data from these sensors for a large population such as one of a hospital can be a challenge. Machine learning algorithms can assist in this task, and significantly reduce the amount of monitoring needed for the smart patch. Previous research has established that vital signs including oxygen saturation and heart rate often exhibit specific patterns well before an episode. Multilayer perceptron-style networks, capable or recognizing features of images of numerals part of the MNIST dataset, can easily recognize features of plots of vitals over time and associate them with either normal patterns, or a warning that a patient may be at immediate risk for an attack (and then alert doctors to give care to a patient). Due to multiple vitals being collected at one time, patterns involving multiple vitals can also be recognized by a machine learning network. The volume of data needed for this can come from the use of the smart patch on hospital patients, and the dataset will only increase as the smart patch is used in more institutions. This technology is very versatile, and ML algorithms which learned from hospital data can also be used to analyze the data of patients in less intensive medical environments, such as assisted living facilities or even at home. With other sensors, the smart patch can also be extended to more uncommon diseases. Patients with liver failure often cannot process ammonia, and their blood ammonium content must be carefully monitored. Monitoring blood ammonia content along with other vitals can train a machine learning network to recognize patterns occurring before a spike in ammonium content (signaling doctors to provide care). Overall, a smart multi-sensor patch monitoring a patient’s vitals is a versatile tool, and one that other sensors can be incorporated into easily. A machine learning network that recognizes patterns preceding an attack that needs immediate response sharply increases the smart patch’s utility and will ensure that patients receive the care they need at critical times.


Video not uploaded

Presentation 05: BLASTing CpG rich DNA sequences generated by elementary cellular automata reveals significant similarity to bacterial species.

Show
  • Jigar Sheth,

Short Abstract: As for instance, we consider rule 30 elementary cellular automata in simple initial condition and the binary encoding 01= CG with 1 degree of freedom, noting that both the reverse and compliment of 01=01 (self similarity). And with 2 degrees of freedom we consider 0=A and 1=T or vice versa, consistent with the fact (cytosine, guanine) and (adenine, thymine) are complimentary nucleotide bases. While generating the DNA sequences starting from initial condition of single black cell in rule 30, we do consider the immediate left white cell in every step of rule 30 progression ensuring a minimal threshold of CpG sites. On the lines of Shannon-Fano encoding internal "01" are again encoder as CpG sites, leaving the remnant 0s and 1s to the liberty of being encoded as As and Ts. In this formalism, by apprehending the t-suffix of each and every such binary encoder sequence with the t-prefix followed by the rest of the binary sequence in the subsequent generation and so on up until finite number of desired steps. We now perform a nucleotide BLAST of these sequence structures potentially for all the 256 elementary cellular automata rules in order to decipher a phylogenetic interrelationship amongst the BLAST hits of mostly bacterial species. As in the case evident from such an exploratory exercise when performed with rule 30 binary sequences. A far fetched task in this direction is to cluster those rules which provide us with pathogenic bacterial hints whilst having performed BLASTn, not only with simple but random initial conditions also. Thus, paving the way for predicting nearest neighborhood dependent nucleotide substitutions and pseudo genome segment assemblies driven by greedy alignments which have predictive implications in line with the theme of "Healthcare improvements due to Bioinformatics".


Video not uploaded

Presentation 06: A Multi-Network Deep Learning Algorithm for Comprehensive Thoracic Radiograph Classification

Show
  • Rohan Bhansali,

Short Abstract: Chest X-rays are the most frequently performed medical imaging procedure, supported by their critical role in diagnosing thoracic diseases, including lung carcinoma, pneumonia, and tuberculosis. Despite their ubiquitous nature, they are considered to be among the most difficult radiographs to interpret, with the challenge exacerbated by human limitations such as cognitive or perceptual biases. Furthermore, with the overwhelming majority of the global population lacking access to radiologists, there exists a stark shortage of qualified diagnostic experts. Recent advances in deep learning algorithms, specifically convolutional neural networks, have demonstrated significant promise towards automated, large-scale chest X-ray classification, alleviating this deficiency while additionally decreasing healthcare costs and diagnostic delays even beyond resource-deficient regions. However, these algorithms fall short in incorporating different view positions and patient symptoms, which are essential for rigorous diagnosis. Accordingly, we developed a multi-network model that concurrently classifies posteroanterior, anteroposterior, and lateral chest X-rays and outputs a unified diagnosis from fourteen disease classifications. We optimized our model’s hyperparameters using the MIMIC-CXR dataset, a collection of 377,110 chest X-ray images sourced from 227,835 imaging studies from 65,379 patients at the Beth Israel Deaconess Medical Center Emergency Department between 2011-2016. The images were passed through the Laplacian filter to highlight meaningful features within the scans, thereby reducing computational expense while boosting performance. The Laplacian is a second order differential operator defined as the divergence of the gradient field, where the value is greater when the rate of the change in the measured value is greater. This results in the filter acting like an edge detector when applied to images, as edges in an image would have a large, abrupt change in pixel values, resulting in white areas on the transformed images; other areas on the image would have gradual change or no change at all in the pixel values, resulting in darker areas on the transformed image. We then utilized the processed images to train three distinct 121-layer convolutional neural networks and subsequently concatenated them to provide a fused prediction. We found that applying the Laplacian filter significantly increased performance across the board, enabling our model’s classification accuracy to increase substantially from 89.2% to 92.8% when validated on a testing set of 74,384 chest X-rays. Our model’s performance exceeded that of practicing radiologists in both efficiency and accuracy. Comparatively, they attained an average accuracy of 78% across the same fourteen disease classifications and required longer time for diagnoses by multiple orders of magnitude. The implications of these results are twofold; they reaffirm the developments of previous research integrating deep learning and clinical diagnosis while concurrently suggesting the newfound efficacy of the Laplacian filter, namely its potential for application in other medical imaging modalities. We describe an inexpensive, efficient, and reliable screening tool for cardiopulmonary diseases capable of reading and interpreting the nearly two billion chest X-rays taken annually. Its versatile nature allows it to be deployed in diverse environments, from aiding developing countries plagued with inadequate healthcare to streamlining metropolitan hospitals brimming with patients.


Video not uploaded



International Society for Computational Biology
525-K East Market Street, RM 330
Leesburg, VA, USA 20176

ISCB On the Web

Twitter Facebook Linkedin
Flickr Youtube