Loading…
Attending this event?

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Monday, October 14
 

12:00pm

Conference Registration and Networking
Very light refreshments will be provided

Monday October 14, 2019 12:00pm - 1:00pm

1:00pm

Welcome
Speakers
avatar for Lydia E. Kavraki

Lydia E. Kavraki

Director Ken Kennedy Institute, Noah Harding Professor of Computer Science, Professor of Bioengineering, Professor of Electrical and Computer Engineering, Profesor of Mechanical Engineering, Rice University
Lydia E. Kavraki is the Noah Harding Professor of Computer Science, professor of Bioengineering, professor of Electrical and Computer Engineering, and professor of Mechanical Engineering at Rice University. She received her B.A. in Computer Science from the University of Crete in... Read More →
avatar for Jan E. Odegard

Jan E. Odegard

Executive Director Ken Kennedy Institute/ Associate Vice President Research Computing, Rice University
Jan E. Odegard Executive Director, Ken Kennedy Institute for Information Technology and Associate Vice President, Research Computing & Cyberinfrastructure at Rice University. Dr. Odegard joined Rice University in 2002, and has over 15 years of experience supporting and enabling research... Read More →


Monday October 14, 2019 1:00pm - 1:15pm
BRC 103

1:15pm

Towards Machine Learning Health Systems - Big Data Is Not Enough
Digital systems have enabled dramatic improvements in our ability to communicate, share information, and control machines.  In parallel, there has been a dramatic growth in our ability to measure a myriad of different quantities in our daily lives and in our practice of medicine.  The variety and velocity of these new measured quantities when combined with our ability to record them is far outstripping our ability to extract real benefit.  In addition, these measurements are not being collected in isolation, but rather have rich contextual ‘meta-data’ that detail the social, temporal, and spatial conditions with which they are associated.  While there is substantial excitement about the potential for this ‘big data’ to advance our understanding and management of human health, the reality is that we are not yet equipped with the mindset, skillset or toolset to realize this promise.  In this overview, the presenter highlights the need to re-think our approach to managing health-related data from a social and technological perspective and offers illustrative vignettes of a future in which socio-technological solutions are adapted to facilitate the paradigm of a learning health system."

Speakers
avatar for David Jaffray

David Jaffray

Chief Technology and Digital Officer, MD Anderson Cancer Center
In the summer of 2019 Dr. David Jaffray was named MDAnderson Cancer Center's new Chief Technology and Digital Officer. In this role, Jaffray will oversee MD Anderson’s Information Services division and Information Security department. He also will hold a faculty appointment as professor... Read More →


Monday October 14, 2019 1:15pm - 2:00pm
BRC 103

2:00pm

Artificial Intelligence in Medicine with Examples in Digital Pathology and Computed Tomography Perfusion
Artificial Intelligence (AI) in medicine is a highly expanding field, with applications ranging from surgery robotics, visualization, segmentation, diagnostics and beyond.  Worldwide we see a strong increase in the amount of medical data and information collected for diagnostic purposes every year.  However the number of human specialists, like radiologists and pathologists are not increasing correspondingly.  Specifically, we have a large increase in tissue samples, histological stains, at the laboratories throughout the western world, but the number of trained pathologists is reported to be decreasing.     At the same time, there has been a tremendous development within different branches of AI in recent years, for a wide range of applications. Especially, the development of deep neural network (DNN) architectures have showed important results throughout the world of image processing and computer vision, and has also made its way into medical image applications. In this talk we will focus on some of the challenges and possibilities of AI in medicine through some examples from digital pathology and Computed Tomography (CT) perfusion. In digital pathology, histological stains are scan by microscopy scanners producing high resolution whole slide images (WSI) of 400 times magnification. We will look at examples of AI and DNN architectures used for analysis of whole slide images from urinary bladder cancer and melanoma, for the purpose of tissue classification and segmentation as well as diagnostics, predicting cancer grade and extracting prognostic information. CT perfusion of the head is fast and painless, and it produce 4D image data, i.e. 3D + time.  It is a useful technique for measuring blood flow to the brain, and an important tool to assess patients with cerebral stroke.  In this talk we will look at some AI approaches using image processing and DNN for analysing CT perfusion for stroke patients. "

Speakers
avatar for Kjersti Engan

Kjersti Engan

Professor of Electrical Engineering and Computer Science, University of Stavanger, Norway
Kjersti Engan is a full professor at the Electrical Engineering and Computer Science department at the University of Stavanger (UiS). She received the BE degree in electrical engineering from Bergen University College in 1994 and the M.Sc. and Ph.D degrees in 1996 and 2000 respectively... Read More →


Monday October 14, 2019 2:00pm - 2:30pm
BRC 103

2:30pm

The Rice Data to Knowledge (D2K) Lab
The Rice D2K Lab is a campus hub for data science experiential learning and research that brings together students and faculty from across the university and connects them with real-world problems and opportunities in data science.  In this talk, we will introduce the D2K Lab's programs and events and present our new experiential data science curriculum, highlighting some example projects and their impact.

Speakers
avatar for Genevera Allen

Genevera Allen

Professor ECE, Founder and Faculty Director of the Rice D2K Lab, Rice University
Genevera Allen is the Founder and Faculty Director of the Rice D2K Lab, an Associate Professor of Statistics, Computer Science, and Electrical and Computer Engineering at Rice University, and an investigator at the Neurological Research Institute at Baylor College of Medicine.  Her... Read More →
avatar for Cara Tan

Cara Tan

D2K Lab Project, Undergraduate Student in Computer Science (BS) and Statistics (BA), Rice University
Cara Tan is a Rice undergraduate student in Computer Science (BS) and Statistics (BA), graduating in May 2020. Cara was on a student team in the D2K Learning Lab (Spring 2019) that was sponsored by Bill.com. She will be a D2K Undergraduate Fellow for the Fall 2019 semester.


Monday October 14, 2019 2:30pm - 3:00pm
BRC 103

3:00pm

Break
Substantial refreshments will be provided

Monday October 14, 2019 3:00pm - 3:30pm

3:30pm

Sharing Resources in the Age of Deep Learning
Communication patterns of Data Science workloads have become increasingly relevant for Cloud Service Providers, High-Performance Computing (HPC) centers and Database hosting infrastructure. For users, contention for shared resources manifests as slow access to filesystems and run-to-run performance variability. Distributed training of Deep Neural Networks (DNNs) requires high-bandwidth, low-latency networks, high-performance filesystems, and GPU resources. Distributed training is known to impact other users' applications but this impact has yet to be quantified. We use the Global-Performance-and-Congestion-Network-Tests (GPCNeT) to model common Data Science applications running alongside proxied distributed training instances to quantify the potential impact. We compare state-of-the-art interconnect features against previous generations, and demonstrate Congestion Management and Adaptive Routing can mitigate common performance issues experienced in multi-user environments, reducing the impact by up to 5x in some cases.

Speakers
JB

Jacob Balma

Presenter, Cray, Inc
RW

Richard Walsh

Cray, Inc
NH

Nick Hill

Cray, Inc



Monday October 14, 2019 3:30pm - 3:45pm
BRC 280

3:30pm

Identification of Kernels in a Convolutional Neural Network: Connections Between Level Set Equation and Deep Learning for Image Segmentation
Two common techniques for image segmentation - level set methods and convolutional neural networks (CNN) - rely on alternating convolutions with nonlinearities to describe image features: neural networks with mean-zero convolution kernels can be treated as upwind finite difference discretizations of differential equations. Such a comparison provides a well-established framework for proving properties of CNNs, such as stability and approximation accuracy. We test this relationship by constructing a level set network, a CNN where forward-propagation is equivalent to solving the level set equation. The level set network achieves comparable segmentation accuracy to solving the level set equation, while not obtaining the accuracy of a CNN. We therefore analyze which convolution filters are present in our CNN, to see whether finite difference stencils are learned during training. We observe certain patterns form in the decoding layers of the network, where kernels cannot be accounted for by finite difference stencils alone.

Speakers
JA

Jonas Actor

Presenter, Rice University
DF

David Fuentes

UT MD Anderson Cancer Center
avatar for Beatrice Riviere

Beatrice Riviere

Noah Harding Chair and Professor, Rice University



Monday October 14, 2019 3:30pm - 3:45pm
BRC 103

3:45pm

Data-Driven Super-Parametrization Using Deep Learning: Climate Models that Do Not Affect Our Climate
Certain physical processes that play key roles in the weather-climate system occur at such small spatial and fast time scales that resolving them can be very expensive. These subgrid-scale processes (denoted by variable Y), are parameterized using semi-empirical schemes as a function of the large-scale-slow variables (X) that are explicitly solved. Multi-scale numerical models dubbed super-parameterization (SP), improved simulation of climate variabilities and extremes, but is computationally prohibitive for many applications. Here, we show recurrent neural networks (RNNs) can be used for data-driven super-parameterization (DDSP): To solve for X numerically at low resolution, and emulate Y at higher numerical resolutions using RNNs. With a multi-scale Lorenz 96 chaotic system as the testbed and examining both predicted short-term trajectory (weather forecasting) and reproducing long-term statistics (climate simulation) we show that DDSP outperforms state-of-the-art and can achieve the accuracy of SP at a much lower computational cost.

Speakers
AC

Ashesh Chattopadhyay

Presenter, Rice University
AS

Adam Subel

Rice University
PH

Pedram Hassanzadeh

Rice University
KP

Krishna Palem

Rice University



Monday October 14, 2019 3:45pm - 4:00pm
BRC 280

3:45pm

Prediction Models for Integer and Count Data
We propose a simple yet powerful framework for modeling integer-valued data, such as counts, scores, and rounded data. The integer-valued data are modeled by Simultaneously Transforming And Rounding (STAR) a continuous-valued process, where the transformation may be known or learned from the data. STAR produces a flexible class of integer-valued processes, which can account for zero-inflation, bounded or censored data, and over- or underdispersion. Scalable computation is available via an efficient MCMC algorithm, which provides a mechanism for direct adaptation of successful Bayesian methods for continuous data to the integer-valued data setting. Using the STAR framework, we develop new additive models and Bayesian Additive Regression Trees (BART) for integer-valued data. The predictive and inferential capabilities of STAR are illustrated using a medical utilization dataset and an animal abundance dataset, with exceptional predictive and computational performance.

Speakers
DK

Daniel Kowal

Presenter, Rice University
AC

Antonio Canale

University of Padova



Monday October 14, 2019 3:45pm - 4:00pm
BRC 103

4:00pm

Middle-Distant Reading: Big Data Meets Big Humanities Scholarship
Computational Textual Analysis (CTA) is a controversial sub-field in the digital humanities. Previous work has shown that critics misunderstand the field’s objectives but also that practitioners overstate their results. Using the JSTOR/Folger Shakespeare dataset, I demonstrate that supercomputing on big humanities data can help both camps to understand one another. 
I designed and optimized: an HPC job using Python and OpenMPI to transform the Shakespeare dataset from a citational network with 623,428 edges to a co-citational network with 29,256,101 entries (5,080 CPU hours); and an HTC job (16,000 CPU hours) that reduced this dataset for a Shakespeare recommendation system. 
The resulting recommendation system powers a simple intertextual reading interface. Its recommendations qualitatively outperform standard CTA approaches. Computational analysis of humanities data can make better use of big data, especially the quantifiable interpretive activities of trained practitioners.

Speakers
JM

John Mulligan

Presenter, Rice University



Monday October 14, 2019 4:00pm - 4:15pm
BRC 280

4:00pm

E^2-Train: Energy-Efficient Deep Network Training with Data-, Model-, and Algorithm-Level Savings
Convolutional neural networks (CNNs) have been increasingly deployed to edge devices. Hence, many efforts have been made towards efficient CNN inference on resource-constrained platforms. This paper attempts to explore an orthogonal direction: how to conduct more energy-efficient training of CNNs? We strive to reduce the energy cost during training from three complementary levels: stochastic mini-batch dropping on the data level; selective layer update on the model level; and sign prediction for low-cost, low-precision back-propagation, on the algorithm level. Extensive simulations and ablation studies, with real energy measurements from an FPGA board, confirm the superiority of our proposed strategies and demonstrate remarkable energy savings for training. For example, when training ResNet-110 on CIFAR-100, an over 84% training energy saving comes at the small accuracy costs of 2% (top-1) and 0.1% (top-5).

Speakers
YW

Yue Wang

Presenter, Rice University
ZJ

Ziyu Jiang

Texas A&M University
XC

Xiaohan Chen

Texas A&M University
PX

Pengfei Xu

Rice University
YZ

Yang Zhao

Rice University
ZW

Zhangyang Wang

Texas A&M University
YL

Yingyan Lin

Rice University



Monday October 14, 2019 4:00pm - 4:15pm
BRC 103

4:15pm

Dynamic Pricing with Hybrid Linear Contextual Bandit
In recent times, many industries have become interested in developing algorithms for dynamically varying prices based on product, customer and other market related features, to drive profitability and revenue growth. However, in many real-world use cases, where historical prices show very little or no variation, typical methods for dynamic pricing by estimating purchase probabilities are not applicable. In this study, we develop practical approaches for dynamic pricing based on reinforcement learning. We propose a hybrid contextual bandit model considering the customer or product features. The pricing setting presents significant challenges to the application of the multi-armed bandit framework since the arms are highly correlated. To capture correlation across arms, we consider a hybrid model with both arm independent and dependent features. We demonstrate using simulations that these methods can efficiently discover the optimal prices and provide a revenue lift of up to 7.6% as compared to the current practices.

Speakers
avatar for Yan Xu

Yan Xu

Presenter, PROS Inc
RK

Ravi Kumar

PROS Inc
avatar for Justin Silver

Justin Silver

Senior Scientist, PROS Inc



Monday October 14, 2019 4:15pm - 4:30pm
BRC 280

4:15pm

Graph Convolutional Networks: Bringing the Deep Learning Revolution to Graphs
Deep learning (particularly deep convolutional neural networks) have revolutionized the field of data-driven modeling techniques. Through an increase in data coverage, computational horsepower, and model complexity we have regularly seen records set and broken in computer vision. 

One fundamental limitation of deep convolutional networks is the data they apply to. A “convolution” is in essence a dense matrix-matrix multiply. It operates on dense, regularly structured data (e.g. a digital image with RGB values for every pixel in a regular grid). 

This presentation offers an overview of graph convolutional networks (GCNs) – an extension to classical convolutional networks for irregular data. It includes an overview of GCNs and a walkthrough of two example applications of GCNs (one unsupervised and one semi-supervised) on a real world streaming dataset from the GDELT Project.

Speakers
MG

Max Grossman

Presenter, 7pod Technologies LLC



Monday October 14, 2019 4:15pm - 4:30pm
BRC 103

4:30pm

RL in Vehicle Routing and Scheduling
Historically, transportation companies have relied on either (1) scheduling solutions rooted in heuristic-based linear programming where a small set of rules reduce an overwhelming set of computational complexity or (2) rooms full of human schedulers with spreadsheets, 3-ring binders and post-it notes. In either case, companies struggle to maximize revenue miles while efficiently managing empty vehicle movement, personnel turnover and cost, and meeting market demand. In this discussion we will review the use of multi-agent reinforcement learning to significantly improve on current market solutions. We will discuss the simulation of transportation requests based on historical data, the state and action spaces, rewards assignment and supporting technology infrastructure.

Speakers
BT

Brian Thompson

Presenter, Expero Inc.
avatar for Ryan Brady

Ryan Brady

Expero Inc.
DF

Daniel Fay

Expero Inc.
EB

Emmett Bertram

Expero Inc.



Monday October 14, 2019 4:30pm - 4:45pm
BRC 280

4:30pm

HodgeNet: Flow Interpolation with Graph Neural Networks
Recently, neural networks have been generalized to process data on graphs, with cutting-edge results in traditional tasks such as node classification and link prediction. These methods have all been formulated in a way suited only to data on the nodes of a graph, based on spectral graph theory. Using tools from algebraic topology, it is possible to reason about oriented data on higher-order structures by relying on the so-called Hodge Laplacian. Our goal is to develop techniques for applying the Hodge Laplacian to process data on higher-order graph structures using graph neural networks. To illustrate the practical value of this framework, we tackle the problem of flow interpolation: Given observations of flow over a subset of the edges of a graph, how can flow over the unobserved edges be inferred? We propose an architecture based on recurrent neural networks for performing flow interpolation, and demonstrate it on urban traffic data.

Speakers
TM

T. Mitchell Roddenberry

Presenter, Rice University
SS

Santiago Segarra

Rice University



Monday October 14, 2019 4:30pm - 4:45pm
BRC 103

4:45pm

A Real-Time Streaming Analytic Pipeline for the Auto-Classification of High-Dimensional Celestial Data Using Innovative Hybrid Machine Learning Techniques
The automatic detection and classification of celestial data on ingest is of growing importance as the volume and velocity of these survey images increases. Here we present a multistage data pipeline and with integrated machine learning classifier, based on the Kohonen self-organizing map (SOM). SOMs are simple to implement and they learn in and unsupervised manner. This allows a SOM Artificial Neural Networks (ANN) to be trained without the pre-classification of the training data set, rendering SOM results free from human bias. In turn, the open source data pipeline is built for absolute speed and efficiency. Data is collected, coalesced and classified based on a real-time streams framework consisting of Apache Flink and Apache Spark. Interesting radio sources are collected in a deep object store for further analysis and review, while data labeled as interference is discarded.

Speakers
TM

Theresa Melvin

Presenter, HPE


Monday October 14, 2019 4:45pm - 5:00pm
BRC 280

4:45pm

Imitate Like a Baby: The Key to Efficient Exploration in Deep Reinforcement Learning
Mimicking the behavior of an expert player in a Reinforcement Learning (RL) Environment to enhance the training of a novice agent from scratch is called Imitation Learning (IL). In most RL environments, the state sequences an agent encounters is a Markov Decision Process. This makes mimicking difficult as it is unlikely that a new agent may encounter similar state sequences as an expert. Prior research in IL proposes to learn a mapping between expert's states and actions, needing considerable number of state-action pairs to achieve good results. We propose an alternative to IL by appending the novice's action space with frequent action sequences of the expert. This modification improves the exploration and significantly outperforms alternatives like Dataset-Aggregation. We experiment with popular Atari games and show significant and consistent growth in the score that the new agents achieve using just a few expert action sequences.

Speakers
TM

Tharun Medini

Rice University
avatar for Anshumali Shrivastava

Anshumali Shrivastava

Presenter, Rice University
Anshumali Shrivastava is an Assistant Professor in the Department of Computer Science at Rice University with joint appointments in Statistics and ECE department. His broad research interests include large scale machine learning, randomized algorithms for big data systems and... Read More →



Monday October 14, 2019 4:45pm - 5:00pm
BRC 103

5:00pm

Networking Gala
Substantial hors d'oeuvres, beer, and wine will be provided

Monday October 14, 2019 5:00pm - 7:00pm
 
Tuesday, October 15
 

7:30am

Breakfast
Breakfast tacos will be provided

Tuesday October 15, 2019 7:30am - 8:30am

8:30am

Welcome
Speakers
avatar for Jan E. Odegard

Jan E. Odegard

Executive Director Ken Kennedy Institute/ Associate Vice President Research Computing, Rice University
Jan E. Odegard Executive Director, Ken Kennedy Institute for Information Technology and Associate Vice President, Research Computing & Cyberinfrastructure at Rice University. Dr. Odegard joined Rice University in 2002, and has over 15 years of experience supporting and enabling research... Read More →
MY

Moshe Y. Vardi

Rice University


Tuesday October 15, 2019 8:30am - 8:45am
BRC 103

8:45am

Data in the Cloud
The cloud has forced a rethinking of database architectures. Does this offer an opportunity to address the siloed nature of data management systems? The question is especially important given the rise of machine learning and data governance. In this talk, I'll discuss these issues through the lens of the Microsoft data journey, both internal and external.

Speakers
avatar for Raghu Ramakrishnan

Raghu Ramakrishnan

CTO for Data, Microsoft
Raghu Ramakrishnan is CTO for Data, and a Technical Fellow at Microsoft. From 1987 to 2006, he was a professor at University of Wisconsin-Madison, where he wrote the widely used text “Database Management Systems”. In 1999, he founded QUIQ, a company powering crowd-sourced question-answering... Read More →



Tuesday October 15, 2019 8:45am - 9:30am
BRC 103

9:30am

Using Deep Learning to Understand Documents
Bill.com is working to build a paperless future. We parse through 60M documents a year ranging from invoices, contracts, receipts and a variety of other types. Understanding those unstructured documents is critical to building intelligent products for our users. While the field of optical character recognition (OCR) has been around for almost half a century, document parsing and field extraction from images remain an open research topic and challenging problem to date. Computer vision algorithms utilizing deep learning have rapidly progressed and allow image segmentation at the region and pixel level in domains such as self-driving cars and scenery embeddings. We utilize an end-to-end deep learning and OCR architecture to predict regions of interest within documents and extract their text."

Speakers
avatar for Eitan Anzenberg

Eitan Anzenberg

Director of Data Science, Bill.com
Eitan Anzenberg is the Director of Data Science at Bill.com. He has 7+ years experience in Data Science with a background in machine learning, applied statistics, modeling and engineering. Eitan was a Postdoctoral Scholar at Lawrence Berkeley National Lab, received his PhD in Physics from Boston University and his B.S. in Astrophysics fr... Read More →


Tuesday October 15, 2019 9:30am - 10:00am
BRC 103

10:00am

Break
Tuesday October 15, 2019 10:00am - 10:30am

10:30am

Forecast the Future of US Oil and Gas Supply
We use Monte Carlo simulation to capture the details of the oil and gas production process in order to forecast the US oil and gas supply. This differs from a pure machine learning model where the process is treated as a black box, and the underlying mechanism is not known. The model utilizes rig count and historical well production data to forecast oil and gas production within the United States. The model first computes the number of first production wells from the rig count by analyzing the drilling process. Then, it forecasts the future production of each well by fitting Arps’ equation. The model forecasts production at the basin level for both tight and conventional wells for a given scenario. A scenario is a set of parameters including future rig count, drill time, completion time, idle time, backlog probability, and initial production.

Speakers
JH

Jiangchuan Huang

Presenter, ConocoPhillips
DB

Darryl Buswell

ConocoPhillips
JP

James Pearson

ConocoPhillips



Tuesday October 15, 2019 10:30am - 10:45am
BRC 103

10:30am

A Sequential Decision-Making Model for Diagnosis and Medication
Speakers
YK

Yejin Kim

University of Texas Health Science Center at Houston


Tuesday October 15, 2019 10:30am - 10:45am
BRC 280

10:45am

Machine Learning and the Internet of Things Enable Steam Flood Optimization for Improved Oil Production
Recently developed machine learning techniques, in association with the Internet of Things (IoT) allow for the implementation of a method of increasing oil production from heavy-oil wells. In contrast to traditional simulations of steam flood, a widely used enhanced oil recovery technique based on principles of classic physics, we introduce here an approach using cutting-edge machine learning techniques that have the potential to provide a better way to describe the performance of steam flood. We propose a workflow to address a category of time-series data that can be analyzed with supervised machine learning algorithms and IoT. We demonstrate the effectiveness of the technique for forecasting oil production in steam flood scenarios. Moreover, we build an optimization system that recommends an optimal steam allocation plan, and show that it leads to a 3% improvement in oil production. We develop a minimum viable product on a cloud platform.

Speakers
avatar for Mi Yan

Mi Yan

Data Scientist, ExxonMobil
Mi Yan is a data scientist at ExxonMobil where he focuses on the application of machine learning in the oil and gas industry ranging from upstream to downstream. Mi received his PhD in physics from Rice University, and then served as a geophysicist at CGG. Later Mi joined Citibank... Read More →
JM

Jonathan MacDonald

Imperial Oil Resources Ltd.
CR

Chris Reaume

ExxonMobil
WC

Wesley Cobb

ExxonMobil
TT

Tamas Toth

ExxonMobil



Tuesday October 15, 2019 10:45am - 11:00am
BRC 103

10:45am

Application of Deep Learning to Automated Diagnosis of Lymphoma with Digital Pathology Images
Recent studies have shown promising results in using Deep Learning to detect malignancy in whole slide imaging. 

Studies had been limited to just predicting positive or negative finding for a specific malignancy. We attempted to build a diagnostic model for four diagnostic categories of lymphoma. 

Our Deep Learning software, a convolutional neural network, was written in Python language. We obtained digital whole-slide images of 128 cases including 32 cases for each diagnostic category. Four sets of 5 representative images, 40x40 pixels in dimension, were taken for each case. For each test set of 5 images, the predicted diagnosis was combined from prediction of five images. The test results showed excellent diagnostic accuracy at 95% for image-by-image prediction and at 100% for set-by-set prediction. 

This preliminary study provided a proof of concept for incorporating automated lymphoma diagnostic screen into future pathology workflow to augment the pathologists’ productivity.

Speakers
AN

Andy Nguyen

Presenter, University of Texas Health Science Center at Houston
HE

Hanadi El Achi

University of Texas Health Science Center at Houston
TB

Tatiana Belousova

University of Texas Health Science Center at Houston
LC

Lei Chen

University of Texas Health Science Center at Houston
AW

Amer Wahed

University of Texas Health Science Center at Houston
IW

Iris Wang

University of Texas Health Science Center at Houston
ZH

Zhihong Hu

University of Texas Health Science Center at Houston
ZK

Zeyad Kanaan

University of Texas Health Science Center at Houston
AR

Adan Rios

University of Texas Health Science Center at Houston



Tuesday October 15, 2019 10:45am - 11:00am
BRC 280

11:00am

Near Real-Time Hydraulic Fracture Event Recognition using Deep Learning Methodology
Historically, the real-time hydraulic fracturing analytics system (Real-Time Completion system, RTC) has relied heavily on manual labeled data. The manual tasks, including fracture stage start/end labeling and ball pumpdown/seat event labeling, suffer from human bias and inconsistent errors, and can easily take days to review and correct. This paper provides the development and technical details of the automated stage-wise KPI report generator that fills the manual task gaps and provides industry-leading performance. The generator is constructed with two machine learning models that detect the stage start and end and identify the ball pumpdown and seat operations. These tasks are performed based on the reliably available measurements of slurry rate and wellhead pressure, which enable the real-time automated stage-wise KPI analysis, and they also lay the foundation for further advanced analysis regarding real time hydraulic fracture operational decision making.

Speakers
YS

Yuchang Shen

Anadarko Petroleum Corporation
DC

Dingzhou Cao

Presenter, Anadarko Petroleum Corporation
KR

Kate Ruddy

Anadarko Petroleum Corporation



Tuesday October 15, 2019 11:00am - 11:15am
BRC 103

11:00am

Measuring Neurodegeneration with Brain Imaging and Digital Device Interaction
Speakers
avatar for Luca Giancardo

Luca Giancardo

Presenter, University of Texas Health Science Center at Houston
Luca Giancardo is an Assistant Professor at the Center for Precision Health, School of Biomedical Informatics (SBMI), UTHealth with co-appointments at and at the Diagnostic and Interventional Imaging, McGovern Medical School, UTHealth and the Institute for Stroke and Cerebrovascular... Read More →



Tuesday October 15, 2019 11:00am - 11:15am
BRC 280

11:15am

Automated Formation Top Labeling and Well Depth Matching by Machine Learning
Depth matching of multiple logging curves is essential to any well evaluation or reservoir characterization and can be applied to various measurements of a single or multiple logging curves from multiple wells within the same field. As many drilling advisory projects have been launched to digitalize the well log analysis, accurate depth matching becomes an important factor in improving well evaluation, production, and recovery. It is a challenge, though, due to the unpredictable structure of the geological formations. We conduct a study on the alignment of multiple gamma-ray well logs by using machine learning techniques. The objective is to automate the depth matching task with minimum human intervention. A novel multitask learning approach is presented to optimize the depth matching strategy that correlates gamma-ray logs. The proposed approach can be extended to other applications as well, such as automatic formation top labeling for an ongoing well given a reference well.

Speakers
SW

Shirui Wang

Presenter, University of Houston
QS

Qiuyang Shen

University of Houston
XW

Xuqing Wu

University of Houston
JC

Jiefu Chen

University of Houston



Tuesday October 15, 2019 11:15am - 11:30am
BRC 103

11:15am

Brain Tissue Analytics for Accelerating Drug Discovery
While international brain mapping initiatives remain focused on the structure and working of the healthy brain, the need to map the unhealthy brain is compelling and urgent. Pointedly, traumatic brain injury (TBI) inflicts pathological alterations in all types of brain cells, at scales ranging from individual cells to multi-cellular functional units, to the layered brain cytoarchitecture. Furthermore, it inflicts a mix of alterations due to the primary injury, secondary injuries, regenerative processes, inflammation, tissue remodeling, drug treatments, and drug side effects. Many of these alterations can be subtle and/or latent, only discernible by sensing changes in cell morphology and/or the expression and/or intra-cellular distribution of specific molecular markers, and can be distant from the injury/damage site. Unfortunately, current immunohistochemistry (IHC) methods reveal only a fraction of these alterations, and do not provide quantitative readouts.

We present an approach for comprehensive pathological brain tissue mapping with a focus on rational therapeutics development. Our method is based on imaging and analyzing highly multiplexed whole brain sections using 10 – 50 molecular markers. Analyzing these images is challenging due to their complexity, variability, and size. We describe a combination of morphology-driven signal reconstruction, deep cell detection and segmentation, and data analysis methods to generate quantitative readouts of cellular alterations at multiple scales ranging from individual cells to multi-cellular units, cortical layers, and atlas-defined brain regions. The results can be used for testing hypotheses, screening combination drug therapies, and system-level studies.



Tuesday October 15, 2019 11:15am - 11:30am
BRC 280

11:30am

Object Detection with Deep Learning in the Oil and Gas Industry
Oil and gas companies are used to analyze different types of 2D images. For example, to date rocks, many microfossils are identified and counted on large microscope images. This research is applied to microfossil detection and could benefit the other use-cases.

To automate this cumbersome work, we suggest a hybrid system consisting of two steps: 1) a heuristic over-segmentation to localize regions of interests (ROIs) with traditional computer vision; 2) a Convolutional Neural Network (CNN) trained to classify ROIs. This hybrid system presents two advantages compared to the state-of-art approach of object detection like those applied to IMAGENET. First, data management of supervised CNN classifiers is more flexible because they are trained on ROIs and not on the overall input image. Second, researchers have focused more on CNN classifiers because of their simplicity. 

Finally, we study the quality of the detection of this system on our micro-fossils detection application.

Speakers

Tuesday October 15, 2019 11:30am - 11:45am
BRC 103

11:30am

Composed Relation-Based Learning (CoRL): Predictive Modeling of Drug Side-Effect Relationships
Drug side-effects are unfortunately common and are often undetected until after a drug has been released. The current state of the art relies on observing a disproportionate co-occurrence of a drug with a potential side-effect for detection. This methodology does not consider relational or causal information, or similarity between drugs and side-effects for classification. In this work, embeddings are learned from literature-derived relational connections, and are entangled together for pairs of interest and leveraged with supervised machine learning. This composed, relation-based learning (CoRL) produces state of the art performance on two widely used, manually curated reference standards for drug safety monitoring.

Speakers
JM

Justin Mower

Presenter, Rice University
TC

Trevor Cohen

University of Washington
DS

Devika Subramanian

Rice University
My research interests are in artificial intelligence and machine learning and their applications in computational systems biology, neuroscience of human learning, assessments of hurricane risks, network analysis of power grids, mortality prediction in cardiology, conflict forecasting... Read More →



Tuesday October 15, 2019 11:30am - 11:45am
BRC 280

11:45am

Automated Salt Top Interpretation
The goal of this work is to demonstrate the detection and extraction of salt tops from seismic data via the application of deep learning. Motivations for automated salt top extraction is a growing necessity in the field of petroleum exploration. Synthetic data is used for automatic label generation and training of a convolutional neural network is capable of predict with higher accuracy the salt top in unseen data during the training. Several experiments were performed and evaluated for exploring the effects of changing various parameters during training. The best model produced in this study provides excellent results when is compared with the interpretation.

Speakers
GL

German Larrazabal

Technology Lab - Geophysics Repsol
FP

Freddy Perozo

Technology Lab – Advance Mathematics Repsol
PG

Pablo Guillen-Rondon

Presenter, University of Houston


Tuesday October 15, 2019 11:45am - 12:00pm
BRC 103

11:45am

Real-Time Metabolic and Molecular Imaging of Cancer Systems
Pressure Injuries (PrI) and Deep Tissue Pressure (DTPrI) are serious conditions among persons with spinal cord injury (SCI), resulting in tremendous personal and societal cost. Primary PrI prevention plays a critical role in the first line of defense, while it is also challenging as there are many risk factors to consider ranging from the individual’s environment to local tissue health. A comprehensive data repository is needed for such purpose. This paper presents SCIPUDSphere - a data repository, that extracts, integrates, stores a wide range of PrI risk factors of data and provides a user query interface for identifying subgroups hypothesis generation. We extracted a total of 268,562 records containing 282 ICD9 codes related to PrI among 36,626 individuals with SCI from the Veterans Administration's VA Informatics and Computing Infrastructure (VINCI) electronic health records. They are integrated into SCIPUDSphere, which can be utilized as the data source for researchers.

Speakers
PB

Pratip Bhattacharya

Presenter, The University of Texas MD Anderson Cancer Center
TS

Travis Salzillo

The University of Texas MD Anderson Cancer Center
CM

Caitlin McCowan

The University of Texas MD Anderson Cancer Center & Rice University
SP

Shivanand Pudakalakatti

The University of Texas MD Anderson Cancer Center
PD

Prasanta Dutta

The University of Texas MD Anderson Cancer Center
MT

Mark Titus

The University of Texas MD Anderson Cancer Center
NZ

Niki Zacharias

The University of Texas MD Anderson Cancer Center


Tuesday October 15, 2019 11:45am - 12:00pm
BRC 280

12:00pm

Lunch
Boxed lunches will be provided

Tuesday October 15, 2019 12:00pm - 1:00pm

1:00pm

Building Data Science Fluency
Imagine a restaurant where all the employees are cooks, with no waiters, greeters or bussers. Or another restaurant in which a single employee does every job required to bring a dish to the table: planting produce, milking cows, preparing ingredients, taking and cooking orders, plating and serving the final dish. It is unlikely that either of these extremes will work well in practice. In a similar way, the interdisciplinary nature of data science means that complex projects usually require a team of people to guarantee success. Building data science skills effectively across an organization requires providing targeted learning opportunities and resources. This talk will discuss three key audiences for data science education and some ideas about how to increase data science fluency for each of these audiences."

Speakers
avatar for Alena Crivello

Alena Crivello

Data Scientist, Independent
Dr. Alena Crivello is Statistician and Data Scientist. After postdoctoral work in clinical trials at the University of Michigan, Alena has spent the last decade working in the energy industry in Houston. She has worked on data science challenges in a wide variety of applications... Read More →



Tuesday October 15, 2019 1:00pm - 1:30pm
BRC 103

1:30pm

Biomedical Informatics in a Data-driven World
Irrespective of data size and complexity, query and exploration tools for accessing data resources remain a central linkage for human-data interaction.  A fundamental barrier in making query interfaces easier to use for health data, ultimately as easy as online shopping, is the lack of  faceted, interactive capabilities. This talk will present a research program and progress made to repurpose existing ontologies by transforming them into nested facet systems (NFS) to support human-data interaction. Two basic research topics arise: one is that the structure and quality of biomedical ontologies need to be examined and elevated for the purpose of NFS; the second is that mappings from data-source specific metadata to a corresponding NFS need to be developed to support this new generation of NFS-enabled web-interfaces. I will motivate the concept of NFS using an array of data resource examples, provide a preliminary order-theoretic formulation for NFS, and demonstrate NFS interfaces that have been deployed to illustrate the conceptual and design considerations involving NFS."

Speakers
avatar for GQ Zhang

GQ Zhang

Vice President and Chief Data Scientist, UT Health Science Center at Houston
Dr. Zhang is Vice President and Chief Data Scientist for UTHealth. He is a Professor in the Department of Neurology, McGovern Medical School and Co-Director, Texas Institute for Restorative Neurotechnologies. Prior to joining UTHealth, he was Professor of Internal Medicine and Computer... Read More →



Tuesday October 15, 2019 1:30pm - 2:15pm
BRC 103

2:15pm

Huge Cohorts, Genomics, and Clinical Data to Personalize Medicine
Precision medicine offers the promise of improved diagnosis and for more effective, patient-specific therapies. Typically, such studies have been pursued using research cohorts. At Vanderbilt and in the Electronic Medical Records and Genomics (eMERGE) Network, we have used clinical data of genomic basis of disease and drug response using real-world clinical data. The EHR also gave birth to “reverse genetics” experiment – starting with a genotype and discovering all the phenotypes with which it is associated – via the phenome-wide association study. By looking for clusters of diseases and symptoms through phenotype risk scores, we find unrecognized genetic variants associated with common disease. The foundation of clinical data combined with participant-collected data has become a platform for a new era of huge cohorts such as the UK Biobank, Million Veteran Program, and the All of Us Research Program. All of Us launched May 6, 2018 and currently has nearly 200,000 participants who have contributed biospecimens, health surveys, and a willingness to share their EHR. Participants in All of Us will also have the options of receiving research results back.

Speakers
avatar for Joshua C. Denny

Joshua C. Denny

M.D., M.S., F.A.C.M.I. Professor of Biomedical Informatics and Medicine Director, Center for Precision Medicine, Vice President for Personalized Medicine Vanderbilt University Medical Center Department of Biomedical Informatics, Vanderbilt University Medical Center
Josh Denny is a Professor of Biomedical Informatics and Medicine, Director of the Center for Precision Medicine, and Vice President for Personalized Medicine. His research interests include natural language processing, accurate phenotype identification from electronic medical record... Read More →


Tuesday October 15, 2019 2:15pm - 3:00pm
BRC 103

3:00pm

Poster Session and Networking
Light hors d'oeuvres, beer, and wine will be provided

Tuesday October 15, 2019 3:00pm - 5:00pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

A Transfer Learning Approach to Real-time Seizure Prediction
Speakers
YH

Yan Huang

Presenter, University of Kentucky
XL

Xiaojin Li

University of Texas Health Science Center at Houston
ST

Shiqiang Tao

McGovern Medical School
GZ

Guo-Qiang Zhang

McGovern Medical School


Tuesday October 15, 2019 3:01pm - 5:00pm

3:01pm

3:01pm

3:01pm

An Integrative Data Repository for Studying Risk Factors Associated with Pressure Injuries Resulting from Spinal Cord Injury
Speakers
NZ

Ningzhou Zeng

Presenter, University of Kentucky
SR

Steve Roggenkamp

University of Kentucky
KB

Kath Bogie

Louis Stokes Cleveland Department of Veterans Affairs Medical Center
ST

Shiqiang Tao

McGovern Medical School
GZ

Guo-Qiang Zhang

McGovern Medical School


Tuesday October 15, 2019 3:01pm - 5:00pm

3:01pm

3:01pm

Applying Graph Convolutional Neural Networks for Drug Metabolism Prediction
Speakers
avatar for Lydia E. Kavraki

Lydia E. Kavraki

Director Ken Kennedy Institute, Noah Harding Professor of Computer Science, Professor of Bioengineering, Professor of Electrical and Computer Engineering, Profesor of Mechanical Engineering, Rice University
Lydia E. Kavraki is the Noah Harding Professor of Computer Science, professor of Bioengineering, professor of Electrical and Computer Engineering, and professor of Mechanical Engineering at Rice University. She received her B.A. in Computer Science from the University of Crete in... Read More →


Tuesday October 15, 2019 3:01pm - 5:00pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

Be a Copycat: Uncharted Rewards by Mimicking Expert Action Sequences
Speakers
avatar for Anshumali Shrivastava

Anshumali Shrivastava

Presenter, Rice University
Anshumali Shrivastava is an Assistant Professor in the Department of Computer Science at Rice University with joint appointments in Statistics and ECE department. His broad research interests include large scale machine learning, randomized algorithms for big data systems and... Read More →
TM

Tharun Medini

Rice University


Tuesday October 15, 2019 3:01pm - 5:00pm

3:01pm

3:01pm

Can Deep Learning Predict Chaos ?
Speakers
DS

Devika Subramanian

Rice University
My research interests are in artificial intelligence and machine learning and their applications in computational systems biology, neuroscience of human learning, assessments of hurricane risks, network analysis of power grids, mortality prediction in cardiology, conflict forecasting... Read More →
KP

Krishna Palem

Rice University
PH

Pedram Hassanzadeh

Rice University


Tuesday October 15, 2019 3:01pm - 5:00pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

Forecasting Time Series of Counts using Dynamic Linear Models
Speakers
DK

Daniel Kowal

Presenter, Rice University


Tuesday October 15, 2019 3:01pm - 5:00pm

3:01pm

3:01pm

3:01pm

Hospital Length of Stay Analytics
Speakers

Tuesday October 15, 2019 3:01pm - 5:00pm

3:01pm

Identifying genetic markers associated with Alzheimer’s Disease progression through image phenotyping
Speakers
avatar for Luca Giancardo

Luca Giancardo

Presenter, University of Texas Health Science Center at Houston
Luca Giancardo is an Assistant Professor at the Center for Precision Health, School of Biomedical Informatics (SBMI), UTHealth with co-appointments at and at the Diagnostic and Interventional Imaging, McGovern Medical School, UTHealth and the Institute for Stroke and Cerebrovascular... Read More →


Tuesday October 15, 2019 3:01pm - 5:00pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

3:01pm

SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation
Speakers
YZ

Yang Zhao

Rice University
XC

Xiaohan Chen

Texas A&M University
PX

Pengfei Xu

Rice University
YW

Yue Wang

Presenter, Rice University
ZW

Zhangyang Wang

Texas A&M University
YL

Yingyan Lin

Rice University


Tuesday October 15, 2019 3:01pm - 5:00pm

3:01pm

3:01pm

3:01pm

Understanding the dynamics of reservoir computing in chaotic dynamical systems
Speakers
AS

Adam Subel

Rice University
AC

Ashesh Chattopadhyay

Presenter, Rice University
PH

Pedram Hassanzadeh

Rice University


Tuesday October 15, 2019 3:01pm - 5:00pm

3:01pm

3:01pm

Web-based Interactive Visualization of Non-Lattice Subgraphs in SNOMED CT
Speakers
ST

Shiqiang Tao

McGovern Medical School
avatar for GQ Zhang

GQ Zhang

Vice President and Chief Data Scientist, UT Health Science Center at Houston
Dr. Zhang is Vice President and Chief Data Scientist for UTHealth. He is a Professor in the Department of Neurology, McGovern Medical School and Co-Director, Texas Institute for Restorative Neurotechnologies. Prior to joining UTHealth, he was Professor of Internal Medicine and Computer... Read More →


Tuesday October 15, 2019 3:01pm - 5:00pm