Back To Schedule
Monday, October 14 • 4:45pm - 5:00pm
A Real-Time Streaming Analytic Pipeline for the Auto-Classification of High-Dimensional Celestial Data Using Innovative Hybrid Machine Learning Techniques

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!


The automatic detection and classification of celestial data on ingest is of growing importance as the volume and velocity of these survey images increases. Here we present a multistage data pipeline and with integrated machine learning classifier, based on the Kohonen self-organizing map (SOM). SOMs are simple to implement and they learn in and unsupervised manner. This allows a SOM Artificial Neural Networks (ANN) to be trained without the pre-classification of the training data set, rendering SOM results free from human bias. In turn, the open source data pipeline is built for absolute speed and efficiency. Data is collected, coalesced and classified based on a real-time streams framework consisting of Apache Flink and Apache Spark. Interesting radio sources are collected in a deep object store for further analysis and review, while data labeled as interference is discarded.


Theresa Melvin

Presenter, HPE

Monday October 14, 2019 4:45pm - 5:00pm CDT
BRC 280

Attendees (5)