Event Title

Honey, I Shrunk the Supercomputer: Scalable Big Data Analysis Using a Raspberry Pi Computing Cluster and Apache Hadoop

Location

101 Science & Nursing Building

Start Date

27-4-2018 10:00 AM

End Date

27-4-2018 10:30 AM

Description

Big data refers to a collection of data sets so large and complex that it becomes difficult to process with traditional data processing applications or methods, nor can it be managed as a single instance. Big data analytics, of course, is the process of examining these large and varied sets of information to deliver a more coherent understanding of this data, by uncovering unseen correlations, patterns, or trends. Typically, a large-scale data analysis operation, the analysis at hand is performed by a "supercomputer" or several high-performance computers (HPCs) that are linked together to pool computing resources. This schema is considered a "cluster" with each individual computer being a "node" work in parallel to solve problems, or in this instance, analyze big data.

The topic of this paper addresses the planning and construction of a small-scale supercomputer (smaller than a breadbox) to perform analysis on data sets pertaining to employable skills for students entering the technology job field.

This document is currently not available here.

Share

COinS
 
Apr 27th, 10:00 AM Apr 27th, 10:30 AM

Honey, I Shrunk the Supercomputer: Scalable Big Data Analysis Using a Raspberry Pi Computing Cluster and Apache Hadoop

101 Science & Nursing Building

Big data refers to a collection of data sets so large and complex that it becomes difficult to process with traditional data processing applications or methods, nor can it be managed as a single instance. Big data analytics, of course, is the process of examining these large and varied sets of information to deliver a more coherent understanding of this data, by uncovering unseen correlations, patterns, or trends. Typically, a large-scale data analysis operation, the analysis at hand is performed by a "supercomputer" or several high-performance computers (HPCs) that are linked together to pool computing resources. This schema is considered a "cluster" with each individual computer being a "node" work in parallel to solve problems, or in this instance, analyze big data.

The topic of this paper addresses the planning and construction of a small-scale supercomputer (smaller than a breadbox) to perform analysis on data sets pertaining to employable skills for students entering the technology job field.