Data Analysis at Scale
This two-day course explores cloud-based analytics platforms demonstrating highly scalable, hosted batch and stream processing solutions. The curriculum includes coverage of the Amazon AWS and Google GCP cloud platforms as well as essential open source solutions, including Hadoop, Spark, and Kafka. Cloud-based data processing services including big data storage facilities, ETL tools, batch, and stream-based machine learning and analytics processing solutions. Upon completion of the course, attendees will have a deeper understanding of streaming and batch-based data processing technologies in common use and how to leverage the unique features of the big data platforms in the cloud.
Who Should Attend
Developers, Architects, DevOps, Quality Assurance (QA) personnel and Professional Services staff
What Attendees Will Learn
This intensive, hands-on course is designed to provide participants with a foundational understanding of big data streaming and processing in the cloud from a developer’s perspective. Learning modules include:
- Big Data in the cloud: data acquisition and processing, Apache Hadoop, cloud based data warehouses
- Streaming and stream processing: data streaming, stream processing, stream analytics, Apache Spark, machine learning
Prerequisites
Knowledge of basic programming concepts is required; knowledge of a specific programming language is not required.