Streaming Big Data with Spark Streaming, Scala, and Spark 3!

In this course, you will learn the basics of the Scala programming language; learn how Apache Spark operates on a cluster; set up discretized streams with Spark Streaming and transform them as data is received; analyze streaming data over sliding windows of time; maintain stateful information across streams of data; connect Spark Streaming with highly scalable sources of data, including Kafka, Flume, and Kinesis; dump streams of data in real-time to NoSQL databases such as Cassandra; run SQL queries on streamed data in real-time; train machine learning models in real-time with streaming data, and use them to make predictions that keep getting better over time; and also, package, deploy, and run self-contained Spark Streaming code to a real Hadoop cluster using Amazon Elastic MapReduce.

This course is very hands-on, filled with achievable activities and exercises to reinforce your learning. By the end of this course, you will be confidently creating Spark Streaming scripts in Scala and be prepared to tackle massive streams of data in a whole new way. You will be surprised at how easy Spark Streaming makes it!

All the codes and supporting files for this course are available at https://github.com/packtpublishing/streaming-big-data-with-spark-stream…-

Type

video

Category

publication date

2016-09-26

what you will learn

Process large amounts of real-time data using the Spark Streaming module
Create efficient Spark applications using the Scala programming language
Integrate Spark Streaming with various data sources
Integrate Spark Streaming with Spark SQL to query your data in real time
Train machine learning models with streaming data, and use for real-time predictions
Maintain stateful data across a continuous stream of input data

duration

381

key features

Process streams of real-time data from various sources with Spark Streaming * Query your streaming data in real-time using Spark SQL * A comprehensive tutorial with practical examples to help you develop real-time Spark applications

approach

This course is very hands-on, filled with achievable activities and exercises to reinforce your learning. In this course, we will cover some real live Twitter data, simulated streams of Apache access logs, and even data used to train machine learning models! You will write and run real Spark Streaming jobs right at home on your own PC.

audience

If you are a student who wants to learn how to use Apache Spark or a big data professional who wants to process large amounts of data on a real-time basis, this course is for you. Some basic programming and scripting experience is required to get the most out of the course.

meta description

Welcome to the Spark streaming tutorial, which focuses on Spark structured streaming, Kafka integration, and streaming big data in real-time

short description

In this course, we will process massive streams of real-time data using Spark Streaming and create Spark applications using the Scala programming language (v2.12). We will also get our hands-on with some real live Twitter data, simulated streams of Apache access logs, and even data used to train machine learning models.

subtitle

Learn to process massive streams of data in real-time on a cluster with Apache Spark Streaming

keywords

Apache spark, spark, fast data, big data, spark streaming, Scala

Product ISBN

9781787123915

ATLAS

ATLAS

Here to assist you. Please feel free to ask!

×

Hello and welcome! I'm ATLAS, your AI Learning Assistant. I’m here to provide expert answers and detailed explanations for your educational queries.