Since its inception, Apache Spark has seen rapid adoption by enterprises across a wide range of industries. So, mastering Apache Spark opens a wide range of professional opportunities. If you are a software engineer or architect and want to design or build your own projects, then this is the right course for you.
This is a hands-on, example-driven, advanced course with demonstrations and coding sessions. This course will help you understand real-time stream processing using Apache Spark and later, you will be able to apply that knowledge to build real-time stream processing solutions.
This course covers everything from scratch, which involves installing Apache Spark and seeing how to set up and run Apache Kafka. Furthermore, it introduces stream processing and how to work with files and directories. You will also explore Kafka serialization and deserialization for Spark and how to work with Kafka AVRO Source. And finally, the course wraps up with streaming Watermark and outer joints.
By the end of this course, you will be able to design and develop big data engineering projects. You will be able to create real-time stream processing applications with Apache Spark. This course will also help you further your growth in real-time stream processing.
All resources and code files are placed here: https://github.com/PacktPublishing/Spark-Streaming-In-Scala
Create arbitrary streaming sinks
Explore Kafka Source and integrate Spark with Kafka
Learn state-less and state-full streaming transformations
Learn to handle memory problems with streaming joins
Learn to work with file streams
Explore windowing aggregates using Spark Stream