Overview
Sapphire Global Spark And scala Training makes you an expert in using Spark And scala concepts. Enroll now for Spark And scala online training and get through the concepts of data, by utilizing the internal memory for storing a working set.
Course Curriculum
Key features
This course will prepare you to:
- Explain the architecture of the Spark And scala.
- Configure and use new functionalities in Spark And scala Concepts.
Dive into Scala
- What is Scala
- Setup and configuration of Scala
- Develop and run basic Scala Programs
- Scala operations
- Functions and procedures in Scala
- Different Scala APIs for common operations
- Loops and collections Array, Map, Lists, Tuples
- Pattern matching for advanced operations
- Eclipse with Scala
Object Oriented and Functional Programming
- Introduction to object oriented programming
- Different oops concepts
- Constructor, getter, setter, singleton, overloading and overriding
- Nested Classes, Visibility Rules
- Functional Structures
- Functional programming constructs
- Call by Name, Call by Value
Big Data and need for Spark
- Introduction to Big Data
- Challenges with old Big Data solutions
- Batch vs Real-time vs in-Memory processing
- MapReduce and its limitations
- Apache Storm and its limitations
- Need for general purpose solution – Apache Spark
Deep Dive in Apache Spark
- What is Apache Spark?
- Internals of Spark architecture
- Apache Spark design principles
- Spark features and characteristics
- Apache Spark Eco-system components and their insights
Deploy Spark in Local mode
- Setup of Spark Environment
- Install and configure prerequisites
- Installation of Apache Spark in local mode
- Work with Spark in local mode
- Troubleshooting the encountered problems
Deploy Apache Spark in different modes
- Installation of Spark in standalone mode
- Installation of Spark in YARN mode
- Installation & configuration of Spark on a real multi-node cluster
- Play with Spark in cluster mode
- Best practices for Spark deployment
Demystify Apache Spark
- Play with Spark shell
- Execute Scala and Java statements in shell
- Understand Spark Context and driver
- Read data from local filesystem
- Integrate Spark with HDFS
- Cache the data in memory for further use
- Distributed persistence
- Testing and troubleshooting
Basic Abstraction RDDs
- What is RDD in Spark
- How RDDs make Spark a feature rich framework
- Transformations in Apache Spark RDDs
- Spark RDDs action and persistence
- Spark Lazy operations - Transformation as well as Caching
- Fault tolerance in Spark
- Load data and create RDD in Spark
- Persist RDD in memory or disk
- Pair operations and key-value in Spark
- Spark Integration with Hadoop
- Apache Spark practicals and workshops
Spark streaming
- Need for stream analytics
- Comparison with Storm and S4
- Real-time data processing using Spark streaming
- Fault tolerance and check-pointing
- Stateful Stream Processing
- DStream and window operations
- Spark Stream execution flow
- Connection to various source systems
- Performance optimizations in Spark
Spark-SQL
- What is Spark SQL
- Apache Spark SQL Features and Data flow
- Spark SQL architecture and components
- Hive and Spark SQL together
- Play with Data frames and Data states
- Data loading techniques in Spark
- Hive Queries through Spark
- Various Spark SQL DDL and DML operations
- Performance tuning in SparK
Spark MLlib and Spark GraphX
- Need for Machine Learning
- Introduction to Spark machine learning
- Various Spark ML libraries
- Algorithms for clustering, statistical analytics, classification etc.
- Introduction to GraphX
- Need for different graph processing engine
- Graph handling using Apache Spark
Real Life Spark Project
- Live Apache Spark project based on real industry scenarios. Work on real life use-case and handle the real-world problem with live datasets.
Practice Test and Interview Questions
Practice Test and Interview Questions
Reviews
There are no reviews yet.