Apache Spark is an open-source distributed general-purpose cluster-computing framework. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since.
The following are the things covered under Spark.
Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs.
Security in Spark is OFF by default. This could mean you are vulnerable to attack by default.
Spark comes with several sample programs. Scala, Java, Python and R examples are in the examples/src/main directory. To run one of the Java or Scala sample programs, use bin/run-example <class> [params] in the top-level Spark directory.
The Spark cluster mode overview explains the key concepts in running on a cluster. Spark can run both by itself, or over several existing cluster managers.
customize Spark via its configuration system
processing structured data streams with relation queries (using Datasets and DataFrames, newer API than DStreams)
The following are the course contents offered for Spark
Download Spark course plan