Apache Spark Table of Contents


Table of Content
 


Getting Started

  • The Course Overview
  • Setting Up an AWS Account
  • Launching a Spark Cluster on EC2
  • Setting Up Your Environment
  • Running a Test Application

Working with RDDs

  • Creating RDDs
  • Actions
  • Transformations
  • Joins, Set, and Numeric Operations
  • Shared Variables

DataFrames

  • Installing Jupyter Notebook
  • RDDs and DataFrames
  • DataFrame Row Operations
  • DataFrame Column Operations
  • DataFrame Manipulation

Spark SQL

  • Views
  • Schemas
  • SQL Operations
  • I/O Options
  • HIVE

Machine Learning Fundamentals

  • Basic Statistics
  • Pipelines
  • Feature Extractors
  • Feature Transformers
  • Feature Selectors

Machine Learning Models

  • Classification
  • Regression
  • Clustering
  • Collaborative Filtering
  • Model Selection and Tuning

Streaming

  • DStreams
  • DStream Window Operations
  • Structured Streaming
  • Window Operations
  • Joining Batch and Streaming Data


Apply for certification

https://www.vskills.in/certification/big-data/apache-spark-certificate

 For Support