Free Download Spark Project On Cloudera Hadoop(Cdh) And Gcp For BeginnersLast updated 4/2021
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 6.54 GB | Duration: 10h 54m
Building Data Processing Pipeline Using Apache NiFi, Apache Kafka, Apache Spark, Cassandra, MongoDB, Hive and Zeppelin
What you'll learn
Complete Spark Project Development on Cloudera Hadoop and Spark Cluster
Fundamentals of Google Cloud Platform(GCP)
Setting up Cloudera Hadoop and Spark Cluster(CDH 6.3) on GCP
Features of Spark Structured Streaming using Spark with Scala
Features of Spark Structured Streaming using Spark with Python(PySpark)
Fundamentals of Apache NiFi
Fundamentals of Apache Kafka
How to use NoSQL like MongoDB and Cassandra with Spark Structured Streaming
How to build Data Visualisation using Python
Fundamentals of Apache Hive and how to integrate with Apache Spark
Features of Apache Zeppelin
Fundamentals of Docker and Containerization
Requirements
Basic understanding of Programming Language
Basic understanding of Apache Hadoop
Basic understanding of Apache Spark
No worry, even solid Apache Hadoop and Apache Spark basics are covered for the benefit of absolute beginners
Most important one, which is willingness to learn
Description
In retail business, retail stores and eCommerce websites generates large amount of data in real-time. There is always a need to process these data in real-time and generate insights which will be used by the business people and they make business decision to increase the sales in the retail market and provide better customer experience. Since the data is huge and coming in real-time, we need to choose the right architecture with scalable storage and computation frameworks/technologies.Hence we want to build the Data Processing Pipeline Using Apache NiFi, Apache Kafka, Apache Spark, Apache Cassandra, MongoDB, Apache Hive and Apache Zeppelin to generate insights out of this data.The Spark Project is built using Apache Spark with Scala and PySpark on Cloudera Hadoop(CDH 6.3) Cluster which is on top of Google Cloud Platform(GCP).Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance.Apache Kafka is a distributed event store and stream-processing platform. It is an open-source system developed by the Apache Software Foundation written in Java and Scala. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds.Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model.A NoSQL (originally referring to "non-SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases.
Overview
Section 1: Introduction
Lecture 1 Course Introduction
Section 2: Big Data and Apache Hadoop Concepts
Lecture 2 Introduction to Big Data
Lecture 3 Introduction to Apache Hadoop
Lecture 4 Understanding Hadoop Distributed File System (HDFS) and MapReduce
Section 3: Apache Spark Concepts
Lecture 5 Introduction to Apache Spark
Lecture 6 Spark Architecture
Section 4: Environment Setup
Lecture 7 Workaround for setting up Cloudera CDH on GCP
Lecture 8 Environment Setup Overview
Lecture 9 Create Free Trial Account in Google Cloud Platform(GCP)
Lecture 10 Create VM instance using Compute Engine in GCP
Lecture 11 Setting Up Single Node Cloudera Hadoop CDH 6.3 Cluster in GCP
Lecture 12 Install Apache NiFi on Single Node CDH 6.3 Cluster
Lecture 13 Install Apache Kafka on Single Node CDH 6.3 Cluster
Lecture 14 Install Apache Cassandra on Single Node CDH 6.3 Cluster
Lecture 15 Install MongoDB on Single Node CDH 6.3 Cluster
Lecture 16 Install and Configure PyCharm Community Edition for PySpark Application
Lecture 17 Install & Configure IntelliJ Community Edition for Spark with Scala Application
Section 5: Apache Spark Practical using Spark with Scala and PySpark
Lecture 18 Resilient Distributed Datasets (RDD) Transformation Operations
Lecture 19 Resilient Distributed Datasets (RDD) Action Operations
Lecture 20 Spark DataFrame Operations
Lecture 21 Spark SQL Concepts with Hands-On
Section 6: Fundamentals of Apache NiFi
Lecture 22 Introduction to Apache NiFi
Lecture 23 Apache NiFi Core Terminologies
Lecture 24 Apache NiFi Concepts with Hands-On - Part 1
Lecture 25 Apache NiFi Concepts with Hands-On - Part 2
Section 7: Fundamentals of Apache Kafka
Lecture 26 Introduction to Apache Kafka
Lecture 27 Key Concepts in Apache Kafka
Lecture 28 Apache Kafka Architecture
Lecture 29 Kafka Producer with Hands-On
Lecture 30 Kafka Consumer with Hands-On
Section 8: Fundamentals of Apache Hive
Lecture 31 Introduction to Apache Hive
Lecture 32 Hive Table Concepts with Hands-On
Lecture 33 Hive Joins Concepts with Hands-On
Lecture 34 Partitioning and Bucketing Concepts in Hive with Hands-On
Section 9: Spark Project Development using Spark with Scala and PySpark on CDH 6.3 Cluster
Lecture 35 Project Architecture(Building Data Processing Pipeline)
Lecture 36 Generate Retail Data using Apache NiFi Data Pipeline(eCommerce Data Simulator)
Lecture 37 Spark Structured Streaming and Apache Kafka Integration
Lecture 38 Building Data Processing Pipeline with Spark Structured Streaming and Cassandra
Lecture 39 Building Data Processing Pipeline with Spark Structured Streaming and MongoDB
Lecture 40 Building Data Visualization using Python
Lecture 41 Project Demo
Lecture 42 How to Install Apache Zeppelin in CDH 6.3 Cluster
Lecture 43 Data Analysis using Spark SQL in Apache Zeppelin
Section 10: Bonus Tutorial
Lecture 44 Introduction to Docker
Lecture 45 Install Docker on Ubuntu Operating System
Lecture 46 Install Docker on Windows Operating System
Lecture 47 Docker Practical Tutorial
Beginners who want to learn Apache Spark/Big Data Project Development Process and Architecture,Entry/Intermediate level Data Engineers and Data Scientist,Data Engineering and Data Science Aspirants,Data Enthusiast who want to learn, how to develop and run Spark Application on CDH Cluster,Anyone who is really willingness to become Big Data/Spark Developer
Homepage
https://www.udemy.com/course/spark-project-on-cloudera-hadoop-cdh-and-gcp-for-beginners/
Buy Premium From My Links To Get Resumable Support,Max Speed & Support Me
Spark Project On Cloudera Hadoop(Cdh) And Gcp For Beginners Torrent Download , Spark Project On Cloudera Hadoop(Cdh) And Gcp For BeginnersWatch Free Online , Spark Project On Cloudera Hadoop(Cdh) And Gcp For Beginners Download Online
Comments