<  Back

Data Engineering on GCP

Data Engineering on GCP

Course Overview

This course will show participants how to design data processing systems, build end-to-end data pipelines, analyze data, and implement machine learning on Google Cloud. It covers structured, unstructured, and streaming data. At the end of this course, participants will be able to:

• Design and build data processing systems on Google Cloud
• Leverage unstructured data using Spark and ML APIs on Cloud Dataproc
• Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow
• Derive business insights from extremely large datasets using Google BigQuery
• Train, evaluate and predict using machine learning models using TensorFlow and Cloud ML
• Enable instant insights from streaming data

Audience Profile

This course is intended for experienced developers responsible for managing big data transformations including:

• Extracting, loading, transforming, cleaning, and validating data
• Designing pipelines and architectures for data processing
• Creating and maintaining machine learning and statistical models
• Querying datasets, visualizing query results and creating reports


Participants should have:

• Completed Google Cloud Big Data and Machine Learning Fundamentals course OR have equivalent experience
• Basic proficiency with common query language such as SQL Experience with data modeling, extract, transform, load activities
• Developing applications using a common programming language such as Python
• Familiarity with basic statistics

Course Outline

Course Outline

  • Introduction to Data Engineering
  • Building a Data Lake & Data Warehouse
  • Introduction to Building Batch Data Pipelines
  • Executing Spark on Cloud Dataproc
  • Serverless Data Processing with Cloud Dataflow
  • Manage Data Pipelines with Cloud Data Fusion and Cloud Composer
  • Introduction to Processing Streaming Data
  • Serverless Messaging with Cloud Pub/Sub
  • Cloud Dataflow Streaming Features
  • High-Throughput BigQuery and Bigtable Streaming Features
  • Advanced BigQuery Functionality and Performance
  • Introduction to Analytics and AI
  • Prebuilt ML model APIs for Unstructured Data
  • Big Data Analytics with Cloud AI Platform Notebooks
  • Production ML Pipelines with Kubeflow
  • Custom Model building with SQL in BigQuery ML & Cloud AutoML


Code GDE2
Duration 4 Days
Full Price RM 7,200
Early Bird Price RM 6,400

    Sign up and enjoy
    early bird discount!