< Back
Applied Data Engineering
Course Overview
By enrolling in this course, participants will be able to learn data engineering concepts, techniques and applications such as data preparation, storage and warehousing, analysis, mining, and visualization.
Upon completion of this course, participants will be familiarized with all the relevant components of Big Data. Participants will also be able to perform data analysis, build real-time stream analytics, execute complex data management, build apps, and create visualization on top of data.
Audience Profile
This course prepares participants for their data engineering journey by setting them on the path to becoming Data Scientists. To fully capitalize on this course, participants should preferably have some knowledge in Python, SQL, HTML or JavaScript.
Course Outline
Course Outline
- Introduction to Big Data
- ETL/ELT Best Practices
- Managing Metadata
- Consolidating Multiple Data Sources
- Data Cleansing & Transformation
- Scheduling Data Refresh
- Introduction to Hadoop
- Ingesting Flat Files into Hadoop
- Integration between RDBMS & Hadoop
- Data Processing using Hive
- Interactive Query using Impala
- Processing Log Files
- Collecting External Data from the Internet
- Introduction to Spark
- Processing Data using Spark
- Querying Data in Spark
- Real-Time Data Processing using Spark Streaming
- Job Orchestration & Workflow using Oozie
- Troubleshooting an ETL Job
- Performance Optimization
- Developing Data Visualizations using D3.js
COURSE INFORMATION
Code | ADE |
Duration | 5 Days |
Full Price | RM 7,200 |
Early Bird Price | RM 6,500 |