Course

Fundamentals of Data Engineering

Faculty
Commerce & Business Administration
Department
Computing Studies & Information Systems
Course Code
CSIS 3600
Credits
3.00
Semester Length
15 Weeks
Max Class Size
35
Method(s) Of Instruction
Lecture
Seminar
Course Designation
None
Industry Designation
APICS
Typically Offered
To be determined

Overview

Course Description
This course covers data engineering concepts and the systems, processes, practices, and tools used in data engineering. Students will learn how to select and configure suitable data engineering infrastructures for various data engineering use cases, and to develop, implement, and manage pipelines for ingesting data from different data sources. Students will also learn how to securely provision ingested data to downstream data consumers. Throughout the course, contemporary data engineering tools will be used for hands-on class demonstrations, exercises, and projects.
Course Content
  1. Introduction to data engineering
    • Data engineering principles
    • Data engineering lifecycle
  2. Data engineering infrastructure
    • Data infrastructure, including cloud infrastructure services, such as those
      provided by Amazon, Google, and Microsoft
    • Modern data architecture
    • Data infrastructure strategy
  3. Building data pipelines
    • Data pipeline patterns and types of data pipelines
    • Building batch data pipelines with tools such as Apache NiFi and Airflow
    • Building streaming data pipelines with tools such as Apache Kafka or Amazon Kinesis
    • Integrating batch and streaming data pipelines (i.e., mini-batch data streams)
  4. Managing data pipelines
    • Orchestrating data pipelines with orchestration tools, such as Apache Airflow
    • Handling changes in source systems and broken data pipelines
    • Monitoring and measuring pipeline performance
  5. Provisioning data for downstream data consumers
    • Cleaning and transforming data
    • Data validation
    • Serving data for downstream data consumers
    • Managing data security and privacy
    • Data governance
Learning Activities

Lecture, seminars, demonstrations, and hands-on exercises/projects

Means of Assessment

Assessment will be based on course objectives and will be carried out in accordance with the Douglas College Evaluation Policy.

Labs

0-10%

Project(s)

15-25%

Midterm Examination*

30-35%

Final Examination*

30-40%

Total

100%

 Some of these assessments may involve group work.

 * Practical hands-on computer exam
 
In order to pass the course, students must, in addition to receiving an overall course grade of 50%, also achieve a grade of at least 50% on the combined weighted examination components (including quizzes, tests, exams).
Learning Outcomes

At the end of this course, the successful student will be able to:

• Explain data engineering concepts and processes.
• Depict and describe the data engineering life cycle.
• Evaluate trade-offs among data engineering techniques and design alternatives within the context of specific data engineering application domains.
•Select, install and/or configure suitable data engineering infrastructure (e.g., cloud infrastructure services, such as those provided by Amazon, Google, and Microsoft) and tools for various data engineering use cases.
• Build working data pipelines to ingest data from various sources.
• Manage data pipelines for optimal performance.
• Clean, transform, and validate messy ingested data.
• Securely make data available for downstream data consumers.

Textbook Materials
  • Reis, Joe and Housely Matt. Fundamentals of Data Engineering: Plan and Build Robust Data Systems, O’Reilly Media. Latest edition.
  • Crickard, Paul. Data Engineering with Python: Work with massive datasets to design data models and automate data pipelines using Python, Packt Publishing. Latest edition.
  • Custom courseware, class notes provided by the instructor, and online resources or other textbooks as approved by the department.

Requisites

Course Guidelines

Course Guidelines for previous years are viewable by selecting the version desired. If you took this course and do not see a listing for the starting semester / year of the course, consider the previous version as the applicable version.

Course Transfers

These are for current course guidelines only. For a full list of archived courses please see https://www.bctransferguide.ca

Institution Transfer Details for CSIS 3600
There are no applicable transfer credits for this course.

Course Offerings

Summer 2024