Data Engineer — resources

roadmap.sh: https://roadmap.sh/data-engineer

Books

  • Fundamentals of Data Engineering (Joe Reis & Matt Housley) — the canonical lifecycle-first overview every data engineer should start with.
  • Designing Data-Intensive Applications (Martin Kleppmann) — deep foundations on storage, distribution, and consistency behind every data system.
  • The Data Warehouse Toolkit (Ralph Kimball & Margy Ross) — definitive guide to dimensional modeling, star schemas, and SCDs.
  • Streaming Systems (Tyler Akidau, Slava Chernyak & Reuven Lax) — the reference for event-time, windowing, and batch/stream unification.

Courses / practice

  • DataTalksClub Data Engineering Zoomcamp — free, hands-on end-to-end course covering dbt, Airflow, Spark, Kafka, and the cloud.
  • Databricks Academy — practical Spark, Delta Lake, and lakehouse training with free self-paced tracks.
  • dbt Learn — official tutorials and certification path for analytics engineering and transformation modeling.