This section provides an overview about the various Educational Methods to learn more about Data Engineering. Members of our project group collect a lot of interesting material they find on a Pinterest pinboard about Data Engineering before they are reviewed by the group and put into one of the following clusters.
For our working group we will search for existing videos in the following available video sources: YouTube, Vimeo, TED. This list should be considered as a starting point and might grow over time. In our PhUSE repository, we will embed the videos we found and introduce them with comments and metadata.
Examples of videos we already found as introduction to the data engineering topic
Introduction to Graphdatabases (contains explicit language)
The following video gives a great introduction into graph databases. It is a recording from a conference called Nodevember back in 2016. The speaker Ed Finkler explains some of the basic maths behind graph databases, the difference between relational databases and graph databases. He also presents reasons, why one should use a graph over a relational DB or other NoSQL. The video is about 30min long and worth a look.
Data Engineering and Data science: bridging the gap
Josh Wills is the Head of Data Engineering. In his past career he also led Data Science teams. At his talk at DataEDGE 2016 he talks how we can bridge the gap between Data Engineering and Data Science. The video is about 30min long.
Textbooks or articles
For our working group we will search the PhUSE, other conference or scientific archives, and also textbooks. This list should be considered as a starting point and might grow over time. In our PhUSE repository, we will embed the articles and textbooks we found and introduce them with comments and metadata.
Examples of textbooks and articles we already found as introduction to the data engineering topic:
Designing data intensive applications
This book is primarily about the architecture of data systems and the ways they are integrated into data-intensive applications. Many of the data engineering technologies are described in this book. It is not written at an entry level though and requires pre-knowledge about computer programming languages and system applications.
There are many data engineers out there who regularly post about their work. We are following a lot of blogs and share relevant information material here. We will again embed the blogs we found and introduce them with comments and metadata.
A Beginners Guido To Data Engineering
Robert Chang wrote this brilliant introduction into data engineering. It consists of three different parts. Starting with a high level introduction, to more technical details, and finally about more advanced data engineering topics, this blog gives you a very good introduction into this topic. Medium is a portal, where authors share various ideas and perspectives. Here you can find always fresh thinking.
Data Science 101
Data Science 101 is a blog about general data science topics. With this it also contains some blogs about data engineering. If you want to start with a high level introduction into the difference between data science and data engineering, this special introductory blog is worth a read.
Socia Media Channels
There are several eLearning platforms available, where you can also learn about data engineering. We will regularly look for new trainings on platforms such as Coursera or Udacity. PhUSE and/or this working group take over no liability for these trainings. These professional trainings might also produce costs, which are not covered by our working group.
Executive Data Science - Johns Hopkins University on Coursera.org
The course “Executive Data Science” from Johns Hopkins University, which is available on Coursera, teaches how to build and develop a team with complementary skill sets and roles in the area of data science. It describes the structure of the data science pipeline, how to assemble and lead a Data Science enterprise, and helps you understanding strategies for keeping your team on target. The target audience for this course are group leads or managers of data science teams.