Exploring the term Data Engineering opens the door to the vast opportunities and roles available today with the overarching goal to optimise the use of data in day to day business operations.  In doing a simple search on the internet, what is data engineering?”, one will find many posts expressing their understanding of Data Engineering, with some variation. 


However, what is clear is that Data Engineering encompasses the many considerations that need to be taken into account to optimally curate, transform, secure and disseminate data suitable for analysis.  As technology and tools have become more advanced, building such a platform and infrastructure requires engineers and architects of both general and specific expertise. The Data Engineer combines knowledge in areas such as software development, infrastructure, data architecture, data warehousing, cloud technology and data cleaning in order to design, build and test solutions that define the pipelines of data throughout the enterprise, making the data accessible to the organisation. [1] [2] [3]  

Optimised Data Engineering appropriately balances the efficiency of an automated process against the cost of development and maintenance of that process, ensuring repetitive processes that require humans to write code, press keys, cut-and-paste and update documents are minimised or eliminated.


[1] Aghabozorgi, Saeed, and Polong Li. “Data Scientist vs Data Engineer, What’s the Difference?” Cognitive Class Blog, 2016, https://cognitiveclass.ai/blog/data-scientist-vs-data-engineer/.

[2] Paruchuri, Vik. “Data Engineering Series.” Dataquest, Dataquest, 15 Dec. 2017, www.dataquest.io/blog/what-is-a-data-engineer/.

[3] Schulz, Wade. “Baikal – Implementing and Deploying Clinical Models with a Real-Time Data Lake.” PhUSE SDE. Focus on the Patients - Bridging Data to Solutions, 26 July 2018, Ridgefield, Boehringer Ingelheim.