Databricks for Data Engineering

Type:

  • Webcast

Topic(s):

  • Databricks
  • Data Engineering
  • Data Transformation

Whether you’re looking to transform and clean large volumes of data or collaborate with colleagues to build advanced analytics jobs that can be scaled and run automatically, Databricks offers a Unified Analytics Platform that promises to make your life easier.

Built by the same team who came up with Apache Spark, and with strong partnerships with both Microsoft Azure and AWS, Databricks is designed to take the pain out of managing your cloud-scale analytics platform, allowing you to focus on valuable analysis.

In the second of 2 webcasts Thorogood Consultants Jon Ward and Robbie Shaw introduce the vital role Databricks can play in your organization’s cloud data architecture as the primary tool for data transformation.

They showcase Databricks’ data transformation and data movement capabilities, how the tool aligns with cloud computing services, and highlight the security, flexibility and collaboration aspects of Databricks. They also look at Databricks Delta Lake, and how it offers improved storage for both large-scale datasets and real-time streaming data.

Databricks for Data Engineering

Across these two webcasts, we have looked at two key use cases for Databricks:

In August’s part one, we used demos based on real customer use cases, and we introduced some of Databricks’ key features for Data Science, like the ability to automatically scale the analysis based on the workload, and the option to switch between SQL, R and Python depending on the task at hand.

We looked at how Databricks MLFlow supports the analytics project lifecycle, and considered how you can use it in combination with other tools to automate analytics and present outputs to the key decision-makers in your organization.

Watch a recording of part one here