Training: Data science with Spark

Do you work with complex machine learning models and large amounts of data? Then Apache Spark as a cluster computing engine with in-memory calculations is your performance boost. Spark enables you to perform data queries even in big data environments and is one of the leading analytics technologies due to its machine learning libraries and numerous interfaces.

The consolidation of different data sources, interactive analyses or real-time data: Spark processes large amounts of data quickly and in parallel, thus optimally supporting even complex machine learning algorithms.

Logo Apache Spark

Our Spark training: Introduction to Data Science

In our training course we teach you the basics for your work with Spark and focus on the interaction of Spark with the data science languages Python and R. We recommend this course for experienced Python and R users and beginners alike.


Course Content

  • Reading and repartitioning of data on a Spark cluster
  • Introduction to data management
  • Exchange between local R/Python sessions and cluster operations
  • Introduction to machine learning with Spark



Recommended course length: One day

Date and time

You can book this course as individual training at a suitable date of your choice.


We offer you the possibility to conduct this training at your site or via remote.


Please contact us, we will be pleased to submit you an individual offer for this training.

Get started now

    Meltem Hekim

    Contact Data Science Training

    Tel. +49 561 87948-370