Training: Data science with Spark
Do you work with complex machine learning models and large amounts of data? Then Apache Spark as a cluster computing engine with in-memory calculations is your performance boost. Spark enables you to perform data queries even in big data environments and is one of the leading analytics technologies due to its machine learning libraries and numerous interfaces.
The consolidation of different data sources, interactive analyses or real-time data: Spark processes large amounts of data quickly and in parallel, thus optimally supporting even complex machine learning algorithms.
Our Spark training: Introduction to Data Science
In our training course we teach you the basics for your work with Spark and focus on the interaction of Spark with the data science languages Python and R. We recommend this course for experienced Python and R users and beginners alike.
- Reading and repartitioning of data on a Spark cluster
- Introduction to data management
- Exchange between local R/Python sessions and cluster operations
- Introduction to machine learning with Spark
Recommended course length: One day
Date and time
You can book this course as individual training at a suitable date of your choice.
We offer you the possibility to conduct this training at your site or via remote.
Please contact us, we will be pleased to submit you an individual offer for this training.