Benefit

Establish a reproducible, centralized development environment for R and Python.

Challenge

Decentralized data science initiatives in an international corporation without a common standardized development environment.

Toolset

RStudio Produkte, Kubernetes, IAM, Terraform, Ansible


Development of a centralized development environment for distributed data science initiatives for Covestro AG

Challenge

Covestro, the leading German polymer material manufacturer, is pushing digitalization and associated initiatives in the field of data science and AI ahead. To drive these forward, a common standardized development environment was required. At an internationally operating enterprise like Covestro, the topic of data science is driven forward in a decentralized manner in different departments and teams.

This complicates development work and leads to high administrative effort as well as compliance problems. In addition, different environments caused challenges for the data scientists, as internal compatibility of the development products could not be guaranteed.

Goal

Covestro wants to provide its data scientists with a centralized development environment for R and Python developments in order to reduce their administrative efforts and promote productive work. Furthermore, the new analysis infrastructure should be scalable and replicable.

Solution

Within the framework of eoda | analytic infrastructure consulting, eoda supports Covestro from the outlining of the architecture (see below) to the implementation and ongoing operation of the analysis environment.

At the core of the infrastructure are RStudio products as selected tools. These include RStudio Workbench for development, RStudio Connect for sharing and deploying applications, and RStudio Package Manager for package management. Furthermore, a Kubernetes backend is used to outsource the computing processes to be able to provide horizontal scaling. The new analytics environment integrates with the existing AWS infrastructure at Covestro.

Moreover, the existing management tools, such as Identity Access Management (IAM), continue to form a central administration instance in the company, without the new environment generating high additional costs. In the context of the required reproducibility, the scripting of the infrastructure was implemented with Terraform and Ansible. This infrastructure-as-code approach ensures that the setup and configuration of the environment is transparent and can be implemented quickly.

Result

With the help of eoda, a reproducible, centralized development environment for R and Python was created. In addition to facilitating compliance, the central analysis environment ensures more efficient collaboration in the context of Covestro’s decentralized working model.

Source: Covestro AG

We also implement your data science infrastructure

    We are looking forward to your message!