Using automation for improving the maintenance and reusability of infrastructures is now the norm. However, automating big data applications implies unique challenges as it requires a deeper integration with the applications under management via their APIs. Hence, the automation framework is not just a collection of scripts, but rather a fully-fledged application that controls other applications: the meta-application.
The talk will describe how Bigstep automatically controls complex applications clusters using the Bigstep DataLab service and the lessons we learned along the way. While not everybody needs to scale Spark clusters automatically, the way we approached certain aspects will help even smaller deployments and application developers choose the best architecture and tools for the job. We will cover the following technologies: Mesos, Marathon, Docker and ways they can help building robust, scalable and easy-to-maintain software-defined infrastructures.