A High-Level Paradigm for Reliable Large-scale Server Software

Description

The exponential growth in the number of cores requires radically new software development technologies. Many expect 100,000-core platforms to become commonplace, and the best predictions are that core failures on such an architecture will be common, perhaps one an hour. Hence we require programming models that are not only highly scalable but also reliable. The project aim is to scale the radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. The trend-setting language we will use is Erlang/OTP which has concurrency and robustness designed in. Currently Erlang/OTP has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. Moreover existing profiling & debugging tools don't scale. The RELEASE consortium is uniquely qualified to tackle these challenges and we propose to work at three levels: - evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; - evolving the language to Scalable Distributed (SD) Erlang, and adapting the OTP framework to provide both constructs like locality control, and reusable coordination patterns to allow SD Erlang to effectively describe computations on large platforms, while preserving performance portability;- developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We will develop state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the effectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene. Erlang is a beacon language for distributed computing, influencing both other languages and actor libraries and frameworks. Hence we expect the project to make a strong and enduring impact on computing practice in the two decades.

KEY DATES
  • Status
  • Completed
  • Project Launch
  • 01 October 2011
  • Project completed
  • 28 February 2015
ICT server software core
×