Making Scientific Software Installation Reproducible On ... ?· Making Scientific Software Installation…

  • Published on
    09-Sep-2018

  • View
    215

  • Download
    3

Transcript

Making Scientific Software Installation ReproducibleOn Cray Systems Using EasyBuildPetar ForaiResearch Institute ofMolecular PathologyDr Bohrgasse 7A-1030 Vienna, Austriapetar.forai@imp.ac.atKenneth HosteGhent UniversityKrijgslaan 281, S9B-9000 Ghent, Belgiumkenneth.hoste@ugent.beGuilherme Peretti-PezziSwiss NationalSupercomputing CentreVia Trevano, 1316900 Lugano, Switzerlandperetti@cscs.chBrett BodeNational Center forSupercomputing ApplicationsUniversity of Illinois1205 W. Clark St.Urbana, IL 61801brett@illinois.eduABSTRACTCray provides a tuned and supported OS and programmingenvironment (PE), including compilers and libraries inte-grated with the modules system. While the Cray PE isupdated frequently, tools and libraries not in it quickly be-come outdated. In addition, the amount of tools, librariesand scientific applications that HPC user support teams areexpected to provide support for is increasing significantly.The uniformity of the software environment across Cray sitesmakes it an attractive target for to share this ubiquitousburden, and to collaborate on a common solution.EasyBuild is an open-source, community-driven frameworkto automatically and reproducibly install (scientific) soft-ware on HPC systems. This paper presents how EasyBuildhas been integrated with the Cray PE, in order to leveragethe provided optimized components and support an easyway to deploy tools, libraries and scientific applications onCray systems.We will discuss the changes that needed to be made to Easy-Build to achieve this, and outline the use case of providingthe Python distribution and accompanying Python packageson top of the Cray PE, to obtain a fully featured batter-ies included optimized Python installation that integrateswith the Cray-provided software stack. In addition, we willoutline how EasyBuild was deployed at the Swiss NationalSupercomputing Centre CSCS and how it is leveraged toobtain a consistent software stack and installation workflowacross multiple Cray (XC, CS-Storm) and non-Cray HPCsystems.1. INTRODUCTIONThe job of High-Performance Computing (HPC) supportteams is to enable the productive and efficient use of HPCresources. This ranges from helping out users by resolvingissues like login problems or unexpected software crashes,providing detailed answers to both simple and deeply tech-nical questions, installing requested software libraries, toolsand applications, to providing services such as performanceanalysis and optimization of scientific software being devel-oped.One particularly time-consuming task for HPC support teamsis installing (scientific) software. Due to the advanced na-ture of supercomputers (i.e., multiple multi-core processors,high performance network interconnect, need for most re-cent compilers and libraries, etc.), compiling software fromsource on the actual operating system and for the systemarchitecture that it is going to be used on is typically highlypreferred, if not required, as opposed to using readily avail-able binary packages that were built in a generic way.Support teams at HPC sites worldwide typically invest largeamounts of time and manpower tackling this tedious andtime consuming task of installing the (scientific) softwaretools, libraries, and applications that researchers are re-questing, while trying to maintain a coherent software stack.EasyBuild is a recent software build and installation frame-work that intends to relieve HPC support teams from theubiquitous burden of installing scientific software on HPCsystems. It supports fully automating the (often complex)installation procedure of scientific software, and includes fea-tures specifically targeted towards HPC systems while pro-viding a flexible yet powerful interface. As such, it hasquickly grown to become a platform for collaboration be-tween HPC sites worldwide. We discuss EasyBuild in moredetail in Section 2.The time-consuming problem of getting scientific softwareinstalled also presents itself on Cray systems, despite theextensive programming environment that Cray usually pro-vides. Both HPC support teams and end users of Craysystems struggle on a daily basis with getting the requiredtools and applications up and running (let alone doing soin a somewhat optimal way). While a policy for providingsoftware installations in a consistent way to end users is acommon goal, the reality more often than not is that short-cuts are being taken to speed up the process, resulting in asoftware stack that is organised in a suboptimal (and some-times outright confusing) manner, with little consistency inthe software stacks provided on different systems beyondwhat is provided by Cray. Hence, it is clear that EasyBuildcould also be useful on Cray systems.Originally, the main target of EasyBuild was the standardGNU/Linux 64-bit x86 system architecture that is common-place on todays HPC systems. On these systems there typ-ically is an abundant lack of a decent (recent) basic softwarestack, i.e., compilers and libraries for MPI, linear algebra,etc., to build scientific applications on top of. EasyBuild wasdesigned to deal with this, by supporting the installationof compilers and accompanying libraries, and through thetoolchain mechanism it employs (see Section 2.2.4). As such,it was not well suited to Cray systems, where a well equippedprogramming environment is provided that is highly recom-mended to be used.In this paper, we present the changes that had to made toEasyBuild to provide stable integration with the availableprogramming environment on Cray systems (see Section 3).In addition, we outline the use case of installing Python andseveral common Python libraries using EasyBuild on Craysystems (Section 4), and discuss how EasyBuild deployedat the Swiss National Supercomputing Centre CSCS (Sec-tion 5).2. EASYBUILDEasyBuild [1719] is a tool for building and installing scien-tific software on HPC systems, implemented in Python. Itwas originally created by the HPC support team at GhentUniversity (Belgium) in 2009, and was publicly released in2012 under an open-source software license (GPLv2). Soonafter, it was adopted by various HPC sites and an activecommunity formed around it. Today, it is used by institu-tions worldwide that provide HPC services to the scientificresearch community (see Section 2.4).2.1 GoalsThe main goal of EasyBuild is to fully automate the taskof building and installing (scientific) software on HPC sys-tems, including taking care of the required dependencies andgenerating environment module files that match the instal-lations. It aims to be an expert system with which all as-pects of software installation procedures can be performedautonomously, adhering to the provided specifications.In addition, EasyBuild serves as a platform for collaborationwhere different HPC support teams, who likely apply differ-ent site policies concerning scientific software installations,can efficiently work together to implement best practices andbenefit from each others expertise.It aims to support achieving reproducibility in science, byenabling to easilyreproduce software installations that wereperformed previously. It allows sharing of installation recipesin order to enable others to perform particular software in-stallations. Several features are available in EasyBuild tofacilitate that; see also Section 2.3.5.EasyBuild intends to be flexible where needed so that HPCsupport teams can implement their own site policies, rangingfrom straightforward aspects like the installation prefix forsoftware and module files, to the module naming schemeand the particular compiler and libraries being employed;see also Section 2.3.6.2.2 TerminologyBefore going into more detail, we introduce some terminol-ogy specific to EasyBuild that will be used throughout thispaper.2.2.1 EasyBuild frameworkThe EasyBuild framework is the core of the tool that pro-vides the functionality that is commonly needed for buildingand installing scientific software on HPC systems. It consistsof a collection of Python modules that implement: the eb command line interface an abstract software installation procedure, split upinto different steps including configuring, building, test-ing, installing, etc. (see Section 2.3.1) functions to perform common tasks like downloadingand unpacking source tarballs, applying patch files, au-tonomously running (interactive) shell commands andcapturing output & exit codes, generating module files,etc. an interface to interact with the modules tool, to checkwhich modules are available, to load modules, etc. a mechanism to define the environment in which theinstallation will be performed, based on the compiler,libraries and dependencies being used (see also Sec-tion 2.2.4)2.2.2 EasyblocksThe implementation of a particular software installation pro-cedure is done in an easyblock, a Python module that defines,extends and/or replaces one or more of the steps of the ab-stract procedure defined by the EasyBuild framework. Easy-blocks leverage the supporting functionality provided by theEasyBuild framework, and can be viewed as plugins.A distinction is made between software-specific and genericeasyblocks. Software-specific easyblocks implement a proce-dure that is entirely custom to one particular software pack-age (e.g., OpenFOAM), while generic easyblocks implementa procedure using standard tools (e.g., CMake, make).Each easyblock must define the configuration, build and in-stall steps in one way or another; the EasyBuild frameworkleaves these steps purposely unimplemented since their im-plementation heavily depends on the tools being used in theinstallation procedure. Since easyblocks are implemented inan object-oriented scheme, the step methods implementedby a particular easyblock can be reused in others throughinheritance, enabling code reuse across easyblocks.For each software package being installed, the EasyBuildframework will determine which easyblock should be used,based on the value of the easyblock parameter, or the nameof the software package if no easyblock is specified.2.2.3 Easyconfig filesEasyconfig files contain sets of key-value definitions for theparameters that are supported by the EasyBuild framework,which are also referred to as easyconfig parameters.They specify what software package and version should beinstalled, which compiler toolchain should be used to per-form the installation (see also Section 2.2.4), and which spe-cific versions of the required dependencies should be madeavailable. In addition, they provide some basic metadata(short description, project homepage, . . . ), allow to specifycustom settings for the build, and so on.An example of an easyconfig file is shown in Listing 3.2.2.4 ToolchainsA compiler toolchain, or simply toolchain for short, is a set ofcompilers (typically for C, C++, & Fortran) typically com-bined with special-purpose libraries to support specific func-tionality, for example an MPI library for distributed com-puting, and libraries that provide heavily tuned routines forcommonly used mathematical operations (e.g., BLAS, LA-PACK, FFT).For each software package being installed with EasyBuild,a particular toolchain is being used. The EasyBuild frame-work prepares the build environment for the different toolchaincomponents, by loading their respective modules and defin-ing environment variables to specify compiler commands(e.g., via $CC), compiler and linker options (e.g., via $CFLAGSand $LDFLAGS), etc. Easyblocks can query which toolchaincomponents are being used and steer the installation proce-dure to be performed accordingly.2.3 FeaturesWe briefly present the main features of EasyBuild below;for a more detailed overview we refer to [1719] and theEasyBuild documentation [2].2.3.1 Step-wise installation procedureEach installation is performed in several steps that are de-fined in the EasyBuild framework, see Figure 1. The wholechain of steps is executed for each of the provided easyconfigfiles and, if necessary and desired, also for each of the miss-ing dependencies (see Section 2.3.3). If a problem occursduring one of the steps, the remaining steps are cancelled.After parsing the easyconfig file, the build directory is cre-ated by obtaining and unpacking the source files, and ap-plying specified patch files (if any). Next, the build envi-ronment is set up based on the toolchain and dependen-cies being used. Subsequently, the common configure-build-install cycle is performed; if a test mechanism to verify thebuild is supported (e.g., make check) it is run before theinstall step. For software packages that support the notionof extensions, e.g., Python (packages), Perl (modules), R(libraries), etc., the listed extensions are also installed. Thesanity check step performs a couple of small checks to ver-ify the installation, i.e., checking whether prescribed filesand directories are present, and whether simple check com-mands (e.g., importing a Python package) can be executedsuccessfully. If the sanity check passes, the build directoryis cleaned up and an environment module file that matchesthe installation is generated. Finally, a couple of final (op-tional) steps are performed: changing of permissions (e.g.,to protect the installation for a particular group of users, orthe make it read-only), creating a package (e.g., an RPM)for the installed software, and running test cases (if any arespecified) just like a user would.Important to note is that each installation is done in a sepa-rate installation prefix, enabling multiple versions and buildsof the same software package to be installed side-by-side.2.3.2 Generating module filesFor each successful software installation the EasyBuild frame-work generates a corresponding module file, which specifiesthe changes to the environment that are required to use thesoftware. In addition to being the canonical way of givingusers access to the installed software, these modules are alsoemployed by EasyBuild itself to resolve dependencies (seeSection 2.3.3).The generated module file specifies updates to environmentvariables like $PATH, $LD_LIBRARY_PATH, etc. to make bina-ries and libraries available. The EasyBuild framework doesthis automatically by checking for standard sub-directorieslike bin, lib, etc. Additional non-standard paths to be con-sidered can be specified in easyblocks or easyconfig files.Likewise, additional environment variables that should bemanipulated by the generated module file can be specified.The module naming scheme to be used can be implementedin a simple Python module that prescribes how a modulename should be constructed based on a set of easyconfig pa-rameters. A couple of common module naming schemes areprovided out-of-the-box, including the one used by default(/--).2.3.3 Dependency resolutionThe dependency resolution mechanism that is part of theEasyBuild framework makes it trivial to install a softwareapplication and all requires dependencies (including com-piler toolchain) with a single command.For each dependency, EasyBuild will check whether a corre-sponding module file is available to resolve that dependency.If not, and the robot mode is enabled, it will search for amatching easyconfig file for each missing dependency, in aset of locations that can be controlled by the user. The easy-config files for the missing dependencies are then processedin the order prescribed by the directed graph that was con-structed based on the dependency specifications, to performthe required installations.2.3.4 LoggingEasyBuild keeps a thorough log of how an installation wasperformed. This can be used in case of problems during theinstallation to debug the problem at hand, or to re-evaluatea successful installation procedure later, for example whenproblems emerged with the installed software. During theinstallation, the log file is kept in a temporary location thatis clearly mentioned in the output of the eb command.After a successfull installation, the log file is copied to asub-directory of the installation directory.EasyBuild log files are well structured: each log messagestarts with a specific prefix tag (to distinguish it from out-put of shell commands being executed), and includes a timestamp and the location in the code where the log mes-sage was sent from. Additional useful information is pro-vided in particular cases, like the location where shell com-mands were run, environment variables that were defined,exit codes of shell commands, etc.2.3.5 ReproducibilityOne of the design goals of EasyBuild is to allow for easily re-producing an installation that was performed earlier, a baserequirement for reproducible science. Although there arepotential external influences that EasyBuild does not con-trol (yet), all major aspects of the installation procedure arestrictly defined somewhere. Global settings like the prefixIV: unpack sourcesV: apply patchesVI: prepareVII: configureVIII: buildIX: testX: installXI: extensionsXII: sanity checkXIII: cleanupXIV: env. moduleIII: check readiness XV: permissionsII: fetch sources XVI: packagingI: parse easyconfig XVII: test casesFigure 1: Step-wise installation procedure performed by EasyBuild.path for software installations are specified in the EasyBuildconfiguration, while aspects specific to the particular soft-ware being installed and its installation procedure are en-coded in the easyblock and easyconfig file. The easyconfigfile may in addition also define parameters that are specificto the particular software version or toolchain being used.With this approach it becomes very easy to reproduce a par-ticular installation, since it basically comes down to usingthe same easyconfig file as was used before. For this rea-son, the easyconfig file is copied to the installation directoryfor each successful installation. In addition, it implies thatsharing easyconfig files and updates to easyblocks is typi-cally sufficient to enable others to replicate an installation.This has been a major factor influencing the growth of theEasyBuild community (see also Section 2.4).2.3.6 FlexibilityAnother important design goal of EasyBuild is to provideample flexibility so HPC sites can implement their own sitepolicies. This is reflected in several ways: how EasyBuildcan be configured, by the dynamic nature of the EasyBuildframework, and in the different features that provide controlover the installation procedures that will be performed.First of all, EasyBuild can be configured in different ways:via system-level and user-level configuration files, throughenvironment variables, and using the command line interface(CLI). These different configuration levels precede one an-other in an intuitive way, i.e., environment variables overrulesettings defined in configuration files, while settings speci-fied using the CLI get preference over the value of the cor-responding environment variables. This allows putting acentral EasyBuild configuration in place, without limitingthe ability to reconfigure EasyBuild easily on a per (shell)session basis for testing or experimental purposes.Next, the EasyBuild framework is very dynamic in the sensethat it can be easily extended or enhanced at runtime, byproviding additional or customized Python modules that canoverrule existing ones. Through this mechanism, specificeasyblocks, compiler toolchains, module naming schemes,etc. can be dynamically included, which will be picked up asneeded by the EasyBuild framework, regardless of whetherthey are part of the EasyBuild installation being used or not.Finally, EasyBuild supports different ways to tweak its de-fault behaviour. This ranges from providing a wealth ofconfiguration settings to control different aspects of the in-stallation procedure, to being able to specify alternate pathsthat should be considered (first) when searching for easycon-fig files, to even redefining or altering particular functionalityof the EasyBuild framework using custom implementations;for example, the module naming scheme being used.2.3.7 TransparencyA common concern is that EasyBuild may be too much ofa black box. To address this, the EasyBuild frameworkprovides a number of features to improve transparency. Inparticular, it supports the notion of performing a dry runinstallation, i.e., reflecting what EasyBuild would do given acurrent configuration and (set of) provided easyconfig file(s).The dependency resolution mechanism can be queried to getan overview of the full dependency graph, to see where thevarious dependencies come into play and for which of themmatching module files are already available. Additionally,the installation procedure that would be performed by Easy-Build for a given easyconfig file can be consulted in detail,in a matter of seconds. Not only does this provide a wayto gain confidence in EasyBuild itself and in the employedworkflow using the tool, it is also useful when developing,debugging and evaluating easyblocks and easyconfig files.2.4 EasyBuild CommunityOne other, non-technical, feature of EasyBuild is its activeand quickly growing community. Shortly after the first pub-lic release in 2012, EasyBuild was picked up by HPC usersupport teams desperately looking for a better way to dealwith their daily struggles of getting scientific software in-stalled.Today, EasyBuild is used by HPC sites around the world,ranging from small sites all across Europe, large Europeansites like Julich Supercomputing Centre (Germany) and theSwiss National Supercomputing Centre (CSCS, see also Sec-tion 5), several institutions outside of Europe including Stan-ford University (US), Ottawa Hospital Research Institute(OHRI, Canada), New Zealand eScience Infrastructure (NeSI),etc., to even large commercial companies like Bayer (Ger-many).Not only is the EasyBuild userbase already substantial andever expanding, a lot of the HPC sites using it are also activein the community by participating in discussions, attendingmeetings, etc., and are actively contributing back in various0"20"40"60"80"100"120"140"160"180"Feb+12" Aug+12" Mar+13" Oct+13" Apr+14" Nov+14" May+15" Dec+15"#"subscribers"Figure 2: Number of subscribers to the EasyBuild mail-ing list, over time.0"10"20"30"40"50"60"70"80"90"Feb/12" Aug/12" Mar/13" Oct/13" Apr/14" Nov/14" May/15" Dec/15"#"unique"contributors"framework" easyblocks" easyconfigs"Figure 3: Number of (unique) contributors to the differ-ent central EasyBuild repositories, over time.ways, by reporting and fixing bugs, sending back enhance-ments to easyblocks/easyconfigs and adding new ones, andalso sharing ideas for (and implementations of) additionalfeatures in the EasyBuild framework.Figures 2 and 3 give a quantified view of the growth ofthe EasyBuild community, in terms of subscribers to theEasyBuild mailing list and unique contributors to the cen-tral EasyBuild repositories where resp. the framework, easy-blocks and easyconfig files are being hosted.2.4.1 Features for and by the communityThe community-oriented aspect of EasyBuild is also reflectedin some of its features. As already discussed in Sections 2.1and 2.3, various features were included to provide amplycontrol over EasyBuilds behaviour and to support imple-menting local site policies.Additionally, there are also several features that were imple-mented as a reaction to the many contributions being sentback. In particular, there is a tight integration with theGitHub platform where the EasyBuild code repositories arehosted. Support is available to obtain easyconfig files fromGitHub pull requests, to send back test reports for those pullrequests, review pull requests, and even to create and updatepull requests straight from the eb command line interface,in an attempt to lower the bar to entry and streamline theworkflow for contributing back.Many of the features in place today were also either imple-mented by popular request from the community, or even bymembers from the community themselves. This confirmsthat EasyBuild has grown beyond a tool for the community,into a tool by the community and truly serving as a platformfor collaboration.3. EASYBUILD ON CRAY SYSTEMSEasyBuild supports bringing up an entire software stackfrom the ground up, including compilers and accompanyinglibraries. However, this is not the approach recommendedby the system vendor on platforms where a well equippedand optimized software stack of basic tools and libraries isreadily provided.On Cray systems, the Cray Programming Environment (CrayPE) provides a robust software environment that is tuned tothe underlying hardware, including different compiler suites,the Cray MPT library, and various additional tools and li-braries that were optimized by Cray.In this section, we will discuss in detail what changes neededto be made in EasyBuild to provide stable support for in-stalling scientific software on top of the readily available soft-ware stack provided on Cray systems.3.1 Cray Programming EnvironmentBefore going into detail, we outline the aspects of the CrayProgramming Environment (or Cray PE, for short), some-times also referred to as Cray Application Developers En-vironment (CADE), that are relevant to the subsequent dis-cussion. For more details, we refer to the Cray documenta-tion [20].3.1.1 Modules interfaceThe Cray PE is implemented through the well-establishedconcept of environment module files [14,15,21,24]. Althoughthe logic encoded in these modules can be quite complex, endusers are not exposed to this thanks to the simple interfaceprovided by the module command. Except maybe throughthe large number of modules that are loaded by default whenlogging in to a Cray system; unlike other HPC systems,having at least 20 modules loaded at any given time is verycommon.Even though the set of module files provided by the CrayPE is well structured and reasonable in size, it can be daunt-ing to navigate this environment, for a number of reasons.First of all, the provided modules form a large combinato-rial space since multiple versions are typically available formost of them. Since different (sets of) modules also interactwith each other in complex ways, extensive manual load-ing, unloading and swapping of these modules sometimesleads to unexpected problems. Also, as the Cray PE releasegets updated as often as monthly, additional modules willbe added and older ones may be removed (depending on sitepolicies). Cray does a good job of providing a sane, coor-dinated default set of modules, but users are on their ownpiecing together the components for a non-default release,and are expected to constantly re-evaluate which modulesthey employ, as sites typically recommend recompiling soft-ware after a new Cray PE release is rolled out.In our experience, significant care must be taken when in-teracting with these modules. We found out the hard waythat using module purge to start afresh does not work asexpected, in the sense that it is close to impossible to manu-ally reconstruct a properly working environment afterwards.Additionally, reloading modules that are already loaded mayhave unexpected side effects, so this should be avoided at alltimes.3.1.2 The programming environment module PrgEnvThe most prominent set of modules that is provided, at leastfor the sake of this discussion, are the PrgEnv modules.For each of the compiler suites supported by Cray, a corre-sponding PrgEnv module is available: PrgEnv-cray, for Cray Compiling Environment (CCE) PrgEnv-gnu, for GNU Compiler Collection (GCC) PrgEnv-intel, for Intel compiler suite PrgEnv-pgi, for PGI compiler suiteEach of these modules prepare the environment for using therespective compiler suite, by configuring the generic Craycompiler wrappers and loading additional modules that ei-ther provide additional software components or further tweakthe environment as needed. For example, the craype mod-ule that corresponds to that particular version of the PrgEnvmodule gets loaded, which results in (re)defining a set of$PE-prefixed environment variables that influence the se-mantics of other modules being loaded afterwards.Depending on the particular PrgEnv module being loaded,a module for the corresponding compiler suite will also beloaded. For example, loading PrgEnv-gnu will result in theCray-provided gcc module being loaded as well; likewise forthe other supported compiler suites. In addition, the cray-libsci module will also be loaded, which provides optimizednumerical routines (BLAS, LAPACK, etc.), among others.Note that this is fairly similar to the EasyBuild toolchainmechanism explained in Section 2.2.4; however there aresome key differences, see Section 3.2.3.It is important to note here that although multiple versionsof the different PrgEnv modules are available, the compilerand cray-libsci modules they load are always loaded with-out specifying a particular version, i.e., whichever version isthe default version for those modules (as indicated by thesite maintainers) will be loaded. As we will discuss in Sec-tion 3.3.4, this raises concerns w.r.t. reproducibility, sincethe default module versions tend to change over time asCray PE updates are being made available.3.1.3 Additional tools and librariesNext to the basic programming environment, additional mod-ules are available for libraries and 3rd party software. Aprominent one is the cray-mpich module for the Cray MPTlibrary providing support for distributed computations. Otherexamples include the fftw module that provides librarieswith optimized FFT routines, and the cray-hdf5 and cray-netcdf modules providing tuned versions of the establishedHDF5 and netCDF I/O libraries, respectively.Not only do these modules make additional tools and li-braries available, they also further configure the compilerwrappers provided via the PrgEnv module by defining vari-ous environment variables.3.2 Integrating EasyBuild with the Cray PEIn order to integrate EasyBuild with the Cray PE, a numberof issues needed to be dealt with. We briefly outline themin this section, before presenting how these were resolved inSection 3.3.3.2.1 Leveraging the Cray PE modulesThe most significant required feature was to enable Easy-Build to leverage the modules provided by the Cray PE.As mentioned in Section 2.3.2, EasyBuild not only gener-ates module files for the installations being performed, italso heavily relies on those modules for resolving dependen-cies and preparing the build environment. The EasyBuildframework needed to be enhanced to allow for using theso-called external modules provided by the Cray PE, seeSection 3.3.1.3.2.2 Better awareness of session environmentSome minor enhancements needed to be made to the mecha-nism for generating module files and the way in which Easy-Build interacts with the modules tool, in order to properlydeal with the sensitive nature of the Cray-provided modulesbriefly discussed in Section 3.1.1.This consists of taking into account already loaded mod-ules and unloading/swapping conflicting modules if required,eliminating the use of module purge, and preventing CrayPE modules from being loaded again if they are alreadyloaded.These issues are discussed in Sections 3.3.2 through 3.3.4.3.2.3 Version pinning of toolchain componentsTraditional toolchains employed by EasyBuild have explicitversions specified for each of the toolchain components, ex-actly like dependencies of software packages. This standsin stark contrast with the closely related PrgEnv modulefiles provided by the Cray PE, where the version of mod-ules corresponding to what would be toolchain components(compiler, numerical libraries) are deliberately not specified(see also Section 3.1.2).To adhere to the EasyBuild goal of enabling reproducibleinstallations, this needed to be dealt with; this is covered inSection 3.3.4.3.3 Enhancements to EasyBuildIn this section, we present the changes that were made toEasyBuild to deal with the concerns outlined in Section 3.2.The development of these features started early 2015, lead-ing to the experimental Cray support included in EasyBuildversion 2.1.0 (Apr15). Although progress was rather slowdue to limited initial interest from Cray sites, the supportfrom the CSC team in Finland, the feedback and contribu-tions by CSCS staff members, and the testing of EasyBuildon Blue Waters at NCSA helped significantly in working to-wards stable support for using EasyBuild on top of the CrayPE.The features discussed here are all part of the most recentEasyBuild release to date, version 2.7.0 (Mar16). All to-gether about 500 extra lines of code (LoC) were required toimplemented these changes, on top of the existing codebaseof about 50, 000 LoC. Hence, in hindsight the required effortin terms of implementation was very limited, illustrating thematurity of the EasyBuild framework.3.3.1 Support for external modulesAs mentioned earlier, EasyBuild relies on the module files itgenerated during subsequent installations, see Section 2.3.3.When resolving dependencies, EasyBuild will check whetherthe corresponding module files are already available, anddependencies = [('pandas', '0.17.1'),('cray-hdf5/1.8.13', EXTERNAL_MODULE),]Listing 1: Example list of dependencies illustrating the useof external modules.load them before initiating the installation; if not, it willsearch for easyconfig files that can be used to install the miss-ing dependency. An important detail here is that the mod-ules generated by EasyBuild contain information about theinstalled software they correspond to, that may be used dur-ing subsequent installations. In particular, the installationprefix and software version are provided through EasyBuild-specific environment variables, respectively named $EBROOT*and $EBVERSION*.In order to let EasyBuild leverage modules that were pro-vided some other way, for example as part of the Cray PE,the notion of external modules was introduced. Basically,this comes down to supporting that a dependency can bespecified using only a module name and a marker indicatingthat it should be resolved directly via the specified mod-ule, as opposed to a standard dependency specification thatincludes the software name and software version, and op-tionally an additional label (a.k.a. version suffix) and the(sub)toolchain it should be installed with. Marking a de-pendency as an external module instructs EasyBuild thatit should not attempt to search for an easyconfig file thatcan be used to install that dependency; if a module with thespecified name is not available, it should simply report anerror stating that the dependency could not be resolved.The example list of dependencies extracted from a (fictional)easyconfig file shown in Listing 1 illustrates the difference be-tween a standard dependency specification and one markedto be resolved by an external module. While the depen-dency on version 0.17.1 of the pandas Python package willbe resolved the standard way, i.e., by letting EasyBuild de-termine the module name, checking whether the module isavailable and falling back to searching for an easyconfig file ifneeded, only the specified version of the cray-hdf5 modulewill be considered to resolve the dependency on the HDF5library.Additionally, support was added to supply metadata for ex-ternal modules, so that EasyBuild can be made aware ofwhat they provide: the software name(s), version(s) andinstallation prefix corresponding to external modules of in-terest can be specified in one or more configuration files.An example of metadata for external modules is shown inListing 2. It specifies that: the cray-hdf5/1.8.13 and cray-hdf5-parallel/1.8.13modules both provide HDF5 version 1.8.13 1, and thatthe installation prefix can be obtained via the $HDF5_DIRenvironment variable that is defined by these modules; the cray-netcdf/4.3.2 module provides both netCDF4.3.2 and netCDF-Fortran version 4.3.2, and their com-mon installation prefix can be determine via $NETCDF_DIR;1EasyBuild makes the distinction between serial and parallelbuilds of HDF5 differently, i.e., through compiler toolchain thatwas used and whether or not it includes MPI support.[cray-hdf5/1.8.13]name = HDF5version = 1.8.13prefix = HDF5_DIR[cray-hdf5-parallel/1.8.13]name = HDF5version = 1.8.13prefix = HDF5_DIR[cray-netcdf/4.3.2]name = netCDF, netCDF-Fortranversion = 4.3.2, 4.3.2prefix = NETCDF_DIR[fftw/3.3.4.3]name = FFTWversion = 3.3.4.3prefix = FFTW_INC/..Listing 2: Example metadata for external modules. the fftw/3.3.4.3 module provides FFTW version 3.3.4.3,and the top-level installation directory is one level upfrom the path specified in $FFTW_INC, i.e. $FFTW_INCshould be combined with the relative path ..The installation prefix can also be hardcoded with an abso-lute path, in case it can not be derived from an environmentvariable set by the external module.Note that EasyBuild will not attempt to interpret the namesof external modules in any way; it solely relies on the pro-vided metadata. Also, EasyBuild will only interact withexternal modules via the modules tool, to load them andcheck for their availability; it will not try to parse or inter-pret their contents.Since EasyBuild version 2.7.0, a file containing metadata forselected modules provided by the Cray PE is included. Thisis sensible only because the Cray-provided module files areknown to be identical across Cray sites, if they are present.Moreover, it enables Cray sites to quickly get started withEasyBuild.It is worth noting that the support in EasyBuild for usingexternal modules is in no way specific to Cray systems. Thefeature was deliberately implemented in a generic way, in or-der to also make it useful on other platforms where the needfor integrating with existing module files arises. For moredetails on using external modules, we refer to the EasyBuilddocumentation 2.3.3.2 Generating more sophisticated module filesMotivated by the sometimes unpredictable behaviour of themodules files provided by the Cray PE, induced by the com-plex logic that is encoded in them, the mechanism in theEasyBuild framework for generating module files needed tobe extended slightly.Support for including module unload and module swapstatements, optionally guarded by a condition that checkswhether a particular module is loaded or not, had to beimplemented. We explain the need for this in more detail inSection 3.3.4.2http://easybuild.readthedocs.org/en/latest/Using_external_modules.htmlhttp://easybuild.readthedocs.org/en/latest/Using_external_modules.htmlhttp://easybuild.readthedocs.org/en/latest/Using_external_modules.html3.3.3 Eliminating the use of module purgeEasyBuild controls the environment in which installationsare being performed, in order to increase the likelihood ofbeing able to reproduce it later. This was done by recordingthe environment in which EasyBuild is running, frequentlypurging the modules that were loaded by the EasyBuildtoolchain mechanism (for dependencies, toolchain, etc.), andrestoring the initial environment in the sanity check step andbefore initiating a new installation.Since running module purge in an environment defined us-ing Cray PE modules can not be done reliably, the purgingof modules was replaced by simply restoring the initial en-vironment instead, which is a better approach anyway ofcontrolling the environment.3.3.4 Cray-specific compiler toolchainsDefining and supporting Cray-specific compiler toolchainswas another major enhancement that had to be made. Thisinvolved changes to the toolchain mechanism of the Easy-Build framework, and a custom easyblock to consume easy-config files that specify particular instances of Cray-specifictoolchains.Overview of Cray-specific toolchains. In the current im-plementation, three different Cray-specific toolchains havebeen defined. Each of them corresponds directly to one ofthe PrgEnv modules we discussed in Section 3.1.2: CrayCCE for PrgEnv-cray CrayGNU for PrgEnv-gnu CrayIntel for PrgEnv-intelThe equivalent toolchain for the PrgEnv-pgi module wasmissing at the time of writing because the PGI compiler isnot supported yet as a toolchain component in EasyBuildversion 2.7.0.In each of these, the toolchain consists of the respective com-piler suite (CCE, GCC, Intel) together with the Cray MPTlibrary (provided by the cray-mpich module) and Cray Lib-Sci libraries (cray-libsci).Framework support for Cray toolchains. The toolchainmechanism in the EasyBuild framework needed to be madeaware of the different components that are part of the Cray*toolchains, in order to support the provided toolchain in-trospection functionality that is used in various easyblocks,and to properly define the build environment based on theselected toolchain.For the compiler component of the toolchains, we leveragethe compiler wrappers provided by the Cray PE. The com-piler commands to are used by EasyBuild are identical re-gardless of which specific Cray* toolchain is used, i.e. cc (C),CC (C++) and ftn (Fortran). The readily available supportfor the GCC and Intel compilers was reused where relevantfor the CrayGNU and CrayIntel toolchains, e.g., for the com-piler flags to control floating-point precision. A limited setof toolchain options was defined to steer the compiler wrap-pers where needed, for example to disable dynamic linkingwhich is enabled by default and is controlled by definingthe $CRAYPE_LINK_TYPE environment variable, or to produceverbose output (which is useful for debugging).For the Cray MPT library, the existing support for theMPICH MPI library was leveraged to inform the toolchainmechanism how to deal with the MPI component of theCray* toolchains. Again, we leave the heavy lifting up tothe CrayPE compiler wrappers.Finally, for the Cray LibSci library a stub implementationof a library providing BLAS/LAPACK support was added.Just like for the compiler and MPI toolchain components,the EasyBuild framework relies mostly on the CrayPE com-piler wrappers, other than making sure that the installationprefix of LibSci is obtained via the $CRAY_LIBSCI_PREFIX_DIRenvironment variable, so it can be used if required.Custom easyblock for Cray toolchains. EasyBuild re-quires a module file to be available for every compiler toolchainto be used. For traditional toolchains, these typically onlyinclude straightforward module load statements for eachtoolchain component; the logic for actually using the toolchainis implemented in the toolchain support of the EasyBuildframework.To obtain a module file for a particular version of a Cray-specific toolchain, a custom CrayToolchain easyblock wasimplemented that redefines part of the module generationstep of the install procedure. This is required since the en-vironment in which these toolchains will be employed is sig-nificantly more tedious to operate in than usual. Care mustbe taken that already loaded conflicting modules are un-loaded, or that they are swapped with an equivalent modulethat is part of the toolchain being used. This comes back tothe issues discussed in Section 3.1.1.More specifically, the module file for a Cray* toolchain must: make sure that the correct PrgEnv module is loaded,after making sure that any other PrgEnv module is un-loaded; it is worth noting that the seemingly straight-forward option of swapping whatever PrgEnv moduleis loaded with the one that should be loaded was foundto be unreliable load the specific module version for each toolchain com-ponent that is specified in the easyconfig file for thetoolchain (see below), or swap whichever version isloaded with the required oneNote that this module is not only used by EasyBuild itselfduring the installation to prepare the build environment,but also whenever a module corresponding to a softwareinstallation is loaded by users, since these include a moduleload statement for the toolchain module that was used toinstall the software.Definition of Cray toolchains. The actual definition of aparticular version of a Cray-specific toolchain is expressedin an easyconfig file, just like for traditional toolchains.An example is shown in Listing 3. The toolchain name andversion are specified to be CrayGNU and 2015.11, respec-tively. EasyBuild is instructed to process this easyconfig fileusing the CrayToolchain easyblock (see previous section),and some basic information is provided through the home-page and description parameters; this will be included inthe generated module file to populate the output of modulehelp.While specifying a toolchain in each easyconfig file is strictlyenforced by the EasyBuild framework, it is irrelevant in thecase of installing a module file for a compiler toolchain, sinceonly the step of the install procedure where the module isbeing generated is performed by the CrayToolchain easy-block; all other steps are skipped. Thus, a so-called dummytoolchain is specified here.The remainder of the easyconfig file defines the actual toolchaincomposition. First, the PrgEnv-gnu module is listed as thebase of the compiler toolchain; the EasyBuild frameworkwill verify whether this aligns correctly with the toolchainname. The specified module name is deliberately left ver-sionless here, as was recommended to us by Cray support.The PrgEnv modules are intended to be mostly backwardscompatible in terms of functionality, and the latest availableversion should be loaded at all times.The subsequent entries of the dependencies list specify aparticular version of a toolchain component. These versionscollectively define this particular version of the CrayGNU com-piler toolchain.Note that the date-like format of the toolchain version, 2015.11in this example, is strongly connected to the CrayPE releaseit was created for, indicated by the year and month. Theactual versions of the toolchain components that define aparticular toolchain version are selected based on the rec-ommendations made by Cray for the corresponding CrayPErelease. Our evaluation has shown that there is no need todiscriminate between different types of Cray systems (e.g.XC vs XE/XK).The intention of version pinning the toolchain componentsis to allow for reproducible software installations on top ofthe Cray PE. If the toolchain definitions can be decidedupon in mutual consensus between different Cray sites, itprovides a solid base for collaboration. This stands in starkcontrast with todays lack of efficiently sharing expertise ina structured way on getting scientific software installed onCray systems.If desired, sites can define local variants of a toolchain byslightly modifying the versions of the toolchain componentsaccording to the availability of Cray PE modules. Such acustom toolchain can be tagged with an additional label (viathe versionsuffix easyconfig parameter) to discriminiate itfrom the standard toolchain definiition.Details of the Cray toolchain module file. To summa-rize, we now take a detailed look at the generated modulefile for a Cray-specific toolchain, and explain how it is ableto reliably modify the complex environment defined by theCray PE it gets loaded in.Listing 4 shows a part of the CrayGNU module file that isgenerated by EasyBuild for the easyconfig file shown in List-ing 3; other parts of the module file not relevant to the dis-cussion here have been omitted.First, the PrgEnv-gnu module that forms the base of theCrayGNU toolchain is dealt with. This is done through i) un-load statements for all other possible PrgEnv modules thateasyblock = 'CrayToolchain'name = 'CrayGNU'version = '2015.11'homepage = 'http://docs.cray.com/books/S-9407-1511'description = """Toolchain for Cray compiler wrapper,using PrgEnv-gnu see: PE release November 2015)."""toolchain = {'name': 'dummy', 'version': 'dummy'}dependencies = [('PrgEnv-gnu', EXTERNAL_MODULE),('gcc/4.9.3', EXTERNAL_MODULE),('cray-libsci/13.2.0', EXTERNAL_MODULE),('cray-mpich/7.2.6', EXTERNAL_MODULE),]moduleclass = 'toolchain'Listing 3: Easyconfig file for CrayGNU version 2015.11may be loaded (note that unloading is always safe to do, evenif the specified module is not loaded); ii) a guarded (ver-sionless) load statement for the PrgEnv-gnu module. Thecondition on the negation of is-loaded PrgEnv-gnu is re-quired to ensure that the PrgEnv-gnu does not get loadedagain, since that may result in unwanted side effects.Note that when this first section of the module file has beenprocessed when this toolchain module is being loaded, a par-ticular version of some of the toolchain components will al-ready be loaded since the PrgEnv module loads the defaultmodule for the compiler and cray-libsci modules (see alsoSection 3.1.2).The remainder of Listing 4 shows the logic used to load thespecified version of each of the toolchain components. De-pending on whether a module for that particular toolchaincomponent is already loaded or not, a swap or load opera-tion will be performed to ensure that the specified versiongets loaded.3.3.5 Cross-compilation and target architectureIt is quite common on Cray systems that the Cray develop-ment and login (CDL) nodes have a different processor ar-chitecture than the actual compute nodes of the system, andeven that different partitions exist in the system that differin processor architecture. If so, software compilation shouldbe done via cross-compilation for a different target architec-ture, to achieve optimal performance. The Cray PE providesa set of craype- modules for this purpose, which con-figure the compiler wrappers provided via the PrgEnv mod-ule to generate binary code for a particular architecture.This is strongly related to the optarch configuration optionthat is available in EasyBuild, which allows for specifyingwhich compiler flags should be used that control the targetarchitecture.For Cray toolchains specifically, we redefined the meaningof optarch to indicate which craype- module mustbe loaded in the build environment. That is, if optarch isdefined to be haswell, then EasyBuild will make sure thecraype-haswell module is loaded; if it is not (potentiallybecause another craype- module is loaded), it willexit with an error. A possible enhancement to this may beto automatically swap to the correct craype- module#%Modulemodule unload PrgEnv-craymodule unload PrgEnv-intelmodule unload PrgEnv-pgiif { ![ is-loaded PrgEnv-gnu ] } {module load PrgEnv-gnu}if { [ is-loaded gcc ] } {module swap gcc gcc/4.9.3} else {module load gcc/4.9.3}if { [ is-loaded cray-libsci ] } {module swap cray-libsci cray-libsci/13.2.0} else {module load cray-libsci/13.2.0}if { [ is-loaded cray-mpich ] } {module swap cray-mpich cray-mpich/7.2.6} else {module load cray-mpich/7.2.6}Listing 4: Partial generated module file for version 2015.11of the CrayGNU toolchaininstead.Also, in order to use a Cray toolchain optarch must bespecified, while with other toolchains it is an optional set-ting that can be used to override of the (usually sensible)default of targeting the host architecture. If it is not speci-fied, EasyBuild will currently exit with a clear error messageexplaining that it should be defined and correspond to anavailable craype- module.3.4 Testing & evaluationThe support in EasyBuild version 2.7.0 for integrating withthe Cray PE has been tested and evaluated on various Craysystems, using multiple established scientific applications.This section provides a short overview.3.4.1 SystemsPiz Daint & co CSCS, Switzerland. See Section 5 fora detailed discussion on evaluating EasyBuild on Cray (andnon-Cray) systems at CSCS.Sisu CSC, Finland. Sisu is a 1,688 compute node CrayXC40 series system hosted at the IT Center for Science(CSC) in Finland; more information is available at https://research.csc.fi/csc-s-servers.Swan Cray. Swan is a Cray XC40 series system hosted atCray. Swan is an early access test bed system for Cray tech-nologies. As such it is comprised of compute nodes using theIntel Broadwell, Intel Haswell and Intel Ivybridge processorsnext to hybrid nodes equipped with Nvidia K20X GPUs attime of writing.Blue Waters NCSA, US. Blue Waters is the largest CrayXE6/XK7 class system ever built with 26,864 compute nodes.4,228 of the nodes are XK7 nodes which have half the mem-ory and CPUs in exchange for an NVIDIA K20X GPU ac-celerator. Blue Waters serves a very broad range of high-endcomputational science as awarded by the National ScienceFoundation in the U.S, and primarily uses AMD Interlagosprocessors on both the compute and external service nodes,along with older AMD processors on the service nodes andtest system login node. More information about Blue Watersis available at https://bluewaters.ncsa.illinois.edu/.Titan ORNL, US. Titan is a Cray XK7 system oper-ated by the Oak Ridge Leadership Computing Facility forthe United State Department of Energy. Titan has 18,688compute nodes each containing an AMD Interlagos proces-sor and an NVIDIA K20X GPU accelerator; see https://www.olcf.ornl.gov/titan/ for more information.3.4.2 Software applicationsHPL. The HPL LINPACK benchmark version 2.1 servedas the initial test case for the Cray support in EasyBuild,and was installed using two different versions of the cur-rently supported Cray-specific toolchains, i.e. the 2015.06and 2015.11 versions of the CrayCCE, CrayGNU and CrayIn-tel toolchains.CP2K. The quantum chemistry and solid state physics soft-ware package CP2K was installed using the CrayGNU toolchain.The multi-threaded (popt) variant of CP2K version 2.6.0was tested, as well as the distributed (psmp) variant of CP2Kversion 3.0. For the FFTW dependency the Cray-providedfftw module was used, while the other dependencies (Libint,libxc) were resolved via EasyBuild.GROMACS. Installing the distributed variant of version4.6.7 of the popular molecular dynamics software GROMACSwas tested using the CrayGNU and CrayIntel toolchains.WRF. Version 3.6.1 of the Weather Research and Forecast-ing (WRF) Model was installed using both the CrayGNU andCrayIntel toolchains, on top of the cray-netcdf and cray-hdf5-parallel modules provided by the Cray PE. Eventhough the installation procedure of WRF is highly custom,no changes were required to the WRF-specific easyblock thatwas already avaialble to install it on top of the Cray PE.Python. Various recent versions of Python together withpopular Python packages were installed using the differentversions of the CrayGNU toolchain; we discuss the Python usecase in detail in Section 4.DCA++ dependencies. DCA++ is Dynamical Cluster Ap-proximation software implemented in C++ to solve the 2DHubbard model for high-temperature superconductivity. Thedevelopers used EasyBuild to install the dependencies re-quired for DCA++ that are not provided by the Cray PE,on different Cray systems and using the CrayGNU toolchain.More specifically, easyconfig files were composed for the NFFT,spglib and MAGMA libraries; the generic ConfigureMakehttps://research.csc.fi/csc-s-servershttps://research.csc.fi/csc-s-servershttps://bluewaters.ncsa.illinois.edu/https://www.olcf.ornl.gov/titan/https://www.olcf.ornl.gov/titan/easyblock turned out to be sufficient for each of them. Whereapplicable, Cray-provided modules are leveraged, i.e., cud-atoolkit for MAGMA, fftw for NFFT and cray-hdf5 toresolve the HDF5 requirement of DCA++ itself.It is worth noting that the easyconfig files were composedand tested on Piz Daint (CSCS), and then used to triviallyreproduce the installations on both Titan (ORNL) and Sisu(CSC), clearly highlighting the potential value of employingEasyBuild on Cray systems.4. USE CASE: PYTHON ON CRAYIn this section, we discuss the use case of providing an opti-mized installation of the standard Python distribution andadditional Python packages on Cray systems, on top of theCray-provided software stack.4.1 MotivationThe Python programming language and platform is becom-ing increasingly more popular in traditional scientific do-mains. As a result of this, the expectation of todays sci-entific researchers is that a fully featured and heavily opti-mized Python installation is made available on modern HPCsystems.However, at the time of writing Cray does not yet providesuch a high performance batteries included Python dis-tribution as a part of the provided software stack, despitethe increasing demand for it. The only Python environmentthat is readily available on the recent versions of Cray LinuxEnvironment (CLE) is the one provided through the SuSELinux Enterprise Server (SLES) distribution. Although itis mostly functional, it does not leverage the heavily tunedsoftware stack provided by Cray PE, nor was it tuned to theunderlying Cray system architecture.As a consequence, Cray sites are required to take care ofan optimized Python installation themselves. Next to in-stalling the standard Python distribution itself in a fully fea-tured way, additional Python packages for scientific comput-ing need to be included. This typically includes the heavilytuned SciPy stack consisting of popular libraries like numpy,scipy, matplotlib, pandas, and sympy, as well as Pythonpackages leveraging these base libraries, like scikit-learnfor machine learning and the Python Imaging Library (PIL).In addition, Python interfaces to established native librarieslike MPI (mpi4py), HDF5 (h5py) and netCDF (netcdf4-python), domain-specific packages like BioPython and pow-erful interactive tools like IPython are frequently requested.4.2 Python support in EasyBuildEasyBuild includes extensive support for installing Pythonand Python packages out-of-the-box.The (non-trivial) installation procedure of the standard Pythondistribution is encoded in a custom software-specific easy-block. It takes care of correctly configuring the build pro-cess, mostly related to picking up on optional dependencieslike libreadline & ncurses, Tcl & Tk, OpenSSL, etc., andcarefully checks whether the obtained Python installation isfully featured as intended.For installing Python packages, a generic PythonPackageeasyblock is available that implements the standard setup.pybased installation procedure, and performs a trivial sanitycheck by testing whether the installed Python package canbe imported. This generic easyblock can be leveraged toboth install Python packages as a part of a Python instal-lation, yielding a so called batteries included installation,and to install stand-alone modules for (bundles of) Pythonpackages; see also Section 4.4.In addition, a number of software-specific easyblocks forindividual Python packages are also available. Particularexamples are the easyblocks for NumPy and SciPy, whichimplement the tedious install procedure for these respectivePython packages, on top of the heavily tuned BLAS and LA-PACK libraries provided by the compiler toolchain. Theyalso include various checks to verify that the installationwas indeed performed correctly and optimally, like runningthe provided test suites and performing a simple speed testfor the numpy.dot routine by computing the dot product oftwo random 1000x1000 matrices, which is usually sufficientto detect misconfigurations since the performance differencebetween a basic and an optimized NumPy installation differsby over an order of magnitude.Several Python-related easyconfig files are available: for bareinstallations containing only the standard Python distribu-tion, installations featuring a list of Python packages thatshould be installed as extensions, and for individual Pythonpackages or bundles thereof.4.3 Installing Python on top of the Cray PETo install Python in an optimized way on top of the CrayPE, the existing EasyBuild support for both Python itselfand Python packages could be easily combined with theCray-specific compiler toolchains presented in Section 3.3.4.The relevant easyblocks were already sufficiently generic,such that no changes needed to be made to make the in-stallations work.It was sufficient to create the necessary easyconfig files thatspecify to EasyBuild the exact versions of Python itself, itsdependencies and those of the Python packages to be in-cluded, and which Cray-specific toolchain should be used toinstall them. Except for specific Python packages like h5pywhere the HDF5 installation provided by the cray-hdf5-parallel module can be leveraged, all dependencies otherthan the toolchain components are resolved through Easy-Build.Figure 4 shows the dependency graph for h5py 2.5.0 andPython 2.7.11 on top of version 2015.11 of the CrayGNUtoolchain (versions of the dependencies not relevant to thediscussion were omitted). Installations performed using Easy-Build are depicted using ovals, dependencies resolved via theexternal modules provided by the Cray PE are shown asrectangles and marked with EXT; additional Python pack-ages that are included as extensions in the Python installa-tion are indicated using dashed lines.The entire software stack for h5py and Python shown inFigure 4, including required dependencies and specified ex-tension Python packages, can be installed on top of the CrayPE with a single EasyBuild command, assuming that Easy-Build was installed and configured properly and that thenecessary easyconfig files are available:$ eb h5py-2.5.0-CrayGNU-2015.11-Python-2.7.11.eb -rIt suffices to specify the name of the easyconfig file to theeb command, and enable the dependency resolution mecha-nism using the -r command line option. After verifying thatcray-mpich/7.2.6 (EXT)bzip2cray-libsci/13.2.0 (EXT) PrgEnv-gnu (EXT)Python 2.7.11zlibfreetypelibreadlinencursescray-hdf5-parallel/1.8.14 (EXT)gcc/4.9.3 (EXT)libpngCrayGNU 2015.11h5py 2.5.0numpy 1.10.4scipy 0.16.1mpi4py 1.3.1pandas 0.17.1matplotlib 1.5.1Figure 4: Complete dependency graph for Python with included extensions, and h5py on top of it.the easyconfig files required to construct the entire depen-dency graph are available, EasyBuild will install any missingdependencies and generate modules for them, and subse-quently install Python and the specified extensions; after-wards, h5py itself will be installed on top of the obtainedPython installation.This is done fully autonomously and leveraging the relevantparts of the Cray PE, resulting in the intended batteriesincluded Python installation that is optimized for the ar-chitecture of the target Cray system.4.4 Managing Python packagesA number of different options are supported by EasyBuildto manage Python packages.As already mentioned, the Python installation can be donewith batteries included, meaning that additional Pythonpackages can be installed in the same installation prefix asPython. If updates for particular Python packages becomeavailable later, additional stand-alone modules on top of thePython module can be installed that overrule their outdatedequivalent that is part of the Python installation.These stand-alone module can be either for individual Pythonpackages, or bundles of Python packages that are somehowrelated to each other.Additionally, users can manage their own Python packageson top of the centrally provided Python installation, if theyso desire. For this, the standard Python installation toolslike pip can be employed.5. EASYBUILD AT CSCSFounded in 1991, the Swiss National Supercomputing Centre(CSCS) develops and provides large scale supercomputinginfrastructure to the international research community. TheCSCS infrastructure consists of both standard Linux HPCclusters and different generations of Cray systems, includingthe recent Cray XC30 system Piz Daint, which is currentlyranked 7th on the Top500 list (Nov15).Until recently, installing scientific software on the diversepool of HPC systems hosted at CSCS was done with little orno automation. Instead, the support team mostly resortedto manual installations, and the expertise on installing com-plex applications was scattered across various individuals,personal scripts, support tickets, and non-centralized notes.By consequence, updating software installations was tediousand error-prone and a huge burden on the support team, andthe software stack provided to end users was wildly inconsis-tent across the different systems and even within the samesystem.Motivated by the need to improve the consistency of theuser experience when switching between various systems andreducing the efforts involved in providing scientific applica-tions, EasyBuild was evaluated as a potential platform forfulfilling these aspirations.CSCS began evaluating EasyBuild with release of version 2.1(Apr15), when the first experimental support for Cray PEintegration was included, by providing an optimized fullyfeatured installation of Python on various systems hostedat CSCS. The experience was very positive, as Python anda set of additional Python packages were provided to endusers in a matter of days, across both Cray and non-Craysystems in a consistent way and with limited effort. This,and the welcoming and supportive EasyBuild community,convinced the CSCS support team to integrate it into theirdaily workflow, see below.5.1 CSCS systems using EasyBuildToday, EasyBuild is deployed on a center-wide central sharedfile system (GPFS) on most of the production and test sys-tems at CSCS, including: Piz Daint: Cray XC30 with 5, 272 compute nodes (In-tel Sandy Bridge + NVIDIA Tesla K20X) Piz Dora: Cray XC40 with 1, 256 compute nodes (IntelHaswell); extension to Piz Daint Santis & Brisi: Test and Development Systems (TDS)for Piz Daint and Piz Dora Pilatus: Linux HPC cluster with 44 compute nodes(Intel Sandy Bridge-EP) Monch: Linux HPC cluster with 440 compute nodes(Intel Ivy Bridge) Kesch & Escha: two Cray CS-Storm cabinets operatedfor MeteoSwiss (Intel Haswell + NVIDIA Tesla K80);see also Section 5.3For more details, we refer to http://www.cscs.ch/computers.5.2 Consistency for users and support teamEasyBuild has helped significantly in providing more con-sistency for both users and the CSCS support team, acrossvarious HPC systems hosted at CSCS.It serves as a uniform interface for maintaining the softwarestack on both Cray and non-Cray systems, and allows main-taining a central repository where installation recipes (easy-config files and customized easyblocks) are hosted. Thisapproach could also be used to enable sharing of expertisew.r.t. installing scientific software across Cray sites.On most Cray systems the Cray-specific toolchains discussedin Section 3.3.4 are used (one exception being the CS-Stormsystem, see Section 5.3); on the Linux HPC clusters, regu-lar compiler toolchains included in EasyBuild like foss andintel are employed.Providing software installations through EasyBuild and reusingcompiler toolchains where applicable (e.g. CrayGNU on PizDaint and Piz Dora) results in a consistent software stackbeing provided across the different systems, which is moreappealing to end users.Although the CSCS support team only provides full sup-port for a select set of applications, EasyBuild has enabledto provide additional limited support to users for installingother software as well, by providing easyconfig files and let-ting users maintain their own software stack on top of whatis centrally provided through EasyBuild.5.3 Use case: EasyBuild on Cray CS-StormMeteoSwiss, the Swiss national weather forecasting service,hosts their dedicated production systems at CSCS. For theirmost recent platform Cray CS-Storm systems were deployed,i.e. Kescha & Escha [11], with CSCS being responsible forgetting the system production-ready and for provisioning ofthe software stack.The software environment on the CS-Storm system is differ-ent to the XC and XE/XK line of systems, in the sense thatfor the particular deployment only PrgEnv-cray based partof the Cray PE release is supported. Therefore, CSCS optedto install their own software stack from scratch. However,when building the software for MeteoSwiss [25] on this sys-tem, it became clear that the provided GCC-based compilerstack was unable to assemble optimized (AVX2) instructionsfor systems Intel Haswell processors.While this issue was being discussed with Cray support, itbecame clear that most of the software required by Me-teoSwiss was already supported in EasyBuild. Moreover,it turned out that the problem with the Haswell support ofthe standard compiler stack was already solved in the recentcompiler toolchains defined by the EasyBuild community,by including a more recent version of GNU binutils, the setof tools that includes the GNU assembler as employed byGCC.As a result, the support ticket opened with Cray was closed,and the software stack required by MeteoSwiss was providedvia EasyBuild within a matter of days. This not only re-sulted in a more efficient workflow and a solution to com-piling the required software stack optimized for the targetarchitecture, it also further aligned the workflow for softwareinstallations with other CSCS systems where EasyBuild wasbeing used.5.4 Testing via continuous integrationDuring the life cycle of an HPC system, updates to the soft-ware environment are inevitable. Cray PE updates appearas frequently as every month. The accumulated set of mod-ules that were installed as a part of various Cray PE releasesmay quickly result in about 1000 modules.Qualifying software and hardware updates on dedicated testand development systems is considered good practice andshould happen as frequently as possible, and ideally as ancontinuous process. In order to accomplish this, all involvedsteps of the process should be designed to run fully auto-mated. A recent comparison of existing continuous integra-tion and deployment tools that were evaluated within thecontext of performance monitoring is available in [23].CSCS decided on using the Jenkins project as the platformof choice for implementing continuous validation techniqueswithin the scope of testing applications delivered to endusers through EasyBuild.Jenkins is oriented around projects that define a set of in-structions that are automatically executed upon user de-fined triggers like commits to central source code reposito-ries. The project defined by CSCS for continuously test-ing installations takes a list of easyconfigs and performs theinstallations on selected systems. The Jenkins dashboardallows every stakeholder to quickly understand the capac-ity of the entire software stack to be reinstalled on everysystem the center operates. In addition to being the centralsource of information about the current state of the deployedsoftware stack, provenance and history on the informationabout previous failures or success for builds can easily bestored and accessed upon interest.Another benefit of employing continuous and immediate ver-ification of every piece of software supported by the teamis the feedback loop introduced into the change processes,and the increased capability to pinpoint sources of problemsmore quickly and with higher confidence.The Jenkins project was designed to mimic the user experi-ence for installing applications on production systems, pro-viding an additional bonus of being able to understand theexperience perceived by the end user.The approach outlined in this section enables a completelynew approach to testing and deployment of requested appli-cations: the user support team prepares the easyconfig filesand provides them to Jenkins, which verifies the installationsbefore they are deployed.http://www.cscs.ch/computers6. RELATED WORKOver the recent years, various tools similar to EasyBuildhave been made available. We briefly discuss the most preva-lent ones here, highlighting the similarities and key differ-ences with EasyBuild.6.1 SpackThe most relevant tool in the context of this discussion isdefinitely Spack [16, 27], a package management tool cre-ated at Lawrence Livermore National Laboratory (LLNL).In terms of design and functionality, it is quite similar toEasyBuild: it is implemented in Python, has the notion ofpackages which are basically equivalent to a combination ofeasyblocks and easyconfig files, and was created to deal withthe daily struggle of dealing with software installations onHPC systems. Like EasyBuild, a community has emergedaround it that is actively contributing to the project.There are a couple of key differences however. The moststriking difference is the flexible support for specifying thesoftware dependency graph. Spack consumes a so-called ab-stract spec that is a partial specification of what should beinstalled, which is then transformed in a concrete spec (basi-cally through constraint solving) before the actual installa-tion is performed. This allows for easily navigating the com-binatorial space of dependency versions and build variants,and appeals to the main target audience of Spack: softwaredevelopers of large scientific applications that need to jugglewith lots of dependencies. As a result of this, Spack doesnot have a concept like compiler toolchains. In comparison,EasyBuild is more focused on making software installationseasy to reproduce across HPC sites, and mainly targets HPCuser support teams that need to provide a consistent soft-ware stack for scientists.The Spack development team was actively working on inte-gration with the Cray PE at the time of writing, in collab-oration with the National Energy Research Scientific Com-puting Center (NERSC).6.2 Other installation tools for HPC systemsSWTools [22, 26] and its successor Smithy [13] were devel-oped by Oak Ridge National Laboratoy (ORNL). Smithy isimplemented in Ruby, follows the Homebrew style of formu-las defining software installation procedures, and providessupport to leverage the Cray PE. From a design point ofview, Smithy is less of a framework compared to EasyBuildor Spack; there is little code reuse across different formulas,which also include a hardcoded template for module files.To the best of our knowledge, Smithy is only used by theHPC support team at ORNL.HeLmod (Harvard Extensions for Lmod deployment) [28] isa software management system developed by FAS ResearchComputing (FASRC) department at Harvard University. Itis built on top of Lmod, a modern alternative to the tradi-tional environment modules tool, and consists of a collectionof RPM spec files that implement software installation pro-cedures and include the accompanying module file. As such,it is prone to lots of code duplication, as opposed to Easy-Build that takes a more centralized approach where codereuse is actively pursued. To the best of our knowledge, itdoes not provide any integration with the Cray PE.Maali [8] (previously iVEC Build System or iBS) is a soft-ware build system developed by the Pawsey SupercomputingCentre. It is implemented in the Bash scripting language,and provides basic support for automatically installing soft-ware packages. Although it includes support for installingsoftware on Cray systems, the functionality it provides isfairly basic for example, there is not support (yet) for re-solving dependencies, and it provides a limited flexibility a lot of the configuration aspects are hardcoded in it.6.3 More generic software installation toolsNext to software installation framework and tools targetedtowards HPC systems, there are also several other projectsworth mentioning in this context.Two recent similar projects are Nix [5, 12] and GNU Guix[3,9], both so-called functional package managers. They takea radically different approach to building and installing soft-ware, and have reproducible builds as their main focus, tothe point where they strive for making builds bit-wise re-producible. Some noteworthy features include performingbuilds in an isolated environment, transactional upgrades,per-user profiles and portability of packages across differentoperating systems. Recent work [10] promotes the use ofthese tools on HPC systems; however, there is a lack of inte-gration with established practices like environment modulesand stringent requirements to ensure true isolation (due toreliance on chroot and requiring a running daemon thatperforms the actual build), leading to limited adoption inthe HPC community thus far.The Hashdist environment management system [4] is a soft-ware build and installation tool to manage the installationof large software projects on Linux, OS X and Windows;it used by the FEniCS project, for example. It providessupport for defining software profiles in YAML syntax.Portage [7] is the official package management and distribu-tion system for the Gentoo Linux distribution. pkgsrc [6]is a framework for building third-party software on NetBSDand other UNIX-like systems. Both of these tools distinguishthemselves from other Linux distribution package managersby their focus on building from source.Conda [1] is a package/environment management systemfor Linux, OS X and Windows. It was originally createdfor Python, but now supports other software as well. TheAnaconda project leverages conda and provides an extensivePython distribution that includes over 150 scientific Pythonpackages. Although it has some support for leveraging opti-mized libraries like Intel MKL, its primary focus is easy ofuse rather than highly optimized software installations.To the best of our knowledge, none of these projects provideintegration with the Cray PE.7. CONCLUSIONSWe motivated the need for automating the process of in-stalling (scientific) software on HPC systems, and on Craysystems in particular, with the requirement of being able toeasily reproduce software installations later or on differentsystems in mind.EasyBuild was presented as a software installation frame-work that has been successfully deployed on HPC sites aroundthe world for this purpose. After introducing EasyBuild,we discussed how it was successfully integrated with limitedeffort with the Cray Programming Environment, throughthe support for external modules and defining Cray-specificcompiler toolchains.The use case of providing an optimized batteries includedPython installation on Cray systems using EasyBuild wasdiscussed in detail, and we outlined how EasyBuild has beendeployed at CSCS to efficiently deal with providing softwareinstallations in a consistent way across the large amount andvariety of (both Cray and non-Cray) systems.We hope this work is the start of a revolution in how scien-tific software is installed on top of the Cray-provided soft-ware stack, and that it forms the base for a collaborationacross Cray sites to benefit from each others work and ex-pertise in providing users with the scientific software stackthey require.8. ACKNOWLEDGEMENTSWe would like to thank Cray, the IT Center for Science(CSC) in Finland in particular Olli-Pekka Lehto, and thethe Swiss National Supercomputing Centre (CSCS) for pro-viding us with exploratory access to their Cray systems, andfor their valuable feedback on our work.The development of EasyBuild is supported by Ghent Uni-versity, the Flemish Supercomputer Centre (VSC), the Flem-ish Research Foundation (FWO) and the Flemish Govern-ment, department EWI.This research is part of the Blue Waters sustained-petascalecomputing project, which is supported by the National Sci-ence Foundation (awards OCI-0725070 and ACI-1238993)and the state of Illinois. Blue Waters is a joint effort of theUniversity of Illinois at Urbana-Champaign and its NationalCenter for Supercomputing Applications.9. REFERENCES[1] Conda. http://conda.pydata.org/docs/.[2] Easybuild documentation.http://easybuild.readthedocs.org.[3] Gnu guix package manager.https://www.gnu.org/software/guix/.[4] Hashdist build it once.https://hashdist.github.io/.[5] Nix the purely functional package manager.https://nixos.org/nix/.[6] pkgsrc portable package build system.https://www.pkgsrc.org/.[7] Portage. https://wiki.gentoo.org/wiki/Portage.[8] P. S. Centre. Maali pawsey supercomputing centrebuild system. https://github.com/Pawseyops/maali.[9] L. Courtes. Functional package management withguix. CoRR, abs/1305.4584, 2013.[10] L. Courtes and R. Wurmus. Reproducible anduser-controlled software environments in hpc withguix. In Euro-Par 2015: Parallel ProcessingWorkshops, pages 579591. Springer, 2015.[11] CSCS. Piz kesch cs-storm system description. http://www.cscs.ch/computers/kesch_escha/index.html.[12] A. Devresse, F. Delalondre, and F. Schurmann. Nixbased fully automated workflows and ecosystem toguarantee scientific result reproducibility acrosssoftware environments and systems. In Proceedings ofthe 3rd International Workshop on SoftwareEngineering for High Performance Computing inComputational Science and Engineering, pages 2531.ACM, 2015.[13] A. DiGirolamo. The smithy software installation tool,2012.http://anthonydigirolamo.github.io/smithy/.[14] J. L. Furlani. Providing a Flexible User Environment.In Proceeding of the Fifth Large Installation SystemAdministration (LISA V, pages 141152, 1991.[15] J. L. Furlani and P. W. Osel. Abstract yourself withModules. In Proceeding of the Tenth Large InstallationSystem Administration (LISA 96, pages 193204,1996.[16] T. Gamblin, M. LeGendre, M. R. Collette, G. L. Lee,A. Moody, B. R. de Supinski, and S. Futral. The spackpackage manager: Bringing order to HPC softwarechaos. In Proceedings of the International Conferencefor High Performance Computing, Networking,Storage and Analysis, ser. SC, volume 15, 2015.[17] M. Geimer, K. Hoste, and R. McLay. Modern scientificsoftware management using easybuild and lmod. InProceedings of the First International Workshop onHPC User Support Tools, pages 4151. IEEE Press,2014.[18] K. Hoste, J. Timmerman, A. Georges, andS. De Weirdt. Easybuild: Building software with ease.In High Performance Computing, Networking, Storageand Analysis (SCC), 2012 SC Companion:, pages572582. IEEE, 2012.[19] HPC-UGent. Easybuild building software with ease.http://hpcugent.github.io/easybuild/.[20] C. Inc. Cray programming environment users guide.http://docs.cray.com/books/S-2529-116//S-2529-116.pdf.[21] Environment Modules Project.http://modules.sourceforge.net.[22] N. Jones and M. R. Fahey. Design, Implementation,and Experiences of Third-Party SoftwareAdministration at the ORNL NCCS. In Proceedings ofthe 50th Cray User Group (CUG08), 2008.[23] V. G. V. Larrea, W. Joubert, and C. Fuson. Use ofcontinuous integration tools for applicationperformance monitoring. In Proceedings of the 57thCray User Group (CUG15), 2015.[24] J. Layton. Environment Modules A Great Tool forClusters. Admin HPC. http://www.admin-magazine.com/HPC/Articles/Environment-Module.[25] MeteoSwiss. The new weather forecasting model forthe alpine region.http://www.meteoswiss.admin.ch/home/latest-news/news.subpage.html/en/data/news/2016/3/the-new-weather-forecasting-model-for-the-alpine-region.html.[26] SWTools, 2011. https://www.olcf.ornl.gov/center-projects/swtools.[27] Todd Gamblin. Spack.http://scalability-llnl.github.io/spack/.[28] F. R. C. H. University. Harvard extensions for lmoddeployment (helmod), 2013.https://github.com/fasrc/helmod.http://easybuild.readthedocs.orghttps://www.gnu.org/software/guix/https://hashdist.github.io/https://nixos.org/nix/https://www.pkgsrc.org/https://wiki.gentoo.org/wiki/Portagehttps://github.com/Pawseyops/maalihttp://www.cscs.ch/computers/kesch_escha/index.htmlhttp://www.cscs.ch/computers/kesch_escha/index.htmlhttp://anthonydigirolamo.github.io/smithy/http://hpcugent.github.io/easybuild/http://docs.cray.com/books/S-2529-116//S-2529-116.pdfhttp://docs.cray.com/books/S-2529-116//S-2529-116.pdfhttp://modules.sourceforge.nethttp://www.admin-magazine.com/HPC/Articles/Environment-Modulehttp://www.admin-magazine.com/HPC/Articles/Environment-Modulehttp://www.meteoswiss.admin.ch/home/latest-news/news.subpage.html/en/data/news/2016/3/the-new-weather-forecasting-model-for-the-alpine-region.htmlhttp://www.meteoswiss.admin.ch/home/latest-news/news.subpage.html/en/data/news/2016/3/the-new-weather-forecasting-model-for-the-alpine-region.htmlhttp://www.meteoswiss.admin.ch/home/latest-news/news.subpage.html/en/data/news/2016/3/the-new-weather-forecasting-model-for-the-alpine-region.htmlhttp://www.meteoswiss.admin.ch/home/latest-news/news.subpage.html/en/data/news/2016/3/the-new-weather-forecasting-model-for-the-alpine-region.htmlhttps://www.olcf.ornl.gov/center-projects/swtoolshttps://www.olcf.ornl.gov/center-projects/swtoolshttp://scalability-llnl.github.io/spack/https://github.com/fasrc/helmodIntroductionEasyBuildGoalsTerminologyEasyBuild frameworkEasyblocksEasyconfig filesToolchainsFeaturesStep-wise installation procedureGenerating module filesDependency resolutionLoggingReproducibilityFlexibilityTransparencyEasyBuild CommunityFeatures for and by the communityEasyBuild on Cray systemsCray Programming EnvironmentModules interfaceThe programming environment module PrgEnvAdditional tools and librariesIntegrating EasyBuild with the Cray PELeveraging the Cray PE modulesBetter awareness of session environmentVersion pinning of toolchain componentsEnhancements to EasyBuildSupport for external modulesGenerating more sophisticated module filesEliminating the use of module purgeCray-specific compiler toolchainsCross-compilation and target architectureTesting & evaluationSystemsSoftware applicationsUse case: Python on CrayMotivationPython support in EasyBuildInstalling Python on top of the Cray PEManaging Python packagesEasyBuild at CSCSCSCS systems using EasyBuildConsistency for users and support teamUse case: EasyBuild on Cray CS-StormTesting via continuous integrationRelated workSpackOther installation tools for HPC systemsMore generic software installation toolsConclusionsAcknowledgementsReferences

Recommended

View more >