Upload
easter-parrish
View
214
Download
0
Embed Size (px)
Citation preview
[name]Computer Sciences DepartmentUniversity of Wisconsin-Madison
[email]@cs.wisc.edu
Building and Modifying
Condor
Nick LeRoyComputer Sciences Department
University of Wisconsin - [email protected]
http://www.cs.wisc.edu/condor
CondorProject.org
Before I start …
› If you have any questions, stop me along the way
› There should hopefully be time for discussions after the talk
› Feel free to talk to me, or any of the Condor developers, any time during the conference
› Todd will give the last part of the talk• Windows specifics
CondorProject.org
Space Requirements
› 5G is probably enough• Actual amount depends on the actual
features built• Bare minimum 2G
› Temporary space is required for building externals, automatically cleaned up
CondorProject.org
UNIX Requirements
› Most tools are standard on Linux development systems
› In other cases, they can be downloaded as binaries
› Or, downloaded as source and built by hand
CondorProject.org
UNIX Requirements List
› GNU tools:• GNU make• GNU autoconf and autoheader (2.59 or greater)• GNU tar (1.13 or higher)• GNU Compiler Collection (gcc >= 2.95.3)• gzip
› Other tools:• perl (5.005_03 or greater)• patch (must support unified diffs, GNU patch is preferred)• strip (can be either GNU or the vendor's version)• lex• yacc (or GNU bison)
› some other typically-found utilities (for example, cut, awk, etc.)
CondorProject.org
Getting it
› Download from the same place that you download the rest of Condor
› In the form of a gzip-ed “tarball”
› Unpack the tarball• If you don’t know how to do this, try:
rm condor_src-7.1.0-all-all.tar.gz
CondorProject.org
First Glance
› BUILD-ID• NMI build ID, you can ignore this
› config and imake• Yes, we still use imake• The rest of the world wisely abandoned it years
ago …• You can probably ignore these• Adds requirement: GNU cpp <= 4.1.3
› LICENSE-2.0.txt• Copy of the Apache License, Version 2.0• The license under which we’ve released Condor
CondorProject.org
Interesting Pieces
› README.building• Document describing building Condor
› NTconfig• Files required for building under Windows
› externals• Externally maintained packages• Some are “hard” requirements, others “soft”
› src• The Condor source code
CondorProject.org
Simple Build
› The basic Condor build is simple:
$ cd src
$ ./build_init
$ ./configure
$ make
CondorProject.org
Didn’t work?
› Most common problem is that you’re trying to build on a system that we haven’t ported the Standard Universe to
› Solution: Disable the standard universe and try again
$ ./configure --disable-full-port \
--disable-gcc-version-check
$ make
CondorProject.org
Externals
› Always have your bags packed• Bags are getting pretty big these days• Globus, ClassAds, PCRE, zlib, Kerberos
› Externals and versions by configure• To use system packages:
$ ./configure --enable-proper• “All or nothing”• Some features (in particular Condor-G) will be disabled• We’re working on making this selective
• Externals tree selected by:$ ./configure --with-externals=/path/tree
CondorProject.org
First look at src
› CODING_GUIDELINES
› condor_*• Directories with most of the source code• In the future, we’ll rename them and get rid
of the condor_ prefix• Also: h• We’ll look at more of these later
CondorProject.org
Configuring the build
› Uses GNU configure• Some options, like, --prefix don’t work• Make sure that the cpp you use isn’t >= 4.2$ export CXXCPP=/usr/bin/cpp-4.1
$ ./configure
• Default:$ ./configure
CondorProject.org
Minimal configuration
› To save disk & time, make use of –without-xxx or –disable-xxx options you don’t care about• Use ./configure –help to get a list of them• Packages listed as “hard requirement” can’t be
turned off• There are some interdependencies
$ ./configure --without-globus --without-nordugridgahp --without-unicoregahp --without-gt4gahp --without-srb --without-oci --without-gcb --without-gsoap --without-drmaa --without-gahp --without-blahp --disable-full-port
CondorProject.org
Some Problems & SolutionsUnknown GCC version
configure: error: Condor will not compile with gcc version 4.2.1
› Try:$ ./configure --disable-gcc-version-check
• The build itself may fail due to compiler incompatibilities
CondorProject.org
Some Problems & Solutions Unknown glibc version
checking glibc... ERRORconfigure: error: Condor does NOT know what glibc external to use with glibc-2.6.1
› Edit (yeah, with vi or emacs) configure.ac• Around line 2500, add a block for your glibc
version (cut & paste from nearby):
"2.6.1" ) # OpenSUSE 10.3 uses glibc 2.6.1 including_glibc_ext=NO
;;
• Rerun ./build_init for this to take affect
CondorProject.org
Build it
› From the src directory:
$ make
› Will build the externals as required
› Go get a beverage – this could take quite a while
CondorProject.org
Build Problems & Solutions Error in ClassAds external
classads-1.0rc5: FAILED! (see
/home/condor-7.1.0/externals/build/log.classads-1.0rc5)
› Disable ClassAds in configure:$ ./configure –without-classads• condor_q –better-analyze will be
broken
CondorProject.org
Build Problems & Solutions Error building other externals
xxxx-1.2.3: FAILED! (see
/home/condor-7.1.0/externals/build/log.xxxx-1.2.3)
› Disable xxxx in configure:$ ./configure –without-xxxx
› If this is a “hard requirement” or you rely on this feature:• Look in the above log and correct the
problem
CondorProject.org
Build Problems & SolutionsStandard Universe
/tmp/IIf.0twp5X:114:6: error: #error Checkpoint library not compatible with compiler!
../../imake/imake: Exit code 1.
Stop.
› Standard Universe features haven’t been ported to this compiler / platform yet.$ ./configure --disable-full-port
CondorProject.org
It built!
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/home/build/condor-7.1.0/src/condor_examples‘
$ make release
…
CondorProject.org
Build targets
› Testing release• $ make release• Suitable for testing• Creates release_dir
› Public release• What we actually release to the public• $ make public• Packaged tarballs wind up in ../public
CondorProject.org
Test It
› We’ll create a test installation of our Condor build• We built condor in /home/condor-7.1.0• We’ll make our test directory a subdirectory of that
• /home/condor-7.1.0/install
› Do a basic Condor install of the Condor from release_dir, just like you would any other Condor install
› Or …
CondorProject.org
Test Installation(Step by step)
$ CONDOR=/home/condor-7.1.0/install
$ mkdir $CONDOR
$ cd $CONDOR
$ mkdir checkpoints cred_dir execute spool log test
$ ln –s ../release_dir/* .
$ cp etc/examples/condor_config.generic etc/condor_config
$ export CONDOR_CONFIG=$CONDOR/etc/condor_config
$ vi $CONDOR_CONFIG
$ export PATH=$CONDOR/bin:$CONDOR/sbin:$PATH
$ rehash
$ condor_master
CondorProject.org
Simple checks
› Run ‘ps’, verify that the Condor processes are running
› Run condor_status –any› Run condor_status to verify that the
Startd’s machine is correct• Make sure that you wait a bit for the Startd to
publish it’s ad(s)
› Look through the logs› Submit a simple “hello world” test job, verify
that it runs as expected
CondorProject.org
More tests
› We have a whole suite of tests$ cd condor_tests$ make$ ./batch_test.pl –bIsThisNightly passed <…/src/condor_tests>Workspace testing…submitting . testslib_chirpio_van.run
succeededlib_procapi_pidtracking-snapshot.run
succeeded…› Wait patiently (very patiently)
CondorProject.org
Use the source, Luke
› Libraries
› Daemon Core
› Client (command line) Tools
› Daemons
› Standard Universe
› Other
CondorProject.org
Source Directories
› Most of the directory names are pretty clear
› We’re in the process of cleaning up, moving things around, and renaming, so be prepared for changes over time• GIT is finally giving us this freedom
› Quite a few have version numbers in the name that make little or no sense to the outside world (condor_startd.V6, …)• This will get cleaned up, too
CondorProject.org
Layering
C++ Utilities,
C Utilities
ClassAds, I/O, Daemon Client, Daemon Core, ProcAPI, SysAPI
Master, Quill, Startd, Shadow, Starter, Collector
Submit, Q,
tools, etc.
“h”, includes
CondorProject.org
Condor Libraries
› The layering is not perfect, there are interdependencies› General purpose:
• condor_util_lib• condor_c++_util
› I/O & Networking:• condor_io• condor_daemon_client
› Process Tracking:• condor_procapi
› System Information:• condor_sysapi
› ClassAds:• condor_classad
› Daemon Core• condor_daemon_core.V6
CondorProject.org
C / C++ Utilities
› In general, there’s a utility for everything• POSIX and stdio library wrappers• C++ Standard library replacements• Condor templates (CTL)
• We don’t use STL for hysterical reasons
› Designed to be portable
› Look here before reinventing the wheel
CondorProject.org
C: dprintf()
› Works like printf()• Conditionally writes to the logdprintf(D_ALWAYS, “Two + two is %d\n”, 2+2);
› OR together for multiple levels, so dprintf(D_COMMAND|D_SECURITY, <…>);
› Useful debug levels• D_ALWAYS• D_FULLDEBUG• Everything else is probably too esoteric (see
condor_debug.h)
CondorProject.org
C++: MyString.h
› Similar to STL’s string› Prefer MyString buffer to char buffer[1024]
• automatically allocates and resizes memory › Notable methods / operators:
• sprintf() and sprintf_cat() • Value() and GetCStr() – read-only access• += is overloaded to append a lot of types to the string• perl-like chomp() and trim() to get rid of whitespace • readLine() that can slurp in data from a FILE* and ostreams • replacement for strtok()
› Other tricks • search for substrings• escape characters
CondorProject.org
C++: Configuration
› Lookup values from the configuration› NOT a ClassAd! › Basic: param(const char *name)
• Returns a char * that you must decode manually • You MUST free() this buffer!
› Others: param_<type>(<name>)• Decodes to the specified type, and free()’s the buffer• Does NOT handle expressions!• Integer: param_integer(<name>)• Double: param_double(<name>) • Boolean: param_boolean(<name>)
CondorProject.org
C++: Boolean Configuration Expressions
› Boolean Expression: param_boolean_expr(<name>)• This one Does handle expressions• Configuration:
WIZBANG = ( FUBAR > 10 || SUPERCALIFRAGILISIC )
• Source Code:bool wizbang = param_boolean( “WIZBANG” );
CondorProject.org
More C & C++
› Wrappers and similar:• safe_open_wrapper(), my_popen()
› “CTL”• ExtArray, string_list, Queue, tree, stringSpace,
counted_ptr
› A lot of other classes & functions• File / Directory access classes: Directory, StatInfo• exponential_backoff• my_hostname(), my_username()
CondorProject.org
Condor I/O & Networking
› All Condor daemons have a “Command Socket”
› Data is encoded with CEDAR• Condor External DAta Representation
› CEDAR is all-singing, all-dancing• Data representation• socket abstraction• Security• bandwidth limiting• port ranges
CondorProject.org
Stream, Sock, et. al.
› The layering of the Condor socket objects is not obvious
› Stream (base class, in stream.{h,C} )• CEDAR streaming• Integers, chars, strings, etc.
› Sock (derived from Stream, in sock.{h,C} )• Adds connection / session management
› ReliSock (derived from Sock, in reli_sock.{h,C} )• TCP-specific “Sock”
› SafeSock (derived from Sock, in safe_sock.{h,C} )• UDP-specific “Sock”
CondorProject.org
Daemon Client
› Series of classes with knowledge of how to communicate with specific daemons• Master, Collector, Startd, etc.
› All derived from a common base
CondorProject.org
ClassAds
› C++ API to access the ClassAds that Condor uses internally
› “Old” ClassAds› Subclassed from AttrList, so look there› Lookup() versus Eval()
• Lookup() will return “7 + 2”• Eval() will return 9
› ClassAds are parsed to ExprTree(s)• Can generally avoid this and use Eval<Type>
› Insert() and Assign() to update the ad› sPrint(), fPrint(), and dPrint() to serialize
CondorProject.org
Condor Daemons
› The code for most Condor daemons are in directories named after the daemon:• Startd is in condor_startd.V6 …
› Note: 2 sets of starters / shadows• condor_starter.V5 and condor_shadow.V6
• Standard Universe
• condor_{starter,shadow}.V6.1• All others
CondorProject.org
Daemon Core
› Heart and body of a Condor daemon
› Usually a singleton object
› Event-driven loop around select()• Single threaded!
› Your code registers events for select() and callbacks• Timers, Pipes, Signals, Reaper, Socket,
CEDAR “Commands”
CondorProject.org
Registering a Callback
› Use Daemon Core’s Register_Command() method:daemonCore->Register_Command(128, "SAY_HELLO",
(CommandHandler)&say_hello, "say_hello", NULL, READ, D_FULLDEBUG );
› Parameters:• The command number (usually defined in condor_commands.h and
condor_commands.C)• Text description of the command • "CommandHandler", which is really a function pointer • Text description of the handler • The service class to use -- since this is a C handler, we don't need
one. • What Permission level we need to be to call this function (i.e.
HOSTALLOW_READ, HOSTALLOW_ADMINISTRATOR, etc) • What dprintf() level to use
CondorProject.org
Some guidelines
› You must not• Throw an exception• Call printf() or exit() or assert()
› You can:• call ASSERT()• call dprintf()
CondorProject.org
Dependency Hell
› Dependancies work on Windows
› Our build system has no knowledge of dependencies
› If you modify an include file, make sure that everything that depends on it gets rebuilt• $ make clean && make
CondorProject.org
More on Dependencies
› Objects from some directories need to get “repackaged” with the C++ library• condor_classads• condor_daemon_client
› Thus, to rebuild these:• $ make && make –C ../condor_c++_util
CondorProject.org
(Even) More on Dependencies
› If you’re working on a daemon and make a library change• Example daemon: Startd in the condor_startd.V6 directory• Example library: condor_daemon_client$ make –C ../condor_daemon_client && make -C
../condor_c++_util && make release
› If you modified dc_startd.h and want to be paranoid:$ (cd ../condor_daemon_client && make clean && make)$ (cd ../condor_c++_util && make clean && make)$ make clean && make release
CondorProject.org
Adding a Source File
› Add the file to the appropriate section of the Imakefile• No, I’m not going to explain our Imakefile
syntax here
$ ../condor_imake
$ make
CondorProject.org
Testing & Debugging
› OK, You’ve built a modified Startd, how do I test / debug it?• Remove STARTD from DAEMON_LIST• Start the master• Run the startd by hand
$ ./condor_startd -t –f• -t to log to stdout• -f to run it in the foreground• CTRL-C to kill it
CondorProject.org
More debugging
› Segfaults can sometimes be caused by object version mismatches• You added a field to a class in C++ Util, but
didn’t rebuild the Startd that uses the class
› With the the use of the –t and -f flags, you can debug like any other program• Adding dprintf()’s• With gdb• Using strace
CondorProject.org
Test it again
› After debugging, you should thoroughly test your modified code
› Maybe re-run the test suite (as described previously)
CondorProject.org
Latest Source
› Where can I get access to the latest source code?• Currently, all that’s available to the general public
is the source tarballs• We hope to make “GIT balls” available in the near
future• You’d have access to log information, etc.
• We won’t provide access to our live GIT repository• We’re worried about exposing security problems before
fixed binaries are publicly available
CondorProject.org
Before Contributing to Condor
› Before we can accept any code from you, you must sign our Contributor License Agreement (CLA)• Based on the Apache CLA• You give us the right to do anything with your code, but you
keep the rights to do whatever you want with it, too.
› Talk to us before large contributions• We may be already working on the same thing you are• We may have a “bigger” plan for solving the problem
› We recommend that you subscribe to the condor-devel mailing list• Discuss your idea / patch with the other Condor developers,
get feedback, etc.
CondorProject.org
Contributing to Condor
› Where do I send my patch / code?• Send patch to condor-devel list• Send it to condor-admin
CondorProject.org
Questions?
› Talk to any of the Condor developers
› Email to [email protected]
› Email to [email protected]
› Thanks!