The design of vision solutions - Universiteit van · Web viewIntroduction Purpose This document describes a plan for providing software support in the development of vision solutions

The design of vision solutions

Introduction

Purpose This document describes a plan for providing software support in the development of vision solutions. It aims at consolidating the products of the IOP-project. Without an explicit software consolidation step as described here, it is fair to conclude that the code developed in the IOP-programme would largely be lost.

Overview Software support should not be based on the technology pushing what can be delivered, but rather on what are the real needs of the designer of a vision solution. To that end, we start with analyzing the state of the art in vision. Then we consider the development of a solution as a business process and as a software design process. From there we derive requirements on vision libraries tailor made for the business and software design process. Finally we specify a product and a plan to develop the product.

Application areas The analysis of deriving good software support considers vision covering the use in manufacturing, in inspection, in steering, in observing, in guiding a vehicle or any such industrial activity. With minor adaptations, the analysis is applicable to robotics, agriculture and multimedia analysis. Development in medical image analysis may be partially different with its large amounts of detailed a priori knowledge on the shape and appearance of human organs, and its high demands on interactivity and visualization, as well as the legal consequences of errors in the processing may be different.

No easy solutions A most important asset in the development of application centers for computer vision is their capability to demonstrate or to falsify the feasibility of a solution. Problem statements from industrial, agricultural or medical industry for computerized vision are never formulated as an off-the-shelf-solution, when they first reach an application centre. It is rarely the case that only tuning some parameters will provide a solution. The process from problem articulation to feasibility and onto practical solutions is communication, labour, experience and knowledge intensive.

Seeing is believing At the current state of the art it is safe to conclude that industrial, agricultural and medical vision are undecided. That is, when confronted with a problem, in the majority of cases one cannot predict whether the problem is solvable. At each stage of the design process, some solution has to be implemented before it can be concluded that the solution is really viable. In short, for most vision problems the existence of a solution can not be assured until the solution is actually shown. In this aspect, vision departs drastically from other information systems. There it is usually known whether a problem can be solved without knowing precisely what the solution is.

Stages in design In the design of vision systems, the design process is often described as a staged process. First quickly work out a slow solution on a very limited dataset while visualising all intermediate processing results as a proof of concept. Then proceed to more solid, less visualized code while shifting the emphasis of evaluation towards robustness on large datasets expressed in a figure of merit. Finally, redesign the code to optimize it towards the run-time platform and perform run-time tests.

Development as a process

1

Design process To support the design process in an application centre by software we take a more detailed look at the design cycle of industrial, medical or agricultural vision applications. The design process still has many aspects of a craft. For the purpose of analysis of the design process, we break the design cycle down by a number of boxes.

Solutions Problem articulation First data

Problem Feasible Proof offirst statement possibility

communication diagnostics

Not likely feasible Not feasible

Real-life data Real-life & simulated dataAlgorithmic design Algorithmic performance On-line trials Robustness tests Prepare production

Proof of Zero Ready for principle product product

research design development

Unsolvable Dropped

1. Articulation First of all, the problem needs to be articulated as a commercial opportunity, economical founded and feasible from a manufactoring or production point of view. Within the first box, the emphasis is on communication with the client. The problem statement will likely cover:

classification by problem type, data in all its variability, availability of test data, sensor, light and scene, the desired outcome, the desired accuracy, operational circumstances, the type of hardware,manufacturing issues,critical conditions: time, parameter dependency, fault tolerance, the requested adaptability, the embedding system, software, middleware, interactivity,maintenance and support, heritage, and possibilities for multipliers into other problems.

2

We will not deal with these aspects here. We recognize, however, that the articulation of the problem is an important step for success if not the most important step. The required expertise in this box is communication skills, intuition for computer vision solutions, ability and experience for seeing the white spots and black holes, and broad operational access to a large variety of solutions. This phase needs to be completed to have a full risk-reward available. The software tools required here are demonstration tools and design tools.

2. Diagnostics In the second box, we call problem diagnostics, it is assessed whether the problem is understood correctly. How can the problem be solved by analyzing similar problems? Can trial solutions be designed on an example image? Compared to the current practice this stage needs to be separated from the next step. Too many problems skip this stage and stumble into the next stage where they solve the wrong problem. In order to shorten the time to market, this phase needs to be shortened most. The typical target time is 1 week mostly by restricting what is done here to parameter tweeking. The skills required for this box are a mastership over the state of the art, short development cycles, and hence programming by picking from function libraries, parameter setting and visualisation. The software tools required here are prototyping tools, large libraries of solutions, and visualisation tools.

3. Research In the third box, we call research, it is assessed whether the problem can be solved at all. Compared to the current practice this stage needs to be speeded up. This is the most costly stage both in terms of time, risk as well as in money spend. The better and broader the tools are available in this stage, the better the chance the research is finished in time. The skills required for this box are in depth expertise of vision, literature, design creativity, and an abstract and future-safe programming style. The tools required here are a large library of basic tools of amendable software libraries with visualisation capabilities embedded in an experimentation environment. The libraries should be available at the level of source code to make the effort minimal. In fact, new code is best designed starting from abstract patterns for which the basics of pixel addressing has already been solved.

4. Design In the design box the solution in principle is designed further to match all operational conditions. Compared to the current practice this stage needs to be matched better with the preceeding stage of the development. A seemless transition between research solution and the design stage furthers the shortening of the overall design cycle. The skills required for this box are an eye for practicality and practical problem hot spots, design creativity, and a problem-specific programming style. The software tools required here are amendable software source code of solution in a design environment. In the development stage, the software environment requires menu tools to enhance the efficient addressing of the functions in the library by non-vision-experts.

5. Development Once a proven, robust design is there, in the development box it is developed further into a commercially viable product. Compared to the current practice this stage needs to be matched better with the preceeding stage of the development. A seemless transition between the design result and the development stage furthers the shortening of the overall design cycle. Skills required for this box are processing speed and a platform-specific programming style. The software tools are centered on vision platforms and on-line experimentation.

Types of tools The software support of the design process may be achieved by demonstrators to show how a similar problem has been solved, prototyping to enlighten the feasibility aspects of a potential solution, and computerized instruction from textbooks to consult theoretical or practical instructions when considering a particular solution.

3

Demonstrators Demonstrators are particularly useful when the idea has been insufficiently articulated. To see a vision solution being applied to a problem may generate the spark to specify a solution to another problem as well. The software support here consists of demonstrators is a matter of experience and proper version management. What once has been generated for another solution may serve in the articulation of the next problem provided it is still operational and parts of that solution can be rapidly employed in the next solution.

Prototypes Prototyping aims at generating a trial solution in a very short time with the focus on testing the feasibility of a solution. Can the problem be solved at all? At this stage time critical performance is not the prime issue, rather whether there exists a solution and what the circumstances are under which such a solution can be reached. Prototyping is indispensible for quick product articulation. From a software point of view, prototyping is highly demanding. It requires that the loop between method specification and program specification is short. Prototyping relies on the availability of the broadest possible image processing library with a good coverage including recent scientific advancements. The scientific advancements should be addressable and tunable at various levels of understanding: both the vision expert as well as the domain expert should be capable of employing the operation. As a consequence, the interactivity and the visual rendering of the rationale behind the solution should be high. For the follow-up the step towards the real product code should be absent in the ideal case. Due to the conflicting demands between the prototype phase of broadness and state-of-the-art at all time versus narrow and speed-optimized solutions for the product phase, the barrier between prototyping and production oriented phase may in practice still be considerable. In such case that barrier should be as small as possible to shorten the overall design cycle effort.

Instructors Instructions are helpful in the articulation phase as they provide clear bounds on what can be done and cannot be done provided that such knowledge is available from textbooks. At the current state of the art very few methods are characterized by clear instructions when to use this algorithm and when to use another one.

Data management In the expression of the problem, data need to be known in sufficient variety and detail, specified on their visual, geometrical, physical and stochastical properties. When the problem is error-critical all outliers of the data need to be known as well. Acquisition requires data management tools to synthesize or to gather data, to keep large quantaties of data, to annotate them, to delineate the ideal segmentation, to analyze the results and to visualize their results.

Development of an algorithm

Algorithm specification The development of an algorithm in vision can be seen as a walk through four levels of specification as shown in the figure: the computational concept, the method, the program specification and the actual code.

Computational Computational Program Program Concept Method Specification Code

For each computational concept several computational method may exist. In turn, to each method several specifications may exist and to each specification also several codings may exist.

4

a. Concept For an example of this chain consider the computational concept of an edge, it may be computed by the method of Prewitt, Sobel and Canny each with their own preferred properties and drawbacks. The concept is usually specified at the level of continuous mathematics as if the image data field was still dense.

b. Computaional method The computational concept of Canny edge detection requires a Gaussian filter for which several computational methods are known: the anonymous parallel implementation with separation by dimension, the recursive implementation due to DeRiche and the improved version due to Young. The computational method is usually specified in discrete mathematics corresponding to situation where the image data have been sampled from the real dense field.

c. Program For the computational method of the Gaussian filter in parallel implementation several program specifications exist: a parallel implementation computing the pixels in the image independently, a sequential implementation computing the pixels in the image in a fixed scan-order. The program specification can be done in data-flow diagrams using symbolic representations for building blocks or other program specification means.

d. Code For the program specification of the Gaussian filter in parallel implementation several program specifications exist: a specific implementation for the Gaussian filter, an instantiation of the abstract linear filter, an instantiation of the symmetric and separable linear filter, or as an instantiation of the abstract parallel pattern. The code specification is done in a programming language such as C.

Conditions of failure In the expression of the problem, the algorithm needs to be verified at all steps of development. The computational method needs to represent the methodological concept under all circumstances it will encounter in practice, and the conditions of failure should be accurately listed. When the concept of an edge is needed in the solution and the Prewitt edge detector fails to detect an edge at sharp corners in the contour, this may be problematic depending on the presence and the relevance of the sharp corners to the solution. The program specification also needs to represent the computational model under all circumstances it will encounter in practice, and, again, the conditions of failure should be accurately recorded. When a recursive linear filter is specified as a very quick but approximate solution for computing a filter and the operational characteristics do not permit the approximation, the solution may fail. Finally, the code needs to accurately represent the specification, in addition to being logically flawless and sufficiently fast for all data variations it might encounter. Rarely code has been tested to the degree that one can guarantee it has seen everything. When all this is fulfilled, the development of the solution loops back to questioning the assumptions underlying the computational concept. By testing the solution to the data one tests wether the concept needs adaption to accurately fit the current problem.

Experience with software support for vision

Vision in NL In the absorption of vision in the Netherlands, a few scientific centres which have concentrated in vision have played a leading role. The vision library plus development interface SCIL-Image has played an important role in the development of the discipline in the Netherlands, next to the CBP-course. SCIL-Image not only served as the major platform where scientific centres shared their toolkits during most of the 90’s, it also served as medium for the export of knowledge and people to industrial research and development.

5

SCIL-Image The unique element of SCIL-Image is the interactive C-interpreter enabling a very short development cycle from an interactively specified sequence of methods to a program coded in C. The newly developed code could be added with little additional effort to the library of vision processing functions. As a consequence, the library grew over the years to reach its biggest extent of about 500 callable functions for the developer. Therefore, SCIL-Image is a good vehicle for the evaluation of new concepts and the subsequent exchange of new implementations among developers in the Netherlands. SCIL-Image has sold over a thousand copies in spite of a limited commercial effort and a price of klf 10 per item. Philips Medical Systems build the product in their common vision module to coordinate the world-wide development of vision solutions. Over the years the revenues have balanced the maintenance costs.

Existing vision libraries

Off the shelf A point of view is that the choice for a software toolset should be limited to commercial packages as they are the only ones ready for long time support. Commercial software when it is capable of solving the problem at hand has a preference, as the cost of coding is very high. The likelihood that such off-the-shelf solution exists in a commercially interesting application in vision is low, as it is likely that the product would already exist. For a review see: E. Malamas, E. Petrakis, M. Zervakis, L. Petit and J-D. Legat "Industrial vision: state of the art, problems and prospects". They mention dozens of packages. The most important competitors are Khoros, Visilog, and specialized libraries as available under Matlab and Labview.

Matlab Matlab is originally designed for matrix manipulations. It is in wide spread use as a mathematical and numerical toolbox for engineers. Also it visualization tools are well known. The open architecture makes it easy to use Matlab. Content-wise, is Matlab particularly good in its home ground: math. Relatively recently it has been extended with image processing capabilities but they are considered not rather limited in its coverage of modern image processing techniques. An exception holds for the mathematical morphology library which is good.

Khoros Khoros is seen as a widely spread system. It is an extensive library with many different sub-libraries called toolboxes. The modules are relatively independent behind one visual programming based interface, Cantata. The interface makes Khoros suited for prototyping as long as the application is small. Large numbers of processing functions make Cantata inefficient to use as it does not fit on the screen. The use of the Khoros requires extensive study as so many different people have been engaged with its development that a common denominator is difficult to assess. It is available for UNIX and X11. Khoros is a public domain product.

Visilog Visilog came into business about the same time as SCIL-Image did. The system provides a large set of functions for Windows or UNIX based platforms. To a certain extent, the product can be expanded and adapted to new environments. Visilog is still based on C and a macro language derived from that language. It has assembled a number of specialized application fields per discipline. Visilog requires a steep learning curve and does not have the highest reputation in reliability.

Review The main critisism of the above packages is that belong to the coding style of a decade ago from a software point of view. They donot use an abstract coding style and hence run into maintenance problems while they cannot keep up with a sufficient coverage of the rapidly developing theory of vision processing. All of the above packages have no source code generally accessible fixed proprietary data types, and lots of platform specific dependencies. Also, the coverage of the packages in broadness is mostly poor, where some maintain many different vision processing techniques whose competitive advantage remains unclear. In general, scientific laboratories do not use the packages very frequently.

6

Conclusion Apart from the obvious requirements, a state of the art vision library should include:1. good amendability upto the source code while preserving maintainability;2. good extensability;3. abstract code availability to reduce the amount of coding to a bare minimum4. good coverage over the modern parts of vision theoryThese requirements are hard to find in current vision libraries.

Considerations and requirements on vision libraries

We arrive at the following considerations for a vision library.

Modularity Designing the software from scratch is best when the vision problem at hand departs from anything seen in the past requesting unconventional solutions, or when the operational circumstances are different from anything seen in the past. In general, when the problem is very difficult one can imagine that everything has to be in line to reach a solution at all. It is safe to conclude that this no longer is a good engineering practice. The code will be dedicated to that data and that problem and that machine and that recording circumstances. Even for moon-landings a modular approach using separately tested modules each with reliable code is better, reserving code optimizations for the very frequently performed operations. Such a modular approach may come at the expense of some computational overhead. As long as the hardware tolerates such solutions, the gain in testing and maintainance is great.

Per discipline A library-based approach to vision development has emerged in disciplines as industrial vision, microscopy and agriculture. In the 1990, medical imaging was using less of software libraries as each new application there tends to be developed from scratch to include as much a priori about the sensor, the intended use and the organ in the solution as one can. New solutions tend to be improvements of solutions obtained for that organ. After the wide-spread acceptance of advanced deformable models, 3D-interaction and 3D geometrical models also in medical vision the reuse of function from libraries is anticipated.

Experienced code When for each application the processing modules were to be designed and implemented anew, the cost of development would surpass any reasonable budget. In addition, using software again is the only feasible solution to obtain reliable methods as they have seen so many different installations they may be deemed reliable. The demand for the precise circumstances under which a method or a program will have success and failure, indicates that it is advantageous to have vision libraries which have a proven record and a proven performance in many different situations. In other words, the need is for last lasting code libraries with experience. The Image-library from SCIL-Image fulfills this demand. Some of the code in the library is 20 years old, most of it is 10 years old, very long compared to the relative age of computer vision. The proper strategy would to use existing front-end systems and libraries where one can, provided that the new code ceovering new developments in vision processing can be added quickly.

We derive the following requirements for a vision library.

Broad coverage Development efficiency prescribes a broad oversight of computational methods. To avoid the costly development of completely new methods, there should be a good coverage of methods known from literature, and hence to the scientific labs of the Netherlands. The library of SCIL-Image has functioned as the most effective means to

7

exchange knowledge and problem statements. Within the library, this is most effectively supported by the availability of a large on-line instructors with prescriptions when to use methods under what circumstances. Software support is best supported by a large and diverse library of vision functions assembled over the years.

Abstraction The extent of the code requires abstract mechanism to be able to maintain the code. SCIL-Image in the end supported 12 different data types (2D integer, 2D binary, 3D float, 2D complex, etc.) and had more than a 500 different user callable functions resulting in more than 12 x 500 = 6000 functions. This posed too much of a burden on software maintenance. In a new library, the number of data types will go up with respect to SCIL-Image. Time sequences and colour spaces will generate new data types. As a consequence, the use of abstract data types and function instantiations for each data type is mandatory to reduce the number of actually programmed functions.

Efficient code There is a trade-off between the use of specific and abstract libraries in processing speed which may be larger for dedicated code on the one hand, and reusability, trustworthiness and proven robustness which may be better for abstract libraries. In the phase of problem articulation, speed should not be the limit as the feasibility of the solution is best tested using as broad a library as one has available. At run time this may differ. The barrier between a general library in the articulation phase and a machine specific library in the later stages should be small. This may enforced by using multiple libraries behind the same interface.

Environment Software verification of the algorithm requires implementation of the concept, specification into code followed by running a testrun in an integrated environment with capabilities to support for the development of the concept, the specification of the program and the coding. The software library should be embedded in an environment where interactivity of development, rendition of intermediate results and visualization of the data is readily available, preferably via commonly available front-ends as IDL or Matlab. At an advanced level of support, a C++ interpreter is undesirable. Instead a pattern-based command-line interpreter would be helpful, particularly when the interpreter would turn it automatically into instantiated C++ code as a new function call.

Multiplier A proper strategy would be to code in such a way that most of the code multiplies into a sequel problem. The instant coding in vision could profit considerably from the use of abstract code libraries building on the essential computational patterns which recur in vision. In that case, code generation boils down to instantiation of existing abstract code. This helps best in bringing down the code generation time.

Data environment At the end of development, the newly generated code needs to be run on sufficient data by experimenting in a systematic way. Experimenting may be done on synthetic data, real data off-line provided the data set is realistic, and real-data on-line. Experimentation in trial runs should determine all critical conditions of the algorithm. The data management tools are needed to synthesize data or to gather data, to keep large quantaties of data in a database, to annotate them where needed with the a priori class, to delineate the ideal segmentation in the data as a golden reference, to analyze the results and to visualize their distributions. Most of these tools should be incorporated from separate packages such as database systems, visualisation packages, and statistical libraries. This requires explicit attention to software package coupling, preferably via middleware such as CORBA.

Documentation Documentation is the key element of a vision library. With the use of abstract, pattern based code the documentation is also best generated on-line along the same patterns. At the same time, documentation of the function is not enough. Just in time learning with ample links to (web-based) textbooks is very helpful to receive when to use a function

8

and when not. Documentation (and experience with the data) serves as the vehicle for the exchange of knowledge, much more than anything else. Another inspirational example is the support provided by Matlab, which includes concepts specified by references, methods specified by equations, algorithms, and case studies.

The new product: Horus

The end of SCIL Fifteen years after its conception, SCIL-Image has reached the end of its life cycle due to the growth of the libraries and the diversity of the data structures it supports. The Image-library grows in the number of image types, functions, and versions. In addition, SCIL has grown in the number of supported platforms. As a consequence the library can no longer be maintained properly and the implementation of new developments has halted. Redesign of the entire code based on more abstract principles is mandatory to reduce the complexity of the code and to enhance the speed of development.

Design choices A natural choice for the new code is the use of C++ as it relies on the principle of abstract data types, inheritance and patterns. In this way, the new code can be designed in such a way that the number of functions has reduced drastically while the number of different data types has increased. In Horus several layers of code are distinguished. The bottom layer consists of 12 patterns of adressing the image, see the appendix. The patterns are underlying the majority of vision functions. Few functions and implementations known in the breadth of the low level image processing literature are not covered by one of the patterns, see the appendix for an overview of all functions.

The plan

Goal The main goal is to achieve in two years after the 1st of july 2001 a library suited for vision developers.

Layered structure The library has a layered structure. The bottom layer is accessible for architects. It will contain the patterns as indicated in the appendix. The layer will be documented for experienced C++-programmers. The second layer is the layer of instantiations, suited for programmers with experience in C++. A complete coverage of vision functionality for a large variety of data types, examples to cover even more data types and examples to cover variations of the algorithmic implementations. The layer will be documented for C++-programmers on the basis of a description of the function performance. The third layer is the layer of higher level, composed of combinations of instantiations. Where the second layer is focused on image processing, this layer concentrates on image understanding tools. For the list of functions to cover, see the appendix. The layer will be documented for C++-programmers at the same level of expertise as in layer two. The top layer of Horus is the communication layer. As this layer connections are made with C visualisation tools, Matlab and JAVA and other industry proprietary front-ends, such as LabView and EasyVision. In this layer also a coupling is present via the CORBA-middleware to packages which support this. We expect CORBA-support for databases, statistical libraries, and visualisation libraries. In the top- layer also a set of sample applications are included taken from IOP-projects.

For a planning, see the appendix.

9

Appendix: Horus

Data types in Horus

Current At this moment, Horus supports two- and three-dimensional images. The pixel values of an image are scalar values, or a vector of scalar values. A scalar value is represented by an integer value, one of {byte, short, int} or a floating point value, one of {float, double}. The vector values contain 2 or 3 scalar values. So, we have:

Scalar integer byte 2D, this is for large field imagesScalar integer short 2D, this is the common grey-valued imageScalar integer int 2D, this is for intermediate results requiring more precisionScalar real float 2D, this is the image representation for large contrast operationsScalar real double 2D, this is preferred to float as it has less arithmetic problemsScalar integer byte 3DScalar integer short 3D, this is the common grey-valued image in 3DScalar integer int 3DScalar real float 3D, this is the representation for many computations in 3DScalar real double 3DVector 2 integer byte 2DVector 2 integer short 2D, this is a common representation for vector fieldsVector 2 integer int 2DVector 2 real float 2D, this is common for motion field computationsVector 2 real double 2DVector 2 integer byte 3DVector 2 integer short 3DVector 2 integer int 3DVector 2 real float 3DVector 2 real double 3DVector 3 integer byte 2D, this is a useful representation for large color fieldsVector 3 integer short 2D, this is a useful representation for some color modelsVector 3 integer int 2DVector 3 real float 2D, this is a useful representation for other color modelsVector 3 real double 2DVector 3 integer byte 3D, this is interesting for 3D color imagesVector 3 integer short 3D, this is useful for vector computation in 3DVector 3 integer int 3DVector 3 real float 3D, this is useful for vector computation in 3DVector 3 real double 3D

Data types and pattern For some of the data types we have indicated a specific preferred application. The number of data types is too large to maintain individually. We have opted for data type independent computational patterns to implement algorithms, see the next section. This makes the number of data types to support independent from the number of functions to support. Therefore, it is of limited relevance to Horus whether the number of data types is 30 or 40. Specialized data types may be introduced with ease and all functions for them may be available at once as long as they have been based on patterns.

Missing Missing data types are one-dimensional images and complex value pixels. Their implementation requires very little work and will be conducted when the need arises.

10

Binary pattern Within Horus binary valued images are considered a special kind of the byte data type as the basic addressable unit is the byte. On the bytes, the complete binary arithmetic is provided for. For reasons of computational efficiency many libraries provide a special bit-packed implementation for binary value images. In Horus, it would require new algorithmic patterns dedicated to just one data type. Therefore, implementation of binary valued data types as a separate entity is not foreseen in the near future as long as computational efficiency in Horus is less important than broad coverage.

Patterns in Horus

In Horus the following patterns form the basis of the abstract data typed image processing operations:

1. Unary pixel A function with one argument is applied to all pixels in the image in abritrary order is based on the unary parallel pattern, where the computation on one pixel does not depend on the result scored on other pixels. Examples are computing the absolute value, the norm, an inversion, the sine.

2. Binary pixel A function with two arguments is applied to corresponding pixels in two images independent of the order of computation by the binary parallel pattern. Examples are: the addition, the infimum, the less than operation.

3. Multi pixel Similarly, a function with more than two arguments is applied to corresponding pixels in the input images by the multi-valued pixel parallel pattern. An example is the 3-way comparison.

4. In & Out The pixels of an image are imported from or are exported to an external source by the IO-pattern.

5. Reduce The pixel values in the image are reduced to a single value by the reduce pattern. The pattern is parallel by nature in the sense that all pixels are addressed and the result does not depend on the order of addressing the pixels. Example are the sum over an image, taking the minimum and other ways of summarizing the information in the image.

6. Generalized convolution The pixel values in the neighbourhood of a pixel in the input image are combined with the pixel values of a kernel image and written to the corresponding central pixel in the output image by the generalized convolution parallel pattern. Examples are found in the convolution operation, the Gaussian filter, an erosion, and a dilation.

7. Neighbourhood The pixel values in the neighbourhood of a pixel in the input image are combined into a single value and written to the corresponding pixel in the output image by the neighborhood parallel pattern. This is a more general case than the generalized convolution pattern. Examples are found in the Kuwahara filter which a combination of linear, quadratic and extreme operations.

8. Recursive neighbourhood The recursively applied neighbourhood pattern is a non-parallel pattern in the sense that the processing result thusfar has a consequence in the result of forthcoming pixels. Examples of operations which can be implemented with this pattern are the distance transform and recursive implementations of the Gaussian filter.

9. Geometric An arbitrary affine transformation is applied to the position of pixels in the image by the geometric pattern. Examples are a rotation and a reflection.

11

t-invariance The current patterns are mostly translation invariant, i.e. the operation applied to a pixel or a neighbourhood has no knowledge of the position in the image. However, translation variant versions are useful in some image processing operations and should be considered for implementation.

Instantiation example We give an example of the variety of functions which can be implemented on one of the patterns (in this case the binary parallel pattern) for binary pixel operations. Binary pixel operations include addition, subtraction, multiplication, division, minimum, maximum, supremum, power, modulo, and, or, xor, left shift, right shift, dot product, cross product, equal, not equal, less than, greater that, less equal, and greater equal. That is, the list of 23 operations is implemented using a single generic C++ function for all image types. In SCIL-Image the implementation takes 23 ops x 12 image types is 276 different functions.

Example use Another good example is found in the generalized convolution used to implement the bulk of filter operations from linear theory as well as the bulk of basic operations from mathematical morphology. For example, the following instructions will compute a very simple derivative:

# Construct a 2D kernel with 32-bit integer valued scalar pixels

# initialized with the values -1, 0, and 1kernel = fromIntData(1, 2, Size(3, 1, 1), (-1, 0, 1));

# and apply the operation to imim.generalizedConvolution(kernel, "mul", "addAssign");

The following instructions will do an erosion with a flat structuring element of size 3x5 with value 0:

# Construct the structuring elementkernel = createValueImage(im.signature(), Size(3, 5, 1),

0); # and apply the operation to imim.generalizedConvolution(kernel, "add", "minAssign”);

Both examples use the same generic C++ function that can apply the operation to all image types. The generalized convolution operation is the basis for at least 30-40 functions from SCIL-Image. Again, given the 12 images types in SCIL-Image this requires implementation of some 400 C-functions. In Horus just the one is required.

Under construction in Horus are the following patterns:

10. Sample The sample operation is the pattern underlying computing arbitrary functions from geometrical entities as specified to the pattern. To illustrate this, let B indicate the boundary of an object. Then, the pattern will compute an arbitrary function from the pixel values of B. This is the is the principle underlying the snake-based methods. If alternatively B is a 2D-blob, then the pattern represents the underlying pixel addressing of partial differential equations.

11. Q-based In the Q-based pattern the order of processing is data driven. That is, interesting pixels are put on a priority queue and processed in that order. Q-based processing is common in many operations from mathematical morphology, for example, the watershed segmentation.

12

12. Sequences Currently a special pattern is under development to cover all time sequences of an arbitrary Horus image data type. The pattern is implemented in the ARMA framework, commonly used in electrical engineering to describe states of time-based behaviour.

No patterns Not all processing function at the low-level can be expressed in the current patterns, as some operations require a very specific addressing sequence of the pixels. Currently Horus will not deliver pattern support for some operations. They can be implemented by C-like coding.

FFT The most prominent counter examples is the pattern underlying the FFT-algorithm. As the pattern in this case is very specific for the FFT-algorithm, there is no need to develop a pattern, nor is there a need to cover the FFT for all data types, as the complex type is mandatory. Therefore, FFT would be incorporated from an existing compute function library as Numerical Recipes.

Pyramids Also, another pattern missing is the pyramid. The abstract pyramid pattern can be quickly composed from the single image patterns available within Horus. The design philosophy of Horus would require this set be combined into one data type with the complete set of supporting functors including in this case multi-scale traces. As is common to all existing packages, this is available in Horus up to the point where a pyramid is composed of a set of images.

Graphs The third and last missing pattern is the graph representation of images. Graphs are different from data arrays in that they are flexible in their geometrical connections. There is a growing interest in graph-based computations but software in that area requires a complete revision of all code. For the moment, we prefer to implement the indicator representation of images, but we plan to monitor the acceptance of graph-based representations of images at the pixel level. A graph-based representation at the object level is different and can be supported by Horus higher level tools.

Horus vision functions

The following list of functions has been implemented.

0. Patternless Investigate parameters of the data structuresisNullidentdimensionalitydimensionSizesizesnumberOfPixelspixelDimensionalitypixelTypepixelPrecisionsignature

1. Unary pixel pattern Assign, recalculate, transform pixels and pixel-valuesasByte Most of these functions are really trivial to implementasShort by spelling them the image by image computational asInt of Horus may easily be overemphazised.asFloatasDoubleasVec2ByteasVec2ShortasVec2Int

13

asVec2FloatasVec2DoubleasVec3ByteasVec3ShortasVec3IntasVec3FloatasVec3DoublenegatecomplementabsceilfloorroundunarySumunaryProductunaryMinunaryMaxnorm1norm2normInfsqrtexploglog10projectRangeinverseProjectRangesincostanasinacosatanatan2sinhcoshtanhaffinepixelcontrastStretchabstractaccess This function can be instantiated further

2. Binary pixel pattern Operations on two images or an image and a constantadd These functions are important to image processingsub and frequently used.muldivminmaxinfsuppowmodandorxor

14

leftShiftrightShiftdotcrosslessThangreaterThanlessEqualgreaterEqualequalnotEqualthresholdabstractaccess This function can be instantiated further.

4. In and Out pattern The pattern fulfills all in and outgoing transport.readFilewriteFileFFT This and the next function are listed here as they fallHartley outside the standard patterns of Horus, see the paragraph

on patterns. Functions like these are computed with the aid of wrapped C-code or external libraries.

5. Reduce pattern Summarizing an image in a variable by this operatorsum This function also delivers image statistics: mean,

standard deviation etcetra.productminmaxinfsupabstractaccess This function can be instantiated further

6. Generalized convolutionconvolutionconvolution1dconvolutionGaussian A subfunction of these functions is the important class of

Gaussian derivatives upto fourth order.convolutionGaussian1dconvolutionSeparatedto be extendedabstractaccess This function can be instantiated further.uniformsimpleDerivativeerosionCanny edge detector

7. Neighbourhood patternpercentilenormCorrelationnonMaxGradDirrecursiveGaussianto be extendedabstractaccess This function can be instantiated further.

9. Geometric patternrotatescalereflectrestrictextend

15

projectDomaininverseProjectDomainabstractaccess This function can be instantiated further.

10. sample patternsampleIdentMasksampleWeightMasksetAtgetAtto be extendedabstractaccess This function can be instantiated further.

Expansions The following list of functions is missing in the current version of Horus. Given the many patterns and the amount of coding effort they take away, it is relatively easy to implement most of them as a consolidation of code derived from existing IOP-projects. Without the software consolidation, it is fair to conclude that the code developed in the IOP-programme would largely be lost.

Most urgently missing is operations on single object including propagation and labelling. Also the expansion of the mathematical morphology library with arbitrary erosions, parabolic erosions and grey value morphology, skeletons and watersheds, and the constraint distance transform is needed. These operations spread out over several patterns. Solutions of snakes and other deformable models exist but not yet in the Horus sample pattern where they belong. Generalized Convolution extensions are planned for Gabor filter, parabolic erosions and dilations, and small size convolutions size 3 and 5. For the neighborhood pattern, the Kuwahara filter is an instantiation to be made. Other operations such as the uniform filter, the eigen filters of color images are quickly to implement as instantiations of this pattern.

CORBA middleware and Horus

IDL The core functionality is defined in IDL (CORBA’s Interface Definition Language). It will provide access via:

- Java GUI : menu and dialogue based execution + visualization of results.- CorbaScript : interactive access to functionality + definition of high level functions.- C++/C/Java : use standard language bindings of IDL.

C++ support When (really) needed IDL statements could be translated automatically into C++/C/Java. In this way, interactively defined programs developed via the menu and dialog interface or CorbaScript could be transformed into C++/C/Java stand-alone applications automatically.

Externals CORBA also opens the way to integration with other software libraries, for example:- A (public domain) library for FFT (not supported in Horus)- Delft library for pattern recognition (their expertise, not ours)- Geometry library from the University of Utrecht- Brazilian library for mathematical morphology (when not done in Horus)

Database for storage of large collections of images and features, but also characterization of Horus operations.

User interfacing in Horus

Specification The essential components of the user interface are: - menu and dialogue to have easy access to the library,

16

- on-line instruction to deliver support for the tuning of vision processing functions,

- visualization of the algorithmic results- interactive command-line access to the library- display of image and other data structures

CORBA The basic component of interfacing the user is via CORBA. Hence interface tools can be derived from outside sources as long as they provide a connection via CORBA.

JAVA and Matlab In addition, we provide specialized bindings to JAVA and Matlab as our brand of support for the user front-end. JAVA is developed itself in the direction of specialized viewing components suited to support the Horus data structures. This case offers a choice. One option is between adopting an external interface library if and only if it is suited for image processing development and porting Horus to that system. The alternative is to design a handsome set of JAVA-tools to enable the user to develop the interfaces in JAVA quickly. As Matlab is intending offer direct access to JAVA, we anticipate the situation where we no longer have to support a specialized connection for Matlab. In this case, the aim is to have the full Matlab interactive capabilities available running the full breadth of Horus and its programmed extensions as a slave.

Partnerships Along the same line of implementation, for proprietary partnerships with industry we may provide user interfacing based on such domain specific interface systems as LabView and EasySCIL. No details have been worked out yet, but it is the aim to make the Horus system openly accessible for scientific industrial laboratories.

Documentation in Horus

Image There exists a long tradition of documentation in SCIL-Image which can be transplanted to Horus at the function level.

Just in time In addition, it is worth its while to study on new standards for documentation by generating the documentation on-line merging together the documentation of patterns as well as instantiations only when needed.

Web-based Finally, at the level of instruction, the examples of web-based instruction integrated with the documentation is currently under investigation. The reader is referred to http://www.dai.ed.ac.uk/HIPR2/featops.htm for an example.

17

Appendix: Horus project schedule

Phase 1: specification

A-a Specification of Horus image processing functionality

The specification results in a list of ready to use functions at the C++ level. The functions are those common in modern image processing literature, as example of such a list is given in appendix. The scope of the functionality will be checked against the functionality required by IOP projects and application centre partnerships to ensure its proper composition.

It is anticipated that the need for the application centre on multimedia and on industrial vision in its exploration phase and on agriculture have similar needs of a software library with the ability to select, expand and couple the software to other systems. For the application centre on medical imaging the software needs may be different due to the dominance of a priori knowledge in the solution, the 3D-data sets, the need for visualisation and interaction and the close relations with proprietary systems. It is yet open for discussion how a connection between a medical imaging software library and an industrial library will coincide.

A-b Inventory of external libraries to be connected to Horus

For the image processing functionality as well as for the graphical user interface there is the choice between direct implementation in Horus or indirect interfacing with existing software libraries. Candidates to consider:

the pattern recognition library from the Delft, other image processing libraries, the numerical recipes library, the computational geometry library from Utrecht,a redering library, a GUI-tool, and databases.

Our experience for coupling leads to formal middleware interfacing, particularly CORBA. In this manner the coupling is well-defined and hence easier to maintain and the system architecture is better suited for extension with libraries in other programming languages. Candidate libraries will be selected on the basis of urgency, requested level of integration, suitability for external coupling, estimated maintainabilty when coupled and performance issues due to data-type conversions in the coupling.

deliverable A The list of Horus functions and some functional specificationThe list of supported library couplings and the level of integration

involved A 5/12 fte total One contact person per application centre: PM fte Consultant from scientific labs: PM fteHorus project leader (responsible): 5/12 fte

elapsed time A Months 1 - 9 Year 1

B-a Specification of documentation and instruction levels.

18

Guidelines will be formulated to indicate the desired format for documentation of Horus image processing functions to ensure uniformity on the basis of the past experience with SCIL-Image as well as on the basis of the current online documentations.

For a good source of inspiration, see http://www.dai.ed.ac.uk/HIPR2/featops.htm. It should be noted that such a web-based instructional documentation is out of reach is this project nor is it actually needed as the level of expertise in our community is higher and communication lines are shorter.

The documentation should include a description for the programmer of the function and its parameters, several small sample applications to demonstrate the use of the function, and a comparision with similar functions to indicates its performance characteristics both in terms of execution speed and image processing capabilities.

B-b Architecture for documentation and instruction.

As the number of functions is likely to grow big, a software architecture is needed to maintain the documentation. Currently, documentation at the function level is automatically extracted from the source code and put into the reference manual. The reference manual is available online as well as on paper.

The architecture and scope for the (web-based) instruction system and the on-line help system is yet to be defined in its definitively form. There is a compromise between abstract solutions based on a text-fragment database and full-text; there is a compromise between an integrated instruction and a reference to standard textbook entries.

deliverable B Guidelines and example documentSpecification of the documentation architecture

involved B 8/12 fte total One contact person per application centre: PM fteConsultant for documentation management: 3/12 fte 1 personHorus project leader (responsible): 5/12 fte

elapsed time B Months 3 - 12 Year 1

C Specification of the graphical user interface.

Most image processing applications require a combination of images and graphics in their graphical user interface left unsupported in existing graphical user interface tools.

Past experience indicates that the graphical user interface software is labour intensive as well as vulnerable for externally induced changes in the environment. At the same it cannot be missed in the acceptance of the library as a whole.

An inventory will be drawn up indicating which components are required in a standard image processing development environment. The list will be checked against the graphical user interface requirements of current IOP projects to ensure validity.

deliverable C List of components to be implementedSpecification of programmed examples

involved C 2/12 fte total One contact person per application centre: PM fteHorus project leader (responsible): 2/12 fte

elapsed time C Months 6 - 9 Year 1

For all activity in phase 1, time available in phase 2 is the key limiting factor in the choice of options.

19

Phase 2: Implementation

D Completion of Horus at the architectural level.

Whereas the big majority of function to be expected in Horus is covered by one of the existing patterns, the image processing functionality indicates a few missing patterns in the internal Horus architecture needed to cover some of the remaining standard processing functions. Missing patterns include the sample pattern and the Q-pattern, where others may appear. The missing patterns need to be implemented by the Horus development team. The Horus team will implement missing graphical user interface components as well as the necessary infrastructure for connections to other libraries.

deliverable D Horus programs: patternsCouplings with other librariesGUI-component library

involved D 3 fte total Consultant for pattern design from scientific labs: 2/12 ftEditor for documentation: 5/12 fte 1 personScientific system designer: patterns 6/12 fte 1 personScientific system designer: 3 libraries 6/12 fte 1 personScientific system designer: GUI 12/12 fte 1 personHorus project leader (responsible): 5/12 fte

elapsed time D Patterns: Month 3 Year 1 - Month 12 Year 1Libraries: Month 1 Year 2 - Month 12 Year 2GUI: Month 1 Year 2 - Month 6 Year 3

E Completion of image processing functionality

A team of programmers will implement the list of functions. The implementation and testing of each function already contributes half of the sample application and instruction part of the documentation. Each programmer will fully document a function.

deliverable E Horus programs: functions plus function reference documentation involved E 5.5 fte total Consultant for functions from science labs: 6/12 fte

Editor for documentation: 7/12 fte 1 personProgrammers: 48/12 fte 2 personsScientific system designer: 3/12 fte 1 personHorus project leader (responsible): 2/12 fte

elapsed time E Functions: Month 1 Year 2 - Month 12 Year 3Documentation: Month 1 Year 2 - Month 12 Year 3

F Implementation of documentation software

A team will set up the documentation architecture responsible for the implementation of a software architecture to maintain the documentation produced by the programmers and to reproduce the information in the formats specified above. The technical writer will produce a manual to get users started with Horus.

deliverable F Horus documentation: user manual and instruction involved F 2.5 fte total Consultant for the instruction from science labs: 6/12 fte

Editor for documentation: 12/12 fte 1 personProgrammer: 10/12 fte 1 personHorus project leader (responsible): 2/12 fte

20

elapsed time F Documentation architecture: Month 1 Year 2 - Month 12 Year 3Documentation: Month 1 Year 2 - Month 12 Year 3Phase 3: Install

The overlap in phase 2 and phase 3 is used to spread the Horus version of july 2003 among potential users to evaluate its usability and to incorporate feed-back. The distribution includes: source codes, reference manuals and documentation.

Beyond 2003 the training program and maintenance will be conducted by the application centres. The strategy would be to have a non-transportable but open-code available to customers with maintenance support from the Horus team on that basis.

Overview

Total fte: 12.2 fte, contribution from UvA equivalent to 5.2 fte.Requested from IOP: 6.0 fte of Kfl 120 per fte = Kfl 720, deficit 1.0 fte.

A. SmeuldersD. Koelma

Amsterdam, January, 2001

21

Documents

The design of vision solutions - Universiteit van · Web viewIntroduction Purpose This document describes a plan for providing software support in the development of vision solutions