Search Based Software Engineering: Techniques, Taxonomy, Tutorial

Author Topic: Search Based Software Engineering: Techniques, Taxonomy, Tutorial  (Read 1611 times)

Offline khalid

  • Jr. Member
  • **
  • Posts: 84
  • Test
    • View Profile
Abstract. The aim of Search Based Software Engineering (SBSE) research is to move software engineering problems from human-based search to machine-based search, using a variety
of techniques from the metaheuristic search, operations research and evolutionary computation paradigms. The idea is to exploit humans’ creativity and machines’ tenacity and reliability, rather than requiring humans to perform the more tedious, error prone and thereby costly
aspects of the engineering process. SBSE can also provide insights and decision support. This
tutorial will present the reader with a step-by-step guide to the application of SBSE techniques to Software Engineering. It assumes neither previous knowledge nor experience with
Search Based Optimisation. The intention is that the tutorial will cover sufficient material to
allow the reader to become productive in successfully applying search based optimisation to
a chosen Software Engineering problem of interest.
1 Introduction
Search Based Software Engineering (SBSE) is the name given to a body of work in which Search
Based Optimisation is applied to Software Engineering. This approach to Software Engineering
has proved to be very successful and generic. It has been a subfield of software engineering for
ten years [45], the past five of which have been characterised by an explosion of interest and
activity [48]. New application areas within Software Engineering continue to emerge and a body of
empirical evidence has now accrued that demonstrates that the search based approach is definitely
here to stay.
SBSE seeks to reformulate Software Engineering problems as ‘search problems’ [45, 48]. This
is not to be confused with textual or hypertextual searching. Rather, for Search Based Software
Engineering, a search problem is one in which optimal or near optimal solutions are sought in a
search space of candidate solutions, guided by a fitness function that distinguishes between better
and worse solutions. The term SBSE was coined by Harman and Jones [45] in 2001, which was the
first paper to advocate Search Based Optimisation as a general approach to Software Engineering,
though there were other authors who had previously applied search based optimisation to aspects
of Software Engineering.
SBSE has been applied to many fields within the general area of Software Engineering, some of
which are already sufficiently mature to warrant their own surveys. For example, there are surveys
and overviews, covering SBSE for requirements [111], design [78] and testing [3, 4, 65], as well as
general surveys of the whole field of SBSE [21, 36, 48].
This paper does not seek to duplicate these surveys, though some material is repeated from
them (with permission), where it is relevant and appropriate. Rather, this paper aims to provide
those unfamiliar with SBSE with a tutorial and practical guide. The aim is that, having read this
paper, the reader will be able to begin to develop SBSE solutions to a chosen software engineering
problem and will be able to collect and analyse the results of the application of SBSE algorithms.
By the end of the paper, the reader (who is not assumed to have any prior knowledge of SBSE)
should be in a position to prepare their own paper on SBSE. The tutorial concludes with a simple
step-by-step guide to developing the necessary formulation, implementation, experimentation and
results required for the first SBSE paper. The paper is primarily aimed at those who have yet to
tackle this first step in publishing results on SBSE. For those who have already published on SBSE,
many sections can easily be skipped, though it is hoped that the sections on advanced topics, case
studies and the SBSE taxonomy (Sections 7, 8 and 9) will prove useful, even for seasoned Search
Based Software Engineers.
The paper contains extensive pointers to the literature and aims to be sufficiently comprehensive,
complete and self-contained that the reader should be able to move from a position of no prior
knowledge of SBSE to one in which he or she is able to start to get practical results with SBSE
and to consider preparing a paper for publication on these results.
The field of SBSE continues to grow rapidly. Many exciting new results and challenges regularly
appear. It is hoped that this tutorial will allow many more Software Engineering researchers to
explore and experiment with SBSE. We hope to see this work submitted to (and to appear in) the
growing number of conferences, workshops and special issue on SBSE as well as the general software
engineering literature.
The rest of the paper is organised as follows. Section 2 briefly motivates the paper by setting
out some of the characteristics of SBSE that have made it well-suited to a great many Software
Engineering problems, making it very widely studied. Sections 3 and 4 describe the most commonly
used algorithms in SBSE and the two key ingredients of representation and fitness function. Section 5
presents a simple worked example of the application of SBSE principles in Software Engineering,
using Regression Testing as an exemplar. Section 6 presents an overview of techniques commonly
used to understand, analyse and interpret results from SBSE. Section 7 describes some of the more
advanced techniques that can be used in SBSE to go beyond the simple world of single objectives
for which we seek only to find an optimal result. Section 8 presents four case studies of previous
work in SBSE, giving examples of the kinds of results obtained. These cover a variety of topics and
involve very different software engineering activities, illustrating how generic and widely applicable
SBSE is to a wide range of software engineering problem domains. Section 9 presents a taxonomy of
problems so far investigated in SBSE research, mapping these onto the optimisation problems that
have been formulated to address these problems. Section 10 describes the next steps a researcher
should consider in order to conduct (and submit for publication) their first work on SBSE. Finally,
Section 11 presents potential limitations of SBSE techniques and ways to overcome them.
2 Why SBSE?
As pointed out by Harman, Mansouri and Zhang [48] Software Engineering questions are often
phrased in a language that simply cries out for an optimisation-based solution. For example, a
Software Engineer may well find themselves asking questions like these [48]:
1. What is the smallest set of test cases that cover all branches in this program?
2. What is the best way to structure the architecture of this system?
3. What is the set of requirements that balances software development cost and customer satisfaction?
4. What is the best allocation of resources to this software development project?
5. What is the best sequence of refactoring steps to apply to this system?
All of these questions and many more like them, can (and have been) addressed by work on
SBSE [48]. In this section we briefly review some of the motivations for SBSE to give a feeling for
why it is that this approach to Software Engineering has generated so much interest and activity.
1. Generality
As the many SBSE surveys reveal, SBSE is very widely applicable. As explained in Section 3,
we can make progress with an instance of SBSE with only two definitions: a representation of
the problem and a fitness function that captures the objective or objectives to be optimised. Of
course, there are few Software Engineering problems for which there will be no representation,
and the readily available representations are often ready to use ‘out of the box’ for SBSE.
Think of a Software Engineering problem. If you have no way to represent it then you cannot
get started with any approach, so problem representation is a common starting point for any
solution approach, not merely for SBSE. It is also likely that there is a suitable fitness function
with which one could start experimentation since many software engineering metrics are readily
exploitable as fitness functions [42].
2. Robustness.
SBSE’s optimisation algorithms are robust. Often the solutions required need only to lie within
some specified tolerance. Those starting out with SBSE can easily become immersed in ‘parameter tuning’ to get the most performance from their SBSE approach. However, one observation
that almost all those who experiment will find, is that the results obtained are often robust
to the choice of these parameters. That is, while it is true that a great deal of progress and
improvement can be made through tuning, one may well find that all reasonable parameter
choices comfortably outperform a purely random search. Therefore, if one is the first to use
a search based approach, almost any reasonable (non extreme) choice of parameters may well
support progress from the current ‘state of the art’.
3. Scalability Through Parallelism.
Search based optimisation techniques are often referred to as being ‘embarrassingly parallel’
because of their potential for scalability through parallel execution of fitness computations.
Several SBSE authors have demonstrated that this parallelism can be exploited in SBSE work
to obtain scalability through distributed computation [12, 62, 69]. Recent work has also shown
how General Purpose Graphical Processing devices (GPGPUs) can be used to achieve scale up
factors of up to 20 compared to single CPU-based computation [110].
4. Re-unification.
SBSE can also create linkages and relationships between areas in Software Engineering that
would otherwise appear to be completely unrelated. For instance, the problems of Requirements
Engineering and Regression Testing would appear to be entirely unrelated topics; they have their
own conferences and journals and researchers in one field seldom exchange ideas with those from
the other.
However, using SBSE, a clear relationship can be seen between these two problem domains [48].
That is, as optimisation problems they are remarkably similar as Figure 1 illustrates: Both
involve selection and prioritisation problems that share a similar structure as search problems.



for more:http://www0.cs.ucl.ac.uk/staff/mharman/laser.pdf