the automation of refactoring. The previous work on distribution, maintenance, and
enhancement is discussed in more detail in the following two subsections, which
separately consider work on modularization and refactoring.
Other work on SBSE application in distribution, maintenance, and enhancement
that does not fall into these two categories has considered the evolution of programming
languages [Van Belle and Ackley 2002], real-time task allocation2 [Bate and Emberson
2006; Emberson and Bate 2007], quality prediction based on the classification of metrics
by a GA [Vivanco and Pizzi 2004], and legacy systems migration [Sahraoui et al. 2002].
SBSE has also been applied to the concept assignment problem. Gold et al. [2006]
applied GAs and HC to find overlapping concept assignments. Traditional techniques
(which do not use SBSE) cannot handle overlapping concept boundaries, because the
space of possible assignments grows too rapidly. The formulation of this problem as an
SBSE problem allows this large space to be tamed.
7.1. Modularization
Mancoridis et al. were the first to address the problem of software modularization using
SBSE [Mancoridis et al. 1998] in 1998. Their initial work on HC for clustering modules
to maximize cohesion and minimize coupling was developed over the period from 1998
to 2008 [Doval et al. 1999; Mancoridis et al. 1999; Mitchell and Mancoridis 2002, 2003,
2008; Mitchell et al. 2002, 2004]. The pioneering work of Macoridis et al. led to the
development of a tool called Bunch [Mancoridis et al. 1999] that implements software
module clustering.
The problem of module clustering is similar to the problem of finding near cliques in
a graph, the nodes of which denote modules and the edges of which denote dependence
between modules. Mancoridis et al. [1999] called this graph a module dependency
graph. The Bunch tool produces a hierarchical clustering of the graph, allowing the
user to select the granularity of cluster size that best suits his application.
Following Macoridis et al., other authors also developed the idea of module clustering
as a problem within the domain of SBSE. Harman et al. [2002] studied the effect of
assigning a particular modularization granularity as part of the fitness function, while
Mahdavi et al. [Mahdavi et al. 2003b; Mahdavi 2005] showed that combining the
results from multiple hill climbs can improve on the results for simple HC and GAs.
Harman et al. also [Harman et al. 2005] explored the robustness of the Modularization
Quality (MQ) fitness function in comparison with an alternative measure of cohesion
and coupling, EValuation Metric (EVM), used in work on clustering gene expression
data.
Other authors have also considered search-based clustering problems. Bodhuin et al.
[2007b] applied GAs to group together class clusters in order to reduce packaging size
and the average downloading times. Huynh and Cai [2007] applied GAs to cluster
Design Structure Matrices and check the consistency between design and source code
structures.
Despite several attempts to improve on the basic HC approach [Harman et al. 2002;
Mahdavi et al. 2003b; Mitchell and Mancoridis 2002], this simple search technique
has been found very effective for this problem. However, Praditwong et al. [2010]
recently demonstrated that multiobjective optimization can significantly outperform
HC in terms of modularization quality. Mitchell and Mancoridis recently published a
survey of the Bunch project and related work [Mitchell and Mancoridis 2006].
2This work could equally well be categorized as “real-time SBSE”, a topic area which is sure to develop in the
future, given the highly constrained nature of the real-time environment and the many competing objectives
that have to be optimized.
đang được dịch, vui lòng đợi..
