Testing Object-Oriented Systems: A Status ReportRobert V. Binder(Note: This article first appeared in the April 1994 American Programmer and was reprinted in April 1995 CrossTalk with minor revisions. A paragraph on FREE has been added here. While practice and research have made progress since this article was first published, the general conclusions are still correct. RVB.)
I. IntroductionWe will need to test software until we can be certain of several things: that no developer will ever mis-code a statement, that the implications of every requirement are fully understood, that the behavior of a system can be extrapolated from its parts with certainty, and that languages, databases, user interfaces, and operating systems are absolutely trustworthy,Object-oriented technology does not in any way obviate the basic motivation for software testing. In fact, it poses some new challenges. Although there are similarities to testing conventional systems, object-oriented testing has significant differences. Some unique problems include:
The State of the ArtIt is probably impossible to give an accurate summary of a technology as large, fluid, and widespread as object-oriented development. However, the published body of knowledge about object-oriented testing is small for the moment (about 30 sources.) The following summary is based on a detailed analysis of these sources. There are several common issues.
Basic Unit for TestingThere is nearly universal agreement that the class is the natural unit for test case design. Methods are meaningless apart from their class. Other units for testing are aggregations of classes: class clusters and small subsystems. The intended use of a class implies different test requirements; e.g., application-specific versus general-purpose classes, abstract classes, and parameterized (template) classes. Testing a class instance (an object) can verify a class in isolation. However, when individually verified classes are used to create objects in an application system, the entire system must be tested as a whole before it can be considered to be verified.
Implications of InheritanceContrary to some hopes, retesting of inherited methods will be the rule rather than the exception. There is general agreement that inherited features require retesting in most circumstances. Retesting is required because a new context of usage results when features are inherited [Perry 90], [Harrold 92]. Multiple inheritance increases the number of contexts to test. Class flattening facilitates automation and understandability -- that is, to devise a class test plan, we need to consider all the features it inherits.Inheritance can be used to implement specialization relationships or as a programming convenience. Implementation specialization should correspond to problem domain specialization. Reusability of superclass test cases is predicated on this kind of correspondence. It is not likely that convenience subclasses will reflect a true specialization relationship. As such, it is unlikely they can be excused from testing by testing their superclass, even when a lexical excuse can be found.
EncapsulationEncapsulation is not a source of errors but may be an obstacle to testing. Testing requires reporting on the concrete and abstract state of an object. The generally beneficial encapsulation features of object-oriented languages may make it hard to provide such reporting. Reliable verification of the reporting methods themselves is a persistent problem. Several approaches are possible. One can provide built-in or inherited state reporting methods. Low level probes or debug tools can be used to manually inspect an object but this is generally undesirable.Proof-of-correctness techniques may help. A proved method could be excused from testing to bootstrap testing of other methods. Formal proof of correctness is equivalent to exhaustive testing, but can be difficult and time-consuming. Since state reporting methods tend to be small and simple, they should be relatively easy to prove. This provides a feasible way to overcome the reliable reporter problem.
PolymorphismEach possible binding of a polymorphic component requires a separate test. It may be hard to find all such bindings, increasing the chance of errors and posing an obstacle to reaching coverage goals. This may also complicate integration planning, in that many server classes may need to be integrated before a client class can be tested.
White-box TestingWhite-box or clear-box testing usually means that source code is examined to develop test cases. There are two variants of this: interface-based testing and method-based testing. Interface-based testing is sometimes incorrectly referred to as black-box testing, since it relies on developing test plans from visible class features: implementation messages and exceptions.Conventional white-box approaches are based on analysis of either control or data flow. There is general agreement these techniques can be adapted to method testing, but are not sufficient for class testing. Applicability of conventional flow-graph approaches is questioned by several sources. They argue that either (a) flow-graph approaches are inconsistent the object-oriented paradigm or (b) that method-level control faults are not likely. The latter critique is more convincing. At least one approach does not consider specifications at all, proposing to derive test cases entirely from the implementation. The problems with white-box only test cases are numerous, primarily resulting from the essentially tautological nature of the approach and the informational insufficiency of source code for an adequate test suite.
Black-box TestingTest case development that uses requirements or specifications (as opposed to source code) is referred to as black-box testing. There is general agreement that conventional black-box methods will useful for object-oriented systems. Jacobson presents a basic, general strategy for black-box testing [Jacobson 92] which draws on established testing techniques.Some novel proposals have been made. The gauge approach proposes a specification integrated with the implementation which would also provide a test suite [Cox 90]. C++ assertions or Eiffel pre/post-conditions can offer similar self-checking. [Siegal 92] suggests several error-guessing strategies for deriving test cases. Several research systems have been developed to explore how automatic verification of abstract data types (ADTs) can be adapted to object-oriented systems. The ADT approach requires test drivers separate from the application. Some ADT approaches require specification information, others are strictly code based. The FREE test design methodology produces an integrated unit, subsystem, and system test suite from any OOA/D model [Binder 96]. Even well-developed specifications do not (and should not) express every implementation detail so some examination of source code is necessary. For example, retesting inherited features requires examination of class structure. The distance between object-oriented specification and implementation is typically small compared to conventional systems. The gap (and therefore usefulness) of the white-box/black-box distinction is decreasing.
State-based TestingState-based testing derives test cases by modeling a class as a state machine. Methods result in state transitions. The state model defines allowable transition sequences. For example, this means an instance must be created before it can be updated or deleted. Test cases are devised to exercise each transition. Several techniques have been proposed to identify state-based test sequences.Jacobson's OOSE produces a class state model as part of design and advocates that all transitions be covered during testing. The transitions are tied to user-selectable activation sequences, called use-cases. Use-cases provide a framework for system testing. Alternatively, a white-box view of state can be defined as instance variable equivalence classes. This is a very low level view and requires either intrusive or built-in reporting on all instance variables. The ADT research systems generate method activation sequences, in effect testing the state transitions of a class. The FREE approach makes extensive use of state models. For classes whose method sequence matters, FREE derives a test suite from the class dynamic (state) model. When sequence doesn't matter, interface dataflow is modeled by a regular expression to generate test sequences. FREE includes techniques for a systematic cover of state space. The size of the test suite grows in direct proportion (not exponentially) to the number of state variables. FREE can be used to synthesize a hierachic state model of a system of objects (a "mode machine"). Tests are generated from the mode machine. State-based testing is not without difficulties. The white-box state model may be intractable for even moderately complex classes. A lengthy sequence of operations may be required to place an object in a desired state to avoid violating encapsulation. If a class is designed to accept any possible sequence of method activation, state-based testing may not be a productive strategy. The locus of state control may be distributed over an entire object-oriented application. Cooperative control means that it may be difficult to verify a class in isolation. To test an application system, a global state model is needed to show how classes interact.
Adequacy and CoverageThe scope of retesting under inheritance established in [Perry 90] is widely, if occasionally reluctantly, accepted. This defines a general scope of testing, but does not advocate a specific coverage based on a probable fault hypothesis. Most approaches call for "thorough" testing. Except for coverage goals borrowed from conventional testing, this is not given an operational definition. Some coverages advocated include statement, multiple condition, and decision.
Integration StrategiesIntegration of classes to create an application system must be closely tied to the overall development approach. There are two basic strategies: thread-based and uses-based. A thread consists of all the classes needed to respond to a single external input. Each class is unit tested, and then the thread set is exercised. Uses-based integration begins by testing classes that use few or no server classes. Next, classes that use the first group of classes are tested, followed by classes that use the second group, and so on. In general, the use of method stubs and drivers is to be avoided, but may be necessary.
Test Process StrategyObject-oriented development tends toward shorter incremental cycles. With testing added, object-oriented development can be characterized as design a little, code a little, test a little. Process issues include when to do test case design, integration strategy, and extent of retesting for inherited features. Integration is constrained by the structure of inter-class uses in an application. Integration planning based on client-server dependencies within a design-code-test cycle is necessary.[Jacobson 92] presents a complete and coherent approach consistent with generally accepted software engineering practice and IEEE testing standards. OOSE calls for complete requirements and detail design, and uses these representations to produce test plans. [Berard 93] presents a detailed schema for an object-oriented test suite. On the other extreme, [Siegel 92] advocates continuous testing in small increments by pairs of developers. [Taylor 92] advocates mandatory class testing before adding a class to the library. [Graham 94] argues that prototyping and usability assessment should be included in testing.
III. The State of the Practice -- An ImpressionA subjective assessment of current practice follows. This is based on several sources: (1) published reports, (2) traffic on public networks, (3) available test tools, and (4) conversations with many developers and testers.
Published Reports[Fiedler 89] describes a unit test strategy used for C++ programs used in a life-critical health care application. The McCabe basis path technique was used to identify test cases for member functions, followed by domain analysis to assign specific values. Classes derived from "well covered" base classes received less testing than those with an uncovered base. Tests of the "signals or exceptions that are raised (not propagated) by each function" were are also devised. Test cases were written to "ensure complete path coverage of each member function." This technique revealed 5.1 defects per KDSI. Fielder concludes "The main difference we have found so far is that each object must be treated as a unit, which means that unit testing in an object-oriented environment must begin earlier in the life cycle."[Murphy 92] describes an automated testing approach used for telecommunication applications developed in Eiffel and C++. "Each test case is described by providing a trace (i.e. sequence of messages to objects) and associated with it some aspect if the expected behaviors of the class in response to that trace." A test script contains declarations and initialization information accessible to all test cases. Tests for clusters (a collection of classes with client-server interfaces) were specification based. Individual class testing was specification and program based. Detail design, programming, unit test, and integration test on cluster components was done in parallel. The process sequence for application-specific clusters was (1) perform specification-based cluster testing, (2) class test each problem-domain class in the cluster, (3) class test each sub-system interface class (e.g., file i/o), (4) class test each class doing asynchronous message sends, and (5) class test each abstract and generic class. Each class in a reusable cluster was tested individually using specification-based testing. System testing was initiated when all cluster tests were completed. A test process for a distributed, multi-user CASE system used a parallel test class structure. This system supported structured analysis on a Sun-3 and was implemented in Objective-C. Unit test was done by developing a test class for each application class, with a test method that corresponded to each application method. "The testing arrangement was therefore quite complex but provided a very through testing procedure." In another project, C++ classes for GUI environment were designed to be independent. A "test harness" was developed and used for to test each class separately [Wilke 92].
Network TrafficPublic electronic forums offer an indicator of interest and current concerns among software practitioners. I have routinely posted messages asking about object-oriented testing in several Compuserve software forums, including those dealing with software engineering, CASE, programming tools and techniques, C++ programming, and Smalltalk programming. The few responses I've gotten express mild curiosity, but little else. The Usenet (available to Internet users) has had several threads about object-oriented testing. A FAQ (frequently asked questions) file about testing object-oriented programs is available (try Usenet groups comp.object and comp.software.testing.) The threads have identified publications, tools, and offered some debate of the technical issues.
Test AutomationIt is a practical impossibility to do effective software testing without automation. Several object-oriented test automation products have recently become available. An object-oriented test tool parses an object-oriented program or specification to generate and track test cases. There are now three such products. There are two code-based tools for C++. Both products offer automatic computation of the MIT metrics [Chidamber 90], automatic white-box test case generation, support for test process management and many other capabilities. There is one specification-based tool. It can generate test cases from at least one object-oriented specification technique and provides other test management capabilities. There are many general purpose test tools (e.g. GUI drivers) which will work equally well for object-oriented and conventional implementations. They are not object-oriented per se.
AssessmentWhile only a small minority of developers practice adequate testing [Hetzel 91], few conventional developers would strongly contend testing is irrelevant. However, there are indications that some object-oriented developers view their artifacts as so fundamentally different as to obviate the need or applicability for generally accepted testing practice. [Siegel 92] notes that "Many people are suggesting that testing needs and costs will be lower for OOD systems. In reality, this probably will not be true." [Perry 90] argues "... we have uncovered a flaw in the general wisdom about object-oriented languages -- that "proven" (that is well-understood, well-tested, and well-used) classes can be reused as superclasses without retesting the inherited code." After recommending "... each class must be thoroughly tested", [Taylor 92] notes "To many object programmers, this degree of analysis and testing may seem like a major intrusion on what should be a quick, intuitive process of class definition."Available information suggests that testing is not a pressing concern for many object-oriented developers. Organizations that develop applications where high reliability is necessary have evolved their own approach. In one such situation, developers used an ad hoc approach to identify test cases. However, after control path instrumentation, it turned out that only 37 percent of the control paths were exercised by the ad hoc test suite. Overall, professional practice for testing object-oriented systems is in a formative, early stage. The new offerings of object-oriented testing tools should stimulate greater effort in this area, as well as interest in testing methods.
III. ConclusionsWhile object-oriented programming languages may reduce some kinds of errors, they increase the chance of others. Methods tend to be small with low algorithmic complexity, so path faults are less likely. Encapsulation prevents many of the problems that result from uncontrolled data scoping. However, there is no compelling reason (let alone evidence) to suppose that developers of object-oriented programs are more immune to errors than those writing in conventional languages. Coding errors (misspell, misname, wrong syntax) are probably as likely as ever. Ironically, some of the essential features of object-oriented languages pose new fault hazards.
It is clear that class testing must be closely involved with class programming. The necessity of reuse via inheritance means that parallel, reusable test suites are imperative. Object-oriented unit code/test proceeds in a shorter cycle than corresponding activities in conventional development. The development process must enable short code/test cycles. But not all test activities follow this pattern. Sub-system integration and system test cannot be done until some or all components are in hand. These test activities will continue to be some of the final steps in object-oriented development. Both technical and process issues must be addressed for effective object-oriented testing. Effective testing must be automated and integrated with development, as well as systematically related to most- probable errors. Overall, several observations can be made.
I think testing of object-oriented software will be more important for producing high quality systems, compared to conventional implementations. In conventional systems, static verification (walkthroughs or inspections) are much more effective and efficient in removing faults than testing. However, object-oriented source code can be considerably harder to read than well-structured conventional source for at least three reasons: (1) the yo-yo problem -- the fragmentation of functionality that can result from inheritance, (2) dynamic binding, and (3) cooperative control strategies where control flow and state control are distributed over several classes. With these obstacles to comprehension, static techniques will be less effective, requiring more testing to achieve the same level of quality.
References[Berard 93] Edward Berard, Essays on Object-Oriented Software Engineering. Prentice-Hall, 1993.[Binder 96] Robert V. Binder, Overview of the FREE Approach to Object-Oriented Testing. http://www.rbsc.com/pages/FREE.html, 1996. [Chidamber 91] Shyam R. Chidamberand Chris F. Kemerer. "Toward a Metrics Suite for Object-Oriented Design," ACM Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA), October 1991, pp. 197-211. [Cox 90] Brad J. Cox "Planning the Software Industrial Revolution", IEEE Software, Nov. 1990, pp. 25-33. [Fiedler 89] Steven P. Fiedler, "Object-Oriented Unit Testing," Hewlett-Packard Journal, April 1989, v. 40, n. 2, pp. 69-75. [Graham 94] Ian Graham. Object-Oriented Methods. 2nd ed. Addison-Wesley, 1994. [Harrold 92] Mary Jean Harrold, John D. McGregor, and Kevin J. Fitzpatrick. "Incremental Testing of Object-Oriented Class Structures," Proceedings, 14th International Conference on Software Engineering, May 1992, pp 68-80. [Hetzel 91] Bill Hetzel and Dave Gelperin, "Software Testing: Some Troubling Issues," American Programmer, v. 4, n. 4, April 1991. [Jacobson 92] Ivar Jacobson, and Magnus Christerson, Patrik Jonsson, and Gunnar Overgaard. Object-Oriented Software Engineering. Addison-Wesley, 1992. [Murphy 92] Gail Murphy and Pok Wong. "Towards a Testing Methodology for Object-Oriented Systems", Poster Paper, OOPSLA 92. [Perry 90] Dewayne E. Perry and Gail E. Kaiser, "Adequate Testing and Object-oriented Programming," Journal of Object-Oriented Programming, v 2, n 5, Jan/Feb 1990, pp. 13-19. [Siegal 92] Shel M. Siegal, "Strategies for Testing Object-oriented Software", Compuserve CASE Forum Library, September 1992. [Wilke 93] George Wilke, Object-Oriented Software Engineering: The Professional Developer's Guide. Addison-Wesley, 1993.
Home | Contact | Copyright | Site MapFirst Release: 1 December 1995. Last Rev: 15 October 2001 |