http://www.spinellis.gr/pubs/jrnl/2003-IEEESW-umlgraph/html/article.html
This is an HTML rendering of a working paper draft that led to a publication. The publication should always be cited in preference to this draft using the following reference:

Citation(s): 17 (selected).

This document is also available in PDF format.

The document's metadata is available in BibTeX format.

Find the publication on Google Scholar

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Diomidis Spinellis Publications


© 2003 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

On the Declarative Specification of Models

Diomidis Spinellis

Colleagues in my research group as well in collaborating institutions typically model software designs using graphical tools like Rational Rose, Together, and Visio. I often witness them toiling to adjust a graph's appearance with the mouse or laboriously visiting each class to change the type of a single field. This need not be so. I propose that design models should be composed textually, and graphs should be automatically generated. You may find it perverse to employ two different representations (textual and graphical) for the same underlying model. To substantiate my view I will therefore outline the advantages of graphical models, describe the benefits gained from directly manipulating a textual representation, illustrating my point by a prototype implementation.

Graph-based Models

There is no rule specifying that models should appear in a graphical form. A model is a simplification of reality, so a model for a software artifact could really be an outline of that artifact; think of a class definition without code in the method bodies. However, we usually prefer to examine many of our models in a graphical representation: UML employs nine different diagrams for visualizing different perspectives of a system.

Using a diagram to represent a model has a number of advantages. When we examine the graphical representation of a model we utilize our visual cognitive apparatus which has some millions of years of evolutionary advantage over our text-reading abilities. The two-dimensional representation of a diagram is a lot more expressive than text, which is typically scanned from the left to the right and the top to the bottom. Diagrams can be viewed following different directions to gain distinct insights, while the use of a larger symbol set makes them more expressive. In addition, one can obtain different levels of detail from the same diagram: a bird's eye view will easily convey a system's structure, while examining a class in detail can reveal its collaborators. Finally, a diagram can allow us to identify patterns; again two-dimensional pattern-matching is an activity we humans are particularly good at.

The Drawing Editor Approach

Designers typically create their model diagrams using a drawing editor. The semantic distance between the editor's graphical model representation and the underlying software artifact can vary enormously: some tools like Visio are pure drawing aids, others like Rational Rose can offer round-trip engineering, while tools like ArgoUML can provide domain-specific advice during the design phase. However, all drawing editors require the tedious placing and manipulation of drawing shapes on the canvas. The effort and the motor coordination skills required for this activity are mostly irrelevant to the end result: unlike architectural or mechanical engineering models the appearance of a software system's model diagram is only marginally related to the quality of the represented software design. The drawing activity is however a creative task providing immediate feedback; software engineers thus often focus on delivering a nice picture rather than an effective design. Furthermore, the internal representation of the model is typically opaque, or under the control of the drawing editor tool, and thus at odds with vertical software process activities like configuration and revision control. Finally, the semantic distance between the model and the artifact is large enough so as to burden activities that are naturally performed on software code such as refactoring, automatic code generation, and metric extraction.

Declarative Modeling

Computer power and automatic graph drawing algorithms [1] have now sufficiently advanced so as to allow the automatic placement of graph nodes on the canvas and the near optimal routing of the respective edges. We can therefore design models using a declarative textual representation and subsequently view, publish, and share them in graphical form. Building architects employ a similar technique when they create realistic ray-traced pictures of a building out of, so called, 2½ dimensional ground plans. In our case a model expressed in a Java-like notation such as the following:

class Asset {}
class InterestBearingItem {}
class InsurableItem {}
/** 
 * @extends InsurableItem 
 * @extends InterestBearingItem 
 */
class BankAccount extends Asset {}
/** @extends InsurableItem */
class RealEstate extends Asset {}
class Security extends Asset {}
class Stock extends Security {}
class Bond extends Security {}
class CheckingAccount extends BankAccount {}
class SavingsAccount extends BankAccount {}
can be used to automatically create the diagram that appears below:

You can read more about the tools I use to generate such diagrams at http://www.spinellis.gr/sw/umlgraph.

Creating models in a declarative, textual notation offers a number of advantages. First of all, the model composition mechanism matches well both a programmer's high-level skills, the textual abstract formalization of concrete concepts, and the associated low-level skills, the manipulation of text using an editor and other text-based tools. The declarative notation, by being closer to the program's representation (the notation I experimented with is based on the Java syntax and semantics), forces the designer to distinguish between the model and the respective implementation, between the essential system characteristics and the trivial adornments. It is more difficult for designers to get away, as they often do now, with drawing for a model a nice picture of the implementation they have in mind. The declarative representation is also highly malleable, the existing visual structure does not hinder drastic changes, nor is effort wasted on the tidy arrangement of graph nodes a psychological barrier against massive design refactoring. Declarative models are also highly automatable: they can be easily generated from even higher-level descriptions by trivial scripts and tools operating on design process inputs such as database schemas, existing code, or structured requirements documents [2]. Text macro processors can be used for configuration management, while revision control and team integration activities can utilize the same proven tools and processes that are currently used for managing source code. Thus with a tool like RCS one can keep track of design revisions, create and merge branches, and monitor model changes, while a system like CVS can allow work to be split into teams. Finally, the declarative approach can readily utilize existing text processing tools for tasks that a drawing editor system may not provide. Consider how your favorite model editor handles the following tasks and how you could handle them using a simple Perl script or a text-processing pipeline applied to the declarative model specification: identify all classes containing a given field (as a prelude to an aspect-oriented cross-cut); count the total number of private fields in a given design; order methods appearing in multiple classes by their degree of commonality; identify differences between two designs.

The declarative specification of software models is clearly not a panacea. Our current UML diagram design prototype stresses dot, the underlying graph layout generator, resulting for example in association multiplicity and visibility adornments overlapping with the respective edges. Furthermore, learning the declarative notation may be more difficult than experimenting with the toolbars of a GUI-based diagram editor competing for the designer's attention. However, since the maturity of a profession is also judged by tools used by its practitioners, I believe that building and adopting a sharp declarative modeling toolset will enrich and advance the software engineering discipline.

Acknowledgements

The prototype I describe could not exist without the Graphviz graph visualization system; John Elson and Stephen C. North graciously incorporated my changes for UML arrow styles to the tool's source distribution. Spyros Oikonomopoulos provided feedback during the development of this work.

References

  1. Emden R. Gasner, Eleftherios Koutsofios, Stephen C. North, and Kiem-Phong Vo. A Technique for Drawing Directed Graphs. IEEE Trans. Software Eng., 19(3)124-230, May 1993.
  2. Diomidis Spinellis and V. Guruprasad. Lightweight languages as software engineering tools. In J. Christopher Ramming, editor, USENIX Conference on Domain-Specific Languages, pages 67-76, Santa Monica, CA, USA, October 1997. Usenix Association.