2009-06-14

Using UML Component diagrams to embed e-Framework Service Usage Models

Given the background of what embedding SUMs in other SUMs can mean, I'm going to model what that embedding can look like from a systems POV, using UML component diagrams. The tool is somewhat awkward to the task, but I was rather taken with the ball-and-socket representation of interfaces in UML 2.0—even if I have to abandon that notation where it counts. I'm also using this as an opportunity to explore specifying the data sources for embedded SUMs—which may not be the same as the data sources for the embedding SUM.

The task I set myself here is to model, using embedded SUMs, functionality for searching for entries in a collection, annotating those entries, and syndicating the annotations (but not the collection entries themselves).

We can represent what needs to happen in an Activity diagram, which captures the fact that entries and annotations involve two different systems. (We'll model them as distinct data stores):

We can go from that Activity diagram to a simple SUM diagram, capturing the use of four services and two data sources:

But as indicated in the previous post, we want to capitalise on the existence of SUMs describing aspects of collection functionality, and modularise out service descriptions already given in those SUMs (along with the context those SUMs set). So:

where a "Searchable Collection" is a service usage model on searching and reading elements in a collection, and "Shareable Collection" is a service usage model on syndicating and harvesting elements in a collection—and all those services modelled may be part of the same system. We are making an important distinction here: the embedded searchable and shareable collection SUMs are generic, and can be used to expose any number of data sources. We nominate two distinct data sources, and align a different data source to each embedded SUM. So we are making the entries data source searchable, but the annotations data source shareable; and we are not relying on the embedded SUMs to tell us what data sources they talk to, when we do this orchestration.

Which is all very well, but what does embedding a SUM actually look like from a running application? I'm going to try to answer that through ball-and-socket. The collection SUM models a software component, which exposes several services for other systems and users to invoke. That software component may be a standalone application, or it may be integrated with other components to build something greater; that flexibility is of course the point of Service Oriented Architecture (and Approaches). The software component exposes a number of services, which can be treated as ports into the component:

And an external component can interface through one or more of those exposed services, giving software integration:

Each service defines its own interface, and the interface to a port is modelled in UML as a realisation of a component (hollow arrowhead): it's the face of the component that the outside world sees:

And outside components that use a port depend on that interface (dashed arrow): the integration cannot happen without that dependency being resolved, so the component using our services depends on our interface:

Exposed services have their interfaces documented in the SUM: that is part of the point of a SUM. But a SUM may not document the interface of just one exposed service, but of several. By default, it documents all exposed services. But if we allow a SUM to model only part of a system's functionality, then we can have different SUMs capturing only subsets of the exposed functionality of a system. By setting up simple, searchable and shareable collections, we're doing just that.

Now a SUM is much more than just an interface definition. But if a single SUM includes the interface definitions for all of Add Read Replace Remove and Search, then we can conflate the interfaces for all those services into a single reference to the searchable collection SUM—where all the interfaces are detailed. We can also have both the simple and the searchable collection SUMs as alternate interfaces into our collection: one gives you search, the other doesn't. (Moreover, we could have two distinct protocols into the collection, so that the distinction may not just be theoretical.)

This is not a well-formed UML diagram, on purpose: the dependency arrows are left hanging, as a reminder that each interface (a SUM) defines several service endpoints into the component. The reason that's not quite right is that the UML interface is specific to a port—each port has its own inteface instance; so a more correct notation would have been to preserve the distinct interface boxes, and use meta-notation to bundle them together into SUMs. Still, the very act of embedding SUMs glosses over the details of which services are being consumed from the embed. So independently of the multiple incoming (and one outgoing) arrows per interface, this diagram is telling us the story we need to tell: a SUM defines bundles of interfaces into a system, and a system may have its interfaces bundled in more than one way.

Let's return to our initial task; we want to search for entries in a collection, annotate those entries, and syndicate the annotations. We can model this with component diagrams, ignoring for now the specifics of the interfaces: we want the functionality identified in the first SUM diagram, of search, read, annotate, and syndicate. In a component diagram, what we want looks like this:

The Entries component exposes search and read services; the Annotations component (however it ends up realised) consumes them. The Annotations component exposes an annotate service to end users, and a syndicate service to other components (wherever they may be).

That's the functionality needed; but we already know that SUMs exist to describe that functionality, and we can use those SUMs to define the needed interfaces:

The Entries collection exposes search and read services through a Searchable Collections SUM, which targets the Entries data source. The Annotations collection exposes syndicate services through a Shareable Collections SUM, which targets the Annotations data source.

Now, in the original component diagram, Annotate was something you did on the metal, directly interfacing with the Entries component:

Expanding it out as we have, we're now saying that realising that Annotate port involves orchestration with a distinct Annotation data source, and consumes search and read services. So we map a port to a systems component realising the port:

Slotting the Annotate port onto a collection is equivalent to slotting that collection into the Search and Read service dependency of the the Annotate system:

So we have modelled the dependency between the Entries and Annotate components. But with interfaces, the services they expose, and data sources as proxies for components, we have enough to map this component diagram back to a SUM, with the interface-bundling SUMs embedded:

The embedded SUMs bundle and modularise away functionality. Notice that they do not necessarily define functionality as being external, and so they do not only describe "other systems". The shareable SUM exposes the annotations, and the searchable SUM exposes the entries: their functionality could easily reside on the same repository, and we can't think of both the Entries and the Annotations as "external" data—if we did, we'd have no internal data left. The embedded SUMs are simply building blocks for system functionality—again, independently of where the functionality is provided from.

What does anchor the embedded SUMs and the services alike are the data sources they interact with. An Annotations data sources can talk to a single Annotate service in the SUM, as readily as it can to a Syndicate service modularised into Shareable Collection. Because an embedded SUM can be anchored to one of "our" data sources, just like a standalone service can. That means that, if a SUM will be embedded within another SUM, it's important to know whether the embedded SUM's data sources are cordonned off, or are shared with the invoking context.

An authentication SUM will have its own data sources for users and credentials, and no other service should know about them except through the appropriate authorisation and authentication services. But a Shareable Collections SUM needs to know what data source it's syndicating—in this case, the same data source we're putting our annotations into. So the SUM diagram needs to identify the embedded SUM data source with its own Annotations data source. If data sources in a SUM can be accessed through external services, then embedding that SUM means working out the mapping between the embedding and embedded data sources—as the dashed "Entries" box shows, two diagrams up.

SUM diagrams are very useful for sketching out a range of functionality, and modularisation helps keep things tractable, but eventually you will want to insert slot A into tab B; if you're using embedded SUMs, you will need to say where the tabs are.

No comments: