You are currently browsing the Core Data Integration Project weblog archives for October, 2012.

Archive for October, 2012

Explaining the importance of a good data integration architecture to senior management

Monday, October 22nd, 2012

In the summer I ran a workshop with the Portfolio Executive, which is our senior decision making body at the University regarding the distribution of funds to internal projects. In one half of the workshop I tackled a discussion about the need to regard master data as an asset and the problems with our current data integration architecture. I claimed that too high a percentage of our current system-to-system integrations are point-to-point, this position having been arrived at naturally following many years of allowing technical developers to create bespoke integrations for new systems.

I discussed how this architecture is risky for the institution as the combination of the lack of documentation, lack of a standards-based approach to data integration and too many system-to-system joins (“spaghetti” landscape of systems)  results in data model changes often propagating through our network systems in an unmanaged way. This can result in the sudden, unplanned loss of functionality in ‘consumer’ IT systems because the implications of changes made in a source IT system are not easy to appraise. We also suffer in some cases from a lack of agility in terms of our ability to respond quickly and efficiently to new imperatives (such as Government requirements for Key Information Sets or changes in REF requirements and so on). So, for example, in terms of data model change we’ve had a case where the organisational hierarchy was changed and unexpectedly broke a system that happened to be depending on it. As far as agility is concerned, we find that when we are starting to replace a system (or multiple systems) with a new one we usually have to go back to the drawing board each time to try to deduce how many interfaces there are to the current system and which interfaces will have to be put in for the new one. This can be overly time-consuming. All too often we create interfaces that are essentially doing the same task as other interfaces e.g. moving student data objects between systems, or transferring data about organisational units, research publications and so on. Although we began to tackle this duplication problem some years ago when we attempted to replicate key master data in a core system called the Datahub, we have not fully met requirements with this approach: for example the Datahub is not realtime (some data in it can be up to 24 hours old) causing many new system implementations to avoid use of the Datahub and instead to connect direct to source data systems. The consequence of this is that we simply perpetuate the problem of having many point to point integrations.

Now, this is is all rather technical. IT Services would like to make an investment in developing a new, more sophisticated architecture, whereby we have abstracted out key functionality (such as system requests of the type ‘give me all the students from department X’ and so on), developed a service oriented architecture where appropriate and deployed ESB technology wherever realtime data needs are most pressing.  We see the benefits this can bring, such as more reliable management of the knock on effect of system changes (reduction in system downtime), quicker project start up times due to a more agile integration architecture and a more standardised and therefore sustainable system integration architecture longer term. However, how to convince senior management to invest in this rather intangible sounding concept of a more mature data integration architecture is difficult when constrained to use non-technical speak! This is a brief summary of how I attempted to describe the concepts to the portfolio executive this summer:

Using a jigsaw analogy suggested to me by the Head of IT services I explained that we constantly try to fit new systems into our existing systems architecture seamlessly as though they are pieces from the same jigsaw puzzle – quite a challenge:

Jigsaw Architecture problem


The more we buy third party products, the more this is a real puzzle. The yellow boxes, by the way, are to do with students/researcher lifecycles – a concept that the portfolio executive were already familiar with and which I have blogged about elsewhere (see my enterprise architecture blog).

Next I discussed how we can think of our IT systems roughly as supporting the three areas of research, education and support services (such as finance and administration). Sticking with the jigsaw analogy, I described how we try to connect systems wherever we need to reuse data from one system in another. For example, where we might want to copy lists of courses stored in our Student Information System over to our timetabling system, or present publications stored in our research information system on the Web via our content management system. The ‘joins’, therefore, are created wherever we need a connection through which we can pass data automatically. This enables us to keep data in sync across systems and avoids any manual reentry of data. It’s the data reuse advantage. I used the diagram below to help discuss this concept:

Illustration of how we currently join IT systems together


I described how, as our requirements around information become more sophisticated, so the pressure on our data integration architecture increases. For example, we need integrated views of both research and teaching data to help inform discussions about individual academic performance, also we need cross-linked information about research and teaching on our website etc. If our data integration architecture is not fit for purpose (i.e. if the overall approach to system ‘joins’ is not planned, standardised, organised, documented and well governed) then we will struggle to deliver important benefits.

I used the following diagram to discuss what the future vision for the data architecture would look like:

The To Be vision of Joining Systems


This diagram is deliberately simplistic and purely illustrative. The blue ring is totally conceptual, but what it allowed me to talk about is the need to decouple systems that consume data from connecting direct to master data systems (i.e. to get away from such a proliferation of point-to-point system integrations). Naturally I’ve only shown a small subset of our systems in these diagrams, but this was enough to explain the concepts I needed to convey. I described how the blue ring could be made up of a combination of technologies, but that we would need to standardise carefully on these, organisation-wide, in order to increase our chances of sustaining the integration architecture over time. I didn’t mention service oriented architecture, but some of the blue ring could be composed of services that abstract out key, commonly used functionality using SOA technology. We didn’t discuss ESB, but some of the blue ring could use ESB technology. We have a data warehouse solution and this could be used to replicate some or even all our master data if we wish to go that route for data reuse too.

Determining the exact composition of the blue ring (ie the exact future data integration architecture for our institution) is not possible for us to do yet because we are still gathering information about the “As Is” (see my blog on the Interface Catalog). When we have fuller data about our current architecture then we will be able to review it and decide by what percentage we wish to reduce groups of point-to-point integrations (replacing them with web service api’s, say) and how we might want to replace Datahub with Datawarehouse technology and so on.

In order to throw more resource at developing the information needed for this analysis, we will be delivering a business case to the portfolio executive, requesting funds to help us. I hope to indicate how we’ve described the business model in that document in a future blog. Meanwhile, it is possible to continue to mature the integration architecture on a project by project basis.

The JISC are hosting an Enterprise Architecture workshop next month at which SOA is on the agenda. I hope to have useful conversations about where other Universities are at in maturing their data integration architectures at that event. One thing I can report is that the Portfolio Executive felt that they did understand the concept of the ‘blue ring’ and the importance of it, and are prepared to accept our forthcoming business case. This feels like an important step forward for our organisation.



Core Data – an important asset in terms of efficiency, agility and strategy

Sunday, October 21st, 2012

Earlier in this project I read the JISC infoNet “10 Steps to Improving Organisational Efficiency” with interest, and considered where this project and other activities we are doing at the University of Bristol fit in with the ten steps summarised.

Step 1 contains the instruction “Review a copy of your college or university’s strategic plan”. This is indeed one of the first jobs I undertook when I started as Enterprise Architect at Bristol in 2011. I have been doing some benefits maps work since to help understand our overarching strategy better – see my blog post about this at

Step 2 recommends: “Create a basic view of the functions carried out by the institution”. There is a recommended JISC infoNet Business Classification Scheme to help with this. I am not sure we have done this step explicitly so far, although one of my early pieces of work on Lifecycles looks at IT systems from a stakeholder perspective and more recently I have been working with the Director of Planning who is using the balanced scorecard approach, so I think it would be useful for me to review how far we meet this objective with clarity.

Step 3: “Create a basic view of your IT Architecture showing the applications and services, the interfaces between them and the data transferred”. Well, it takes so few words to describe this step and yet this is a massive step for IT Services in my organisation. We have documented applications and services and are applying ITIL to good effect in terms of using a service catalogue as a central record for all IT systems offered as services within the organisation. However, the interfaces between our systems are absolutely key for us and the focus of this project. I have documented what we’re doing with our Interface Catalog in my other blog: and we’ve now started a data dictionary (a summary overview of which is offered via an Entity Relationship diagram for the data model) in relation to the data entities exchanged/manipulated at the interface level. I will blog soon on how we are agreeing on the data dictionary template as a standard and what data entities need to be documented in the data dictionary and why. This work is one of the main thrusts behind this JISC project.

Step 4: “Identify any ‘bloated’ or redundant applications that consume resource far in excess of their actual value to the organisation and plan to phase them out. In time you will look at the business processes that drive this.” This step is about looking at return on investment and although I wouldn’t say we are managing this step through a formal methodology at the moment, we are looking at benefit in terms of benefits maps and we are also developing extremely useful documentation about the costs of our services (especially in terms of developer FTE’s required to manage the systems that constitute those services or the costs of the  outsourced support costs, in addition to any software licensing costs after upfront investment).

Step 5: “Use the IT architecture conclusions as a starting point for discussions involving management, teaching and administrative colleagues about architecture at enterprise level. ….. to build a roadmap of integrated process and ICT change.” This is a valuable step and at the moment this is happening through benefits work taking place with the Education, Research and Support areas of our organisation. When the three sets of benefits maps come together I think we will have an incredibly useful piece of dialogue around the priorities around IT support for strategic requirements considered holistically at the University.

Step 6: “Identify the likely lifespan and replacement cycle for existing applications.” This is happening as part of our application of the ITIL standard across IT Services.

Step 7: “Consider how a service-oriented approach (SOA) to your data layer could streamline the architecture and reduce the need for interfaces/data retyping. Plan to turn the ‘spaghetti’ into a ‘lasagne’.”. Although there are 3 steps after this one, I will stop at step 7 in this blog post because this is very much the focus of this JISC project. The documentation in our interface catalog is showing us where, say, ten different interfaces are exchanging the same student entities, in which case we could abstract out the common functionality and offer it via a single API, possibly via a Web Service. That way, should we change the structure of a student entity in our master data system, we can manage the propagation of that change through the systems that consume that data far more safely, easily and efficiently than at present. We can do this in turn for every piece of functionality that we see is highly used. We can also consider the need for realtime data updates across our systems and where we might start deploying ESB solutions in earnest. For more information about our interface catalogue please see: