You know that old song about you saying ‘pot-at-o’ and me saying ‘pot-ay-to’ when we’re talking about the same thing? We often need to have similar conversations about data when you are designing interfaces.
Data mapping is often deceptively easy, but start doing the work without doing the necessary business preparation, and you might find that your integration code, no matter how carefully crafted, tested and documented, is completely flawed and worse than useless.
What is the "thing"?
The first problem that some analysts fall into is to think that the context of data entities is the same across an organisation. Let’s take an extreme example, let’s pretend that you are doing some data integration work in a hardware shop. The people owning the invoicing system have asked you to pass specific information about the bulbs the shop sells into it so that they can do more detailed cost mapping.
So you’d start with the source stock management system and you happily go to the electrical department, getting details for the entity like wattage, type of illumination, size, etc and then you’d set up the invoicing system with the relevant fields and write the integration code. It performs brilliantly. Well done.
Except the bulbs that the invoicing people were interested in were flower bulbs, not light bulbs and you actually needed to speak to the gardening department. Now your lovely code is corrupting the invoicing database and everyone’s unhappy with you. No-one had flagged that the concept of a ‘bulb’ was different depending on your departmental context and no-one had challenged which ‘bulb’ was being talking about before the work was started.
I acknowledge that this is indeed a very extreme example but I can cite many examples where assumptions about what a ‘thing’ was were completely different in separate parts of an organisation. This has led to many despairing hours of integration rework and a rake through the list of assumptions to ensure that this type of problem didn’t happen again.
What are the attributes of the "thing"?
The second problem is that of the attributes of the ‘thing’ itself. You may be talking about the same thing but do you agree about its behaviours and characteristics? Do restricted list of values for attributes match, for example? When the target system has a restricted colour list of light blue and dark blue and you pass it ‘turquoise’ what is the appropriate choice? Is the source system right? Is the target system right? It sometimes happens that the business understanding of two entities and attributes are so out of kilter the thing you thought you were talking about was actually something else entirely!
What are the "thing's" lifecycle and dependencies?
Integration can get even more complex when you’re having to manage the context of the entities between multiple systems and is often the reason why complex integrations fail. The third problem is when you neglect implications around the lifecycle of that entity and its dependencies. Let’s take the example of an organisation who is no longer trading. Some systems may not need to do anything with this information, some may delete the record, others may merely flag a change of status. None of these behaviours are necessarily incorrect but they do need to have an understanding of how the local data implementation is complementary to that of the source system. This rarely results in a ‘one size fits all’ integration pattern.
Define data and build a corporate schema
There’re a few lessons here to keep your integration and data sanity in check. First thing is that you need to identify the business owner for the entities that you have. These are the people who define what an entity looks like and what its attributes are across the organisation.
People often make the mistake of implementing a Master Data Management (MDM) system hoping the technology will manage this for them without putting the necessary governance in place. MDM can make this easier but ultimately it’s the data business owner who defines what data looks like, how it is logged, how long it’s retained for etc.
The second thing is to build yourself a corporate schema for your data entities and ensure that all your entities conform to this model when exchanging data between business systems. These schemas should always represent the business view of what an entity is and not necessarily be defined by the underling system.
Chris’ rules to integrate by:
- Never assume that one person’s view of what a thing is is the same as someone else’s
- Never assume that one person’s view of what a thing has or does is the same as someone else’s
- Make sure someone ‘owns’ your data entity. If no one does, ensure someone is nominated and put the necessary governance measures in place
- Design and get buy-in on what an organisational view of what a ‘thing’ is. Ensure it’s documented, that definitions are attached to its attributes, and that people use it.
- If you throw technology at a poorly managed data estate you will end up with a more expensive poorly managed data estate.