Skip to main content

https://sfadigital.blog.gov.uk/2015/02/09/the-data-modelling-trilogy-semantic-mayhem/

The Data Modelling Trilogy: Semantic Mayhem

Posted by: , Posted on: - Categories: Agile, Data

This is the first post in a series of three looking at data modelling in an Agile development framework.

What’s in a name?

Whether ‘Waterfall’ or ‘Agile’, there are many data modelling approaches and several types of data models. Variety is turned into chaos by referring to these data models in arbitrary ways, and hey presto we have a suite of terminology that means a lot to one member of a team but little to the rest. Not a good start to collaboration.

The Universal Data Model
The Universal Data Model

‘Domain’, ‘Enterprise’, ‘Corporate’, ‘Conceptual’, ‘Logical’, ‘Physical’, ‘Class’, ‘Canonical’, ‘Application’, ‘Abstract’, and ‘Business’ are some of the names used for types of data model. Surely a project doesn’t need all these?  Well the good news is it doesn’t, not least because these models aren’t unique and overlap considerably.  Still, there is money to be made from chaos, so pedalling data modelling frameworks that claim to have a unique selling point whilst confusing the hell out of the consumer has its plus points for the vendor! Some may also be minded to point out that this semantic mayhem adds value and mysticism to the role of the data architect. Oh, surely not...

The mistakes data modelling aims to help us avoid

Why are there so many (apparently) different types of data models? The various models readily spring up on the journey from a blank piece of paper to a DDL (data definition language). The models are used to increase the chances of pinging the perfect solution into production, and to help ensure informed design decisions are made. The word ‘informed’ in this context really means the avoidance of major mistakes. When it comes to data the biggest mistakes are:

  • missing data – bit of a disaster, goes live with some required data absent
  • inconsistent data – the solution is capable of giving different answers to the same question depending on how the question is asked
  • designs that don’t support the processing – usually resolved by throwing extra processing resource at the problem or bending the design to shoehorn the data in. Neither solution works long term
  • tunnel vision designs – application-centric design with no wider view either to the enterprise or the future
  • short sighted designs – rigid designs that don’t flex well when new requirements come along

There are loads of others, but these tend to have the biggest financial impact. A particularly nasty aspect to four of the five is their propensity to only raise their ugly heads sometime after the project goes live.

Data is no different to consumer products - a failure (error) found with the consumer (live) will cost 100 times more to resolve than one found in the design phase.

So, what models does one really need, especially in an Agile development? Does Agile mean the death of the data model? I’ll look at this in part two of my blog.

Sharing and comments

Share this page