Skip to main content

Lean data

Posted by: , Posted on: - Categories: Data, Solution Design and Innovation

Lean data showing the numbers 0 and 1 repeated

“So we have these really lean business processes and we are going to build a set of services to support them. We will have data supporting Service Based Design but we won’t de facto have lean data. ‘Lean data’ … sounds like a subject deserving of a separate blog."

The above might seem like an odd paragraph with which to start a blog, not least because it is. It’s the final paragraph my previous blog on Service Based Design,  but it sort of sets out the reason this blog is being written.

‘Lean’, a fashionable and abused word. Prefix another word with ‘lean’, brew up a methodology, write a book and make millions. So, is the term ‘lean data’ just another buzz word; well two words?

Given its roots in the manufacture of motor vehicles, it’s not surprising that lean focuses on very process-orientated types of waste. It works for manufacturing, but could or does it work for services as well (yes) and, if so, what about data?

Does data have a value stream?

The first question has to be about value stream. Does data have a value stream? Sounds like asking whether the oil that’s put in an engine during manufacture has a value stream. It’s a good analogy to draw. A car’s value stream mapping starts with the customer and goes back down the processes, including the manufacturing processes, until at some point we get to the oil - not so much the quality of the oil, but any waste associated with it.

How can you have waste associated with oil? A leading manufacturer thought the same until some bright spark mapped  the value stream of a process and suggested that putting a lower, but still legal, level of oil in the vehicle could save buckets loads of dosh. It did. So if oil has a value stream, it’s starting to look promising for data. After all, data really is the oil that runs the SFA. Funding training, one of the Agency’s most important objectives, doesn’t involve money physically changing hands, rather the money is abstracted into data (just like your savings are – data held in electronic form).

What waste is associated with data?

The second question has to be about what waste is associated with data. Let’s look at the original seven types of waste that Taiichi Ohno (Toyota Chief Engineer and really bright cookie) drew up back in the 1950s:

  • Over-processing
  • Transportation
  • Inventory
  • Over-production
  • Waiting
  • Motion
  • Defects


In lean manufacturing, we avoid the waste over-processing by using machines that are of an appropriate capacity and capable of achieving the required quality. For lean data, we avoid over-processing by using processing and storage devices that are of an appropriate capacity, don’t corrupt data and don’t fall over all the time.


In lean manufacturing, we avoid transportation waste by minimising the movement of materials and having processes in close proximity to each other. For data, we pass data around in the most efficient way, something that Service Orientated Architecture aims to achieve.


Inventory (levels of stock) relates to the volume of data held. There are three types of waste here: the waste of keeping data that is no longer of any benefit, the waste of storing data one doesn’t need to store, and the waste of untapped potential.

Keeping data that is no longer of any use might not seem important, especially as the cost of storage is falling, however the cost of managing data is increasing as more data legislation is implemented.

The rise of Service Orientation has reduced the need to move data around and store duplicate copies – where possible simply make a call for the data you needs from a Service and let someone else worry about managing the data.

Untapped potential is about maximising the benefit of the data - a field that has been enhanced by advances in software (from the first steps into OLAP, through to sophisticated technologies to extract vital information from social media content).


Inventory is similar to over-protection – the generation and acquisition of redundant data. We should aim to create, acquire and use the minimum amount of data required.

Waiting and motion

Waiting and motion are about flow - getting the right data, in the right format, to the right consumer, at the right time.

Does data really flow if the consumer is spending hours trying to manually bolt together data from different sources?  Does it really flow if (rather like Apollo 13’s command ship's square CO2 filters that would not fit into the rescue ship's round filter barrel) so little attention has been paid to integration there are numerous ways of identifying the same data and very few ways of unifying it?


Whether in manufacturing or IT, defects can and do kill. Data defects can be costly in terms of litigation, compliance, governance. Defects are not just restricted to the operational use of data. Get the design of data wrong and one can be paying for it (more expensive enhancements, lack of integration) for years to come.

Some might call it ‘lean data’, some would call it common sense!  Whatever it’s called, I’m not about to brew up a methodology, write a book and make millions…but it still represents a good basis for well managed data.


You may also be interested in:

Sharing and comments

Share this page