Industrial Big data or Smart data ?

Is Big Data applicable to Industrial environments ?

  • Do we really need a “big data set”  with strong analytics to extract knowledge, or we only need a small set of data ?
  • Do we need to create a strong structure (data model) or can we have non structured data (“data lake”)  ?

A friend of mine shared his experience in a recent project. They decided to apply Big data to an Industrial manufacturing process.

  • They installed a big server with a lot of processing power and a huge amount of memory.
  • They install best-in-class analytical engine.

They selected a coating process. The process is similar to this:



In this process, the product is heated by a hot air current, when the setpoint is reached then  spray “paints” the product, and when the thickmess is the appropiated, the spay stops and the air is cooling down the product.

The first analytics was intended to establish the robustnes of the process and the influence of the average temperature in the thickness coverage.

The outcome was the average temperature was 50°C, well below the specs.

Of course, the user rejected inmediately the validity of the result.

They reviewed the calculations and brought a  data statistician on board, that concluded the data was not following neither a normal nor a chi square distribution.

After all this, they decided to bring a Process Expert on board. He said the data has to be structured in three independent phases


Now, the “Smart Data” basic statistics could be easily applied to each of the phase, with very basics statistical tools and obtaining very quick results.

The conclusions of my friend were:

  • The number of data needed for extracting meaningful conclusions in industrial processes is normally very low
  • The data has to be structured, because the process itself is heavily structured, and following phisical laws
  • And the most important: If you do not know your process….don’t ask a big data engine, ask to your process experts !


What do you think ? What are your experiences on this field ? Please leave a comment









6 thoughts on “Industrial Big data or Smart data ?

  1. Pingback: Industrial Big data or Smart data ? – thefactoryofthefuture

  2. Hi Antonio,

    Nice blog! Love the wall paper picture too!

    Based on general feedback I hear, you are not the only one with this type of experience.
    One very large chemical company said: “There is not one single data science project that has succeeded WITHOUT process knowledge”. They always put the data scientists together with the process experts, and they are making organizational changes in order to promote the two collaborate more closely. This week I heard another analyst firm making a similar comment. I think we can conclude the we need process knowledge and understanding to interpret process data.

    Now are you interested in Small or in Big Data? I mean, do you want to analyse how one particular recipe is executed in one particular equipment (small data) or do you want to one or maybe similar recipes compared across many installations world-wide (big data)?

    I have heard about companies structuring their data according to asset structure, but not yet according to recipe phase, or process phase.

    However, we know about one European startup, who do datamining on process data, and are capable to detect patterns in data. Comparing different batches, they would detect heating phases are abnormally slow or fast, or that the spray temperature would be abnormally low or high.

    So the question is: do you need to put the effort in to associate recipe-phase with historical data slices, or is clever, process-oriented data-mining enough? If you are interested in big data analytics, the slicing could be of interest, in particular if you could assign the links automatically.

    I would also be interested in others’ views and feedback


  3. It’s going to be hard not making this a marketing pitch … in fact; it’s probably going to end up sounding like that anyways. So my apologies for that upfront.

    This remark is spot-on. My own background was in datacenter technologies and storage in specific before. I’ve seen big data technologies successfully being implemented over petabytes of data in many different verticals. When I first heard about the ‘big data’ challenges in the process industry on the floor I had no idea how it came that this was just not being used on a daily base. And this in datasets (the historian) of merely a couple of terabytes. I see process engineers exporting data to excel sheets and then forwarding the question to the ‘data scientist’ as you mentioned above. The problem is that the person responsible for creating the question is not the one that needs to create the answer. And they don’t speak the same language.

    (here comes the marketing pitch)
    At TrendMiner we feel that this gap needs to be closed by the technology, not by human procedures. Therefor we have chosen to base our discovery, diagnostic and predictive analytics on pattern recognition on the signal shapes. This way we can bring the big data technology straight into the hands of the process expert himself. Why would we need a librarian to tell us how we can search Google 😉

    Coming back to your original question I think I disagree that we need to create a new type of data structure for the process industry. The problem with that is that we are going to layer more rules and regulations just because we cannot close those knowledge gaps. No, instead I feel we need more companies who rethink big data technologies, applied to the process industry, not generic.


  4. Two thoughts:
    1. I’m not sure this is really big data in any of the velocity, volume or variety dimensions.
    2. It’s always been a bad idea to buy technology before you know what problem you’re trying to solve, and it always will be.


  5. Dear Antonio,

    In our company we are only dealing with process data (most of the time timestamped values) and our experiences have taught us that :

    1. You need to know what you are looking for (increase yield, reduce quality issues, etc…) and you need to understand what is the physical meaning of the data; impossible to crunch data without P&ID and flowsheets as a support. Our consultant are process engineers not statisticians.

    2. Big Data is a subjective word (big for a process industry has not the same meaning as big for google).

    3. Ban black-box model if you can’t show results with simple 1D/2D plots

    4. Include as soon as possible people and operators in the process. They will guide you to more reliable and faster results. Even better, they will part of the solution… so they will accept much easier the results of analytics.

    5. At the end, linear regression models will do the job 🙂

    6. Don’t forget about good old continuous improvement programs (like six-sigma); this is a perfect framework for analytics.

    7. Successful project = 10% Process understanding, 10% data analytics, 30% data collection/cleaning, 50% mindset/people management (workshop, training, communication).

    8. I don’t believe that structure will really help. Depending on the problem you are looking at, a different structure on data might be required. What you need is to be able to develop an agile structure so that analytics can be done on a subset of data that matches all the problem you need to solve.

    Hope this helps 🙂



  6. Antonio.
    I spent 2 years with Rockwell Automation, mainly with OEM’s. We’d talk about ‘Internet of Things’ and ‘Connected Enterprise’. Interestingly most OEM’s don’t see the benefit to THEM. The irony is that in most cases the OEM (Machine Builder) needs to be involved at the start of the End Users Journey to a Big Data / Connected Enterprise. After all if the sensor is not in the right place or the data is locked by a disparate network, the data is either not relevant or not available in real-time. Crap In – Crap out.

    Forward thinking Machine Builders will recognise the Service Support benefits of releasing contextualised, role specific data.

    Packages such as Rockwell’s Historian & Vantage Point are off the shelf and remove the need to reinvent the wheel.

    If like to speak to some of my ex colleagues, let me know.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s