The field of chemometrics has been around for quite some time now and has played its role in both research and industrial environments. While the multivariate research toolbox is well established and ever increasing, its industrial counterpart is only beginning to see widespread use in the last decade. This could be due to many reasons such as the advancement of industrial data management solutions, the semi-continuous financial crises that drive the industry to smarter manufacturing and the use of data to reduce costs and waste. Whatever the reasons, it is clear that chemometrics and multivariate analysis have a big role to play in connecting the dots between research and the new paradigm with advanced and more robust sensors, agile manufacturing processes and product quality control. Welcome to the multivariate world!
We believe that all processes or systems are multivariate in nature until proven otherwise and therefore they must be analysed, modelled and understood as such! There are two important aspects in any field that requires experimentation; the ability to understand the outputs of the experiment and the ability to put this newfound knowledge to use for future situations. Whether the experiment involves spectral measurements, sensory data, manufacturing process data or psychometric variables, the two main outputs everyone is looking for are; can I understand the process/experiment and how can I put my findings to good use. All you need is the right multivariate tools to understand the data and to generate valid and robust models. Then you are ready to apply these models in real-world situations.
While multivariate methods lend themselves well to empirical analysis of data sampled from science, technology and nature (i.e. any system with multiple underlying structures) there is nothing that prevents the use of the first principle models in combination with actual observations.
CAMO’s philosophy is not to describe all methods in the world or to include them in our software, but to provide methods that are versatile and suited for any kind of data, regardless of their size and properties. CAMO believes that the focus should be on graphical presentation of results rather than tables with p-values. This is related to the distinction between significance and relevance. With a high number of objects any test for significance between two groups or correlation between two variables will be statistically significant. Thus, a table of p-values does not show if the model is suitable for predicting selected properties such as the product quality at the individual level.
When this is said we realise that summarising the important findings from a project or study is often efficiently done with bullet points or univariate statistics. Our message is that multivariate methods provide the fastest insight into complex data to arrive at the correct conclusions and to avoid “searching for correlations”.
The situation is that even after 40 years of multivariate methods and in particular multivariate calibration, it is not known to the majority of people that selectivity is not needed to predict quality of a product or classify or identify samples such as raw materials.
Finally, being a data analyst is about practicing the methods and software on your own data.
I wish you all the best, and may your models be with you!
Dr Frank Westad, Chief Scientific Officer, CAMO Software