For very obvious reasons, when considering calibration accuracy there is a big focus on Delta-E values, with low dE values for grey scale and primary colours beeing seen as a good sign of accuracy. Many calibration system suppliers specifically go out of their way to provide all sorts of reporting capabilities to prove their calibration is accurate using Delta-E values for grey scale and RGB primary colours.
But, is this correct?
The reality of relying on Delta-E
The simple answer is no, it is not at all correct, and often has little real relevance to the overall accuracy of calibration.
A very easy explanation can be as simple as the fact that real-world images do not contain grey scales, or even much in the way of grey or pure red, green or blue as actual colours. It is only technically generated images that have such perfect colours, which shows there is an underlying issue with the way many systems approach calibration & verification, as it is these unnatural colours that are the focus of most calibration systems.
The image here shows the standard Delta-E values reported as an example of calibration accuracy, based on Grey Scale, and RGB Primary ramps. All the black space is unverified for accurate calibration, and can easily be wildly inaccurate.
Because of the limited number of points that such Delta-E verification focuses on it is very possible for the actual underlying calibration to be widely inaccurate when real-world images are viewed, even though the Delta-E values report accurate calibration.
A far better and very obvious way to verify calibration is to perform a second profile post-calibration, using a full volumetric patch set and assess the 3D graphs within ColourSpace, including the assessing the dE values for all volumetric verification measurements.
Delta-E (dE) is a single number that represents a difference between two colours, with the basis that a dE of 2.3 is the Just Noticeable Difference (JND), or smallest colour difference the human eye can see.
So, theoretically any dE less than 2.3 is imperceptible, while any dE greater than 2.3 is noticeable. However, some colour differences greater than 2.3 can be imperceptible, while some colour differences below 2.3 can be very visible, depending on the colour being measured.
Additionally, and more importantly, when Delta-E is used to represent calibration accuracy it is normal to only report a limited number of colour points. Usually the grey Scale and RGB primary colours only are represented, as shown above, or a small selection of colours based on something like the Macbeth Colour Checker. Neither is good enough in reality, as far too few points are being used to try to verify the total volumetric colour space.
Note: Although a dE of 2.3 is regarded as the technical JND value, many refer to a value of 1.0 as being a more realistic threshold for imperceptible difference.
Problems with Delta-E
To gain a mental image of the problems being outlined here think of skin tones. The average Caucasian skin tone resides well away from any grey scale, or primary colours, and as such is ignored by most calibration systems when performing a post-calibration verification. More importantly colours such as skin tones, grass, sky, etc, are memory colours, which means the human eye has a good idea as to what they should look like as they are seen almost daily. And equally importantly there are many different variations of hues, saturation and brightness associated with each memory colour or tone. Without accurate display verification that includes these variations the calibration results can never be considered as accurate.
The cube image above shows a standard grey scale and primary RGB verification, with skin tone added to show its approximate location for reference. All the black space (including the skin tone patches) are effectively un-verified in most calibration systems.
It should be understood that if displays were perfectly linear in colour reproduction - any change in input signal would produced an exactly equal change in the displayed colour - it would be possible to perform a grey scale and primary colour calibration only, and extrapolate/interpolate the calibration of the remaining colours. Unfortunately very few displays are in anyway linear. More annoyingly, those displays that are close to linear are the highly expensive professional monitors which are routinely calibrated with professional 3D LUT profiling systems, whether the display needs it or not. It is lower-cost displays, such as home TVs that are almost always of poor linearity, and therefore can only be accurately calibrated, and verified, via professional level full 3D cube based profiling and calibration.
This requirement for accurate calibration of displays with poor linearity (which is most displays as stated above) requires the use of 3D LUTs generated from full 3D cube based profiles.
But, not all 3D LUT based calibration is made equal - because of the overriding desire of many calibration systems to focus, incorrectly, on Delta-E, grey scale and primary colours as the definition for accurate calibration.
This is not to say that Delta-E reports are useless, and should be ignored, or that the values they report are untrustworthy (ignoring the fact that the actual values reported can be deceptive), but that good Delta-E values alone are no guarantee of accurate calibration. All the colours that Delta-E vales do not report on are equally important, and must be equally as accurate for good final calibration.
Every Colour Point MUST Be Considered Equal
From the above description of calibration issues it can be seen that every colour point has to be given equal importance during profiling and calibration and verification, not just grey scale and primary colours.
That above statement is so important for accurate calibration it is worth stating again!
Every volumetric colour point has to be given equal importance during profiling and calibration, not just grey scale and primary colours!
And the only way to do that is to verify multiple volumetric points, using as many points as possible for any critical calibration verification, so that the entire colour space is covered, with a good level of granularity.
The following graphically illustrates this point.
The first verification graph shows a normal Grey & Primary Ramp, plus Memory Colours verification, and as can be seen there is a huge amount of unverified volumetric space.
The second graphs uses a 1000 patch volumetric verification, and the thirst graph a 3000 patch verification. The different is obvious, and obviously the volumetric verifications far better define the final calibration accuracy.