Using count() in a rule on a list of empty strings returns zero

Saw an interesting one today and while we are investigating what the actual intention is, it would be interesting to hear feedback from users.

In this article my colleague @hlee showed how to use a relative path to match empty cells in a column and the implication is that you can then “count()” them to test if there are too many (or too few) empty data items.

Turns out that the count() function ignores empty strings - probably treating them as NULL. The simplest form to show this is:

set $(x) count(0, 0, 0, 0)
set $(y) count("", "", "", "")

If you then look at the values of $(x) and $(y) you see 4 and 0 respectively.

The workaround, if you want to count empty cells is simple; use the severity property instead of value like this: count(wpath "alias" severity) but is this enough and/or “right” ?

Is treating empty values as NULL for counting purposes useful?

Hi Peter,

Whilst the public release notes only cover versions 4 and 5 because the product has moved on, in version 3.0.10, the count function was indeed changed to ignore blanks. I have dug up the release note:

Blank and white-space cells are now ignored for statistical and historical calculations, where previously they were taken as 0. For historical calculations, a blank value is taken as a lack of value for that duration.

List of affected functions are:

maximum
minimum
average
total
count
standardDeviation
durationWeightedAverage
durationWeightedTotal
durationWeightedStandardDeviation

The rate function is not affected by this change.

There are many reasons why having blank cells (or blank strings in this case) included will negatively affect the functions above including, for example, getting the maximum of a range of negative numbers where there could be a blank to avoid it being treated as 0, or counting non blank items in a set of cells, which otherwise could be an issue depending on the source data.

If you are counting a range of cells using a wpath, and you do want blank values included then I personally do exactly as you - use the severity - as it is one of the other properties that all cells are guaranteed to have, and can never be blank.