CR4® - The Engineer's Place for News and Discussion®

 Previous in Forum: Computer CPU Meter Next in Forum: Calculation of Relative Dielectric Constant of Piezoelectric?
Guru

Join Date: Mar 2011
Location: Lubbock, Texas
Posts: 14345

# Statistics on an Excel Data Table

11/07/2017 12:06 PM

Suppose you have data table with about ten different parameters, and you wish to calculate the average of each column in the table along with the standard deviation (for the entire population of the existing table).

Unfortunately, not all the parameters are measured on each date in the table and there are blank gaps in the data.

I used AVERAGEIF(Table1[parameter],crit) where crit is a defined name:>0

Next I used STDEV.S(Table1[parameter]) for the standard deviation.

Is this the correct way to ignore the blank records?

__________________
If it ain't broke, don't fix it. Just build a better one.
Interested in this topic? By joining CR4 you can "subscribe" to
Guru

Join Date: Apr 2010
Posts: 5994
#1

### Re: Statistics on an Excel Data Table

11/07/2017 1:28 PM
Guru

Join Date: Mar 2011
Location: Lubbock, Texas
Posts: 14345
#2

### Re: Statistics on an Excel Data Table

11/07/2017 1:37 PM

interesting website.

I guess the second part of my question: is the standard deviation being inflated by adding in a bunch of terms that tricks Excel into thinking blank cells are zero?

I am sure ISNUMBER (0) = true. I am pretty sure ISNUMBER(_) = false.

The trick would be incorporating that into a calculation of standard deviation where the number of terms is over 100.

I, of course, have the option of rendering the data a different way, into a contiguous column pure in number values, with no blanks, zeroes, or text.

__________________
If it ain't broke, don't fix it. Just build a better one.
Guru

Join Date: Mar 2011
Location: Lubbock, Texas
Posts: 14345
#3

### Re: Statistics on an Excel Data Table

11/07/2017 1:51 PM
 criter >0 DATA: data/blanks data/zeroes Filtered: 23 23 23 22 22 22 24 24 24 0 25 25 25 23 23 23 24 0 24 24 0 average 23.5 15.67 23.5 AverageIF 23.5 23.5 23.5 STDEV.S 1.04881 11.77922 1.04881

Note that the standard deviation calculations for data with gaps and with no gaps are identical, but the data that is zero-filled presents a huge problem.

Thanks, Rixter for making me focus better.

__________________
If it ain't broke, don't fix it. Just build a better one.
Guru

Join Date: Apr 2010
Posts: 5994
#4

### Re: Statistics on an Excel Data Table

11/07/2017 4:42 PM
Guru

Join Date: Mar 2011
Location: Lubbock, Texas
Posts: 14345
#5

### Re: Statistics on an Excel Data Table

11/07/2017 4:55 PM

Ok, I see that If(,,) can be used within stdev.p function.

also saw the one with isnumber (range).

__________________
If it ain't broke, don't fix it. Just build a better one.
Member

Join Date: Mar 2017
Posts: 7
#6

### Re: Statistics on an Excel Data Table

11/08/2017 8:12 AM

Stuffing blank entries would skew the data. If there isn't any real data value then nothing should be substituted into that spot. Zero stuffing is used in DSP to change the sample rate of a signal which changes the energy content. The STNDEV in any data set that is zero stuffed would be similar.

Write a short macro to remove blanks.

Member

Join Date: Mar 2017
Posts: 7
#8

### Re: Statistics on an Excel Data Table

11/08/2017 11:46 AM

Should have clarified my last statement if "you" read the whole comment more clearly.

The STNDEV in any data set that is stuffed with erroneous data will be modified to fit that new data set. As I said, the energy content (STD) would be wrong.

Guru

Join Date: May 2016
Posts: 1029
#7

### Re: Statistics on an Excel Data Table

11/08/2017 11:07 AM

Not really. That method will artificially lower the standard deviation by biasing zero values to the average.

I would suggest making a value copy of the data table, breaking it into individual two column tables that include the dates/times and the individual data elements and then sort the two column tables based on the data value. This will collapse the data with the zeroes sorted to the end of the column. Then you can run average and standard deviations on the columns of non-zero data. If you wanted to automate it a bit further, you could add two columns alongside the sorted date and value column and do an if/then formula to calculate the average from the top of the data column to the present level in the third column and the standard deviation in the fourth column.

The last average and standard deviation in the column would be the summary for the data in the column.

Guru

Join Date: Mar 2011
Location: ''but, don't we get PAID to ask questions?...''
Posts: 989
#9

### Re: Statistics on an Excel Data Table

11/10/2017 12:18 PM

Seems like you should also take additional specific care to qualify that each (possible column entry) is an actual number...

(i.e.: something like: ( if \$@#is > 1, then count-value is 1 ) along with ( if count ="1" then sum(\$@) = (c+1), else, not) and then Stat functions are each a function of a totally numeric (C), rather than a possibly mixed (n), to start with, in order to eliminate such things as dates being treated as numbers, etc.)...

hopefully, helpfully, good luck, in any case...

__________________
''illigitimi non carborundum...'' (i.e.: don't let the fatherless ones grind you down...)
Guru

Join Date: Mar 2011
Location: Lubbock, Texas
Posts: 14345
#10

### Re: Statistics on an Excel Data Table

11/10/2017 1:10 PM

It does seem I used the correct option. This works as my time is somewhat limited that I would eschew calculation of the appropriate sums of data in a more manual way.

__________________
If it ain't broke, don't fix it. Just build a better one.