DataSmoothing

# Data Smoothing for the HP-41

Overview

1°)  Least-Squares Parabola
2°)  Smoothing & Differentiation
3°)  Moving Window Average

-We assume that the set of n data points

 x1 x2 ........... xn y1 y2 ........... yn

verifies   h = xk+1 - xk = Constant  ( equally-spaced abscissas )

1°)  Least-Squares Parabola

-Given a set of n data points  y1 , y2 , ............. , yn  we want to smooth noisy data.
-The following program uses a 5-point least-squares-parabola and replaces each  yk  by the value  zk  of this quadratic polynomial at this point.

Formulae:

zk = (1/35) ( -3 yk-2 + 12 yk-1 + 17 yk + 12 yk+1 - 3 yk+2 )

z1 = (1/35) ( 31 y1 + 9 y2 - 3 y3 - 5 y4 + 3 y5 )                     and 2 similar expressions
z2 = (1/35) ( 9 y1 + 13 y2 + 12 y3 + 6 y4 - 5 y5 )                   for zn-1  and  zn

Data Registers:           •  R00 = n > 4                                ( Registers R00 thru Rnn are to be initialized before executing "SMOOTH" )

•  R01 = y1  •  R02 = y2  .......................  •  Rnn = yn                 Rnn+1 & Rnn+2:  temp

>>>>  when the program stops:     R01 = z1   R02 = z2  .......................  Rnn = zn

Flags: /
Subroutines: /

 01  LBL "SMOOTH"   02  RCL 00   03  STO M   04  31   05  RCL IND Y   06  *   07  9   08  LASTX   09  *   10  DSE M   11  RCL IND M   12  13   13  *   14  +   15  X<>Y   16  RCL IND M   17  9   18  *   19  +   20  DSE M   21  RCL IND M   22  3   23  *   24  -   25  X<>Y   26  RCL IND M   27  12   28  *   29  + 30  DSE M   31  RCL IND M   32  6   33  *   34  +   35  X<>Y   36  RCL IND M         37  5   38  *   39  -   40  DSE M   41  RCL IND M   42  3   43  *   44  +   45  X<>Y   46  RCL IND M   47  5   48  ST+ M   49  *   50  -   51  35   52  ST/ Z   53  /   54  STO IND M   55  ISG M   56  CLX   57  X<>Y   58  STO IND M 59  RCL 00   60  STO M   61  LBL 01   62  RCL IND X          63  3   64  *   65  DSE Y   66  RCL IND Y   67  12   68  *   69  X<>Y   70  -   71  DSE Y   72  RCL IND Y   73  17   74  *   75  +   76  DSE Y   77  RCL IND Y   78  12   79  *   80  +   81  DSE Y   82  RCL IND Y   83  3   84  *   85  -   86  35   87  / 88  X<> IND M          89  STO N   90  DSE M   91  4   92  RCL M   93  X>Y?   94  GTO 01   95  RCL 01   96  9   97  *   98  RCL 02   99  13 100  * 101  + 102  RCL 03 103  12 104  * 105  + 106  RCL 04 107  6 108  * 109  + 110  RCL N 111  5 112  * 113  - 114  RCL 01 115  31 116  * 117  RCL 02              118  9 119  * 120  + 121  RCL 04 122  5 123  * 124  - 125  RCL N 126  RCL 03 127  - 128  3 129  * 130  + 131  35 132  ST/ Z 133  / 134  STO 03 135  X<>Y 136  STO 04 137  3.001 138  RCL 00 139   E6 140  / 141  + 142  REGMOVE 143  CLA 144  END

( 210 bytes / SIZE nnn+3 )

 STACK INPUTS OUTPUTS X / /

Example:    Smooth the following data

 x 1 2 3 4 5 6 7 8 9 10 y 0.071 0.613 1.112 1.395 1.511 1.795 1.944 2.141 2.153 2.334

-Store the ten yk  into registers R01 thru R10
-Store the number of data points ( i-e 10 ) into R00 and XEQ "SMOOTH"  The results - in R01 thru R10 - are:

 z 0.055 0.65 1.093 1.37 1.566 1.753 1.98 2.091 2.211 2.314

-The first array was obtained by adding random "errors" between -0.1 and +0.1 to  Y = Ln x

 Y 0 0.693 1.099 1.386 1.609 1.792 1.946 2.079 2.197 2.303

-So, we can compare the root-mean-square errors:          RMS error of yk = 0.053      ,      RMS error of zk = 0.032

2°)  Smoothing & Differentiation

-The formulae used in "Numerical Differentiation for the HP-41" produce accurate results when the data are exact.
-But if there are roundoff or observational errors, data smoothing may give better estimations of dy/dx

-The following routine also uses a 5-point least-squares-quadratic polynomial to calculate the derivatives.

Formulae:

y'k ~ (1/10.h) ( -2 yk-2 - yk-1 + yk+1 + 2 yk+2 )                                    where  h = xk+1 - xk

y'1 ~ (1/70.h) ( -54 y1 + 13 y2 + 40 y3 + 27 y4 - 26 y5 )                     and 2 similar expressions
y'2 ~ (1/70.h) ( -34 y1 + 3 y2 + 20 y3 - 17 y4 - 6 y5 )                          for y'n-1  and  y'n

Data Registers:           •  R00 = n > 4                                        ( Registers R00 thru Rnn are to be initialized before executing "SMDV" )

•  R01 = y1  •  R02 = y2  .......................  •  Rnn = yn                 Rnn+1 & Rnn+2:  temp

>>>>  when the program stops:     R01 = y'1   R02 = y'2  .......................  Rnn = y'n

Flags: /
Subroutines: /

 01  LBL "SMDV"   02  10   03  *   04  STO O   05  RCL 00   06  STO M   07  34   08  RCL IND Y   09  *   10  54   11  LASTX   12  *   13  DSE M   14  RCL IND M   15  13   16  *   17  -   18  X<>Y   19  RCL IND M   20  3   21  *   22  -   23  DSE M   24  RCL IND M   25  20   26  *   27  ST- Z   28  ST- Z 29  -   30  DSE M   31  RCL IND M    32  17   33  *   34  -   35  X<>Y   36  RCL IND M   37  27   38  *   39  -   40  DSE M   41  RCL IND M   42  26   43  *   44  +   45  X<>Y   46  RCL IND M   47  6   48  ST+ M   49  *   50  +   51  7   52  RCL O   53  *   54  ST/ Z   55  /   56  X<>Y 57  STO IND M    58  DSE M   59  X<>Y   60  STO IND M   61  RCL 00   62  STO M   63  LBL 01   64  RCL IND X   65  ST+ X   66  DSE Y   67  RCL IND Y   68  +   69  DSE Y   70  DSE Y   71  RCL IND Y   72  -   73  DSE Y   74  RCL IND Y   75  ST+ X   76  -   77  RCL O   78  /   79  X<> IND M   80  STO N   81  DSE M   82  4   83  RCL M   84  X>Y? 85  GTO 01   86  RCL 02            87  3   88  *   89  RCL 01   90  34   91  *   92  -   93  RCL 03   94  20   95  *   96  +   97  RCL 04   98  17   99  * 100  + 101  RCL N 102  6 103  * 104  - 105  RCL 02 106  13 107  * 108  RCL 01 109  54 110  * 111  - 112  RCL 03 113  40 114  * 115  + 116  RCL 04          117  27 118  * 119  + 120  RCL N 121  26 122  * 123  - 124  7 125  RCL O 126  * 127  ST/ Z 128  / 129  STO 03 130  X<>Y 131  STO 04 132  3.001 133  RCL 00 134   E6 135  / 136  + 137  REGMOVE 138  CLA 139  END

( 213 bytes / SIZE nnn+3 )

 STACK INPUTS OUTPUTS X h /

where  h = xk+1 - xk

Example:    With the same data, evaluate y'(1) , y'(2) , .............. , y'(10)

 x 1 2 3 4 5 6 7 8 9 10 y 0.071 0.613 1.112 1.395 1.511 1.795 1.944 2.141 2.153 2.334

-Store the ten yk  into registers R01 thru R10
-Store the number of data points ( i-e 10 ) into R00

h = 1  XEQ "SMDV"  The results - in R01 thru R10 - are:

 y'(x) 0.671 0.519 0.366 0.276 0.206 0.193 0.163 0.129 0.111 0.094

-The exact values should be 1/x  i-e:

 y'(x) 1 0.5 0.333 0.25 0.2 0.167 0.143 0.125 0.111 0.1

-The accuracy is relatively good - except  y'(1) !
-If we use a collocation polynomial, for instance the 5-point formula:     y'k ~ (1/12.h) (  yk-2 - 8 yk-1 + 8 yk+1 - yk+2 )

it yields:   y'(3) ~ 0.401   y'(4) ~ 0.168   y'(5) ~ 0.197   y'(6) ~ 0.227  y'(7) ~ 0.177   y'(8) ~ 0.094

-All these values are much less accurate except y'(5)

3°)  Moving Window Average

-Given n values   y1 , y2 , ............. , yn  and  k coefficients   c1 , c2 , ....... , ck ,
"MWAV" computes   z1 , z2 , ............. , zm  where the z's  are defined by

z1 = ( c1 y1 + ............ + ck yk ) / c
z2 = ( c1 y2 + ............ + ck yk+1 ) / c

.................................................................                    with  c =  c1 + ............ + ck   ( we assume that c # 0 )

zm = ( c1 yn-k+1 + ............ + ck yn ) / c

Data Registers:              R00 = c                     ( Registers Rbb thru Ree & Rbb' thru Ree' are to be initialized before executing "MWAV" )

R01 = bbb.eee
R02 = bbb.eee'           R04-R05-R06: temp
R03 = bbb.eee"

•  Rbb = y1  •  Rbb+1 = y2  .......................  •  Ree = yn
•  Rbb' = c1 •  Rbb'+1 = c2  .........  •  Ree' = ck

>>>>  when the program stops:     Rbb" = z1   Rbb"+1 = z2  .......................  Ree" = zm

Flags:  /
Subroutines:  /

-If you want that this routine also works when  c = 0  ( for numerical differentiation ) ,  add   X=0?   SIGN   after line 47
-Lines 50-51 may be replaced by   CLX   SIGN   ST+ 05   ( slightly faster )

 01  LBL "MWAV" 02  STO 03 03  RDN 04  STO 02 05  INT 06  X<>Y 07  STO 01 08  INT 09  STO 05 10  - 11  RCL 03 12  + 13   E3 14  / 15  RCL 01           16  FRC 17  RCL 02 18  FRC 19  - 20  + 21  RCL 03 22  + 23  STO 03 24  STO 04 25  RCL 02 26  0 27  LBL 00 28  RCL IND Y 29  + 30  ISG Y 31  GTO 00 32  STO 00           33  LBL 01 34  RCL 02 35  RCL 05 36  STO 06           37  CLX 38  LBL 02 39  RCL IND Y 40  RCL IND 06 41  * 42  + 43  ISG 06 44  CLX 45  ISG Y 46  GTO 02 47  RCL 00           48  / 49  STO IND 04 50  ISG 05 51  CLX 52  ISG 04 53  GTO 01 54  RCL 03 55  END

( 77 bytes )

 STACK INPUTS OUTPUTS Z bbb.eee / Y bbb.eee' zm X bbb" bbb.eee"

bbb.eee  is the control number of the y's
bbb.eee' is the control number of the c's                                           All the  bbb's > 006
bbb"  is the first register of the block that will contain the z's

Example:

{ y } = {  1  2  5  9  14  16  13  9  4  1  0 }    that we store in registers  R12  R13 ..................  R22
{ c } = {  1  3  4  1  1 }                                  that we store in registers  R07  R08 ..................  R11

-If we choose  bbb" = 31

12.022   ENTER^
7.011    ENTER^
31       XEQ "MWAV"   >>>>   31.037   --- in 15 seconds ---

-And we have in registers R31 to R37:    { z } = { 5  8.3  11.7  13.7  12.7  9.6  5.7 }

Notes:

-These smoothing filters are defined by  zi = wi-L yi-L + wi-L+1 yi-L+1 + ....... + wi yi + ........... + wi+R-1 yi+R-1 + wi+R yi+R    ( wj = cj/c )
L is the number of points used to the left of yi
R is the number of points used to the right of yi

-So, you can interpret the results according to your choice:    { 5  8.3  11.7  13.7  12.7  9.6  5.7 }

may represent the smoothed values of  { 14  16  13  9  4  1  0  }  if  R = 0  ( causal filters )
or  the smoothed values of  {  1  2  5  9  14  16  13 }  if  L = 0
or  the smoothed values of  {  5  9  14  16  13  9  4 }  if  L = R

and 2 other possibilities in this example.

-Usually, the "moving window averaging" assumes that all cj's = 1, but this program is more general.

-With n = 100 & k = 9 , execution time = 5mn13s
-You can choose bbb" = bbb ,  the y-values will be gradually overwritten but the z-values will be correctly calculated.
-So, a set of 300 data points may be smoothed in about 16 minutes.

-The variant below uses synthetic registers to increase the number of data registers available:

Data Registers:              R00 = c                     ( Registers Rbb thru Ree & Rbb' thru Ree' are to be initialized before executing "MWAV" )

•  Rbb = y1  •  Rbb+1 = y2  .......................  •  Ree = yn
•  Rbb' = c1 •  Rbb'+1 = c2  .........  •  Ree' = ck

>>>>  when the program stops:     Rbb" = z1   Rbb"+1 = z2  .......................  Ree" = zm

Flags: /
Subroutines: /

-If you want that this routine also works when  c = 0  ( for numerical differentiation ) ,  add   X=0?   SIGN   after line 45
-Lines 48-49 may be replaced by   CLX   SIGN   ST+ M   ( slightly faster )

 01  LBL "MWAV" 02  RDN 03  STO N 04  INT 05  X<>Y 06  STO Z 07  INT 08  STO M 09  - 10  R^ 11  + 12   E3 13  / 14  X<>Y 15  FRC 16  + 17  RCL N            18  STO T 19  FRC 20  - 21  + 22  STO O 23  STO P 24  CLX 25  LBL 00 26  RCL IND Y    27  + 28  ISG Y 29  GTO 00 30  STO 00 31  LBL 01 32  RCL N 33  RCL M 34  STO Q 35  CLX 36  LBL 02 37  RCL IND Y 38  RCL IND Q    39  * 40  + 41  ISG Q 42  CLX 43  ISG Y 44  GTO 02 45  RCL 00 46  / 47  STO IND P    48  ISG M 49  CLX 50  ISG P 51  GTO 01 52  RCL O 53  CLA 54  END

( 87 bytes )

 STACK INPUTS OUTPUTS Z bbb.eee / Y bbb.eee' zm X bbb" bbb.eee"

bbb.eee  is the control number of the y's
bbb.eee' is the control number of the c's                                           All the  bbb's > 000
bbb"  is the first register of the block that will contain the z's

-Of course, the same example gives the same results, but we can choose  bbb = 1

References:

[1]  F. Scheid -  "Numerical Analysis" - Mc Graw Hill   ISBN 07-055197-9
[2]  Press , Vetterling , Teukolsky , Flannery - "Numerical Recipes in C++" - Cambridge University Press - ISBN  0-521-75033-4