hp41programs

The Discrete Hartley Transform for the HP-41

The Discrete Hartley Transform for the HP-41


Overview
 

-The Discrete Hartley Transform is closely related to the Discrete Fourier Transform,
  but unlike the DFT, the DHT has the advantage of producing real numbers.
-Furthermore, it is (quasi-)symmetrical.
 

 1°)  One-Dimensional Data ( Vector )
 2°)  Two-Dimensional Data ( Matrix )
 3°)  Three-Dimensional Data ( Hypermatrix )
 4°)  The Fast Hartley Transform ( One-Dimensional only , X-Module Required )
 5°)  Continuous Hartley Transform & Filon's Integration Formula
 6°)  Symmetric Hartley Transform ( 1- & 2-Dimensional Problems )

    a)  Focal Program
    b)  M-Code Routine


Latest Update:    Paragraph 6°)a)

 

1°) One-dimensional Data
 

-The DHT of a vector  [ a1 , a2 , ........ , an ]  is a vector   [ b1 , b2 , ........ , bn ]

  where   bi =  SUMq=1,2,...,n   aq. cas  360°(q-1)(i-1)/n    with   cas µ = sin µ + cos µ
 

Data Registers:           •  R00 = n    •  R01 = a1 ;  •  R02 = a2 ; .......... ;  •  Rnn = an     ( these registers are to be initialized before executing "DHT" )
Flags:  /
Subroutines: /
 
 

01  LBL "DHT" 
02  DEG
03  STO N         
04  360
05  ST* Y
06  -
07  RCL 00
08  STO M         
09  /
10  0
11  LBL 01
12  RCL M
13  ENTER^
14  SIGN         
15  -
16  R^
17  *
18  RCL IND M
19  P-R
20  +
21  +
22  DSE M         
23  GTO 01 
24  RCL N 
25  SIGN         
26  X<>Y 
27  CLA
28  END

 
   ( 46 bytes / SIZE n+1 )
 

 
      STACK        INPUTS      OUTPUTS
           X             i            bi
           L             /             i

    ( 1 <= i <= n )

Example:    Find the DHT of  [ 2  4  7  6 ]

  n = 4   therefore  4  STO 00   and   2  STO 01   4 STO 02   7  STO 03   6  STO 04

  1  XEQ "DHT"  >>>   19
  2      R/S           >>>    -7
  3      R/S           >>>    -1
  4      R/S           >>>    -3     whence   DHT [ 2  4  7  6 ]  =  [ 19  -7  -1  -3 ]

Notes:

-If you call  a0 ; a1 ; ...... ; an-1  the components of these vectors  ( all subscripts decremented by 1 ) , replace lines 05-06 by the single line *
-Execution time = 83 seconds per component if  n = 100.
-The DFT may be obtained from the DHT as follows: keep the first element unchanged ( 19 ) and reverse the order of the others  ( -3 -1 -7 )

 then      E =  (   [ 19  -7  -1  -3  ]  +  [ 19  -3  -1  -7 ]  ) / 2  =  [ 19 -5  -1  -5 ]  = the even part of the DHT
 and       O = (   [ 19  -7  -1  -3  ]  -  [ 19  -3  -1  -7 ]  ) / 2   =    [ 0 -2  0  2 ]     = the odd part of the DHT

-The DFT = E - i.O =  [ 19  -5+2.i  -1  -5-2.i ]

-We have the property   DHToDHT = n.Id
-Thus, applying the DHT twice yields the original vector multiplied by n.
 

2°) Two-dimensional Data
 

-The DHT may be generalized to  nxm matrices  [ ai,j ]  ( 1 <= i <= n ; 1 <= j <= m ). The result is also a  nxm matrix  [ bi,j ]

  where  bi,j =  SUMq=1,2,...,n  ; r=1,2,....,m   aq,r. cas 360°[(q-1)(i-1)/n+(r-1)(j-1)/m]            ( cas = sin + cos )
 

Data Registers:     •  R00 = n.mmm  =  n + m/1000        ( these  nm+1 registers are to be initialized before executing "DHT2" )

                                •  R01 = a1,1      •  Rn+1 = a1,2   .......................................   •  Rnm-n+1 = a1,m
                                •  R02 = a2,1      •  Rn+2 = a2,2   .......................................   •  Rnm-n+2 = a2,m
                                       ..........................................................................................................
                                •  Rnn = an,1      •  R2n  =  an,2   ........................................   •  Rnm = an,m
Flag: /
Subroutine:  /
 
 

01  LBL "DHT2" 
02  DEG
03  360
04  ST* Y
05  ST* Z
06  ST- Z
07  -
08  RCL 00        
09  INT
10  STO M 
11  /
12  STO N 
13  CLX
14  RCL 00        
15  FRC
16   E3
17  *
18  ST* M
19  /
20  STO O 
21  CLX
22  LBL 01
23  RCL N        
24  RCL M
25  ENTER^
26  SIGN
27  -
28  ST* Y
29  RCL 00
30  INT
31  /
32  INT
33  RCL O        
34  *
35  +
36  RCL IND M
37  P-R
38  +
39  +
40  DSE M
41  GTO 01       
42  CLA
43  END

 
   ( 69 bytes / SIZE nm+1 )
 

 
      STACK        INPUTS      OUTPUTS
           Y             j             /
           X             i           bi,j

   ( 1 <= i <= n ; 1 <= j <= m )

 Example:

               [ [  1  3  4  10  ]
 Let  A =   [  4  5  7  14  ]       Calculate   b2,3
                 [  2  9  6  11  ] ]

 We have 3 rows and 4 columns , therefore  3.004  STO 00  and we store the 12 elements in registers  R01 thru R12  in column order:
  1  STO 01    4  STO 02   2  STO 03  ..........   14  STO 11   11 STO 12

-Then    3  ENTER^
             2  XEQ "DHT2"  >>>  b2,3 =  5.464101608  ( in 14 seconds )

-Likewise, we find

                       [ [    76             -28         -28          8         ]
     DHT (A) =   [  -9.2679    5.9282    5.4641   -3.1962  ]        ( rounded to 4 decimals )
                         [ -12.7321  -7.9282   -1.4641    7.1962  ] ]

Notes:

-If all subscripts are decremented by 1 ( a0,0 to an-1,m-1) , replace lines 06-07 by the single line *  and delete line 04.
-Execution time = 42 seconds per component if  n = 5 and m = 7.

-We have the property   DHToDHT = nm.Id
-Applying the DHT twice yields the original matrix multiplied by nm.
 

3°) Three-dimensional Data
 

-In this case, the DHT of a  nxmxp  hypermatrix [ ai,j,k ]  ( 1 <= i <= n ; 1 <= j <= m ; 1 <= k <= p ) is a  nxmxp hypermatrix  [ bi,j,k ]

  where  bi,j,k =  SUMq=1,2,...,n  ; r=1,2,....,m ; s=1,2,....,p   aq,r,s. cas 360°[(q-1)(i-1)/n+(r-1)(j-1)/m+(s-1)(k-1)/p]         ( cas = sin + cos )

Data Registers:     •  R00 = n.mmmppp  =  n + m/1,000 + p/1,000,000  ( these  nmp+1 registers are to be initialized before executing "DHT3" )

                                •  R01 = a1,1,1      •  Rn+1 = a1,2,1   .......................................   •  Rnm-n+1 = a1,m,1
                                •  R02 = a2,1,1      •  Rn+2 = a2,2,1   .......................................   •  Rnm-n+2 = a2,m,1
                                       ..................................................................................................................
                                •  Rn = an,1,1        •  R2n  =  an,2,1   ........................................  •  Rnm = an,m,1

                                          •  Rnm+1 = a1,1,2      •  Rnm+n+1 = a1,2,2   .......................................   •  R2nm-n+1 = a1,m,2
                                          •  Rnm+2 = a2,1,2      •  Rnm+n+2 = a2,2,2   .......................................   •  R2nm-n+2 = a2,m,2
                                              ..................................................................................................................................
                                          •  Rnm+n = an,1,2      •  Rnm+2n  =  an,2,2   ........................................  •  R2nm = an,m,2

                                                ..........................................................................................................
                                                    ..........................................................................................................
                                                        ..........................................................................................................

                                                    •  Rnm(p-1)+1 = a1,1,p      •  Rnm(p-1)+n+1 = a1,2,p   .......................................   •  Rnmp-n+1 = a1,m,p
                                                    •  Rnm(p-1)+2 = a2,1,p      •  Rnm(p-1)+n+2 = a2,2,p   .......................................   •  Rnmp-n+2 = a2,m,p
                                                       ....................................................................................................................................................
                                                    •  Rnm(p-1)+n = an,1,p      •  Rnm(p-1)+2n  =  an,2,p   ........................................  •  Rnmp = an,m,p
Flags: /
Subroutines:  /
 
 

01  LBL "DHT3" 
02  DEG
03  360
04  ST* Y
05  ST* Z
06  ST* T
07  ST- T
08  ST- Z
09  -
10  RCL 00        
11  INT
12  STO M
13  STO a
14  /
15  STO N 
16  CLX
17  RCL 00 
18  FRC
19   E3
20  *
21  INT
22  ST* M
23  ST* a
24  /
25  STO O        
26  CLX
27  RCL 00
28   E6
29  ST* Y
30  SQRT
31  MOD
32  ST* M 
33  /
34  STO P
35  CLX
36  LBL 01        
37  RCL M 
38  ENTER^
39  SIGN
40  -
41  STO Z
42  RCL a
43  /
44  INT
45  RCL P
46  *
47  RCL N        
48  R^
49  *
50  ST+ Y
51  X<> L
52  RCL 00 
53  INT
54  /
55  INT
56  RCL O
57  *
58  +
59  RCL IND M
60  P-R
61  +
62  +
63  DSE M
64  GTO 01 
65  CLA
66  END

 
    ( 104 bytes / SIZE nmp+1 )
 

 
      STACK        INPUTS      OUTPUTS
           Z             k             /
           Y             j             /
           X             i           bi,j,k

   ( 1 <= i <= n ; 1 <= j <= m ; 1 <= k <= p )

Example:

    Let A =  [ [ [   1  25  40   5    2  ]                        To store these 60 coefficients into the proper registers, you can key in    60   XEQ 10
                       [   4  36  18  32  37 ]                         after loading this short routine:
                       [   9   8   39  20  33 ]
                       [ 16  23  21  10  31 ] ]                       LBL 10   ENTER^   X^2   41   MOD   STO IND Y   X<>Y   DSE X   GTO 10
                             [ [  31  10  21  23  16 ]
                               [  33  20  39   8    9  ]
                               [  37  32  18  36   4  ]
                               [   2    5   40  25   1  ] ]
                                       [ [   0  16  23  21  10 ]
                                         [   1  25  40   5    2  ]
                                         [   4  36  18  32  37 ]
                                         [   9   8   39  20  33 ] ] ]

-Evaluate   b2,4,3
-First,   we have a   4x5x3  hypermatrix  therefore:   4.005003   STO 00
-Then,  3  ENTER^
            4  ENTER^
            2  XEQ "DHT3"  >>>    b2,4,3 =  27.04299874

Notes:

-If all subscripts are decremented by 1 (  a0,0,0 to an-1,m-1,p-1 ) , replace lines 07-08-09 by the single line *  and delete line 04.
-Execution time = 79 seconds per component if  n = 4 ; m = 5 ; p = 3.

-We have the property   DHToDHT = nmp.Id
-Applying the DHT twice yields the original hypermatrix multiplied by nmp.
 

4°) The Fast Hartley Transform ( n must be an integer power of 2 , n > 1 )
 

 -If  n = 2N is an integer power of 2 , the DHT of  [ a1 , a2 , ........ , an ]  =  [ b1 , b2 , ........ , bn ]  may be perform more quickly by the following method:

 a)  First, the  ai  are placed into the jth position where j is calculated as follows ( lines 01 to 37 ):         i-1 is written in binary ,
      and the digits are reversed which yields  j-1   ( for instance, if  i = 4 , 4-1 = 3 = (011)2 >>> (110)2 = 6 = 7-1  thus, a4 is placed into the 7th position )

 b)  Then, for L = 1 to N,   registers  Rii  are replaced by  Rj1 + Rj2 .cos 360°(i-1)/2L+ Rj3 .sin 360°(i-1)/2L    ( i = 1 ,2 , ...... , n )   with

        j1 = i - (i-1) mod 2L + (i-1) mod 2L-1  ;
        j2 = j1 + 2L-1  ;
        j3 =  j2           if   ( i-1) mod 2L-1 = 0  and   j3 =  2L + i - (i-1) mod 2L - (i-1) mod 2L-1  otherwise.
 

Data Registers:       •  R00 = n    •  R01 = a1 ;  •  R02 = a2 ; .......... ;  •  Rnn = an     ( these n+1 registers are to be initialized before executing "FHT" )

           and when the program stops,   R01 = b1 ;     R02 = b2 ; .......... ;     Rnn = bn
                                                       ( Rn+1 thru R2n are used for temporary data storage )
Flag: /
Subroutine:  /
 
 

  01  LBL "FHT" 
  02  DEG
  03  RCL 00        
  04  STO M
  05  1
  06  +
  07  LOG
  08  2
  09  LOG
  10  /
  11  INT
  12  STO N
  13  LBL 01
  14  RCL N
  15  STO O
  16  0
  17  RCL M
  18  DSE X
  19  LBL 02
  20  RCL X
  21  2
  22  ST* T
  23  MOD
  24  ST+ Z
  25  X<> L
  26  /
  27  INT
  28  DSE O
  29  GTO 02        
  30  SIGN
  31  +
  32  RCL 00
  33  +
  34  RCL IND M
  35  STO IND Y
  36  DSE M
  37  GTO 01
  38  360
  39  STO O
  40  SIGN
  41  STO M
  42  GTO 04
  43  LBL 03
  44  RCL 00
  45  .1
  46  STO Z
  47  %
  48  ISG X
  49  +
  50  %
  51  1
  52  +
  53  REGMOVE 
  54  LBL 04        
  55  2
  56  ST/ O
  57  RCL 00
  58  STO P 
  59  LBL 05
  60  RCL 00
  61  RCL P
  62  RCL X
  63  RCL O
  64  STO Q
  65  SIGN
  66  -
  67  ST* Q
  68  RCL M
  69  ST+ X
  70  ST+ T
  71  MOD
  72  -
  73  ST+ Y
  74  RCL 00        
  75  +
  76  RCL P
  77  ENTER^
  78  SIGN
  79  -
  80  RCL M 
  81  MOD
  82  ST- Z
  83  +
  84  RCL X
  85  RCL M 
  86  ST+ Z
  87  X<> L
  88  X=0?
  89  ENTER^
  90  X=0?
  91  +
  92  CLX
  93  RCL IND T
  94  RCL Q        
  95  SIN
  96  ST* Y
  97  X<> L
  98  COS
  99  RCL IND T
100  *
101  +
102  RCL IND Y
103  +
104  STO IND P
105  DSE P
106  GTO 05
107  RCL M
108  ST+ M
109  DSE N
110  GTO 03
111  CLA
112  END

 
   ( 176 bytes / SIZE 2n+1 )
 

Example:    Find   DHT [ 2  4  7  6 ]

n = 4   therefore  4  STO 00   and   2  STO 01   4 STO 02   7  STO 03   6  STO 04   XEQ "FHT"

-21 seconds later, we have  R01 = 19   R02 = -7   R03 = -1   R04 = -3  whence   DHT [ 2  4  7  6 ]  =  [ 19  -7  -1  -3 ]

Notes:

-This program doesn't work if  n = 1    ( but DHT [ a ] = [ a ] )
- I've measured the following execution times:
 

 
       n      2      4      8     16     32     64   128
  nxDHT     4s    14s    53s  3m32s 14m10s 56m40s  3h47m
    FHT     6s    21s    60s  2m38s  6m36s 15m59s 37m34s

-FHT execution time is approximately proportional to  n.Log n  ( instead of n2 for n.DHT )
-Obviously, this "fast" algorithm remains relatively slow with an HP-41 and it's worthwhile only if  n > 8
-However, the advantage is substantial if  n = 128  ( about 6 times as fast )
 and good emulators can reduce execution times to even more "reasonable" values...
-Furthermore, calculations are less complicated and roundoff errors are smaller.
 

5°) Continuous Hartley Transform & Filon's Integration Formula
 

-Actually, Hartley has defined the real transform  H(x) =  (2.pi) -1/2  §-infinity+infinity  f(t) ( cos x.t + sin x.t ) dt      ( § = integral )
-This transform is strictly reciprocal.

-In the following program, we assume  f(t)  is defined over an interval  [ a ; b ]    ( f = 0 elsewhere )  where n = b - a  is an even integer
 and we omit the factor (2.pi) -1/2      ( Add  PI  ST+ X  SQRT  /   after line 112 if you want to take it into account )

-The integral is estimated by the Filon's formula.
 

Data Registers:           •  R00 = f(a)    •  R01 = f(a+1)   •  R02 = f(a+2)  , .......... ,  •  Rnn = f(b) = f(a+n)

                                    ( These n+1 registers are to be initialized before executing "HRTL" )
Flags: /
Subroutines: /

-This program uses synthetic register a
-So, it must not be called as more than a first level subroutine.
 
 

  01  LBL "HRTL"
  02  RAD
  03  RDN
  04  X<>Y
  05  STO a
  06  -
  07  STO M        
  08  1.5
  09  1/X
  10  STO O
  11  R^
  12  STO N
  13  ENTER^
  14  ENTER^
  15  X=0?
  16  GTO 01
  17  COS
  18  X^2
  19  1
  20  +
  21  X<>Y
  22  ST+ X
  23  SIN
  24  R^
  25  /
  26  STO O 
  27  -
  28  ST+ X
  29  X<>Y
  30  X^2
  31  /
  32  X<> O        
  33  2
  34  /
  35  X<>Y
  36  SIN
  37  LASTX 
  38  /
  39  X^2
  40  ST+ X
  41  -
  42  1
  43  +
  44  X<>Y
  45  /
  46  LBL 01
  47  RCL M
  48  RCL a
  49  +
  50  R^
  51  *
  52  RCL IND M
  53  P-R
  54  STO P 
  55  X<>Y
  56  ST+ P
  57  X<>Y
  58  -
  59  RCL a
  60  R^
  61  *
  62  RCL 00
  63  P-R
  64  ST- P
  65  X<>Y
  66  ST- P
  67  -
  68  +
  69  *
  70  RCL O
  71  RCL P
  72  *
  73  2
  74  /
  75  -
  76  RCL N 
  77  X=0?
  78  GTO 00       
  79  SIN
  80  LASTX
  81  ST/ Y
  82  COS
  83  -
  84  4
  85  *
  86  RCL N 
  87  X^2
  88  /
  89  GTO 02
  90  LBL 00
  91  CLX
  92  RCL O
  93  ST+ X
  94  LBL 02
  95  STO P 
  96  X<>Y
  97  LBL 03
  98  RCL M 
  99  RCL a
100  +
101  RCL N
102  *
103  RCL IND M
104  P-R
105  +
106  RCL P
107  X<> O
108  STO P 
109  *
110  +
111  DSE M
112  GTO 03
113  RCL N
114  SIGN
115  RDN
116  CLA
117  DEG
118  END

 
   ( 167 bytes / SIZE nnn+1 )
 
 

      STACK        INPUTS      OUTPUTS
           Z             a             /
           Y             b             /
           X             x          H(x)
           L             /            x

  where  b-a =  n  must be  even  and  x  is expressed in  radians

Example1:      f  is only known by the following samples, and  f = 0  if  t < 3  or  t > 7
 
 

     t     3     4     5     6     7
   f(t)     1     2     1    -2    -7

-We store the f-values into registers  R00 thru R04:   1  STO 00   2  STO 01   1  STO 02   -2  STO 03   -7  STO 04
 and then, for instance:

  3  ENTER^
  7  ENTER^
  2  XEQ "HRTL"   >>>>   H(2) = -3.8768  and similarly:
 
 

      x      -3      -2      -1       0       1       2       3
   H(x)   0.7675  -3.5134  -2.8271  -1.3333  -9.6763  -3.8768  -3.7485

-Actually, we had   f(t) = -14 +8.t - t2   and these results are exact because Filon's formulae produce perfect accuracy if  f is a polynomial of degree 2

-If you compute more H(x)-values , you'll see the following graph ( approximately ):

                                                 H(x)
                                 *                |                           *
                               *   *             |        *               *   *
                    *       *      *        *  |     *   *           *       *        *
      --------*--*--*-----*---*--*|--*----*----- *-------*--*---*------------------------- x
                          *          *  *       |*          *      *               *
                                        *         |              *  *
                                                   |                *
 

Example2:     We take  f(t) =  e -t/2  for  t >= 0  and  f(t) = 0  if  t < 0

- H(x) may be expressed analytically and  H(x) = 2(1+2x)/(1+4x2)

-If we use f(0) , f(1) , .............. , f(16) to represent the function  ( these 17 numbers are to be stored in registers R00 thru R16 ),
 we obtain the following results:
 
 

      x       -1     -0.5    -0.25       0   0.2071       1       2
  HRTL  -0.3971   0.0034   0.8015   2.0000   2.4154   1.2025   0.5930
   exact    -0.4       0      0.8       2   2.4142      1.2   0.5882

-The accuracy is quite satisfactory.
-The graph of the Hartley transform looks like this:

                                                              H(x)
                                                                 |
                                                                 |  *                               H(x) is maximum for x = 0.2071
                                                                 |*   *
                                                                 |          *
                                                                 |                       *
                                                                 |                                                          *
  ---------------------------------------* -|----------------------------------------------- x
            *                                                   |
                                     *                 *       |
                                                                 |
 

6°)  Symmetric Hartley Transform ( 1- & 2-Dimensional Problems )

 
       a) Focal Program


-This program works for vectors and matrices.
-The transformation is strictly symmetrical: all the coefficients are divided by sqrt(n.m)
-So  DHT ( DHT (A) )  =  A , with small roundoff-errors as usual.


Data Registers:     •  R00 = n.mmm  =  n + m/1000    or  n  if  m = 1                  ( These  n.m+1 registers are to be initialized before executing "DHT" )

                                •  R01 = a1,1      •  Rn+1 = a1,2   .......................................   •  Rnm-n+1 = a1,m
                                •  R02 = a2,1      •  Rn+2 = a2,2   .......................................   •  Rnm-n+2 = a2,m
                                       ..........................................................................................................
                                •  Rnn = an,1      •  R2n  =  an,2   ........................................   •  Rnm = an,m
Flag: /
Subroutine:  /
  

 01 LBL "DHT"
 02 RAD
 03 RCL 00          
 04 INT
 05 LASTX
 06 FRC
 07  E3
 08 *
 09 X=0?
 10 SIGN
 11 *
 12 STO P
 13 STO a
 14 LBL 10
 15 RCL a
 16 ENTER
 17 SIGN
 18 -
 19 STO Y
 20 RCL 00          
 21 INT
 22 MOD
 23 X<>Y
 24 LASTX
 25 /
 26 INT
 27 PI
 28 ST+ X
 29 ST* Z
 30 *
 31 RCL 00
 32 FRC
 33 PI
 34 INT
 35 10^X
 36 *
 37 X=0?
 38 SIGN
 39 STO M
 40 /
 41 STO O
 42 CLX
 43 RCL 00          
 44 INT
 45 ST* M
 46 /
 47 STO N
 48 CLX
 49 LBL 01
 50 RCL N
 51 RCL M
 52 ENTER
 53 SIGN
 54 -
 55 ST* Y
 56 RCL 00          
 57 INT
 58 /
 59 INT
 60 RCL O
 61 *
 62 +
 63 RCL IND M
 64 P-R
 65 +
 66 +
 67 DSE M
 68 GTO 01
 69 RCL P
 70 SQRT
 71 /
 72 RCL a          
 73 RCL P
 74 +
 75 X<>Y
 76 STO IND Y
 77 DSE a
 78 GTO 10
 79 X<> L
 80 ST+ X
 81  E3
 82 /
 83 +
 84 CLA
 85 DEG
 86 END
 

       ( 121 bytes / SIZE 2n.m+1 )

 

         STACK          INPUT        OUTPUT
             X               /         bbb.eee

    Where  bbb.eee = control number of the results

Example1:           A = [ 1 2 4 7 ]     here, n = 4 & m = 1

   Store  1  2  4  7  into  R01  R02  R03  R04   and   n = 4   STO 00

   XEQ "DHT"  returns  X = 5.008  and registers  [ R05  R06  R07  R08 ] = [ 7  -4  -2   1 ]                ---Execution time = 17s---


Example2:        Let  M  the 2x3 matrix defined by

    1  2  4      =      R01   R03   R05            n.mmm = 2.003  STO 00
    3  5  6      =      R02   R04   R06

   XEQ "DHT"  >>>>   7.012                                                                                                             ---Execution time = 39s---

   R07   R09   R11    =    8.5732   -2.8978   -0.7765       ( rounded to 4D )
   R08   R10   R12    =   -2.8577  -0.1494    0.5577

Notes:

-With n = 5 & m = 7, execution time = 21mn10s

-Trigonometric functions are slightly less slow in RAD mode.
-If you prefer DEG mode, delete line 85, add  R-D  after line 27 and replace line 02 by DEG
 ( there will be no roundoff error in example 1 )

-Synthetic register P is used, so don't stop the program.
-Synthetic register a is also used, so this program cannot be called as more than a first level subroutine.
-But these registers may be replaced by unused data registers, like R98 & R99... provided  2nxm < 98.

 
       b) M-Code Routine


-Of course, execution time will be smaller with M-Code functions.

-The existence of the required registers is tested but there is no check for alpha data.
-There is 1 absolute jump ( the last 2 words ), that must be changed to work properly in your own ROM.
 

094  "T"                                 address  E34A in my ROM
008  "H"
004  "D"
244  CLRF 9
0F8  READ 3(X)
2EE  ?C#0 ALL
013   JNC+02
248   SETF 9
378   READ 13(c)
03C  RCR 3
106   A=C S&X
130   LDI S&X
200   200h                           200 is the correct value for an HP-41CV , CX or C with a quad memory module.
306   ?A<C S&X
381   ?NCGO                     you'll get  NONEXISTENT if R00 does not exist
00A   02E0
0AE   A<>C ALL
270   RAMSLCT
038   READATA
05E   C=0 MS
10E   A=C ALL
046   C=0 S&X
270   RAMSLCT
0AE  A<>C ALL
268   WRIT 9(Q)
2A0  SETDEC
084   CLRF 5                  C
0ED  ?NCXQ                 =
064   193B                   frc(C)
226   C=C+1 S&X
226   C=C+1 S&X
226   C=C+1 S&X
088   SETF 5                  C
0ED  ?NCXQ                 =
064   193B                   int(C)
2EE   ?C#0 ALL
01F   JC+03
35C   PT=12                   C
050   LD@PT-1             =1
168   WRIT 5(M)
278   READ 9(Q)
088   SETF 5                  C
0ED  ?NCXQ                 =
064   193B                   int(C)
1A8   WRIT 6(N)
178   READ 5(M)
13D   ?NCXQ               C=
060   184F                  AB*C
1E8   WRIT 7(O)
028   WRIT 0(T)
260   SETHEX
38D   ?NCXQ
008    02E3
106   A=C S&X
378   READ 13(c)
03C  RCR 3
206   C=A+C S&X
0E6   B<>C S&X
0C6  C=B S&X
1BC  RCR 11
0C6  C=B S&X
1BC  RCR 11
0C6  C=B S&X
17C  RCR 6
206  C=A+C S&X
13C  RCR 8
228   WRIT 8(P)                the control number of the registers is stored in synthetic register P
24C  ?FSET 9
017   JC+02
17C  RCR 6
106   A=C S&X
130   LDI S&X
200   200h
306   ?A<C S&X
381   ?NCGO                     you'll get  NONEXISTENT if Rnm does not exist when X # 0
00A   02E0                         you'll get  NONEXISTENT if R2nm does not exist when X = 0
2A0   SETDEC
24C   ?FSET 9
08B   JNC+11h=+17d
0F8   READ 3(X)
10E   A=C ALL
04E   C=0 ALL                   C
2BE   C=-C-1 MS              =
35C   PT=12
050   LD@PT- 1                -1
070   N=C ALL
01D  ?NCXQ                    C=
060   1807                        A+C
128   WRIT 4(L)
0B8   READ 2(Y)
10E   A=C ALL
0B0   C=N ALL
01D  ?NCXQ                    C=
060   1807                        A+C
0F3   JNC+1Eh=+30d
2A0   SETDEC                  LOOP       at the address   E3A9  in my ROM
046   C=0 S&X                 C
270   RAMSLCT               =
038   READATA               T
10E   A=C ALL
046   C=0 S&X
0EE  C<>B ALL
009  ?NCXQ                    C=
060   1802                       AB-1
2FE  ?C#0 MS
345   ?CGO                     goto
043   10D1                      CLA
028   WRIT 0(T)
268   WRIT 9(Q)
10E   A=C ALL
1B8   READ 6(N)
044   CLRF 4                  C
070   N=C ALL               =
171   ?NCXQ
064   195C                  AmodC
128   WRIT 4(L)
278   READ 9(Q)
10E   A=C ALL
1B8   READ 6(N)
261   ?NCXQ                C=
060   1898                    A/C
088   SETF 5                  C
0ED  ?NCXQ                 =
064   193B                   int(C)
04E   C=0 ALL
0E8   WRIT 3(X)
35C   PT=12                 C
0D0   LD@PT- 3
190   LD@PT- 6           =
226   C=C+1 S&X
226   C=C+1 S&X      360
070   N=C ALL
13D  ?NCXQ               C=
060   184F                  AB*C
178   READ 5(M)
269   ?NCXQ                C=
060   189A                   AB/C
0A8   WRIT 2(Y)
138   READ 4(L)
10E   A=C ALL
0B0   C=N ALL
135   ?NCXQ               C=
060   184D                  A*C
1B8   READ 6(N)
269   ?NCXQ                C=
060   189A                  AB/C
128   WRIT 4(L)
1F8   READ 7(O)
10E   A=C ALL                                      loop2
046   C=0 S&X
0EE  C<>B ALL
009  ?NCXQ                    C=
060   1802                       AB-1
068   WRIT 1(Z)
138   READ 4(L)
13D   ?NCXQ               C=
060   184F                 AB*C
070   N=C ALL
078   READ 1(Z)
10E   A=C ALL
1B8   READ 6(N)
261   ?NCXQ                C=
060   1898                    A/C
088   SETF 5                  C
0ED  ?NCXQ                 =
064   193B                   int(C)
0B8   READ 2(Y)
13D   ?NCXQ               C=
060   184F                  AB*C
0B0   C=N ALL
025  ?NCXQ                    C=
060   1809                      AB+C
04E   C=0 ALL                  C
35C   PT=12
110   LD@PT- 4                =
150   LD@PT- 5
226   C=C+1 S&X            45              Calculating Sin(x+45°) is simpler than using P-R in M-Code.
025  ?NCXQ                    C=
060   1809                      AB+C
070   N=C ALL                 C
3C4  ST=0                         =
229   ?NCXQ
048   128A                      sin(C°)         calculates the sine of an angle in degrees without disturbing the display mode
10E   A=C ALL
238   READ 8(P)
270   RAMSLCT
038   READATA
135   ?NCXQ               C=
060   184D                  A*C
046   C=0 S&X
270   RAMSLCT
0F8   READ 3(X)
025  ?NCXQ                    C=
060   1809                      AB+C
0E8   WRIT 3(X)
260   SETHEX
238   READ 8(P)
266   C=C-1 S&X
228   WRIT 8(P)
2A0   SETDEC
078   READ 1(Z)
2EE  ?C#0 ALL
257   JC-36h=-54d      goto loop2
04E   C=0 ALL                C
35C   PT=12                    =
090   LD@PT- 2              2
10E   A=C ALL
1F8   READ 7(O)
261   ?NCXQ                C=
060   1898                    A/C
305   ?NCXQ               C=
060    18C1                sqrt(AB)
0F8   READ 3(X)
13D  ?NCXQ                C=
060   184F                  AB*C
0E8   WRIT 3(X)
24C   ?FSET 9
345   ?CGO                     goto
043   10D1                      CLA
10E   A=C ALL
238   READ 8(P)
17C  RCR 6
260   SETHEX
270   RAMSLCT
0AE  A<>C ALL
2F0   WRITDATA
046   C=0 S&X
270   RAMSLCT
238   READ 8(P)
03C  RCR 3
27A  C=C-1 M
106   A=C S&X
1BC  RCR 11
0A6  A<>C S&X
228   WRIT 8(P)
2A5  ?NCGO                 goto
38E   E3A9                   LOOP            address  E436  in my ROM

 ( 237 words )
 

Data Registers:     •  R00 = n.mmm  =  n + m/1000    or  n  if  m = 1        ( These  nm+1 registers are to be initialized before executing "DHT" )

                                •  R01 = a1,1      •  Rn+1 = a1,2   .......................................   •  Rnm-n+1 = a1,m
                                •  R02 = a2,1      •  Rn+2 = a2,2   .......................................   •  Rnm-n+2 = a2,m
                                       ..........................................................................................................
                                •  Rnn = an,1      •  R2n  =  an,2   ........................................   •  Rnm = an,m
Flag: /
Subroutine:  /
 
 

      STACK        INPUT1      OUTPUT1       INPUT2       OUTPUT2
           Y             j             /            /             /
           X             i            bi,j            0            b1,1

  If  m = 1 ,  j  may take any real value.

  If  X = 0 , the whole DHT is computed and sored into register  Rmn+1 thru R2mn

Example1:           A = [ 1 2 4 7 ]     here, n = 4 & m = 1

   Store  1  2  4  7  into  R01  R02  R03  R04   and   n = 4   STO 00

   •  0  XEQ "DHT"  returns  b1 = 7  and registers  [ R05  R06  R07  R08 ] = [ 7  -4  -2   1 ]    ---Execution time = 12s---

   •  3  XEQ "DHT"  gives  b3 = -2         ---Execution time = 3s---

Example2:        Let  M  the 2x3 matrix defined by

    1  2  4      =      R01   R03   R05          Store  n.mmm = 2.003  STO 00
    3  5  6      =      R02   R04   R06

   •  0  XEQ "DHT"  returns  b1 = 8.573214097                    ---Execution time = 27s---

   R07   R09   R11    =    8.5732   -2.8978   -0.7765       ( rounded to 4D )
   R08   R10   R12    =   -2.8577  -0.1494    0.5577

 If you copy R07 ....  R12 to R01 .... R06  and press  0  XEQ "DHT"  again,
 you'll get the elements of the original matrix M with a mean error of about 3 E-9

   •  3  ENTER^
       2  XEQ "DHT"  gives  b2,3 = 0.557677535                    ---Execution time = 4.5s---

Notes:

-For n = 5 & m = 7 ( and X = 0 )  execution time = 15mn10s
-Even in Micro-Code, the trigonometric functions remain relatively slow...

 

Reference:

[1]  Ronald N. Bracewell - "The Fourier Transform and its Applications" - McGraw-Hill -  ISBN 0-07-116043-4