One
possible solution example
Warning! Given solution could be not the best, however it suits as example…
Coefficients: c1 & c9 = 0.125; c2 & c8 = 0.25; c3 & c7 = -0.75; c4 & c6 = 1.25; and c5 = 1.0.
Equation transformation (di==data_in):
data_out := (1/8) * (di-0+di-8) + (1/4) * (di-1+di-7) - (3/4) * (di-2+di-6) + (1 1/4) * (di-3+di-5) + di-4
Equation after substitution (division-> shifting):
data_out := ( (di-0+di-8) >> 3 ) + ( (di-1+di-7) >> 2 ) - ( (di-2+di-6) - ( (di-2+di-6) >> 2 ) ) + (di-3+di-5) + ( (di-3+di-5) >> 2 ) + di-4
Together 8 additions and 2 subtractions, what scheduled among 4 steps. The result of every operation is saved always in the same register (reg1, reg2, …, reg4). Multiplexers in adder/subtractor inputs are not optimized…
Step |
add #1 |
add #2 |
add #3 |
sub #4 |
1 |
reg1 := di-1 + di-7 |
reg2 := di-2 + di-6 |
reg3 := di-3 + di-5 |
- |
2 |
reg1 := di-0 + di-8 |
reg2 := di-4 + (reg1>>2) |
reg3 := reg3 + (reg3>>2) |
reg4 := reg2 - (reg2>>2) |
3 |
reg1 := (reg1>>3) + reg2 |
- |
- |
reg4 := reg3 - reg4 |
4 |
reg1 := reg1 + reg4 |
- |
- |
- |
RTL code you can be found here. Synthesis results: area 3248 (1181+2067), delay 19.99.
In order to restrict the area size during synthesis use command:
set_max_area
[value]