Read the paper on reconfigurable multiplication.
P1. Consider an 8 x 8 multiplier implemented using two sets of sixteen 4-LUTs and a 12-bit adder. We need to include negative numbers as part of possible operands. You can assume that that one of the operands is known before multiplication begins.
P2. In a conventional 32 x 32 bit multiplier, 32 partial products (PPs) are generated and added together using several carry-save-adders (CSAs). Recall that each CSA takes three PPs as input and produces two PPs as output. Thus using FLOOR(n/3) CSAs, n PPs can be reduced to CEILING(2n/3) PPs in parallel. In the end, a carry-propagate adder is used to compute the final sum.
The above solution is expensive. An alternate is to use pipeline that takes eight PPs at a time and reduces them to two PPs using four CSA stages. PPs are pipelined through this hardware. In the first and second stages, there are two CSA in parallel. The third and fourth stages consist of one CSA each. Two more stages (fifth and sixth) of one CSA each can be added to accumulate the two PPs with the partial sums obtained earlier (as 2 PPs). After four iterations, the accumulated two PPs are fed to a carry-propagate adder.