If A Number is An Exact Power of 2?

This algorithm is very complicated but has very high performance. I have developed this algorithms based on other counting algorithms, mainly hardware version of Parallel Count algorithm. This algorithm is explained in brief here. Please post your question if you need more information. Basic operation in the algorithm is that number of bits in any two bit number can be counted by using simple half adder.

This algorithm performs following steps

  • Group number in 2 bit number
  • Determine number of ones in each of the pair using half adder. Since number of 1s can be 0, 1 or 2 in a two bit number, 2 bit storage is required to hold result.
  • Result should have either 0 or 1 but not 2. If Result is 2 then this number  has more than 1 bit ON & no further computation is required. Check if MSB of result is zero or not.
  • If MSBs of each result pair are 0, discard result MSBs & create new number of half the original width & Repeat.
  • Iteration will start with 32 bit input & output will be reduced to 16 bit. Above operations should be repeated till only 1 bit is left @ output.

 

 Innovative Algorithm


     assign sum15_top = (sum15_hi == 0);
     assign sum8_top = (sum8_hi == 0);
     assign sum4_top = (sum4_hi == 0);
     assign sum2_top = (sum2_hi == 0);
     assign sum1_top = (sum1_hi == 0);

     always @(*) begin
       // 16 Half Adders
       // Pair inb[31] with inb[15]
       //      inb[30] with inb[14]
       //   …
       //      inb[16] with inb[0]
       sum15_hi = inb[15:0] & inb[31:16];
       sum15_lo = inb[15:0] ^ inb[31:16];

       // 8 Half adders
       sum8_hi = sum15_lo[7:0] & sum15_lo[15:8];
       sum8_lo = sum15_lo[7:0] ^ sum15_lo[15:8];

       // 4 Half Adders
       sum4_hi = sum8_lo[3:0] & sum8_lo[7:4];
       sum4_lo = sum8_lo[3:0] ^ sum8_lo[7:4];

       // 2 Half Adders
       sum2_hi = sum4_lo[1:0] & sum4_lo[3:2];
       sum2_lo = sum4_lo[1:0] ^ sum4_lo[3:2];

       // 1 Half Adder
       sum1_hi = sum2_lo[0] & sum2_lo[1];
       sum1_lo = sum2_lo[0] ^ sum2_lo[1];
     end

     always @(posedge clk or negedge rst_n) begin
       if (~rst_n)
          val <= 1'b0;
       else
          // Verify if all MSBs are 0 & lowest output is 1
          val <= sum15_top & sum8_top & sum4_top & sum2_top & sum1_top
         & sum1_lo;
     end


 

 

If implemented in ANSI-C, 5 distinct steps are performed, each step having a loop. Each of these loops are run for 32, 16, 8, 4, and 2 time. Based on this, Number of cycles taken arek*62 (Where k is arbitrary constant). HW performance is

  • Code Size – 40 (Less the better) 
  • Complexity – 5/10 on simplicity scale. (More the better – Little Subjective)
  • Basic Gate Count & Timing –  Area is 762 micron2 (Less the better), Critical Path Delay 1.26ns(Less the better)
  • Optimizable – Area is 1992 micron2(Less the better), Critical Path Delay 1.0ns (Less the better)

Next: Conclusion

Leave a Reply

Your email address will not be published. Required fields are marked *