Bitwidth cognizant architecture synthesis of custom hardware accelerators Mahlke, S. Ravindran, R. Schlansker, M. Schreiber, R. Sherwood, T. Hewlett-Packard Labs., Palo Alto, CA; This paper appears in: Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on On page(s): 1355-1371 Volume: 20, Issue: 11, Nov 2001 ISSN: 0278-0070 References Cited: 31 CODEN: ITCSDI INSPEC Accession Number: 7093950 Abstract: Program-in chip-out (PICO) is a system for automatically synthesizing embedded hardware accelerators from loop nests specified in the C programming language. A key issue confronted when designing such accelerators is the optimization of hardware by exploiting information that is known about the varying number of bits required to represent and process operands. In this paper, we describe the handling and exploitation of integer bitwidth in PICO. A bitwidth analysis procedure is used to determine bitwidth requirements for all integer variables and operations in a C application. Given known bitwidths for all variables, complex problems arise when determining a program schedule that specifies on which function unit (FU) and at what time each operation executes. If operations are assigned to FUs with no knowledge of bitwidth, bitwidth-related cost benefit is lost when each unit is built to accommodate the widest operation assigned. By carefully placing operations of similar width on the same unit, hardware costs are decreased. This problem is addressed using a preliminary clustering of operations that is based jointly on width and implementation cost. These clusters are then honored during resource allocation and operation scheduling to create an efficient width-conscious design. Experimental results show that exploiting integer bitwidth substantially reduces the gate count of PICO-synthesized hardware accelerators across a range of applications