dcterms.abstract | Power consumption has emerged as a key design objective for almost any application. Low swing/voltage clock distribution was proposed in earlier work as a method to reduce power consumption since clock networks typically consume a significant portion of the overall dynamic power in synchronous integrated circuits (ICs). Existing works on low voltage clocking, however, suffer from multiple issues, making these approaches impractical for industrial circuits. For example, most of the existing studies sacrifice performance when lowering the supply voltage of a clock network, such as clock networks developed for near-threshold computing. The primary objective of this dissertation is to develop a low voltage clocking methodology without degrading circuit performance (operating frequency) or clock network characteristics (such as skew and slew). This objective is achieved through several circuit and algorithmic innovations. A novel D flip-flop (DFF) cell that can reliably operate with a low voltage clock signal and a nominal voltage data signal is proposed. Contrary to existing approaches where the last stage of the clock network operates at nominal voltage, the proposed cell enables low voltage clock operation throughout the entire clock network, thereby maximizing power savings. Furthermore, a similar clock-to-Q delay is maintained to satisfy the same timing constraints. Simulation results demonstrate that when the clock voltage is scaled to 70% of the nominal supply voltage, the proposed DFF cell achieves up to 53% power savings at the expense of approximately 50% increase in cell-level physical area. At chip-level, the increase in area is approximately 15%. At low supply voltages, satisfying the slew constraint becomes highly challenging due to reduced drive ability of the clock buffers. A slew driven-clock tree synthesis (CTS) methodology, referred to as SLECTS, is proposed to satisfy tight slew constraints at scaled supply voltages. Contrary to existing CTS methods that are primarily delay/skew based and slew is considered only during post-CTS optimization, in the proposed approach, slew constraint is integrated into the critical steps of the synthesis process (such as merging clock tree nodes, defining routing points, and handling long interconnects). For an industrial 4-core application processor with approximately 1 million gates and implemented in 28 nm fully depleted silicon-on-insulator (FD-SOI) CMOS technology, the proposed slew-driven CTS methodology achieves up to 15% reduction in clock tree power while producing satisfactory skew and slew characteristics. Furthermore, contrary to the vendor tool that exhibits slew violations, the proposed approach satisfies tight slew constraints. When the proposed DFF cell is combined with the proposed CTS methodology, up to 48% reduction in overall clocking power is achieved under similar performance constraints at the expense of 15% increase in area. In clock trees with highly aggressive design constraints, selective low voltage clocking was considered to satisfy the tight constraints. A novel level-up shifter with dual supply voltage is proposed to enable such operation. Simulation results demonstrate that the proposed level shifter achieves 43% and 36% reduction in, respectively, transient power and leakage power as compared to a conventional cross-coupled level shifter, while consuming 9.5% less physical area. Clock gating is an effective and common technique to reduce the switching power of the clock networks. Clock signals arrive at clock gating cells earlier than sinks, which reduces the timing slack of Enable paths. A useful skew methodology for gated low voltage clock trees is proposed to relax the timing constraints of Enable paths. The methodology is evaluated using the largest ISCAS'89 benchmark circuits. The results demonstrate an average 47% increase in the timing slack of the Enable path. The design methodologies proposed in this dissertation facilitate low voltage clocking for high performance industrial circuits. Significant reduction in clock power is achieved without degrading clock frequency and primary clock constraints such as skew and slew. The proposed methodologies were integrated into a conventional design flow and demonstrated using large scale industrial circuits. | |