# On-Interposer Decoupling Capacitors Placement for Interposer-based 3DIC

Bo-Yang Chen, Chang-Yun Liu Hung-Ming Chen

**Bo-Tsang Huang** 

Institute of Electronics National Yang Ming Chiao Tung University hatehess.eecs04@nctu.edu.tw jimmyliu1229@nycu.edu.tw hmchen@nycu.edu.tw International College of Semiconductor Technology National Yang Ming Chiao Tung University bughuang@nctu.edu.tw

Abstract— With the demand for high performance and density, silicon interposer-based three-dimensional integrated circuit (3DIC) has become a promising solution for these requirements. However, simultaneously switching noise (SSN) will cause voltage fluctuation and hence performance degradation and logic failure. Our work proposes an efficient Simulated Annealing (SA) based algorithm to perform decap placement automatically on the interposer. In our solution, target impedance can be achieved within certain frequency range. Results show that number of decaps as well as impedance of PDN are minimized to meet the requirement.

# I. INTRODUCTION

As semiconductor manufacturing technology advances and designers use tremendous amount of transistors to meet more demanding specifications, conventional two-dimensional processes face many performance, power, and area problems. Compared with traditional two-dimensional integrated circuits, three-dimensional integrated circuits (3DICs) have many benefits. One of them is reducing the average wire length. Longer wire length increases the resistance and the capacitance of these lines, resulting in a significant increase in signal propagation (RC) delay [1]. As the interconnect scaling continues, RC delay is increasingly becoming the dominant factor determining the performance of advanced ICs [2].

In addition, many challenges are unsolved, and power integrity is one of the significant challenges for 3DIC. The noise margin of the chip is much lower than before due to relatively low supply voltage, and a small voltage ripple might cause the devices' malfunction. Therefore, ensuring a steady power distribution to components in different layers becomes a crucial concern. For optimizing the power distribution network (PDN) design, a typical technique to minimize the impedance of PDN is by using decoupling capacitors (decaps). A decap acts as a temporary current pool and provides the low-noise return path for signals [3]. However, only a limited number of decaps can be inserted on the interposer due to manufacturing costs. Hence, in this work, we focus on the effectiveness and efficiency of placing decaps under area constraints while solving the power integrity problem.

# II. Related Works

For 2D package and board optimization, [4] proposed a genetic-based algorithm to determine the location of decaps. Similarly, [5] proposed a genetic algorithm for optimization and selection of the appropriate number and types of decoupling capacitors for a pre-defined power/ground noise specification. In [6], they investigated interactions among the on-chip power supplies, decoupling capacitors, and load circuitry and then proposed a simultaneous placement of power supplies and decaps. [7] proposed a Simulated Annealing based method to place decoupling capacitors, minimizing core supply noise efficiently.

For 3DIC power distribution networks, one of the crucial technologies is the through silicon via (TSV) technology, which provides massive interconnections between stacked chips, shortening the power and signal paths [8,9]. [10–13] discussed the power integrity issues in 3D integration due to the current consumed by multiple ICs, such as IR drops, small-voltage margins, and large noises.

Many previous pieces of research on 3D PDN

modeling and analysis proposed different on-chip PDN modeling approaches [14–22]. However, offchip PDNs like package and PCB were not covered. In [23], they verified that PDN impedance directly distinguishes the contributions from off-chip PDN and on-chip PDN, and the SSN can be decomposed into distinct frequency regions corresponding to different PDN components. [24] reported on modeling of power delivery into 3D chip stacks on a silicon interposer/packaging substrate using a novel hybrid approach.

For decoupling capacitors placement on the interposer of 3DIC, [25] proposed a nature-inspired algorithm of the genetic class to decide the location, the number, and the value of decaps. In [26], they proposed a deep reinforcement learning (RL)-based optimal decap design method for silicon interposerbased 2.5-D/3-D ICs.

However, some previous works mentioned above did not consider multi-chip systems. Some studies focused on complicated equations about power integrity to find optimal locations for decaps. Others can choose different values of decaps from a decap library during the optimization procedure. In this work, we demonstrate a flow to deal with a gridbased power network with the latest deep trench capacitor considering both locations and multi-chip simulation.

### III. PROPOSED APPROACH

#### A. Methodology

Figure 1 is the entire flow of our method. In this flow, we first do current profile preprocessing, followed by determining an initial solution for the number and the locations of decaps by calculating key input parameters. Then, to reduce the runtime of HSPICE simulation, we conduct Simulated Annealing algorithm with only an off-chip model and the interposer layer, which is directly connected to a merged current profile through micro-bumps and TSVs. For every iteration, we determine whether picking the new state or not by probability and check if the solution fulfills our requirements. Finally, after we get an optimized placement solution of on-interposer decaps, we add chip1 and chip2 to conduct a comprehensive simulation of the 3DIC at the end. In this section, we demonstrate the details of each step.

#### B. Current Profile

In this work, current profiles with an AC value are applied on the power networks to express the switching activities of chips at the physical locations. The greater the value is, the more frequent the switching activity is on the corresponding location of the chip. There are two chips with different current profiles in our 3DIC structure. Since the electrical impedance of the vertical connections between those two chips is much smaller than the impedance of a power network of a single layer, we can merge two current profiles of the two chips to a single one by adding the current value of the same points on chip1 and chip2. It is worth mentioning that we perform replication padding at the network's border of chip1 and chip2 to avoid inaccuracy of the merged profile, as an example shown in Figure 2. Lastly, we connect the merged and normalized current profile to the interposer's power network through a model of micro-bump and TSV. Figure 3 shows an example of current profile preprocessing.

#### C. Initial Decap Placement

To give a relatively reasonable initial solution instead of a random placement solution at the beginning of our Simulated Annealing algorithm, we propose an intuitive way to determine the initial number and location of decaps. First, we introduce some terminologies for a brief prediction of the initial solution Eq (1)(2).

$$Current \ Density = \frac{N_{severe\_points}}{N_{power\_source}}$$
(1)

$$N_{initial\_decaps} = \left\lfloor \frac{Current \ Density \times 1250}{target \ impedance} \right\rfloor \ (2)$$

 $N_{severe\_points}$  is the total number of points with a normalized current value  $\geq 0.7$ .  $N_{power\_source}$ is the total number of power connected up to the interposer.  $N_{initial\_decaps}$  is the total number of decaps on the interposer for the initial solution.

There are two operations to choose from for generating a neighboring state in our work. The first one is to add one additional decap to a random location with no decap, called "decap insertion." The other operation randomly picks an existing decap and moves it to another random location without decap, called "repositioning". In this work, we employ TSMC [27] as our decap, which is a fixed value.

The acceptance probability function is defined as the following equations:

$$P(e, e_{new}, T) = \begin{cases} exp(\frac{-(e_{new}-e)}{T}), & if \ e_{new} \ge e\\ 1, & otherwise \end{cases}$$
(3)

$$e = E(state) = Imp_{max} + \left(\frac{N_{violated}}{N_{severe\_points}}\right) \times 100$$
(4)



Fig. 1.: Flow chart of the overall algorithm.

Eq(3) shows the probability of making the transition from the current state to a new candidate neighboring state. e and  $e_{new}$  denote the energy of the current state and the new candidate state, respectively. T describes the time-varying temperature, which reduces in every iteration. This equation indicates that we must take a neighboring state with lower energy. If having a neighboring state with higher energy, the probability is close to 1 due to a relatively higher temperature T at the beginning of the method. Additionally, it tends to move to a higher energy neighboring state with less energy growth. The feature of acceptance probability prevents the algorithm from being stuck at local minima.

Eq(4) shows how we get the energy of a particular state by a designed cost function.  $N_{severe\_points}$  denotes the total number of severe points, where points are heavily loaded, and  $N_{violated}$  is the total number of severe points that violate the target impedance.  $Imp_{max}$  represents the maximum value of impedance among all the severe points.

# IV. Algorithm of Decapitator Integration

The advantage of Simulated Annealing is that we can avoid being trapped in local minima by using

|            |   |   |  | 1                   | 3 | 5 | 5 |
|------------|---|---|--|---------------------|---|---|---|
| 1          | 3 | 5 |  | 1                   | 3 | 5 | 5 |
| 6          | 2 | 4 |  | 6                   | 2 | 4 | 4 |
| 4          | 2 | 5 |  | 4                   | 2 | 5 | 5 |
| No padding |   |   |  | Replication Padding |   |   |   |

Fig. 2.: An example of replication padding.



Fig. 3.: An example of current profile prepossessing for the preparation of design simulation.

a probability parameter to perform random perturbation to a neighborhood solution [28]. Compared to other deterministic optimization techniques, Simulated Annealing is more efficient due to introducing a special method with less overall run-time. By conducting HSPICE simulation on only an off-chip model and interposer layer, this way we can reduce each iteration run-time. After our simulation become stable, we re-simulate full model HSPICE simulation and output our result.

In our work, some features help accelerate the algorithm and provide a more reliable solution with fewer decaps. To reduce the algorithm's runtime, we have a "boosting mode," which means we immediately terminate the algorithm when the solution meets the target impedance at all severe points. On the contrary, with boosting mode off, the searching continues even if there is a feasible solution and will finish when the temperature is low enough, as shown in the overall flow.

In addition, to use less decap to achieve the target impedance, we adjust the rate of taking "repositioning" operation, including 70%, 98%, and 99%, which means that we have a slight chance to conduct "decap increment." With this adjustment, we can ensure a more robust placement with the current number of decaps before adding a new decap.

For the temperature, it starts at 100 and reduces in every iteration at a rate of 99%. The algorithm ends if the temperature is below 0.0001. In other words, the maximum number of iterations is 1374. The runtime of each iteration costs few seconds and increases each time the new decap is placed. However, the total runtime would no longer progress over 30 minutes with the maximum iteration being locked. We exhibit the concept of our method by the Algorithm 1.

# V. EXPERIMENTAL RESULT

We present the experimental results of the proposed algorithm. The supply voltage, which distributes through an off-chip model to the interposer, is 0.9V for our case in this work. To demonstrate the effectiveness of our method, we compare the performance of our results with an intuitive decap solution under the same amount of decaps. We also present the intuitive solution with the maximum allowed number of decaps for reference.

Figure 4 presents the current profile of two chips and the merged and normalized one. The size of the interposer and two chips is  $4 \times 4 \ mm^2$ . The

| Algorithm 1: Simulated Annealing                     |  |  |  |  |  |  |  |
|------------------------------------------------------|--|--|--|--|--|--|--|
| Input:                                               |  |  |  |  |  |  |  |
| Merged current profile                               |  |  |  |  |  |  |  |
| T: The initial temperature                           |  |  |  |  |  |  |  |
| $T_{terminate}$ : The terminate temperature          |  |  |  |  |  |  |  |
| $\alpha$ : temperature reduction factor              |  |  |  |  |  |  |  |
| $f_{interest}$ : Frequency range of interest         |  |  |  |  |  |  |  |
| $Z_{target}$ : Target impedance                      |  |  |  |  |  |  |  |
| Output:                                              |  |  |  |  |  |  |  |
| A set of decoupling capacitors by                    |  |  |  |  |  |  |  |
| placement results                                    |  |  |  |  |  |  |  |
| 1 Obtain initial solution state $S$ by calculating   |  |  |  |  |  |  |  |
| Merged current profile                               |  |  |  |  |  |  |  |
| 2 while $T \ge T_{terminate}$ do                     |  |  |  |  |  |  |  |
| <b>3</b> Generate state $S'$ , the neighboring state |  |  |  |  |  |  |  |
| of $S$                                               |  |  |  |  |  |  |  |
| 4 if for state $S'$ , $Z_{PDN} \leq Z_{target}$ in   |  |  |  |  |  |  |  |
| $f_{interest}$ then                                  |  |  |  |  |  |  |  |
| 5 End the algorithm                                  |  |  |  |  |  |  |  |
| 6 end                                                |  |  |  |  |  |  |  |
| 7 if $E(S') < E(S)$ then                             |  |  |  |  |  |  |  |
| $\mathbf{s}     S \leftarrow S'$                     |  |  |  |  |  |  |  |
| 9 else                                               |  |  |  |  |  |  |  |
| $10     \Delta = E(S') - E(S)$                       |  |  |  |  |  |  |  |
| <b>11</b> $r = random(0.0, 1.0)$                     |  |  |  |  |  |  |  |
| 12 if $r < exp(-\Delta/T)$ then                      |  |  |  |  |  |  |  |
| 13 $  S \leftarrow S'$                               |  |  |  |  |  |  |  |
| 14 end                                               |  |  |  |  |  |  |  |
| 15 end                                               |  |  |  |  |  |  |  |
| $16     T = \alpha \times T$                         |  |  |  |  |  |  |  |
| 17 end                                               |  |  |  |  |  |  |  |
|                                                      |  |  |  |  |  |  |  |

TABLE I : Decap placement result with different repositioning rate.

|               |        | Max        | Max                   |  |
|---------------|--------|------------|-----------------------|--|
| Repositioning | Decaps | Impedance  | Impedance             |  |
|               |        | $(\Omega)$ | with chips $(\Omega)$ |  |
| 70%           | 01     | 9.6656     | 33.6054               |  |
| 1070          | 21     | $(1_{-5})$ | $(chip2 \ 0_{-1}0)$   |  |
| 0.907         | 10     | 9.9931     | 33.8686               |  |
| 9870          | 19     | $(0_{-8})$ | $(chip2 \ 12_0)$      |  |
| 0.007         | 10     | 9.77       | 32.3141               |  |
| 99%           | 19     | (8_5)      | $(chip2 \ 0_{-40})$   |  |
| Intuitive     | 10     | 10.0143    | 33.4368               |  |
| Placement     | 19     | $(0_{-8})$ | (chip2 18_14)         |  |
| Intuitive     | 25     | 7.5857     | 26.2197               |  |
| Placement     | (Max)  | (20_9)     | (chip2 18_14)         |  |



Fig. 4.: Current Profile (a) Chip1 (b) Chip2 (c) Merged and Normalized

number of power sources is 441, and the specification is 10  $\Omega$  target impedance between 1 GHz and 8 GHz with only the interposer power network. We can obtain several decap placement results shown in Figure 5. We first show the self-impedance of the PDN with every decap placement solution from 1 Hz to 10 GHz, as shown in Figure 6. We will focus on the high-frequency region where on-interposer decaps mainly contribute. The self-impedance of PDN with every decap placement solution shown in Figure 7, and the statistics listed in Table I.

We can observe that the max impedance with chips is highly correlated with the max impedance with only the interposer, which indicates that our method of first doing Simulated Annealing with the interposer PDN then adding chips on it is effective and efficient. Under the same constraint, we can



Fig. 5.: Placement results. (a) Repositioning 70% (b) Repositioning 98% (c) Repositioning 99% (d) Intuitive Placement with 19 decaps (e) Intuitive Placement with 25 decaps

infer that 19 is the minimum number of decap to meet the target impedance for our case. Furthermore, under the same number of 19, as opposed to intuitive placement, the placement generated by our method with a 99% repositioning rate gets a lower maximum impedance of chips

The results show that PDNs with different sizes can be optimized by placing decoupling capacitors with our methodology. Not only can we obtain a more reliable solution with lower PDN impedance in contrast to an intuitive placement method by designers, but we also use less decap, which means fewer resources and area occupied, to avoid overdesign of the PDN and achieve the target impedance

# VI. SUMMARY AND CONCLUSIONS

This work demonstrates a flow to deal with a grid-based power network with the latest deep trench capacitor iCap, proposed by TSMC. Considering the runtime issue, we propose a flow to perform preprocessing first and conduct HSPICE



Fig. 6.: Self-Impedance of the PDN with several Decap placement solution with full view.



Fig. 7.: Self-Impedance of the PDN with several Decap placement solution with magnified view. We can see that with different reposition level the impedance has a noticeable improvement.

simulations with a single layer. We propose a Simulated Annealing based algorithm to optimize the power integrity by efficiently placing decoupling capacitors on the interposer of 3DIC power networks. The PDN with our placement result meets the target impedance while utilizing fewer decaps and area occupied. With the same number of decoupling capacitors, our methodology can reduce the PDN impedance up to 9.3 % compared to intuitively placement by human experience. In conclusion, our method can reach fast and accurate results by using novel decoupling capacitors on gridbased three-dimensional power networks.

#### References

- K. Banerjee, S. Souri, P. Kapur, and K. Saraswat, "3-D ICs: a novel chip design for improving deepsubmicrometer interconnect performance and systemson-chip integration," *Proceedings of the IEEE*, vol. 89, no. 5, pp. 602–633, 2001.
- [2] W. M. Arden, "The international technology roadmap for semiconductors—perspectives and challenges for

the next 15 years," Current Opinion in Solid State and Materials Science, vol. 6, no. 5, pp. 371–377, 2002.

- [3] Y.-E. Chen, T.-H. Tsai, S.-H. Chen, and H.-M. Chen, "Cost-effective decap selection for beyond die power integrity," in 2014 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 1–4, 2014.
- [4] K. Bharath, E. Engin, and M. Swaminathan, "Automatic package and board decoupling capacitor placement using genetic algorithms and M-FDM," in 2008 45th ACM/IEEE Design Automation Conference, pp. 560–565, 2008.
- [5] D. Soldo and S. G. Pytel, "Automated decoupling capacitor analysis for analog/digital printed circuit boards," in 2011 8th Workshop on Electromagnetic Compatibility of Integrated Circuits, pp. 111–114, 2011.
- [6] S. Köse and E. G. Friedman, "Distributed power network co-design with on-chip power supplies and decoupling capacitors," in *International Workshop on Sys*tem Level Interconnect Prediction, pp. 1–5, 2011.
- [7] J. N. Tripathi, P. Damle, and R. Malik, "Minimizing core supply noise in a power delivery network by optimization of decoupling capacitors using simulated annealing," in 2017 IEEE 21st Workshop on Signal and Power Integrity (SPI), pp. 1–3, 2017.
- [8] J.-Q. Lu, "3-D Hyperintegration and Packaging Technologies for Micro-Nano Systems," *Proceedings of the IEEE*, vol. 97, no. 1, pp. 18–30, 2009.
- [9] J. Knickerbocker, C. Patel, P. Andry, C. Tsang, L. Buchwalter, E. Sprogis, H. Gan, R. Horton, R. Polastre, S. Wright, and J. Cotte, "3-D Silicon Integration and Silicon Packaging Technology Using Silicon Through-Vias," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 8, pp. 1718–1725, 2006.
- [10] J. Lu, A. Jindal, P. Persans, T. Cale, and R. Gutmann, "Wafer-level assembly of heterogeneous technologies," in *Proceedings of 2003 international conference on compound semiconductor manufacturing technology, GaAs MANTECH*, pp. 91–94, 2003.
- [11] Q. K. Zhu, Power distribution network design for VLSI. John Wiley & Sons, 2004.
- [12] M. S. Bakir and J. D. Meindl, Integrated interconnect technologies for 3D nanoelectronic systems. Artech House, 2008.
- [13] Z. Xu, Q. Wu, H. He, and J. J.-Q. Lu, "Electromagnetic-Simulation Program With Integrated Circuit Emphasis Modeling, Analysis, and Design of 3-D Power Delivery," *IEEE Transactions on Components, Packaging and Manufacturing Technol*ogy, vol. 3, no. 4, pp. 641–652, 2013.
- [14] G. Huang, D. C. Sekar, A. Naeemi, K. Shakeri, and J. D. Meindl, "Compact Physical Models for Power Supply Noise and Chip/Package Co-Design of Gigascale Integration," in 2007 Proceedings 57th Electronic Components and Technology Conference (ECTC), pp. 1659–1666, 2007.
- [15] M. B. Healy and S. K. Lim, "Power delivery system architecture for many-tier 3D systems," in 2010 Proceedings 60th Electronic Components and Technology Conference (ECTC), pp. 1682–1688, 2010.
- [16] K. Kim, W. Lee, J. Kim, T. Song, J. Kim, J. S. Pak, J. Kim, H. Lee, Y. Kwon, and K. Park, "Analysis of power distribution network in TSV-based 3D-IC," in 19th Topical Meeting on Electrical Performance of Electronic Packaging and Systems, pp. 177–180, 2010.

- [17] J. S. Pak, J. Kim, J. Cho, K. Kim, T. Song, S. Ahn, J. Lee, H. Lee, K. Park, and J. Kim, "PDN Impedance Modeling and Analysis of 3D TSV IC by Using Proposed P/G TSV Array Model Based on Separated P/G TSV and Chip-PDN Models," *IEEE Transactions on Components, Packaging and Manufacturing Technology*, vol. 1, no. 2, pp. 208–219, 2011.
- [18] N. H. Khan, S. M. Alam, and S. Hassoun, "Power Delivery Design for 3-D ICs Using Different Through-Silicon Via (TSV) Technologies," *IEEE Transactions* on Very Large Scale Integration (VLSI) Systems, vol. 19, no. 4, pp. 647–658, 2011.
- [19] K. Kim, C. Hwang, K. Koo, J. Cho, H. Kim, J. Kim, J. Lee, H.-D. Lee, K.-W. Park, and J. S. Pak, "Modeling and Analysis of a Power Distribution Network in TSV-Based 3-D Memory IC Including P/G TSVs, On-Chip Decoupling Capacitors, and Silicon Substrate Effects," *IEEE Transactions on Components, Packaging and Manufacturing Technology*, vol. 2, no. 12, pp. 2057–2070, 2012.
- [20] H. He, J. J.-Q. Lu, Z. Xu, and X. Gu, "TSV density impact on 3D power delivery with high aspect ratio TSVs," in ASMC 2013 SEMI Advanced Semiconductor Manufacturing Conference, pp. 70–74, 2013.
- [21] H. He, Z. Xu, X. Gu, and J.-Q. Lu, "Power delivery modeling for 3D systems with non-uniform TSV distribution," in 2013 IEEE 63rd Electronic Components and Technology Conference (ECTC), pp. 1115–1121, 2013.
- [22] D. C. Zhang, M. Swaminathan, D. Keezer, and S. Telikepalli, "Characterization of alternate power distribution methods for 3D integration," in 2014 IEEE 64th Electronic Components and Technology Conference (ECTC), pp. 2260–2265, 2014.
- [23] H. He and J. J.-Q. Lu, "Modeling and Analysis of PDN Impedance and Switching Noise in TSV-Based 3-D Integration," *IEEE Transactions on Electron Devices*, vol. 62, no. 4, pp. 1241–1247, 2015.
- [24] Z. Xu, X. Gu, M. Scheuermann, K. Rose, B. C. Webb, J. U. Knickerbocker, and J.-Q. Lu, "Modeling of power delivery into 3D chips on silicon interposer," in 2012 IEEE 62nd Electronic Components and Technology Conference (ECTC), pp. 683–689, 2012.
- [25] S. Piersanti, F. de Paulis, C. Olivieri, and A. Orlandi, "Decoupling Capacitors Placement for a Multichip PDN by a Nature-Inspired Algorithm," *IEEE Transactions on Electromagnetic Compatibility*, vol. 60, no. 6, pp. 1678–1685, 2018.
- [26] H. Park, J. Park, S. Kim, K. Cho, D. Lho, S. Jeong, S. Park, G. Park, B. Sim, S. Kim, Y. Kim, and J. Kim, "Deep Reinforcement Learning-Based Optimal Decoupling Capacitor Design Method for Silicon Interposer-Based 2.5-D/3-D ICs," *IEEE Transactions on Components, Packaging and Manufacturing Technology*, vol. 10, no. 3, pp. 467–478, 2020.
- [27] S. Hou, H. Hsia, C. Tsai, K. Ting, T. Yu, Y. Lee, F. Chen, W. Chiou, C. Wang, C. Wu, and D. Yu, "Integrated deep trench capacitor in si interposer for cowos heterogeneous integration," in 2019 IEEE International Electron Devices Meeting (IEDM), pp. 19.5.1– 19.5.4, 2019.
- [28] X.-S. Yang, Engineering optimization: an introduction with metaheuristic applications. John Wiley & Sons, 2010.