# CoWoS Architecture Evolution for Next Generation HPC on 2.5D System in Package

Yu-Chen Hu Taiwan Semiconductor Manufacturing Company, Ltd Hsinchu, Taiwan YCHUQ@tsmc.com

Chih-Ta Shen Taiwan Semicondutor Manufaturing Company, Ltd Hsinchu, Taiwan CTSHENA@tsmc.com Yu-Min Liang Taiwan Semicondutor Manufaturing Company, Ltd Hsinchu, Taiwan YMLIANGA@tsmc.com

Chien-Hsun Lee Taiwan Semicondutor Manufaturing Company, Ltd Hsinchu, Taiwan CHLEEC@tsmc.com Hsieh-Pin Hu Taiwan Semicondutor Manufaturing Company, Ltd Hsinchu, Taiwan HPHU@tsmc.com

S. Y. Hou Taiwan Semicondutor Manufaturing Company, Ltd Hsinchu, Taiwan SYHOU@tsmc.com Chia-Yen Tan Taiwan Semicondutor Manufaturing Company, Ltd Hsinchu, Taiwan CYTANI@tsmc.com

*Abstract*— Chip-on-wafer-on-substrate (CoWoS®) is an advanced packaging technology to make high performance computing (HPC) and artificial intelligence (AI) components. As a high-end system-in-package (SiP) solution, it enabled multi-chip integration in a side-by-side manner within a compact floor plan than traditional multi-chip module (MCM). Scaling up of the interposer area is one of the key attributes to accommodate more active circuits and transistors into the package to boost the SIP system performance. CoWoS-S based on Si interposer has been developed up to an interposer area of 2500 mm<sup>2</sup> by four-mask stitching. However, the unprecedented interposer area poses major yield and manufacturing challenges. Ways to overcome the Si interposer size limitation becomes highly desirable.

In this paper, we introduce CoWoS-L, a new architecture in the CoWoS family, to address the large Si interposer defect-driven yield loss concern. The interposer of CoWoS-L includes multiple local Si interconnect (LSI) chiplets and global redistribution layers (RDL) to form a reconstituted interposer (RI) to replace a monolithic silicon interposer in CoWoS-S. The LSI chiplet inherits all the attractive features of Si interposer by retaining sub-micron Cu interconnects, through silicon vias (TSV), and embedded deep trench capacitor (eDTC) to ensure good system performance, while avoids the issues associated with one large Si interposer, such as yield loss. Furthermore, through insulator via (TIV) is introduced in the RI as vertical interconnect to provide a low insertion loss path than TSV. CoWoS-L with 3x reticle size (~2500 mm<sup>2</sup>) interposer carried multiple SoC/chiplet dies and 8 HBMs has been successfully demonstrated. The electrical characteristics and component level reliability are reported. The stable reliability results and excellent electrical performance indicate that the CoWoS-L architecture will continue the scaling momentum of CoWoS-S to meet the demand of future 2.5D SiP systems for HPC and AI deep learning.

## Keywords— CoWoS, HPC, SiP, AI, MCM, TSV, TIV, eDTC

# I. INTRODUCTION

The advancement of artificial intelligence (AI) has been booming in the recent years at an unprecedented pace. The applications associated with the accessibility of deep learning and big data analysis are growing to drive more bandwidth for the HPC system. Not only the data processing speed in SoC itself needs to be increased, the data transmission bandwidth between SoC and memory also plays a key role in HPC systems which process big data in the parallel processing architecture. The pursuit of high bandwidth and low signal latency interconnect become increasingly critical in high density heterogeneous integration. Among the advanced packaging and 3DIC technologies developed in recent years [1-4], the 2.5D CoWoS platform has been widely adopted by HPC and AI system for its unique capabilities of large integration area, high bandwidth memory (HBM) compatibility and rich options for passives and interconnects,



Fig. 1. CoWoS development progress

In a typical CoWoS process, top dies of known-good logic SoC and HBM are integrated in a side-by-side manner on a Si interposer wafer through  $\mu$ -bumps at a pitch around 30 to 60 um. Before the above chip-on-wafer (CoW) process, the Si interposer is pre-formed with multi-layer interconnect, TSV and eDTC in a wafer fab environment. The CoW wafer is then diced into individual CoW modules based on the interposer size, and are assembled to a package substrate to form the SiP. The introduction of Si interposer between the top dies and substrate promises a finer interconnect pitch, shorter horizontal path to ensure better signal integrity (SI) and power integrity (PI). Two and four mask photolithography stitching were developed in the previous CoWoS generations to scale Si interposer up to an area equivalent to three full reticle size (3x, or ~2500 mm<sup>2</sup>). Note that, in this paper, one reticle size is defined as ~830 mm<sup>2</sup>, which derives from 25.52 mm x 32.52 mm, the maximum accessible field area for a lithography scanner. CoWoS-S, Si interposer based CoWoS technology, has been qualified for a record-high number of 3 SoC/chiplet dies and 8 HBMs [1, 5]. While continuous increase of Si interposer size is still an option for next generation CoWoS scaling up to 4x (~3300 mm<sup>2</sup>), challenges emerge from productivity and reliability aspects. The complexity of lithography process beyond 4-mask stitching brings a great deal of throughput penalty for the interposer fabrication. Control of stitching error at field boundaries of different masks is also challenging [6]. In addition, a monolithic Si interposer at such a large size brings up yield concern, especially the gross die count per wafer is dramatically decreasing beyond 3x. Hence, CoWoS-S scaling towards fourtime reticle size (~3320 mm<sup>2</sup>) or beyond is extremely challenging in terms of production and reliability.

In this paper, CoWoS-L architecture is demonstrated as a feasible platform to tackle the productivity issue accompanied by CoWoS package scaling. Several Si-based LSI chips are reconstituted in a molding-compound based interposer to replace the single Si interposer. This innovative RI structure brings many advantages for CoWoS-L, such as no mask stitching d and yield. With the technology roadmap shown in Fig.1, CoWoS-L rollout keeps the momentum to continue the evolution for CoWoS scaling and ignites more applications for the vibrant HPC industry.

#### II. CoWoS-L

### A. CoWoS-L Architechture

CoWoS-L package constitutes of 3 parts i.e., top dies, reconstituted interposer and substrate. Fig. 2 illustrates the scheme of CoWoS-L package. Top dies are bonded side-by-side on interposer through fine pitch  $\mu$ -bumps. Interposer plays an important role to carry all the top dies to form chip-on-wafer (CoW) and the LSI die does the most die-to-die talk. Both top and bottom sides of interposer contain an RDL layer for  $\mu$ -bump and C4 bump routing respectively. TIV surrounded by molding compound provides a direct vertical path from substrate to top die with low insertion loss . Finally, the CoW die is bonded to a substrate to complete CoWoS.



Fig. 2. CoWoS Schematic deagram of CoWoS-L structure

Fig. 3 shows the CoWoS-L test vehicle package. The size of the package and interposer is 70mm x 76mm and 43mm x58mm, respectively. In the CoWoS-L test vehicle, 3 SoC/chiplet dies and 8 HBMs were designed for structure verification. More than 10 LSI dies are embedded in RI.



Fig. 3. Images of CoWoS-L test vehicle, (a) ring type (b) lid type package

#### B. Fabrication of CoWoS-L Package

CoWoS-L is a "chip last" assembly, where interposer fabrication is prior to top die stacking. Fig. 4(a) illustrates the process flow of RI. First, TIV is fabricated on the carrier wafer. Second, the LSI KGDs are placed on the carrier wafer. Molding compound is filled into the gaps between LSI die and TIV followed by a CMP process for surface planarization.



Fig. 4. Process flow of CoWoS-L

One RDL layer is fabricated at the interposer frontside for connection of  $\mu$ -bump to TIV and LSI die. Fig. 4(b)-(d) shows the process flow of CoW. Top dies with  $\mu$ -bump are bonded to the interposer then filled and encapsulated with underfill and molding compound. Another RDL layer is also fabricated on the interposer backside followed by C4 formation, as shown in Fig. 4(d). Fig. 4(e)-(f) depicts the on-substrate (oS) process flow. Ring-type and lid-type package are offered by CoWoS-L. For the lid-type package, a novel film type thermal interface material (TIM) is inserted between lid and CoW die for better thermal dissipation than traditional gel-type TIM.

# C. Fabrication of Local Si Interconnect (LSI) die

LSI die inherits all the key features of Si interposer, which includes TSV and Cu interconnect for signal or power transmission. Two different types of LSI dies i.e., LSI-1 and LSI-2, are available. LSI-1 consists of dual damascene Cu interconnect and LSI-2 consists of Cu RDL. The difference of Cu interconnect in LSI determines its electrical performance and minimum line width capability.



Fig. 5. Process flow of LSI die (a) LSI-1 process flow (b) LSI-2 process flow

Fig. 5 illustrates the process flow of LSI-1 and LSI-2. For LSI-1 fabrication, TSV and a layer of single damascene Cu metal (M1) on 300-mm silicon wafer is fabricated first. Later, dual damascene Cu with undoped silicate glass (USG) as dielectric layer forms interconnect structure. Dual damascene Cu process provides the minimum metal width/space 0.8/0.8um and thickness of 2um in LSI-1 metal scheme.

LSI-2 has the same structure of TSV and M1 metal scheme. After M1 layer is fabricated, Cu RDL formation with polyimide (PI)-based material as dielectric layer forms interconnect structure by semi additive process (SAP). The SAP Cu RDL provides minimum width/space 2/2um and thickness of 2.3um.

Finally, Cu Via is fabricated on the top metal of LSI to serve as the connection to frontside RDL of RI.

# D. The new Generation eDTC

The 1<sup>st</sup> generation of deep trench capacitor (eDTC) is first introduced in CoWoS platform to boost electrical performance [7]. In the early development, CoWoS with the 1<sup>st</sup> generation eDTC can lower the system power delivery network (PDN) impedance by 93% and the 1<sup>st</sup> voltage droop is 72% lower than that without the eDTC [8]. Moreover, the simultaneously switching noise (SSN) of VDDQ in HBM can be mitigated to 38% compared to without eDTC at 3.2 GHz [8]. Signal integrity can also be improved since SSN is reduced. CoWoS platform with eDTC benefits power integrity and signal integrity. The new generation of eDTC can provides a capacitance density of 1100 nF/mm<sup>2</sup>. The high capacitance density provides enormous advantage for power efficiency on high-speed computing. CoWoS-L can provide higher capacitance than CoWoS-S. Due to yield concern, the maximum area of eDTC has an upper limit around 300 mm<sup>2</sup> on a single silicon chip. CoWoS-L with multiple LSI chips can dramatically increase the total eDTC capacitance on RI by connecting the capacitance of all LSI chips. Fig. 6 shows a comparison of maximum eDTC capacitance between CoWoS-S and CoWoS-L.



Fig. 6. The maximum eDTC capacitance of CoWoS-S and CoWoS-L

III. ELECTRICAL PERFORAMCNE AND RELIABILITY TEST

A. Electrical Properties of New Generation eDTC



Fig. 7. Cumulative distubution of new generation eDTC capacitance density

Fig. 7 reveals the new generation eDTC capacitance density. The capacitance density is around 1100 nF/mm<sup>2</sup> after voltage stressing. It is 3 times higher than 1<sup>st</sup> generation eDTC and more than 50 times higher than MiM capacitor [9]. The excellent electrical characteristics of eDTC integrated in LSI die brings CoWoS-L great SI and PI performance.

# B. Elelctrical Characterization of LSI-1 and LSI-2

CoWoS-L provides two kind of LSI die, LSI-1 and LSI-2, the main difference is the interconnect metal scheme. Kelvin structure was designed to investigate the basic electrical property of the two metal schemes. Fig. 8 shows the resistance with the minimum width of the two metal schemes.



Fig. 8. Resistance of minimum line width per micrometer (a) LSI-1 metal scheme, minumum line width 0.8um (b) LSI-2 metal scheme, minumum line width 2um

The LSI die does die to die talk between HBM and SoC. The signal integrity performance of LSI interconnect is critical to prevent data distortion during high speed transmission. Insertion loss of LSI-1 and LSI-2 metal scheme were investigated and characterized in Fig. 10. A single-ended GSG pattern was used to evaluate metal scheme property. As shown in Fig. 9, LSI-1 metal scheme has lower S21 than LSI-2 metal scheme at high frequency.



Fig. 9. Insertion loss of LSI-1 and LSI-2 interconnect

# C. Reliabilty of CoWoS-L Package

In order to verify CoWoS-L reliability, 4 different daisy chain types:  $\mu$ -bump, TSV, TIV and C4 daisy chain were designed to investigate the structure integrity, as shown in Fig. 10. The  $\mu$ -bump daisy chain connected up to 100  $\mu$ -bumps. TSV daisy chain that connected hundreds of TSVs was designed to analyze LSI interconnection. TIV daisy chain that connect from C4 to interposer front-side RDL. And C4 daisy chain located at die corner to evaluate C4 joint quality, where shows higher stress during reliability. The electrical measurement results of CoWoS-L package are shown in Fig. 11. The deviation of each package is small, which indicates the excellent electrical properties and perfect integration scheme.



Fig. 10. Daisy chain design introduction (a)  $\mu\text{-bump}$  chain (b) TSV chain (c) TIV chain (d) C4 chain

The component level reliability test of CoWoS-L follows JEDEC standard. Moisture sensitive level (MSL4) was first applied, followed by thermal cycle test (TCG) under -40 °C to 125 °C for 1500 cycles, unbiased highly accelerated stress testing (u-HAST) with 85% RH at 110 °C post 264 hours and high-temperature storage (HTS) at 150 °C for 1500 hours. No significant change in resistance was observed after reliability test, as shown in Fig. 12. Despite the large interposer size (~2500 mm<sup>2</sup>), CoWoS-L structure passed all JEDEC qualification items. The innovative structure of interposer consisting of LSI die and molding mitigates the stress from the CTE mismatch between substrate and Si top die as stress buffer.



Fig. 11. T0 electrical result (a)  $\mu$ -bump chain (b) TSV chain (c) TIV chain (d) C4 chain



Fig. 12. Post reliablity electrical result (a)  $\mu\text{-bump}$  chain (b) TSV chain (c) TIV chain (d) C4 chain

# IV. CONCLUSION

In this paper, CoWoS-L with reconstituted interposer area up to 2500 mm<sup>2</sup> has been successfully developed and demonstrated. CoWoS-L, as a member of CoWoS family, provides a novel structure to fulfill continuous scaling up requirement in high-end products. The unique structure of integrated LSI-1 and LSI-2 provides the design flexibility of superior SoC-to-SoC and SoCto-HBM interconnect in one package. The TIV also brings the advantage of signal and power integrity for the ultra-high speed data transmission without the insertion loss concern through TSV. eDTC utilization becomes efficient considering its "small die" advantage under the same defect density of wafer-fab manufacturing process. The reliability test results indicated CoWoS-L is robust and manufacturable. In conclusion, CoWoS- L exhibits a great heterogeneous integration capability to meet the growing demands for HPC and AI fields.

## ACKNOWLEDGMENT

The authors would like to acknowledge the great collaboration and support from our R&D colleagues in tsmc, especially from the 3DIC Module-2 division, and 3DIC Integration division.

#### Reference

- S. Y. Hou et al., "Wafer-Level Intergation of an Advacned Logic-Memory System Through the Second-Generation CoWoS Technology", *IEEE Transactions on Electron Devices*, Vol. 64, No. 10, pp. 4071-4077, Oct. 2017
- [2] M. F. Chen et al., "System on Integrated Chips (SoIC<sup>TM</sup>) for 3D Heterogeneous Integration," in Proc. IEEE ECTC, 2019, pp. 594-599
- C. F. Tseng et al., "InFO (Wafe Level Integrated Fan-out) Technology," in Pro. ECTC, 2014, pp. 1-6
- [4] Kevin Lepak et al., "The next generation AMD Enterprise Server Product Architecture," IEEE HOT Chips Symposium, 2017
- [5] P. K. Huang et al., "Wafer Level System Integration of the Fifth Generation CoWoS-S with Performance Si Interposer at 2500mm<sup>2</sup>," in Pro. ECTC, 2021, pp. 101-104
- [6] M. Guan et al., "Enhanced Stitching for the Fabrication of Photonic Structures by Electron Beam Lithography," J. Vac. Sci. Technol. B 25, 2007, pp. 2034-2037
- [7] S. Y. Hou et al., "Integrated Deep Trench Capacitor in Si Interposer for CoWoS Heterogeneous Integration," in IEDM Tech. Dig., Dec. 2019, pp.462-465
- [8] W.T. Chen et al., "Design and analysis of logic-HBM2E power delivery system on CoWoS platform with deep trench capacitor," *in Pro. ECTC*, 2020, pp. 380-385
- [9] W.S. Laio et al., "A manufacturable Interposer MIM Decoupling Capacitor with Robust Thin High-k Dielectric for Heterogeneous 3D IC CoWoS Wafer Level System Integration," 2014 IEEE International Electron Devices Meeting, San Francisco, CA, 2014, pp. 27.3.1-27.3.4, 2014 IEDM