The need for semiconductors has exploded since 2010, and despite a recent downturn in 2023, the semiconductor industry is trending toward recovery and long-term growth to more than $1 trillion in revenue by 2030. Comparable to previous market cycles, some segments of the semiconductor industry have grown, while others have contracted. In this market cycle, the demand for chips to power smartphones and personal computers has dropped, and the demand for chips used for AI and in autonomous and electric vehicles has surged. Meanwhile, semiconductor fabs are handling a wider and more complex variety of chips and designs on their production lines.
Rapidly changing market dynamics and product demand create a whiplash effect for the industry, forcing it to oscillate between the paradigms of cost reduction and throughput maximization. This oscillation complicates the ability of semiconductor fab leadership to plan strategic goals (such as wafer shipments) and tactical daily targets (such as equipment utilization). Independent of the paradigm and pertinent to most goals, our experience has been that performance improvements with existing tooling and manufacturing footprints are often faster, more cost-effective, and more sustainable than commissioning and decommissioning tools and manufacturing expansions. These performance improvements and goal-setting exercises are heavily dependent on a single source of truth that reflects the reality of the fab floor, which can be achieved by implementing in-house, transparent, and top-down analytics, thereby optimizing the potential value of the fab.
This article will discuss the three differentiators of semiconductor fabrication and describe three select analytical approaches—variance curves, saturation curves, and empirical bottleneck identification—that can provide fab leaders significant insight into their operations, enabling them to optimize their manufacturing processes regardless of where they sit on the spectrum between cost optimization and throughput improvement.
Differentiators in semiconductor manufacturing
Semiconductor fabrication is one of the most complex and sophisticated processes in manufacturing. Semiconductor manufacturing stands out for its unique demands that require precision down to the nanometer, atomic ordering, and high chemical purity. Accomplishing this technical feat on the scale of thousands of wafers daily adds to its impressive and daunting nature.
Three differentiators make manufacturing semiconductors at scale exceptionally challenging.
Semiconductor manufacturing is iterative. As the wafer moves along each stage of the manufacturing process, it revisits the same piece of equipment multiple times. This iterative process means that any disruption at any one piece of equipment will result in consequences at multiple steps of the manufacturing process.
Semiconductor fabs are large and complex operations. They contain hundreds of linear steps and thousands of iteratively used equipment chambers, each with their own controllers and data streams. Managing each piece of equipment requires efficient and data-driven teams.
Semiconductor fabs can have a combination of high-volume and high-mix products. High-volume product demand is becoming more apparent as the number of semiconductor-enabled electronic devices and vehicles increases. High-mix product demand is becoming more pronounced as the number of technologies and devices that require semiconductors—such as those related to the Internet of Things (IoT), the energy transition, AI, cloud computing, electric vehicles, and wearables—increases. Having a high mix of different products requires multiple parameters and processes to be programmed and performed on each tool. It also requires intricate coordination between planning, engineering, operations, and equipment teams. If this high mix is not coordinated and executed properly, a high volume of backlog behind each of the hundreds of tools and steps can occur.
The following three analytical frameworks are core to understanding holistic performance, are often overlooked, and can be implemented rapidly. Additionally, these frameworks can be leveraged at multiple levels (for instance, at the fab, subsegment, or equipment level) to clearly identify improvement opportunities, set goals, and implement change in the chain of operations. These frameworks help simplify the complexity of operations and provide guidance for diving deeper into the fab, ultimately implementing solutions and improving fab performance either for cost or throughput. We present these frameworks in a hypothetical scenario of a fab with ongoing production issues, framed by the key questions fab management may be asking at any point along the fab’s improvement journey.
Variance curves to deliver rapid insights on overall fab performance, equipment health, and line variance
Fab leaders frequently grapple with assessing performance of a fab over time and against other fabs, either in the portfolio or industry benchmarks. Leveraging variance curves—also known as alpha or frontier curves—facilitates a seamless comparison of current performance against historical benchmarks and industry standards by charting capacity utilization versus normalized cycle time (in other words, how long an operation takes compared with the theoretical minimum). This approach generates precise insights on operations variance, which can be used to identify specific points where the fab began to deviate from peak performance, what tool groups and areas of the fab are driving the variance compared with the ideal (or previously demonstrated) state, and if the trade-offs between equipment utilization and product cycle time are justifiable. Minimizing variance is always the goal, regardless of being in a cost or throughput regime, so employing and quantifying variance curves can improve the interpretation of more-traditional performance metrics (for example, shipments, work in progress [WIP], and cycle times).
Exhibit 1 shows the telltale signs of a fab losing operational control and increasing variance over a five-year period. Stable operations in the first two years rapidly give way to increased mix and demand changes in year three. Management attempts to compensate by starting additional WIP to drive additional moves and outs. This increased mix and WIP leads to increased downtime from conversions and clogs in the line, which result in a rapid escalation of cycle time, utilization reductions, and increased variance. Ultimately, this change in demand creates significantly worse outcomes for the fab—for example, the product takes longer to reach customers and capacity planning takes even longer to predict. Although improving tool performance helps solve aspects of this line balance problem, the key to solving line balance is counterintuitive: ultimately, by reducing starts and overall WIP levels to those of years one and two, batch producing wafers to combat the conversion complications, and building inventory buffer stock at the back end of line (to reduce shipment variance), the fab is able to bring its performance to a much more stable operation over the course of year five.
By analytically comparing historical performance or industry and portfolio peers, fab leaders can quickly identify decisions to solve manufacturing problems before they balloon. Fabs that have addressed these priority levers and implemented variance control methods significantly improve their utilization, cycle time, and on-time delivery. After setting and adhering to line balance and variance targets guided by variance curves, some fabs have been able to increase on-time delivery and decrease shipment variance by more than 70 percent.
Saturation curves to optimize WIP and throughput levels
As previously discussed, we used variance curve analytics to examine an instance in which a fab’s performance deteriorated due to a large increase in starts, WIP, and product mix. We saw that one key solution to reducing variance and improving performance was to decrease starts and WIP to an optimal level. Although this optimal WIP level can be calculated and modeled using precise values of tool availability, running times, and changeover frequency and duration, these values may not always be consistent with the past, present, or future conditions on the fab floor. Here, we explain how saturation curves can empirically define WIP targets.
When seeking to increase production overall or on a single tool, the tendency is often to increase WIP levels, seeking to increase overall throughput. This relationship holds until a bottleneck occurs or a saturation point is reached, after which additional WIP merely increases the cycle time at best or potentially decreases output as the line becomes clogged with excess WIP at worst. In many fabs we have examined, we see the tendency to revert to increasing starts to account for fewer outs, which invariably does not always translate.
With standardized data inputs, saturation curves that compare WIP to throughput help quickly identify the ideal levels of inventory WIP to optimize throughput and ways to reduce output variance in established processes. By examining historical performance data, fab leaders can visualize the throughput saturation point and determine the level of control for the fab process.
The saturation curve has helped some semiconductor fabs navigate quantifying ideal WIP quantities at all levels of their fabs. Exhibit 2 highlights the regimes that become quickly apparent for how a fab (or equipment group or tool) is operating, including the following:
- stable at or near the optimal production point
- highly glutted with WIP, affecting cycle times and potentially choking throughput
- oversupplied with WIP and underperforming and tools
- undersaturated with WIP; provides an opportunity to increase WIP levels to drive additional throughput or to idle or reduce tool count (and further reduce fixed costs) to maintain the same output with fewer tools
- expanded to a new performance frontier due to the addition of incremental tooling, increased performance, or capacity
This same set of events can also be triggered by increasing the mix and complexity of the devices made in the fab. As the complexity increases and more changeovers or conversions and WIP management become necessary, the position on the curve of output compared with WIP can change dramatically. Thus, improvement toward the optimal point could be due to a nonexhaustive combination of factors: rebalancing the wafer volume along the manufacturing process; optimizing the product start mix and batch sequencing (that is, increasing train size) to run larger batches, thereby minimizing setup and changeover losses based on the fab’s optimized start plan; and increasing equipment availability. Alterations in train size would likely necessitate buffer stock builds initially (or delivery date adjustments if approved by the customer) to ensure there are no lapses in shipment, but they could significantly improve output performance in high-mix fabs.
By setting target WIP levels, continuously adjusting inventory, maintaining or improving tool performance, and managing train size with capacity-backed optimized start-up sequencing, it is possible to steadily improve both inventory and cycle time while maintaining target shipment values. Some fabs were able to decrease WIP levels by 25 percent while maintaining stable monthly shipments over a 12-month transformation period because of their data-driven goals, enabled by saturation curves. Based on this success, fab leadership could consider options to decrease WIP levels to decrease the cycle time and reach optimal levels again.
Empirical equipment analytics to identify true bottleneck tools and direct actions to improve fab performance
In the prior section, we used WIP saturation analytics to set target WIP levels, which helped rebalance equipment inventories and manufacturing lines. In conjunction with adjusting WIP level, the quantity and performance of tools must also be considered. For example, if the WIP levels are remaining high and saturated but the throughput is low at a particular tool group, then this tool group needs a combination of increased availability, utilization, or tool quantity to improve line balance. In contrast, if the WIP levels remain low and the onset of saturation is never observed, then it may be possible to take a tool offline to save on operational expenditures without negatively affecting line balance. Here, we explain how empirical equipment analytics can be used to identify true bottleneck tools and improve fab performance in both cost and growth regimes.
Frequently, in long-standing operations, fab leaders rely on institutional knowledge, whether that is historical capacity models, design philosophies (for example, photo is always the bottleneck), or their experience to create lists of fab production bottlenecks. However, as processes change and mature over time or as product mix changes, this institutional knowledge can quickly become stale and outdated. Similarly, for relatively new fabs, relying only on design philosophies and capacity models may not fully capture production realities such as complex queuing schemes and the human element. Both scenarios present a risk that operations personnel focus on maintaining or maximizing performance of tools that are not truly bottlenecks.
Alternatively, fab leaders could use real-time data to identify areas or processes within the fab that act as bottlenecks and restrict the overall fab capacity to meet customer demand. This empirical approach is both data-driven and easily analyzed over multiple time periods (for example, weeks or years to determine if a tool is a transient or structural bottleneck). This approach can even pinpoint which equipment is causing delays in on-time deliveries (for example, accumulating WIP, cycle time, and variance) and enable strategic allocation of resources to enhance output based on the status of whether the tool is a bottleneck.
Exhibit 3 shows an empirical determination of bottlenecks, plotting weighted cycle time against WIP level variation. Tools located beyond the mean WIP weighted cycle time indicate some form of bottleneck, either transient (appearing at times due to maintenance events or mix changes) or structural (ever-present in gating WIP movement throughout the fab). Once identified, these bottlenecks can be appropriately categorized and managed. In growth regimes, fabs can focus on improving the availability and utilization of structural bottleneck tools and develop action plans for transient bottlenecks based on their occurrence criteria, with minimal or no capital investment to drive potentially significant increases in capacity and throughput.
Conversely, in a cost-constrained regime, fab leaders could consider performing a similar set of actions, but instead of maximizing throughput, they could consolidate the tools to fewer numbers and reduce fixed costs associated with the bottleneck tool groups. Furthermore, in a cost-constrained environment, the additional opportunity is to consolidate or reduce operations of more-complex tool groups outside of the bottleneck categories to improve overall margins without affecting overall fab capacity. For example, a fab may have a tool group consisting of ten tools, with each processing 500 wafers per day. By increasing the production capacity of each tool to 556 wafers per day by, for example, reducing downtime and idle events, the fab can achieve the same aggregate output using just nine tools. This allows the fab to turn off one tool, consolidate the workflow to the remaining nine tools, and reduce costs while maintaining the same overall production.
By properly identifying these bottleneck tools, fab leaders can focus on the tools that are truly limiting their fab capacity and identify which tools or tool groups may be further optimized for margins without a significant effect on fab capacity. Many semiconductor fab leaders have initiated root-cause-analysis workshops and standardized downtime action plans to ensure bottleneck equipment remains up and available more frequently, mitigating the accumulation of WIP and cycle time for any given tool group. Fabs that have employed these analytic approaches and solutions have seen up to a 30 percent increase in structural bottleneck tool group availability and a roughly 60 percent decrease in WIP sustained for extended periods of time. In these specific examples, fabs were seeking to increase throughput instead of optimize cost performance. In the same scenario, under a different economic or demand regime, a roughly 30 percent increase in bottleneck tool availability could translate to a proportionate reduction in total tools in operation for the same tool group, enabling a fixed cost reduction for this area of the fab.
Given the rapid paradigm shifts in the semiconductor industry, fab leaders are increasingly on the front line to drive value for their companies and are embracing tools that increase the visibility and efficiency of their operations to maximize value. The above analytical frameworks, cascading KPIs, and streamlined or transparent analytics are not novel—but they are critical for improving fab performance. Fab leaders can use advanced analytical frameworks as the scaffolding for continuous improvements, no matter the economic and demand conditions surrounding the fab and company at large.