What is Commonality Analysis?

Commonality Analysis (CA) refers to a set of techniques to identify SYSTEMATIC causes of yield loss. There are numerous techniques used but the most common include:

Association Rules
Analysis of Variance (ANOVA)
Contingency Tables
Other

These “data mining” techniques can be prone to FALSE POSITIVES (identify a variable that is not a cause) and FALSE NEGATIVES (exonerate a variable that is a cause). Despite this weakness, these techniques are often used to narrow down the possibilities as a starting point for engineers to use AD-HOC ANALYSIS and design of experiments (ANOVA) to identify and correct a problem.

“Equipment Commonality” is used extensively in wafer fabs to identify equipment, or periods of time, when equipment is systematically underperforming to expectation. ANOVA analytics for different tools and layers are run automatically at regular time or lot intervals in modern wafer fabs as part of Statistical Process Control (SPC) monitoring. For fabless IC design or Outsource Assembly and Test (OSAT) firms, commonality techniques can be used to identify photo mask (reticle), test program, ATE, probe cards, time periods and other test and assembly variables that are systematically under performing to expectation.

Data Requirements for Commonality Analysis

Data availability, quality (cleansing) and organization issues surround Commonality Analysis. Accurate records of how products traverse the manufacturing and testing environment must be available. This data is referred to as “Genealogy” (how “Lots” or “batches” of material are split and combined) and Lot “History” data (the tool associations) for a lot, wafer, die or device. Any aggregate test data must use consistent genealogy and history data when making calculations and drawing conclusions.

This data is usually stored in multiple systems such as the Manufacturing Execution System (MES) as well as in engineering Test Programs and even human recorded (e.g. spreadsheets) data. This data is used to create the “Association”

Genealogy and History data is used to identify Associations common to the analyzed material. Examples of Associations used in CAfor Wafer Fab, Test and Assembly are shown below.

Wafer Manufacturing Process

Facility
Equipment (id, recipe, operation, time period)
Process (technology, feature size, layers)
Product (mask / Reticle)
Lot & Sub-Lot (genealogy, equipment sequence)
Wafer (WAT / eTest)

Photo Courtesy of TSMC

Test Process

Facility
Equipment (ATE, probe card, handler, time period)
Process (test program version)
Product (mask / Reticle)
Lot &Sub Lot(genealogy, equipment sequence)
Wafer
Die (circuit test)

Photo Courtesy of StatsChipPAC

Assembly Process

Facility
Equipment (encapsulation, bump, bonder, time period)
Process (metallurgy, materials)
Product (die, package)
Lot &Sub Lot(genealogy, equipment sequence)
Packaged Device or KGD

Photo Courtesy of Rohm

Association Rules

Association rules are a good starting point to find discretely valued problems (e.g. GOOD or BAD parts) and can be used to analyze large amounts of data.

To demonstrate how association rules are developed, we’ll analyze which reticle in a 4 reticle mask set is a probable cause of yield loss. The problem would setup the same if looking for wafers within a wafer lot, time intervals on equipment or the zone of a wafer.

First some definitions:

r = reticle location being analyzed

Nb = total number of bad die

Nbr = number of bad die from reticle location r

Nr = total number of die from reticle location r

Sr = Conditional Support of reticle r

Cr = Confidence of reticle r

Sr = Nbr / Nb- Conditional Support of reticle r

Cr = Nbr / Nr – Confidence in reticle r

The table below calculates Sr and Cr for 10,000 die with a 20% yield loss. It assumes a 4 reticle mask set and that each reticle is responsible for ¼ of all the die in the 10,000 die population.

r	#good	#bad	Total	S	C	*SC**
1	2070	430	2500	0.22	0.17	0.037
2	2200	300	2500	0.15	0.12	0.018
3	1930	570	2500	0.29	0.23	0.065
4	1800	700	2500	0.35	0.28	0.098
Total	8000	2000	10000	1.00

The purpose of calculating Support and Confidence is to RANK and PARETO the possibilities. It is common to use S * C to rank. Here we see that 35% of yield loss is from r4with only a28% confidence that r4 is bad.

Since C4=.28 << .5 there is not a strong likelihood of a reticle problem. An engineer would compare this to other Commonality Analysis such as lot, time period, probe card, wafer zone, ATE etc. Comparing can be done by creating an “Pareto Frontier” by plotting C = F(S) and looking for deviations from the line.

Now look what happens when the yield loss increases to 40% with the same % of bad die from r4 (S4).

r	#good	#bad	Total	S	C	*SC**
1	1675	825	2500	0.21	0.33	0.068
2	1550	950	2500	0.24	0.38	0.090
3	1675	825	2500	0.21	0.33	0.068
4	1100	1400	2500	0.35	0.56	0.196
Total	6000	4000	10000	1.00

Here C4 = .56which is >.5 and if a bad reticle is the ONLY possible cause of the yield loss then r4 is the most likely suspect. (If all r4 die were bad then S4 = 2500/4000 = .625 and C4 = 2500/2500 = 1, if all reticles were identical S1-4 = 1000/4000 = .25 and C1-4 = 1000/2500 = .4).

In summary, Association Rules are very effective for identifying possible causes of discretely valued problems. Doing CA of parametric variations is better handled using ANOVA statistics to identify possible sources.

ANOVA

If the yield loss is being caused by a parameter outside of specification limits, then ANOVA is the most frequently used technique for CA. It is a technique to decompose the VARIANCE of the parameter into a linear combination of different sources.

In semiconductors the parameter’s MEAN and VARIANCE (MEANVAR) is decomposed into Genealogy and History subgroup MEANVAR’s. There are many good tutorials and resources about ANOVA with a search on the internet.

ANOVA should be used for CA when:

A parametric (vs. discrete) variation is causing loss
Measurement data of the parameter is available and can be aggregated
1. Overall (e.g. all wafers in a lot)
2. For each possible problem source (e.g. each wafer in the lot, or each machine that tested a portion of the lot etc.)

In summary, Commonality Analysis is used to isolate the cause of unacceptable variations in output. These variations can be discrete (e.g. good/ bad yield), or parametric (e.g. maximum frequency). Association Rules are commonly used for discrete outputs and ANOVA is commonly used in parametric outputs.

For further reading on ANOVA:http://en.wikipedia.org/wiki/Analysis_of_variance