Ultimate Guide to Outlier Detection Using Part Average Testing
What is Part Average Testing PAT?
Hidden reliability issues in ICs often remain undetected during production, only to appear once the chip is in use, potentially leading to costly system disruptions.
What if you could harness the massive data from semiconductor testers to uncover insights that help you catch issues early—preventing costly retests and failed dies down the line?
The semiconductor industry addresses this challenge through Part Average Testing (PAT), a powerful outlier detection method that screens each die and sets limits across multiple test parameters. Any die outside these statistically determined limits is flagged as an “outlier,” allowing manufacturers to catch defects early and maintain higher yield quality.
Each IC design and its unique processing flow create distinct test result statistical distributions which become the foundation for setting PAT limits. Automotive engineers have applied PAT to electrical test measurements for nearly three decades, making it a mainstay in achieving rigorous standards.
In this blog, you’ll learn the foundational aspects of PAT, examine its different types, and the advanced capabilities of the yieldWerx PAT module. We base this discussion on Automotive Electronics Council’s (AEC) guidelines, the globally accepted standard for Part Average Testing. The principles described in this guideline apply to packaged or unpackaged die.
What are the benefits of PAT?
The demand for quality, reliability, and transparency in the semiconductor industry is rising sharply, with zero tolerance for defects becoming the standard—especially in sectors like automotive.
- By providing early feedback on outlier components, PAT helps prevent quality incidents before they escalate, significantly reducing expenses related to customer support and failure analysis.
- This proactive approach leads to a notable decrease in Return Material Authorizations (RMAs) and failure analysis requests from the field.
- PAT contributes to packaging cost savings at the wafer level, as only parts that meet quality standards proceed through more expensive stages of production.
- Automatic bin adjustments within PAT allow for real-time quality indications, providing ongoing assurance of product reliability.
- Through outlier limit optimization, PAT reinforces high-quality standards, ensuring that customers only receive reliable devices.
Investing in Outlier Detection and Part Average Testing (PAT) ensures high quality and reliability, reinforcing your brand’s reputation with customers and their clients.
The outlier detection in the die-level PAT screening can be both static and dynamic.
Ready to optimize your yield? Discover how yieldWerx’s automotive solutions can transform your quality control strategy—get in touch today!
What is Static PAT?
Static PAT (SPAT) sets clear boundaries to catch outliers, using a pre-defined set of tests and population data from multiple batches. These boundaries—called the lower and upper specification limits (LSL and USL)—act as a guardrail for screening dice that fall outside acceptable ranges. Updated every six months or after eight wafer lots (whichever comes first), SPAT calculates the mean (µ) and standard deviation (σ), setting limits at µ ± 6σ to ensure only quality parts move forward.
Static PAT Limits = Robust Mean ± 6 Robust Sigma
What is Dynamic PAT?
SPAT can lead to excessively wide distributions compared to batch-by-batch estimates. Instead, Dynamic PAT (DPAT) takes a real-time approach, setting limits individually for each wafer test by calculating the mean and standard deviation for a specific lot. These limits are based on the actual material in front of you, eliminating lot-to-lot variation concerns.
Here’s the key: any die that falls outside the DPAT limits but remains within broader specification limits (LSL and USL) is flagged as an outlier. Since wafer distributions are often tighter than those of entire lots, DPAT lets you fine-tune limits to each wafer’s unique profile, quickly identifying any deviations from the norm.
Dynamic PAT Limits = Robust Mean ± 6 Robust Sigma
According to a KeySight case study, implementing pseudo-real-time Dynamic PAT limits instead of the per lot Static PAT limits process led to a reduction of greater than 2% in the total false rejects, a reduction of unnecessary scrap, a reduction of device handling, and improved equipment throughput due to the lower retest rate.
Setting PAT Limits: Tailoring for Normal vs. Non-Normal Data
PAT for Normal Distribution
Calculating Part Average Testing (PAT) limits differs for normal and non-normal distributions because of the unique characteristics of each. A normal distribution is symmetrical, where data clusters around a central mean, forming a bell-shaped curve with most data points falling within three standard deviations. For such distributions, PAT limits are set using the robust mean and standard deviation method, effectively identifying outliers based on their distance from the mean.
Robust Mean and Standard Deviation Method
Robust Mean = Q2 [the median]
Note: Q2 (Quartile 2) is the middle data point if the sample size is odd. If the sample size is an even number, Q2 is the average of the two middle data points.
Robust Sigma = (Q3 – Q1) / 1.35
Note: The 1.35 number is inexact for sample sizes less than 20. Q1 is the point 1/4 of the way through the ranked data and Q3 is the point 3/4 the way through the ranked data.
yieldWerx Advanced PAT uniquely identifies and applies the best-fit PAT limits across five different distribution types, unlike conventional methods that assume all parameters follow a Gaussian distribution. By accurately determining the distribution type, it enhances precision in failure detection, striking a balance between avoiding unnecessary yield loss and ensuring high
reliability with zero defects.
PAT for Non-Gaussian Distribution
Non-normal distributions are asymmetrical, with unevenly spread data where the mean, median, and mode differ. In these cases, using the mean and standard deviation can misrepresent the data, as outliers may heavily skew these metrics.
Instead, engineers turn to other special measures to set realistic PAT limits. This approach captures outliers accurately without assuming symmetry, and it accommodates the natural skew often present in non-normal data. However, AEC recommends employing such methods with caution and detailed justification.
Below we share the basics of some of these outlier detection techniques that are equally usable in several other industries like fraud prevention, health sciences, and environment protection. In real-life scenarios, PAT calculations are much more complex and use proprietary algorithms and a combination of advanced statistical techniques outside the scope of our discussion for now.
Median Absolute Deviation (MAD)
The Mean Absolute Deviation (MAD) is the average of the absolute deviations from a central measure, typically the mean or median. You can calculate it by dividing the sum of all absolute deviations from this central measure by the total number of observations:
MAD = (Σ |Deviation from Central Measure|) / (Total Number of Observations)
MAD is a robust statistical method often used for outlier detection. It measures how much individual data points deviate from the median of a dataset, making it more resistant to the influence of outliers compared to other simple methods like standard deviation.
Interquartile Range (IQR)
The interquartile range is a measure of spread. It tells you how data is dispersed around a central point (usually the mean).
Steps to find outliers using IQRs:
- Find the first quartile (Q1), which is the 25th percentile, and the third quartile (Q3), which is the 75th percentile.
- IQR = Q3−Q1
- Define thresholds for outliers using Q1 – k * IQR for the lower bound and Q3 + k * IQR for the upper bound, where k is typically set to 1.5 or 2 based on desired sensitivity.
For example, if k = 1.5, then:
Lower Bound = Q1 − 1.5 × IQR
Upper Bound = Q3 + 1.5 × IQR
Any data point that falls below the lower boundary or above the upper boundary is flagged as an outlier. This method might remove essential data points. So be careful while removing them.
Another related and widely used method is the semi-interquartile range (SIQR) in which IQR is divided by 2.
Adjusted Boxplot
The adjusted boxplot method is known for its greater robustness in removing outliers compared to SIQR boxplot, especially in scenarios where the distribution is highly skewed in large datasets.
The adjusted box plot modifies the traditional outlier boundaries by incorporating the medcouple to scale the boundaries based on the skewness:
Upper Bound: Q3 + (1.5 × e ^ (3 × MC) × IQR)
Lower Bound: Q1 − (1.5 × e ^ (3 × MC) × IQR)
MedCouple (MC) is a robust, non-parametric measure of skewness specifically designed for use with skewed data. Unlike traditional skewness measures, the MedCouple is resistant to outliers, making it ideal for distributions with heavy tails or irregularities.
Tukey’s Method
Tukey’s Fences is a method for detecting outliers in data by defining bounds around the central 50% of the data (the interquartile range, or IQR). This method, developed by statistician John Tukey, is robust and effective for identifying outliers in both normal and non-normal distributions because it focuses on the spread around the median and IQR, making it less sensitive to extreme values.
Lower Fence: Q1 − k × IQR
Upper Fence: Q3 + k × IQR
Here, k is a multiplier that determines how far from the quartiles an observation needs to be to be considered an outlier. The most common values for k are:
- 1.5 (standard Tukey’s fences): Flags mild outliers.
- 3.0: Flags extreme outliers.
However, Tukey’s Method is suitable for symmetric or moderately skewed data. In contrast, the Adjusted Box Plot is specifically designed for skewed distributions; it modifies the outlier boundaries based on the MedCouple (MC), a robust measure of skewness.
Univariate Vs. Multi-variate PAT Analysis
As complexity in semiconductor manufacturing increases, so does the need for more advanced analysis techniques. Some types of failure categories are challenging to identify using only univariate (single variable) methods.
For example, small, distributed changes across several variables may emerge due to gradual degradation or failures due to interdependence among variables that conceal the root cause when analyzed individually.
In these cases, multivariate PAT analysis becomes essential, allowing engineers to analyze multiple parameters simultaneously. This uncovers subtle defect patterns that single-parameter methods might miss. However, selecting the right parameter combinations can be challenging, given the vast number of possibilities.
To support engineers in identifying meaningful combinations, yield management systems like yieldWerx enable simulations of multivariate relationships, often starting with two-dimensional analyses as a foundation.
Here, we discuss one commonly used multivariate outlier detection method for Automotive Part Average Testing and many other industries.
Mahalanobis Distance
The formula to compute Mahalanobis distance is:
D2 = (x-m)T. C-1 . (x-m)
where
D2 = square of the Mahalanobis distance,
x = vector of the observation (row in a dataset),
m = vector of mean values of independent variables (mean of each column), and
C = covariance matrix of independent variables
Research Study Comparing Univariate and Multivariate Analysis
Researchers at UC Santa Barbara investigated customer returns. These are parts that passed all standard tests and ultimately failed at customer sites. In the automotive industry, zero returns are a key goal, making advanced analysis essential. Moving beyond univariate outlier analysis, they explored multivariate models to capture subtle discrepancies.
Source: Multivariate Outlier Modeling for Capturing Customer Returns
The team compared space coverage between a multivariate model and a univariate approach, with each test’s limits set at ±3σ to define univariate outliers. In univariate analysis, a “3σ bounding box” identifies outliers outside this box. However, the covariance-based multivariate model, using an equivalent 3σ Mahalanobis distance, defines an oval-shaped boundary, revealing shaded areas between the bounding box and oval where space coverage differs. Assuming a correlation of 0.8 between the univariate tests.
Die in these shaded areas are flagged as outliers by the multivariate model—anomalies that the univariate analysis would overlook. This refined approach enhances the detection of potential customer-return issues, aligning with the industry’s high standards.
yieldWerx Zonal PAT
Normal Part Average Testing (PAT) detects outliers by analyzing the overall wafer population. However, dies that exhibit variations within specific zones—often due to fab process uniformity issues—may go undetected if they fall within the broader distribution range.
With yieldWerx Zonal PAT, outliers are identified within defined radial zones, enabling the detection of anomalies specific to each zone’s distribution pattern. This approach allows for more precise outlier detection by focusing on variations within each defined zone.
Key Features of the YieldWerx PAT Module For Outlier Detection
yieldWerx PAT module meets all the AEC standards and comes with the following powerful features, so you don’t have to worry about setting up Part Average Testing in-house.
- You get instantaneous notifications and insights when potential outliers or anomalies are identified during the testing phase.
- Easily merge the PAT module with yieldWerx’s core application, MES/Shop-floor control systems, or automated test equipment systems like Advantest, Teradyne, and more.
- Empower your team with the ability to create and apply custom rules in PAT, offering adaptability to diverse testing and production scenarios.
- Examine past test data, spatial defect patterns, reticle failures, and more, to compare against current batches, ensuring continuous quality improvement.
- Ability to remove known Good Die where the maximum number of touchdowns has been exceeded to meet AEC guidelines.
- You can generate sophisticated reports at the touch of a button and can share reports with the team irrespective of their location.
- Utilize sophisticated techniques like GDBN and Zonal PAT to ensure reliable device shipment and minimal post-manufacture issues.
- Regular software quality checks and feature updates under the leadership of top semiconductor industry experts.
Key Takeaway
For many automotive companies outsourcing PAT to specialized providers offers distinct advantages over in-house implementation. By entrusting PAT to experts, companies gain access to state-of-the-art technology, consistent updates to testing protocols, and a team with a dedicated focus on yield optimization.
The yieldWerx PAT module, with its integration options and advanced features, enables semiconductor manufacturers to implement robust quality control without the need for extensive internal resources. This comprehensive approach to outlier detection bolsters a brand’s reputation by delivering defect-free devices.
Want to see PAT in action? Schedule a Demo to learn more about your outsourcing options and our customized testing solutions.
FAQs
What is a latent defect?
A latent defect is a faulty chip that is not immediately detectable through standard inspection or testing methods. These defects often emerge in the later part of the product lifecycle due to prolonged use or certain environmental conditions.
What is the difference between an outlier and an anomaly?
Anomalies are unexplainable values deviating too much from the base distribution. For example, in a dataset containing the age of students in a college, the value of -10 is an anomaly, as age cannot be a negative number.
Outliers are unlikely events or data points significantly different from points in a dataset. For example, in the same college dataset, the student’s age could be 97, i.e., an older person.
In our case, an outlier is a chip that has passed the original manufacturing tests but differs from the lot while showing abnormal characteristics and is more likely to fail in the field.
How does PAT differ from traditional binning?
Traditional binning categorizes parts as pass/fail based on test limits. PAT uses statistical methods to detect anomalous parts based on average performance rather than fixed limits, making it more sensitive to deviations within a production batch.
How is the PAT test performed?
During the Electrical Wafer Sort (EWS) phase, unpackaged ICs undergo testing. This involves a mechanical probe making contact with each IC’s pads on the wafer, where a probe card connects to the test equipment through cables. An automated system then performs sequential testing on each die across the wafer.
Throughout this testing, a datalog file records measurement results to compute the mean (μ) and standard deviation (σ). With these values established, outlier parts that passed the test but fall outside the acceptable range can be accurately identified.
At what stage PAT testing is performed?
You can implement it at the wafer or final test stage or both. However, by identifying anomalies at an early stage, engineers can implement corrective actions before defective components are made and scrapped.
What is the difference between PAT and GDBN?
PAT methodology separates chips that are different from other chips, which have been normally produced, GDBN identifies a working die surrounded by a failing die and removes it from the lot as a precaution.
How does PAT help improve yield?
PAT enhances overall yield quality and reliability by identifying and removing outliers that may still technically pass test limits but show abnormal behavior.
How does PAT relate to Gage R&R?
PAT data, when combined with Gage R&R (repeatability and reproducibility), ensures that testing variability doesn’t falsely trigger PAT failures, adding confidence in quality assessments.
What is Unit Level Predictive Yield?
ULPY is a similar approach to GDBN as this grades the center die(die of interest) based on the surrounding die. ULPY have a fixed filter set, which is 3×3, but yieldWerx GDBN solution is flexible to allow the user to define the filter set size and define the weight of the influence of each neighbouring die to the die of interest.
What are some typical PAT Electrical Tests recommended by AEC?
AEC – Q001 guidelines Rev-D states some of these tests:
- Pin Leakage Test
- Standby Power Supply Current (IDD or ICC)
- IDDQ testing
- Over-Voltage Stress Test
- Output Response Time
- Output breakdown voltage
- Output leakage
- Output current drive
- Output voltage levels
- Low/High temperature
Are there any limitations of the traditional PAT test?
Legacy solutions for part average testing rely heavily on electrical testing only. Advances in wafer scan technology promise additional capabilities.
References
Recent Posts
- Ultimate Guide to Outlier Detection Using Part Average Testing
- Understanding the Significance of STDF Data in Semiconductor Testing
- Overcoming Semiconductor Yield Management Challenges Using AI and ML
- A Guide to Implementing Yield Management Software in the Semiconductor Industry
- Best Practices for Sensing Failures in Automotive ICs