Part of the  

Solid State Technology

  Network

About  |  Contact

Archive for September, 2017

Reliability for the Real (New) World

Thursday, September 21st, 2017

By Dina Medhat

There’s nothing more annoying than a device that doesn’t perform as expected. Nearly everyone has experienced the ultimate frustration of the “intermittent failure” problem with their laptops, or a cellphone that suddenly and inexplicably stops working. Now imagine that failure occurring in a two-ton vehicle traveling at highway speeds, or in a pacemaker implanted in someone you love. With electronics moving into virtually every facet of our lives, designers are facing unique challenges as they create (or re-engineer) designs for new high-reliability, environmentally-demanding applications like automotive and medical.

Significantly increased longevity requirements, coupled with new stresses, new circuits and topologies, increased analog content, higher voltages, and higher frequencies, make the task of ensuring performance and reliability harder than ever. The corollary to these new constraints and requirements is the need for verification technology and techniques that enable designers to find and eliminate potential electrical failure points and weaknesses.

Electrical overstress (EOS) is one of the leading causes of integrated circuit (IC) failures, regardless of where the chip is manufactured or the process used. EOS events can result in a wide spectrum of outcomes, covering varying degrees of performance degradation all the way up to catastrophic damage, where the IC is permanently non-functional. Identifying and removing EOS susceptibility from IC designs is essential to ensuring successful performance and reliability when the products reach the market.

When we discuss EOS, however, it’s important to understand that EOS is technically the result of a wide range of root cause events and conditions. EOS in its broadest definition includes electrostatic discharge (ESD) events, electromagnetic interference (EMI), latch-up (LUP) conditions, and other EOS causes. However, ESD, EMI, and LUP causes are generally differentiated, as shown in Figure 1.

Figure 1. Typical root causes of EOS events. See Reference 1.

Any device will fail when subjected to stresses beyond its designed capacity, due either to device weakness or improper use. The absolute maximum rating (AMR) defines this criterion, as follows:

  • Each user of an electronic device must have a criterion for the safe handling and application of the device
  • Each manufacturer of an electronic device must have a criterion to determine if a device failure was caused:
    • By device weakness (manufacturer  fault)
    • By improper usage (user fault)

Device robustness is represented by the typical failure threshold (FT) of a device. Because FTs are subject to the natural distribution of the manufacturing process, a product AMR is set to provide the necessary safety margin against this distribution (to avoid failures in properly- constructed devices). The safe operating area (SOA) of a device consists of parametric conditions (usually current and voltage) over which a device is expected to operate without damage or failure (Figure 2). For example:

  • Over-voltage tends to damage breakdown sites
  • Over-current tends to fuse  interconnects
  • Over-power tends to melt larger areas

Figure 2. Graphical interpretation of an AMR. The yellow line represents the number of components experiencing immediate catastrophic EOS damage. See Reference 2.

EOS events can result in a wide spectrum of outcomes. Electrically-induced physical damage (EIPD) is the term used to describe the thermal damage that may occur when an electronic device is subjected to a current or voltage that is beyond the specification limits of the device. This thermal damage is the result of the excessive heat generated during the EOS event, which in turn is a result of resistive heating in the connections within the device. The high currents experienced during an EOS event can generate very localized high temperatures, even in the normally low resistance paths. These high temperatures cause destructive damage to the materials used in the device’s construction [2].

As shown in Figure 3, EOS damage can be external (visible to the naked eye or with a low-power microscope), or internal (visible with a high-power microscope after decapsulation). External damage can include visible bulges in the mold compound, physical holes in the mold compound, burnt/discolored mold compound, or a cracked package. Internal damage manifests itself in melted or burnt metal, carbonized mold compound, signs of heat damage to metal lines, and melted or vaporized bond wires.

Figure 3. External and internal EOS damage. See Reference 3.

So, if preventing EOS conditions in your design is a good idea, just how do you do that? In the past, designers used a variety of methods to check for over-voltage conditions, relying mainly on the expertise and experience of their design team. Manual inspection is probably the most tedious and time-consuming approach, and hardly practical for today’s large, complex designs. Another conventional approach is the use of design rule checking (DRC) in combination with manually-applied marker layers. Manual marker layers are inherently susceptible to human mistakes and forgetfulness, and this approach also requires additional DRC runs, extending verification time. Lastly, there is simulation, which can take a long time to run, and is dependent on the quality of the extracted SPICE netlist, SPICE models, stress models, and input stimuli.

Voltage Propagation

Voltage propagation is an automated flow that propagates realistic voltage values to all points in the layout, eliminating the more fallible manual processes. An automated voltage propagation flow (Figure 4) generates the voltage information automatically, without requiring any changes to sign-off decks, or any manually added physical layout markers.

Figure 4. Automated voltage propagation flow.

Example

Let’s debug a typical over-voltage (EOS) condition. We’re using the Calibre® PERC™ tool for the voltage propagation, and the Calibre RVE™ results debugging environment for viewing and debugging the results. The debugging steps are illustrated in Figure 5.

(1)   The Calibre PERC run identifies a device with a 3.3V difference between propagated voltages to gate pin and source pin, which is greater than the allowed breakdown limit of 1.8V for this device type. To debug this violation, we first highlight the violating device in a schematic viewer

(2)   Next, we must understand how the gate can receive a propagated voltage of 3.3V. To do that, we initiate a trace of the gate pin using the Calibre RVE interface

(3)   The trace results provide the details of the voltage propagation paths in the voltage trace window (where “start” is the gate pin and “break” is the 3.3V net)

(4)   We can then click on specific devices/nets from the voltage trace window to highlight them in our design data in the schematic viewer.

(5)   Step 4 provides us with the information we need to analyze and resolve the voltage overload condition.

Figure 5. Calibre PERC voltage propagation interactive debugging.

Summary

Designers at both advanced and legacy nodes are facing new and expanded reliability requirements. New solutions are emerging to ensure continuing manufacturability, performance, and reliability. Automated voltage propagation supports the fast, accurate identification of reliability conditions in a design, enabling designers to analyze and correct the design early in the verification flow. Finding and eliminating often-subtle EOS susceptibilities before tapeout helps ensure that designs will satisfy the performance and reliability expectations of the market.

References

[1]         K. T. Kaschani and R. Gärtner, “The impact of electrical overstress on the design, handling and application of integrated circuits,” EOS/ESD Symposium Proceedings, Anaheim, CA, 2011, pp. 1-10. URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6045593&isnumber=6045562

[2]         Industry Council on ESD Target Levels, “White Paper 4: Understanding Electrical Overstress – EOS,” August 2016. https://www.esda.org/assets/Uploads/documents/White-Paper-4-Understanding-Electrical-Overstress.pdf

[3]         “Electrical Overstress EOS,” Cypress Semiconductor Corp. http://www.cypress.com/file/97816/download

Author:

Dina Medhat is a Technical Lead for Calibre Design Solutions at Mentor Graphics. Prior to assuming her current responsibilities, she held a variety of product and technical marketing roles in Mentor Graphics. Dina holds a BS and an MS from Ain Shames University, Cairo, Egypt. She may be contacted at dina_medhat@mentor.com.

Faster Signoff and Lower Risk with Chip Polishing

Sunday, September 17th, 2017

By Bill Graupp, Mentor, a Siemens Business

Designing integrated circuits (ICs) today is a complex and high-risk endeavor; design teams are large and often scattered around the world, tool flows are complex, and time-to-market pressures omnipresent. It’s no surprise that product releases are often delayed because design teams can’t get to signoff on schedule. Schedules certainly account for the time required for full verification, as well as design optimizations like DFM fill and via enhancements, but all the delays along the way accumulate. Engineers are then pressured to compensate for those delays to stay on schedule. Typically, the final days of signoff are the worst—the deadline is looming, and each iteration between finding and fixing layout issues increases the risk of being late.

Engineers are all about increasing efficiency and reducing risk. When considering how to get to signoff faster, there are many ways to do that. You could hire more designers, but that makes coordination harder. You could increase design margins, but that reduces your product’s value. You can make sure to plan plenty of time for final verification and signoff, yet delays earlier in the flow can still impinge on that allotted block of time.

The counterintuitive solution? Add another step to the process flow—more verification performed at many levels throughout the design flow to catch and fix problems earlier. The phenomenon of putting in more thought and effort to get “less” isn’t unique to IC design. Mark Twain captured the idea when he said, “I didn’t have time to write a short letter, so I wrote a long one instead.”

IC designers already do this to find design rule checking (DRC) violations, starting in early implementation, but how about the non-DRC layout issues, like nano-jogs, space ends, mushrooms, dog bone ends, and offset vias? None of these items is necessarily a design rule error, but all of them are likely to affect manufacturability and lower yield. Fixing these issues is referred to as chip polishing, and is one of the keys to improving a product’s manufacturability. Figure 1 illustrates some typical chip polishing activities.

Fig1-chip-polishing-examples

Figure 1. Automated chip polishing modifies the layout to improve robustness of the design and yield. Modifications are inserted back into the design database.

There are software tools that automate these chip polishing tasks and can be easy to adopt and customize into any flow to reduce the risks associated with reaching signoff. A key to usability of chip polishing software is the ability for engineers to combine a focused set of commands into macros that can be peppered throughout a customized flow for engineering change order (ECO) filling, passive device insertion, custom fill to increase densities, jog removal, via enhancements, and programmable edge modification (PEM) commands to eliminate fragmented edges. If, for example, your power structures or capacitor placement rules cause system-level final verification issues, a solution can be implemented quickly and systemically across all blocks and top cells.

Categorizing issues by groups, based on the methodology needed to fix the issue, improves the efficiency of design closure. Correction of some issues requires the insertion of passive devices, while others require polygon shifts and edge movements. Some require the insertion of additional shapes for manufacturability. Each of these categories can best be handled by a custom electronic design automation (EDA) process designed to resolve that category of issue. When one process is used for each category, then all the processes can be combined into one final sign-off flow that can be customized for each design methodology, using a common programming language and database.

Many of the failures of today’s post-route sign-off flows can be solved by creating the conditions for an effective and timely solution to late-stage DRC errors and enabling engineers to insert and modify any shapes needed to achieve the final signoff. A well-designed automated sign-off flow can improve your product’s manufacturability, allow you to get to market faster, and enable you to create market differentiation.

For example, many issues that require or benefit from chip polishing arise from hierarchy conflicts, such as two lines from two cells being connected at the parent cell without the knowledge of the entire line shape or width. Other typical problematic layout features include:

  • Space Ends – Metal lines formed into a “J” due to the router passing a short adjacent track line and coming back to the far end. The connection bottom of a “J” can pinch if the loopback is too narrow.
  • T-Line Ends – Metal lines with a narrow cross “T” at the end can cause necking.
  • Mushrooms – A long metal line connected to the center of a short metal adjacent track line typically causes necking of the connection metal.
  • Nano-jogs – When two metals of slightly different widths are connected end to end, it creates breaks in long edges that cause unnecessary runtime in verification and mask generation.
  • Offset Vias – Manually-placed vias at an adjacent metal overlap that are not centered in the overlapping region create potential via coverage issues that can cause higher electrical resistance.

Chip polishing software can execute programmable edge modification (PEM) commands to correct for these issues, including polygon shifting, polygon sizing, edge-based polygon creation, feature-based edge identification (jogs, space ends, etc.), and polygon growth with spacing considerations.

By reducing the number of edges in the design through chip polishing, many chip release tasks can be improved or eliminated. It’s only logical that mask generation can optimize long edges more quickly when they do not contain jogs or notches, so it’s no surprise that final verification runtimes for large blocks and chip layout can be reduced by eliminating any edges broken into fragments due to accidental jogging. Mask generation is also faster with optimized line ends, because there are fewer edges that will require optical proximity correction (OPC). By having a faster mask flow with fewer issues to manage, the manufacturing process can be optimized for the consistency of the manufacturing models used to control the process. A more robust design will also create a more reliable product, as well as reduce yield variability over the life cycle of the product.

Getting to signoff faster, with less risk, while generating a layout that is highly manufacturable can be accomplished with automation tools with the types of analysis and fixing capabilities described here. PEM commands can improve a layout by automatically analyzing a design, then smartly removing or altering the offending edges. A well-designed automated PEM flow can improve your product’s manufacturability, allow you to get to market faster, and enable you to create market differentiation.

Author

Graupp_Bill_2015_2x2 Bill Graupp is a DFM Application Technologist for Calibre in the Design to Silicon division of Mentor, a Siemens Business. He is responsible for product marketing and customer support for the DFM product line, focused on layout enhancement and fill. Bill received his BSEE from Drexel University, and an MBA from Portland State University. After hours, he currently serves as the mayor of Aurora, Oregon, and as a director on his local school board.