Part of the  

Solid State Technology

  and   

The Confab

  Network

About  |  Contact

Posts Tagged ‘Nvidia’

Deep Learning Could Boost Yields, Increase Revenues

Thursday, March 23rd, 2017

thumbnail

By Dave Lammers, Contributing Editor

While it is still early days for deep-learning techniques, the semiconductor industry may benefit from the advances in neural networks, according to analysts and industry executives.

First, the design and manufacturing of advanced ICs can become more efficient by deploying neural networks trained to analyze data, though labelling and classifying that data remains a major challenge. Also, demand will be spurred by the inference engines used in smartphones, autos, drones, robots and other systems, while the processors needed to train neural networks will re-energize demand for high-performance systems.

Abel Brown, senior systems architect at Nvidia, said until the 2010-2012 time frame, neural networks “didn’t have enough data.” Then, a “big bang” occurred when computing power multiplied and very large labelled data sets grew at Amazon, Google, and elsewhere. The trifecta was complete with advances in neural network techniques for image, video, and real-time voice recognition, among others.

During the training process, Brown noted, neural networks “figure out the important parts of the data” and then “converge to a set of significant features and parameters.”

Chris Rowen, who recently started Cognite Ventures to advise deep-learning startups, said he is “becoming aware of a lot more interest from the EDA industry” in deep learning techniques, adding that “problems in manufacturing also are very suitable” to the approach.

Chris Rowen, Cognite Ventures

For the semiconductor industry, Rowen said, deep-learning techniques are akin to “a shiny new hammer” that companies are still trying to figure out how to put to good use. But since yield questions are so important, and the causes of defects are often so hard to pinpoint, deep learning is an attractive approach to semiconductor companies.

“When you have masses of data, and you know what the outcome is but have no clear idea of what the causality is, (deep learning) can bring a complex model of causality that is very hard to do with manual methods,” said Rowen, an IEEE fellow who earlier was the CEO of Tensilica Inc.

The magic of deep learning, Rowen said, is that the learning process is highly automated and “doesn’t require a fab expert to look at the particular defect patterns.”

“It really is a rather brute force, naïve method. You don’t really know what the constituent patterns are that lead to these particular failures. But if you have enough examples that relate inputs to outputs, to defects or to failures, then you can use deep learning.”

Juan Rey, senior director of engineering at Mentor Graphics, said Mentor engineers have started investigating deep-learning techniques which could improve models of the lithography process steps, a complex issue that Rey said “is an area where deep neural networks and machine learning seem to be able to help.”

Juan Rey, Mentor Graphics

In the lithography process “we need to create an approximate model of what needs to be analyzed. For example, for photolithography specifically, there is the transition between dark and clear areas, where the slope of intensity for that transition zone plays a very clear role in the physics of the problem being solved. The problem tends to be that the design, the exact formulation, cannot be used in every space, and we are limited by the computational resources. We need to rely on a few discrete measurements, perhaps a few tens of thousands, maybe more, but it still is a discrete data set, and we don’t know if that is enough to cover all the cases when we model the full chip,” he said.

“Where we see an opportunity for deep learning is to try to do an interpretation for that problem, given that an exhaustive analysis is impossible. Using these new types of algorithms, we may be able to move from a problem that is continuous to a problem with a discrete data set.”

Mentor seeks to cooperate with academia and with research consortia such as IMEC. “We want to find the right research projects to sponsor between our research teams and academic teams. We hope that we can get better results with these new types of algorithms, and in the longer term with the new hardware that is being developed,” Rey said.

Many companies are developing specialized processors to run machine-learning algorithms, including non-Von Neumann, asynchronous architectures, which could offer several orders of magnitude less power consumption. “We are paying a lot of attention to the research, and would like to use some of these chips to solve some of the problems that the industry has, problems that are not very well served right now,” Rey said.

While power savings can still be gained with synchronous architectures, Rey said brain-inspired projects such as Qualcomm’s Zeroth processor, or the use of memristors being developed at H-P Labs, may be able to deliver significant power savings. “These are all worth paying attention to. It is my feeling that different architectures may be needed to deal with unstructured data. Otherwise, total power consumption is going through the roof. For unstructured data, these types of problem can be dealt with much better with neuromorphic computers.”

The use of deep learning techniques is moving beyond the biggest players, such as Google, Amazon, and the like. Just as various system integrators package the open source modules of the Hadoop data base technology into a more-secure offering, several system integrators are offering workstations packaged with the appropriate deep-learning tools.

Deep learning has evolved to play a role in speech recognition used in Amazon’s Echo. Source: Amazon

Robert Stober, director of systems engineering at Bright Computing, bundles AI software and tools with hardware based on Nvidia or Intel processors. “Our mission statement is to deploy deep learning packages, infrastructure, and clusters, so there is no more digging around for weeks and weeks by your expensive data scientists,” Stober said.

Deep learning is driving new the need for new types of processors as well as high-speed interconnects. Tim Miller, senior vice president at One Stop Systems, said that training the neural networks used in deep learning is an ideal task for GPUs because they can perform parallel calculations, sharply reducing the training time. However, GPUs often are large and require cooling, which most systems are not equipped to handle.

David Kanter, principal consultant at Real World Technologies, said “as I look at what’s driving the industry, it’s about convolutional neural networks, and using general-purpose hardware to do this is not the most efficient thing.”

However, research efforts focused on using new materials or futuristic architectures may over-complicate the situation for data scientists outside of the research arena. At the International Electron Devices Meeting (IEDM 2017), several research managers discussed using spin torque magnetic (STT-MRAM) technology, or resistive RAMs (ReRAM), to create dense, power-efficient networks of artificial neurons.

While those efforts are worthwhile from a research standpoint, Kanter said “when proving a new technology, you want to minimize the situation, and if you change the software architecture of neural networks, that is asking a lot of programmers, to adopt a different programming method.”

While Nvidia, Intel, and others battle it out at the high end for the processors used in training the neural network, the inference engines which use the results of that training must be less expensive and consume far less power.

Kanter said “today, most inference processing is done on general-purpose CPUs. It does not require a GPU. Most people I know at Google do not use a GPU. Since the (inference processing) workload load looks like the processing of DSP algorithms, it can be done with special-purpose cores from Tensilica (now part of Cadence) or ARC (now part of Synopsys). That is way better than any GPU,” Kanter said.

Rowen was asked if the end-node inference engine will blossom into large volumes. “I would emphatically say, yes, powerful inference engines will be widely deployed” in markets such as imaging, voice processing, language recognition, and modeling.

“There will be some opportunity for stand-alone inference engines, but most IEs will be part of a larger system. Inference doesn’t necessarily need hundreds of square millimeters of silicon. But it will be a major sub-system, widely deployed in a range of SoC platforms,” Rowen said.

Kanter noted that Nvidia has a powerful inference engine processor that has gained traction in the early self-driving cars, and Google has developed an ASIC to process its Tensor deep learning software language.

In many other markets, what is needed are very low power consumption IEs that can be used in security cameras, voice processors, drones, and many other markets. Nvidia CEO Jen Hsung Huang, in a blog post early this year, said that deep learning will spur demand for billions of devices deployed in drones, portable instruments, intelligent cameras, and autonomous vehicles.

“Someday, billions of intelligent devices will take advantage of deep learning to perform seemingly intelligent tasks,” Huang wrote. He envisions a future in which drones will autonomously find an item in a warehouse, for example, while portable medical instruments will use artificial intelligence to diagnose blood samples on-site.

In the long run, that “billions” vision may be correct, Kanter said, adding that the Nvidia CEO, an adept promoter as well as an astute company leader, may be wearing his salesman hat a bit.

“Ten years from now, inference processing will be widespread, and many SoCs will have an inference accelerator on board,” Kanter said.

Mentor Graphics U2U Meeting April 26 in Santa Clara

Monday, April 11th, 2016

thumbnail

Mentor Graphics’ User2User meeting will be held in Santa Clara on April 26, 2016. The meeting is a highly interactive, in-depth technical conference focused on real world experiences using Mentor tools to design leading-edge products.

Admission and parking for User2User is free and includes all technical sessions, lunch and a networking reception at the end of the day. Interested parties can register on-line in advance.

Wally Rhines, Chairman and CEO of Mentor Graphics, will kick things off at 9:00am with a keynote talk on “Merger Mania.“ Wally notes that in 2015, the transaction value of semiconductor mergers was at an all-time historic high.  What is much more remarkable is that the average size of the merging companies is five times as large as in the past five years, he said. This major change in the structure of the semiconductor industry suggests that there will be changes that affect everything from how we define and design products to how efficiently we develop and manufacture them. Dr. Rhines will examine the data and provide conclusions and predictions.

He will be followed by another keynote talk at 10:00 by Zach Shelby, VP of Marketing for the Internet of Things at ARM. Zach was co-founder of Sensinode, where he was CEO, CTO and Chief Nerd for the ground-breaking company before its acquisition by ARM. Before starting Sensinode, Zach led wireless networking research at the Centre for Wireless Communications and at the Technical Research Center of Finland.

After user sessions and lunch, a panel will convene at 1:00pm to address the topic “Ripple or Tidal Wave: What’s driving the next wave of innovation and semiconductor growth?” Technology innovation was once fueled by the personal computer, communications, and mobile devices. Large capital investment and startup funding was rewarded with market growth and increased silicon shipments. Things are certainly consolidating, perhaps slowing down in the semiconductor market, so what’s going to drive the next wave of growth?  What types of designs will be staffed and funded? Is it IoT?  Wearables?  Automotive?  Experts will address these and other questions and examine what is driving growth and what innovation is yet to come.

Attendees can pick from nine technical tracks focused on AMS Verification, Calibre I and II, Emulation, Functional Verification, High Speed, IC Digital Implementation, PCB Flow, and Silicon Test & Yield Solutions. You’ll hear cases studies directly from users and also updates from Mentor Graphics experts.

These user sessions will be held at 11:10-12:00am, 2:00-2:50pm and 3:10-5:00pm.

A few of the highlights:

  • Oracle’s use of advanced fill techniques for improving manufacturing yield
  • How Xilinx built a custom ESD verification methodology on the Calibre platform
  • Qualcomm used emulation for better RTL design exploration for power, leading to more accurate power analysis and sign-off at the gate level
  • Micron’s experience with emulation, a full environment for debug of SSD controller designs, plus future plans for emulation
  • Microsoft use of portable stimulus to increase productivity, automate the creation of high-quality stimulus, and increase design quality
  • Formal verification at MicroSemi to create a rigorous, pre-code check-in review process that prevents bugs from infecting the master RTL
  • A methodology for modeling, simulation of highly integrated multi-die package designs at SanDisk
  • How Samsung and nVidia use new Automatic RTL Floorplanning capabilities on their advanced SoC designs
  • Structure test at AMD: traditional ATPG and Cell-Aware ATPG flows, as well as verification flows and enhancements

Other users presenting include experts from Towerjazz, Broadcom, GLOBALFOUNDRIES, Silicon Creations, MaxLinear, Silicon Labs, Marvell, HiSilicon, Qualcomm, Soft Machines, Agilent, Samtec, Honewell, ST Microelectronics, SHLC, ViaSat, Optimum, NXP, ON Semiconductor and MCD.

The day winds up with a closing session and networking reception from 5:00-6:00pm.

Registration is from 8:00-9:00am in the morning.

3DIC Technology Drivers and Roadmaps

Monday, June 22nd, 2015

thumbnail

By Ed Korczynski, Sr. Technical Editor

After 15 years of targeted R&D, through-silicon via (TSV) formation technology has been established for various applications. Figure 1 shows that there are now detailed roadmaps for different types of 3-dimensional (3D) ICs well established in industry—first-order segmentation based on the wiring-level/partitioning—with all of the unit-processes and integration needed for reliable functionality shown. Using block-to-block integration with 5 micron lines at leading international IC foundries such as GlobalFoundries, systems stacking logic and memory such as the Hybrid Memory Cube (HMC) are now in production.

Fig. 1: Today’s 3D technology landscape segmented by wiring-level, showing cross-sections of typical 2-tier circuit stacks, and indicating planned reductions in contact pitches. (Source: imec)

“There are interposers for high-end complex SOC design with good yield,” informed Eric Beyne, Scientific Director Advanced Packaging & Interconnect for imec in an exclusive interview with Solid State Technology. ““For a systems company, once you’ve made the decision to go 3D there’s no way back,” said Beyne. “If you need high-bandwidth memory, for example, then you’re committed to some sort of 3D. The process is happening today.” Beyne is scheduled to talk about 3D technology driven by 3D application requirements in the imec Technology Forum to be held July 13 in San Francisco.

Adaptation of TSV for stacking of components into a complete functional system is key to high-volume demand. Phil Garrou, packaging technologist and SemiMD blogger, reported from the recent ConFab that Hynix is readying a second generation of high-bandwidth memory (HBM 2) for use in high performance computing (HPC) such as graphics, with products already announced like Pascal from Nvidia and Greenland from AMD.

For a normalized 1 cm2 of silicon area, wide-IO memory needs 1600 signal pins (not counting additional power and ground pins) so several thousand TSV are needed for high-performance stacked DRAM today, while in more advanced memory architectures it could go up by another factor of 10. For wide-IO HVM-2 (or Wide-IO2) the silicon consumed by IO circuitry is maybe 6 cm2 today, such that a 3D stack with shorter vertical connections would eliminate many of the drivers on the chip and would allow scaling of the micro-bumps to perhaps save a total of 4 cm2 in silicon area. 3D stacks provide such trade-offs between design and performance, so the best results are predicted for 3DICs where the partitioning can be re-done at the gate or transistor level. For example, a modern 8-core microprocessor could have over 50% of the silicon area consumed by L3-cache-memory and IO circuitry, and moving from 2D to 3D would reduce total wire-lengths and interconnect power consumptions by >50%.

There are inherent thresholds based on the High:Width ratio (H:W) that determine costs and challenges in process integration of TSV:

-    10:1 ratio is the limit for the use of relatively inexpensive physical vapor deposition (PVD) for the Cu barrier/seed (B/S),

-    20:1 ratio is the limit for the use of atomic-layer deposition (ALD) for B/S and electroless deposition (ELD) for Cu fill with 1.5 x 30 micron vias on the roadmap for the far future,

-    30:1 ratio and greater is unproven as manufacturable, though novel deposition technologies continue to be explored.

TSV Processing Results

The researchers at imec have evaluated different ways of connecting TSV to underlying silicon, and have determined that direct connections to micro-bumps are inherently superior to use of any re-distribution layer (RDL) metal. Consequently, there is renewed effort on scaling of micro-bump pitches to be able to match up with TSV. The standard minimum micro-bump pitch today of 40 micron has been shrunk to 20, and imec is now working on 10 micron with plans to go to 5 micron. While it may not help with TSV connections, an RDL layer may still be needed in the final stack and the Cu metal over-burden from TSV filling has been shown by imec to be sufficiently reproducible to be used as the RDL metal. The silicon surface area covered by TSV today is a few percents not 10s of percents, since the wiring level is global or semi-global.

Regarding the trade-offs between die-to-wafer (D2W) and wafer-to-wafer (W2W) stacking, D2W seems advantageous for most near-term solutions because of easier design and superior yield. D2W design is easier because the top die can be arbitrarily smaller silicon, instead of the identically sized chips needed in W2W stacks. Assuming the same defectivity levels in stacking, D2W yield will almost always be superior to W2W because of the ability to use strictly known-good-die. Still, there are high-density integration concepts out on the horizon that call for W2W stacking. Monolithic 3D (M3D) integration using re-grown active silicon instead of TSV may still be used in the future, but design and yield issues will be at least comparable to those of W2W stacking.

Beyne mentioned that during the recent ECTC 2015, EV Group showed impressive 250nm overlay accuracy on 450mm wafers, proving that W2W alignment at the next wafer size will be sufficient for 3D stacking. Beyne is also excited by the fact the at this year’s ECTC there was, “strong interest in thermo-compression bonding, with 18 papers from leading companies. It’s something that we’ve been working on for many years for die-to-wafer stacking, while people had mistakenly thought that it might be too slow or too expensive.”

Thermal issues for high-performance circuitry remain a potential issue for 3D stacking, particularly when working with finFETs. In 2D transistors the excellent thermal conductivity of the underlying silicon crystal acts like a built-in heat-sink to diffuse heat away from active regions. However, when 3D finFETs protrude from the silicon surface the main path for thermal dissipation is through the metal lines of the local interconnect stack, and so finFETs in general and stacks of finFETs in particular tend to induce more electro-migration (EM) failures in copper interconnects compared to 2D devices built on bulk silicon.

3D Designs and Cost Modeling

At a recent North California Chapter of the American Vacuum Society (NCCAVS) PAG-CMPUG-TFUG Joint Users Group Meeting discussing 3D chip technology held at Semi Global Headquarters in San Jose, Jun-Ho Choy of Mentor Graphics Corp. presented on “Electromigration Simulation Flow For Chip-Scale Parametric Failure Analysis.” Figure 2 shows the results from use of a physics-based model for temperature- and residual-stress-aware void nucleation and growth. Mentor has identified new failure mechanisms in TSV that are based on coefficient of thermal expansion (CTE) mismatch stresses. Large stresses can develop in lines near TSV during subsequent thermal processing, and the stress levels are layout dependent. In the worst cases the combined total stress can exceed the critical level required for void nucleation before any electrical stressing is applied. During electrical stress, EM voids were observed to initially nucleate under the TSV centers at the landing-pad interfaces even though these are the locations of minimal current-crowding, which requires proper modeling of CTE-mismatch induced stresses to explain.

Fig. 2: Calibration of an Electronic Design Automation (EDA) tool allows for accurate prediction of transistor performance depending on distance from a TSV. (Source: Mentor Graphics)

Planned for July 16, 2015 at SEMICON West in San Francisco, a presentation on “3DIC Technology Past, Present and Future” will be part of one of the side Semiconductor Technology Sessions (STS). Ramakanth Alapati, Director of Packaging Strategy and Marketing, GLOBALFOUNDRIES, will discuss the underlying economic, supply chain and technology factors that will drive productization of 3DIC technology as we know it today. Key to understanding the dynamic of technology adaptation is using performance/$ as a metric.