Part of the  

Solid State Technology


The Confab


About  |  Contact

Posts Tagged ‘HP’

Deep Learning Could Boost Yields, Increase Revenues

Thursday, March 23rd, 2017


By Dave Lammers, Contributing Editor

While it is still early days for deep-learning techniques, the semiconductor industry may benefit from the advances in neural networks, according to analysts and industry executives.

First, the design and manufacturing of advanced ICs can become more efficient by deploying neural networks trained to analyze data, though labelling and classifying that data remains a major challenge. Also, demand will be spurred by the inference engines used in smartphones, autos, drones, robots and other systems, while the processors needed to train neural networks will re-energize demand for high-performance systems.

Abel Brown, senior systems architect at Nvidia, said until the 2010-2012 time frame, neural networks “didn’t have enough data.” Then, a “big bang” occurred when computing power multiplied and very large labelled data sets grew at Amazon, Google, and elsewhere. The trifecta was complete with advances in neural network techniques for image, video, and real-time voice recognition, among others.

During the training process, Brown noted, neural networks “figure out the important parts of the data” and then “converge to a set of significant features and parameters.”

Chris Rowen, who recently started Cognite Ventures to advise deep-learning startups, said he is “becoming aware of a lot more interest from the EDA industry” in deep learning techniques, adding that “problems in manufacturing also are very suitable” to the approach.

Chris Rowen, Cognite Ventures

For the semiconductor industry, Rowen said, deep-learning techniques are akin to “a shiny new hammer” that companies are still trying to figure out how to put to good use. But since yield questions are so important, and the causes of defects are often so hard to pinpoint, deep learning is an attractive approach to semiconductor companies.

“When you have masses of data, and you know what the outcome is but have no clear idea of what the causality is, (deep learning) can bring a complex model of causality that is very hard to do with manual methods,” said Rowen, an IEEE fellow who earlier was the CEO of Tensilica Inc.

The magic of deep learning, Rowen said, is that the learning process is highly automated and “doesn’t require a fab expert to look at the particular defect patterns.”

“It really is a rather brute force, naïve method. You don’t really know what the constituent patterns are that lead to these particular failures. But if you have enough examples that relate inputs to outputs, to defects or to failures, then you can use deep learning.”

Juan Rey, senior director of engineering at Mentor Graphics, said Mentor engineers have started investigating deep-learning techniques which could improve models of the lithography process steps, a complex issue that Rey said “is an area where deep neural networks and machine learning seem to be able to help.”

Juan Rey, Mentor Graphics

In the lithography process “we need to create an approximate model of what needs to be analyzed. For example, for photolithography specifically, there is the transition between dark and clear areas, where the slope of intensity for that transition zone plays a very clear role in the physics of the problem being solved. The problem tends to be that the design, the exact formulation, cannot be used in every space, and we are limited by the computational resources. We need to rely on a few discrete measurements, perhaps a few tens of thousands, maybe more, but it still is a discrete data set, and we don’t know if that is enough to cover all the cases when we model the full chip,” he said.

“Where we see an opportunity for deep learning is to try to do an interpretation for that problem, given that an exhaustive analysis is impossible. Using these new types of algorithms, we may be able to move from a problem that is continuous to a problem with a discrete data set.”

Mentor seeks to cooperate with academia and with research consortia such as IMEC. “We want to find the right research projects to sponsor between our research teams and academic teams. We hope that we can get better results with these new types of algorithms, and in the longer term with the new hardware that is being developed,” Rey said.

Many companies are developing specialized processors to run machine-learning algorithms, including non-Von Neumann, asynchronous architectures, which could offer several orders of magnitude less power consumption. “We are paying a lot of attention to the research, and would like to use some of these chips to solve some of the problems that the industry has, problems that are not very well served right now,” Rey said.

While power savings can still be gained with synchronous architectures, Rey said brain-inspired projects such as Qualcomm’s Zeroth processor, or the use of memristors being developed at H-P Labs, may be able to deliver significant power savings. “These are all worth paying attention to. It is my feeling that different architectures may be needed to deal with unstructured data. Otherwise, total power consumption is going through the roof. For unstructured data, these types of problem can be dealt with much better with neuromorphic computers.”

The use of deep learning techniques is moving beyond the biggest players, such as Google, Amazon, and the like. Just as various system integrators package the open source modules of the Hadoop data base technology into a more-secure offering, several system integrators are offering workstations packaged with the appropriate deep-learning tools.

Deep learning has evolved to play a role in speech recognition used in Amazon’s Echo. Source: Amazon

Robert Stober, director of systems engineering at Bright Computing, bundles AI software and tools with hardware based on Nvidia or Intel processors. “Our mission statement is to deploy deep learning packages, infrastructure, and clusters, so there is no more digging around for weeks and weeks by your expensive data scientists,” Stober said.

Deep learning is driving new the need for new types of processors as well as high-speed interconnects. Tim Miller, senior vice president at One Stop Systems, said that training the neural networks used in deep learning is an ideal task for GPUs because they can perform parallel calculations, sharply reducing the training time. However, GPUs often are large and require cooling, which most systems are not equipped to handle.

David Kanter, principal consultant at Real World Technologies, said “as I look at what’s driving the industry, it’s about convolutional neural networks, and using general-purpose hardware to do this is not the most efficient thing.”

However, research efforts focused on using new materials or futuristic architectures may over-complicate the situation for data scientists outside of the research arena. At the International Electron Devices Meeting (IEDM 2017), several research managers discussed using spin torque magnetic (STT-MRAM) technology, or resistive RAMs (ReRAM), to create dense, power-efficient networks of artificial neurons.

While those efforts are worthwhile from a research standpoint, Kanter said “when proving a new technology, you want to minimize the situation, and if you change the software architecture of neural networks, that is asking a lot of programmers, to adopt a different programming method.”

While Nvidia, Intel, and others battle it out at the high end for the processors used in training the neural network, the inference engines which use the results of that training must be less expensive and consume far less power.

Kanter said “today, most inference processing is done on general-purpose CPUs. It does not require a GPU. Most people I know at Google do not use a GPU. Since the (inference processing) workload load looks like the processing of DSP algorithms, it can be done with special-purpose cores from Tensilica (now part of Cadence) or ARC (now part of Synopsys). That is way better than any GPU,” Kanter said.

Rowen was asked if the end-node inference engine will blossom into large volumes. “I would emphatically say, yes, powerful inference engines will be widely deployed” in markets such as imaging, voice processing, language recognition, and modeling.

“There will be some opportunity for stand-alone inference engines, but most IEs will be part of a larger system. Inference doesn’t necessarily need hundreds of square millimeters of silicon. But it will be a major sub-system, widely deployed in a range of SoC platforms,” Rowen said.

Kanter noted that Nvidia has a powerful inference engine processor that has gained traction in the early self-driving cars, and Google has developed an ASIC to process its Tensor deep learning software language.

In many other markets, what is needed are very low power consumption IEs that can be used in security cameras, voice processors, drones, and many other markets. Nvidia CEO Jen Hsung Huang, in a blog post early this year, said that deep learning will spur demand for billions of devices deployed in drones, portable instruments, intelligent cameras, and autonomous vehicles.

“Someday, billions of intelligent devices will take advantage of deep learning to perform seemingly intelligent tasks,” Huang wrote. He envisions a future in which drones will autonomously find an item in a warehouse, for example, while portable medical instruments will use artificial intelligence to diagnose blood samples on-site.

In the long run, that “billions” vision may be correct, Kanter said, adding that the Nvidia CEO, an adept promoter as well as an astute company leader, may be wearing his salesman hat a bit.

“Ten years from now, inference processing will be widespread, and many SoCs will have an inference accelerator on board,” Kanter said.

D2S Releases 4th-Gen IC Computational Design Platform

Friday, September 30th, 2016


By Ed Korczynski, Sr. Technical Editor

D2S ( recently released the fourth generation of its computational design platform (CDP), which enables extremely fast (400 Teraflops) and precise simulations for semiconductor design and manufacturing. The new CDP is based on NVIDIA Tesla K80 GPUs and Intel Haswell CPUs, and is architected for 24×7 cleanroom production environments. To date, 14 CDPs across four platform generations are in use by customers around the globe, including six of the latest fourth generation. In an exclusive interview with SemiMD, D2S CEO Aki Fujimura stated, “Now that GPUs and CPUs are fast-enough, they can replace other hardware and thereby free up engineering resources to focus on adding value elsewhere.”

Mask data preparation (MDP) and other aspects of IC design and manufacturing require ever-increasing levels of speed and reliability as the data sets upon which they must operate grow larger and more complex with each device generation. The Figure shows a mask needed to print arrays of sub-wavelength features includes complex curvilinear shapes which must be precisely formed even though they do not print on the wafer. Such sub-resolution assist features (SRAF) increase in complexity and density as the half-pitch decreases, so the complexity of mask data increases far more than the density of printed features.

Sub-wavelength lithography using 193nm wavelength requires ever-more complex masks to repeatably print ever smaller half-pitch (HP) features, as shown by (LEFT) a typical mask composed of complex nested curves and dots which do not print (RIGHT) in the array of 32nm HP contacts/vias represented by the small red circles. (Source: D2S)

GPUs, which were first developed as processing engines for the complex graphical content of computer games, have since emerged as an attractive option for compute-intensive scientific applications due in part to their ability to run many more computing threads (up to 500x) compared to similar-generation CPUs. “Being able to process arbitrary shapes is something that mask shops will have to do,” explained Fujimura. “The world could go 193nm or EUV at any particular node, but either way there will be more features and higher complexity within the features, and all of that points to GPU acceleration.”

The D2S CDP is engineered for high reliability inside a cleanroom manufacturing environment. A few of the fab applications where CDPs are currently being used include:

  • model-based MDP for leading-edge designs that require increasingly complex mask shapes,
  • wafer plane analysis of SEM mask images to identify mask errors that print, and
  • inline thermal-effect correction of eBeam mask writers to lower write times.

“The amount of design data required to produce photomasks for leading-edge chip designs is increasing at an exponential rate, which puts more pressure on mask writing systems to maintain reasonable write times for these advanced masks. At the same time, writing these masks requires higher exposure doses and shot counts, which can cause resist proximity heating effects that lead to mask CD errors,” stated Noriaki Nakayamada, group manager at NuFlare Technology. “D2S GPU acceleration technology significantly reduces the calculation time required to correct these resist heating effects. By employing a resist heating correction that includes the use of the D2S CDP as an OEM option on our mask writers, NuFlare estimates that it can reduce CD errors by more than 60 percent, and reduce write times by more than 20 percent.”

In the E-beam Initiative 2015 survey, the most advanced reported mask-set contained >100 masks of which ~20% could be considered ‘critical’. The just released 2016 survey disclosed that the most complex single-layer mask design written last year required 16 TB of data, however platforms like D2S’ CDP have been used to accelerate writing such that the average reported write times have decreased to a weighted average of 4 hours. Meanwhile, the longest reported mask write time decreased from 72 to 48 hours.