Part of the  

Solid State Technology


The Confab


About  |  Contact

Posts Tagged ‘Deep Learning’

Deep Learning Joins Process Control Arsenal

Friday, December 8th, 2017


By David Lammers

At the 2017 Advanced Process Control (APC 2017) conference, several companies presented implementations of deep learning to find transistor defects, align lithography steps, and apply predictive maintenance.

The application of neural networks to semiconductor manufacturing was a much-discussed trend at the 2017 APC meeting in Austin, starting out with a keynote speech by Howard Witham, Texas operations manager for Qorvo Inc. Witham said artificial intelligence has brought human beings to “a point in history, for our industry and the world in general, that is more revolutionary than a small, evolutionary step.”

People in the semiconductor industry “need to take what’s out there and figure out how to apply it to your own problems, to figure out where does the machine win, and where does the brain still win?” Witham said.

At Seagate Technology, a small team of engineers stitched together largely packaged or open source software running on a conventional CPU to create a convolution neural network (CNN)-based tool to find low-level device defects.

In an APC paper entitled Automated Wafer Image Review using Deep Learning, Sharath Kumar Dhamodaran, an engineer/data scientist based at Seagate’s Bloomington, Minn. facility, said wafers go through several conventional visual inspection steps to identify and classify defects coming from the core manufacturing process. The low-level defects can be identified by the human eye but are prone to misclassification due to the manual nature of the inspections.

Each node in a convolutional layer takes a linear combination of the inputs from nodes in the previous layer, and then applies a nonlinearity to generate an output and pass it to nodes in the next layer. Source: Seagate

“Some special types of low-level defects can occur anywhere in the device. While we do have visual inspections, mis-classifications are very common. By using deep learning, we can solve this issue, achieve higher levels of confidence, and lower mis-classification rates,” he said.

CNNs, Hadoop, and Apache

The deep learning system worked well but required a fairly extensive training cycle, based on a continuously evolving set of training images. The images were replicated from an image server into an Apache HBASE table on a Hadoop cluster. The HBASE table was updated every time images were added to the image server.

To improve the neural network training steps, the team artificially created zoomed-in copies of the same image to enlarge the size of the training set. This image augmentation, which came as part of a software package, was used so that the model did not see the same image twice, he said.

“Our goal was to demonstrate the power of our models, so we did no feature engineering and only minimal pre-processing,” Dhamodaran said.

A Convolution Neural Network (CNN)

Neural networks are trained with many processing layers, which is where the term deep learning comes from. The CNN’s processing layers are sensitive to different features, such as edges, color, and other attributes. This heterogeneity “is exploited to construct sophisticated architectures in which the neurons and layers are connected, and plays a primary role in determining the network’s ability to produce meaningful results,” he said.

The model was trained initially with about 7,000 images over slightly less than six hours on a conventional CPU. “If training the model had been done on a high-performance GPU, it would have taken less than a minute for several thousand images,” Dhamodaran said.

The team used commercially available software, writing code in Python and using Ubuntu, Tensorflow, Keras and other data science packages.

After the deep learning system was put into use, the rate of false negatives on incoming images was excellent. Dhamodaran said the defect classification process was much better than the manual system, with 95 percent of defects correctly classified and the remaining five percent mis-classifications. With the manual system, images were correctly classified only 60 percent of the time.

“None of the conventional machine learning models could do what deep learning could do. But deep learning has its own limitations. Since it is a neural network it is a black box. Process engineers in a manufacturing setting would like to know ‘How does this classification happen?’ That is quite challenging.”

The team created a dashboard so that when an unseen defect occurs the system can incorporate feedback from the operator, feedback which can be incorporated in the next training cycle, or used to create the training set for different processes.

The project involved fewer than six people, and took about six months to put all the pieces together. The team deployed the system on a workstation in the fab, achieving better-than-acceptable decision latency during production.

While Dhamodaran said future implementations of deep learning can be developed in a shorter time, building on what the team learned in the first implementation. He declined to detail the number of features that the initial system dealt with.

Seagate engineer Tri Nguyen, a co-author, said future work involves deploying the deep learning system to more inspection steps. “This system doesn’t do anything but image processing, and the classification is good or bad. But even with blurry images, the system can achieve a high level of confidence. It frees up time and allow operators to do some root cause analysis,” Nguyen said.

Seagate engineers Tri Nguyen and Sharath Kuman Dhamodaran developed a deep learning tool for wafer inspection that sharply reduced mis-classifications.

Python, Keras, TensorFlow

Jim Redman, president of consultancy Ergo Tech (Santa Fe, N.M.), presented deep learning work done with a semiconductor manufacturer to automate lithography alignment. Redman was unabashedly positive about the potential of neural networks for chip manufacturing applications. The movement toward deep learning, he said, “really started” from the date — 9 November 2015 – when the TensorFlow software, developed within the Google Brain group, was released under an Apache 2.0 open source license.

Other tools have further eased the development of deep learning applications, Redman added, including Keras, a high-level neural network API, written in Python and capable of running on top of TensorFlow, for enabling fast experimentation.

In just the last year or so, the application of neural networks in the chip industry has made “huge advances,” Redman said, arguing that deep learning is “a wave that is coming. It is a transformative technology that will have a transformative effect on the semiconductor industry.”

In image processing and analysis, what is difficult to do with conventional techniques often can be handled more easily by neural networks. “The beauty of neural networks is that you can take training sets and teach the model something by feeding in known data. You train the model with data points, and then you feed in unknown data.”

While Redman’s work involved lithography alignment, he said “there is no reason the same learning shouldn’t apply to etch tools or electroplaters. It is the basically the same model.”

Less Code, Lower Costs

Complex FDC modeling can involve Ph.ds with domain expertise, while deep learning can involve models with “30-40 lines of Python code,” he said, noting that the “minimal number of lines of code translates to lower costs.”

Humans, including engineers, are not adapted to look for small details in hundreds or thousands of metrology images or SPC charts. “Humans don’t do that well. Engineers still see what they want to see. We should let computers do that. When it comes to wafer analysis and log files, it is getting too complex (for human analysis). The question now is: Can we leverage these advances in machine learning to solve our problems?”

After training a model to detect distortions for a particular stepper, based on just 35 lines of Python code, Redman said the model provided “an extremely good match between the predicted values and the actual values. We have a model that lines up exactly. It is so good it is almost obscene.”

Redman said similar models could be applied to make sure etchers or electroplating machines were performing to expectations. And he said models can be continuously trained, using incoming flows of data to improve the model itself, rather than thinking of training as distinct from the application of the system.

“Most people talk about the training phase, but in fact we can train continuously. We run data through a model, and we can feed that back into the model, using the new data to continuously train,” he said.

Machine Learning for Predictive Maintenance

Benjamin Menz, of Bosch Rexroth (Lohr am Main, Germany), addressed the challenge of how to apply machine learning to predictive maintenance.

To monitor a machine’s vibration, temperature threshold, power signal, and other signals, companies have developed model-based rules to answer the question: Will it break in the next couple of days? Metz said

“Machine learning can do this in a very automatic way. You don’t need tons of data to train the network, perhaps fifty measurements. A very nice example is a turning machine. The network learned very quickly that the tool is broken, even though the human cannot see it. The new approach is clearly able to see a drop in the health index, and stop production,” he said.

Deep Learning Could Boost Yields, Increase Revenues

Thursday, March 23rd, 2017


By Dave Lammers, Contributing Editor

While it is still early days for deep-learning techniques, the semiconductor industry may benefit from the advances in neural networks, according to analysts and industry executives.

First, the design and manufacturing of advanced ICs can become more efficient by deploying neural networks trained to analyze data, though labelling and classifying that data remains a major challenge. Also, demand will be spurred by the inference engines used in smartphones, autos, drones, robots and other systems, while the processors needed to train neural networks will re-energize demand for high-performance systems.

Abel Brown, senior systems architect at Nvidia, said until the 2010-2012 time frame, neural networks “didn’t have enough data.” Then, a “big bang” occurred when computing power multiplied and very large labelled data sets grew at Amazon, Google, and elsewhere. The trifecta was complete with advances in neural network techniques for image, video, and real-time voice recognition, among others.

During the training process, Brown noted, neural networks “figure out the important parts of the data” and then “converge to a set of significant features and parameters.”

Chris Rowen, who recently started Cognite Ventures to advise deep-learning startups, said he is “becoming aware of a lot more interest from the EDA industry” in deep learning techniques, adding that “problems in manufacturing also are very suitable” to the approach.

Chris Rowen, Cognite Ventures

For the semiconductor industry, Rowen said, deep-learning techniques are akin to “a shiny new hammer” that companies are still trying to figure out how to put to good use. But since yield questions are so important, and the causes of defects are often so hard to pinpoint, deep learning is an attractive approach to semiconductor companies.

“When you have masses of data, and you know what the outcome is but have no clear idea of what the causality is, (deep learning) can bring a complex model of causality that is very hard to do with manual methods,” said Rowen, an IEEE fellow who earlier was the CEO of Tensilica Inc.

The magic of deep learning, Rowen said, is that the learning process is highly automated and “doesn’t require a fab expert to look at the particular defect patterns.”

“It really is a rather brute force, naïve method. You don’t really know what the constituent patterns are that lead to these particular failures. But if you have enough examples that relate inputs to outputs, to defects or to failures, then you can use deep learning.”

Juan Rey, senior director of engineering at Mentor Graphics, said Mentor engineers have started investigating deep-learning techniques which could improve models of the lithography process steps, a complex issue that Rey said “is an area where deep neural networks and machine learning seem to be able to help.”

Juan Rey, Mentor Graphics

In the lithography process “we need to create an approximate model of what needs to be analyzed. For example, for photolithography specifically, there is the transition between dark and clear areas, where the slope of intensity for that transition zone plays a very clear role in the physics of the problem being solved. The problem tends to be that the design, the exact formulation, cannot be used in every space, and we are limited by the computational resources. We need to rely on a few discrete measurements, perhaps a few tens of thousands, maybe more, but it still is a discrete data set, and we don’t know if that is enough to cover all the cases when we model the full chip,” he said.

“Where we see an opportunity for deep learning is to try to do an interpretation for that problem, given that an exhaustive analysis is impossible. Using these new types of algorithms, we may be able to move from a problem that is continuous to a problem with a discrete data set.”

Mentor seeks to cooperate with academia and with research consortia such as IMEC. “We want to find the right research projects to sponsor between our research teams and academic teams. We hope that we can get better results with these new types of algorithms, and in the longer term with the new hardware that is being developed,” Rey said.

Many companies are developing specialized processors to run machine-learning algorithms, including non-Von Neumann, asynchronous architectures, which could offer several orders of magnitude less power consumption. “We are paying a lot of attention to the research, and would like to use some of these chips to solve some of the problems that the industry has, problems that are not very well served right now,” Rey said.

While power savings can still be gained with synchronous architectures, Rey said brain-inspired projects such as Qualcomm’s Zeroth processor, or the use of memristors being developed at H-P Labs, may be able to deliver significant power savings. “These are all worth paying attention to. It is my feeling that different architectures may be needed to deal with unstructured data. Otherwise, total power consumption is going through the roof. For unstructured data, these types of problem can be dealt with much better with neuromorphic computers.”

The use of deep learning techniques is moving beyond the biggest players, such as Google, Amazon, and the like. Just as various system integrators package the open source modules of the Hadoop data base technology into a more-secure offering, several system integrators are offering workstations packaged with the appropriate deep-learning tools.

Deep learning has evolved to play a role in speech recognition used in Amazon’s Echo. Source: Amazon

Robert Stober, director of systems engineering at Bright Computing, bundles AI software and tools with hardware based on Nvidia or Intel processors. “Our mission statement is to deploy deep learning packages, infrastructure, and clusters, so there is no more digging around for weeks and weeks by your expensive data scientists,” Stober said.

Deep learning is driving new the need for new types of processors as well as high-speed interconnects. Tim Miller, senior vice president at One Stop Systems, said that training the neural networks used in deep learning is an ideal task for GPUs because they can perform parallel calculations, sharply reducing the training time. However, GPUs often are large and require cooling, which most systems are not equipped to handle.

David Kanter, principal consultant at Real World Technologies, said “as I look at what’s driving the industry, it’s about convolutional neural networks, and using general-purpose hardware to do this is not the most efficient thing.”

However, research efforts focused on using new materials or futuristic architectures may over-complicate the situation for data scientists outside of the research arena. At the International Electron Devices Meeting (IEDM 2017), several research managers discussed using spin torque magnetic (STT-MRAM) technology, or resistive RAMs (ReRAM), to create dense, power-efficient networks of artificial neurons.

While those efforts are worthwhile from a research standpoint, Kanter said “when proving a new technology, you want to minimize the situation, and if you change the software architecture of neural networks, that is asking a lot of programmers, to adopt a different programming method.”

While Nvidia, Intel, and others battle it out at the high end for the processors used in training the neural network, the inference engines which use the results of that training must be less expensive and consume far less power.

Kanter said “today, most inference processing is done on general-purpose CPUs. It does not require a GPU. Most people I know at Google do not use a GPU. Since the (inference processing) workload load looks like the processing of DSP algorithms, it can be done with special-purpose cores from Tensilica (now part of Cadence) or ARC (now part of Synopsys). That is way better than any GPU,” Kanter said.

Rowen was asked if the end-node inference engine will blossom into large volumes. “I would emphatically say, yes, powerful inference engines will be widely deployed” in markets such as imaging, voice processing, language recognition, and modeling.

“There will be some opportunity for stand-alone inference engines, but most IEs will be part of a larger system. Inference doesn’t necessarily need hundreds of square millimeters of silicon. But it will be a major sub-system, widely deployed in a range of SoC platforms,” Rowen said.

Kanter noted that Nvidia has a powerful inference engine processor that has gained traction in the early self-driving cars, and Google has developed an ASIC to process its Tensor deep learning software language.

In many other markets, what is needed are very low power consumption IEs that can be used in security cameras, voice processors, drones, and many other markets. Nvidia CEO Jen Hsung Huang, in a blog post early this year, said that deep learning will spur demand for billions of devices deployed in drones, portable instruments, intelligent cameras, and autonomous vehicles.

“Someday, billions of intelligent devices will take advantage of deep learning to perform seemingly intelligent tasks,” Huang wrote. He envisions a future in which drones will autonomously find an item in a warehouse, for example, while portable medical instruments will use artificial intelligence to diagnose blood samples on-site.

In the long run, that “billions” vision may be correct, Kanter said, adding that the Nvidia CEO, an adept promoter as well as an astute company leader, may be wearing his salesman hat a bit.

“Ten years from now, inference processing will be widespread, and many SoCs will have an inference accelerator on board,” Kanter said.