Part of the  

Solid State Technology


The Confab


About  |  Contact

Posts Tagged ‘STT-MRAM’

Deep Learning Could Boost Yields, Increase Revenues

Thursday, March 23rd, 2017


By Dave Lammers, Contributing Editor

While it is still early days for deep-learning techniques, the semiconductor industry may benefit from the advances in neural networks, according to analysts and industry executives.

First, the design and manufacturing of advanced ICs can become more efficient by deploying neural networks trained to analyze data, though labelling and classifying that data remains a major challenge. Also, demand will be spurred by the inference engines used in smartphones, autos, drones, robots and other systems, while the processors needed to train neural networks will re-energize demand for high-performance systems.

Abel Brown, senior systems architect at Nvidia, said until the 2010-2012 time frame, neural networks “didn’t have enough data.” Then, a “big bang” occurred when computing power multiplied and very large labelled data sets grew at Amazon, Google, and elsewhere. The trifecta was complete with advances in neural network techniques for image, video, and real-time voice recognition, among others.

During the training process, Brown noted, neural networks “figure out the important parts of the data” and then “converge to a set of significant features and parameters.”

Chris Rowen, who recently started Cognite Ventures to advise deep-learning startups, said he is “becoming aware of a lot more interest from the EDA industry” in deep learning techniques, adding that “problems in manufacturing also are very suitable” to the approach.

Chris Rowen, Cognite Ventures

For the semiconductor industry, Rowen said, deep-learning techniques are akin to “a shiny new hammer” that companies are still trying to figure out how to put to good use. But since yield questions are so important, and the causes of defects are often so hard to pinpoint, deep learning is an attractive approach to semiconductor companies.

“When you have masses of data, and you know what the outcome is but have no clear idea of what the causality is, (deep learning) can bring a complex model of causality that is very hard to do with manual methods,” said Rowen, an IEEE fellow who earlier was the CEO of Tensilica Inc.

The magic of deep learning, Rowen said, is that the learning process is highly automated and “doesn’t require a fab expert to look at the particular defect patterns.”

“It really is a rather brute force, naïve method. You don’t really know what the constituent patterns are that lead to these particular failures. But if you have enough examples that relate inputs to outputs, to defects or to failures, then you can use deep learning.”

Juan Rey, senior director of engineering at Mentor Graphics, said Mentor engineers have started investigating deep-learning techniques which could improve models of the lithography process steps, a complex issue that Rey said “is an area where deep neural networks and machine learning seem to be able to help.”

Juan Rey, Mentor Graphics

In the lithography process “we need to create an approximate model of what needs to be analyzed. For example, for photolithography specifically, there is the transition between dark and clear areas, where the slope of intensity for that transition zone plays a very clear role in the physics of the problem being solved. The problem tends to be that the design, the exact formulation, cannot be used in every space, and we are limited by the computational resources. We need to rely on a few discrete measurements, perhaps a few tens of thousands, maybe more, but it still is a discrete data set, and we don’t know if that is enough to cover all the cases when we model the full chip,” he said.

“Where we see an opportunity for deep learning is to try to do an interpretation for that problem, given that an exhaustive analysis is impossible. Using these new types of algorithms, we may be able to move from a problem that is continuous to a problem with a discrete data set.”

Mentor seeks to cooperate with academia and with research consortia such as IMEC. “We want to find the right research projects to sponsor between our research teams and academic teams. We hope that we can get better results with these new types of algorithms, and in the longer term with the new hardware that is being developed,” Rey said.

Many companies are developing specialized processors to run machine-learning algorithms, including non-Von Neumann, asynchronous architectures, which could offer several orders of magnitude less power consumption. “We are paying a lot of attention to the research, and would like to use some of these chips to solve some of the problems that the industry has, problems that are not very well served right now,” Rey said.

While power savings can still be gained with synchronous architectures, Rey said brain-inspired projects such as Qualcomm’s Zeroth processor, or the use of memristors being developed at H-P Labs, may be able to deliver significant power savings. “These are all worth paying attention to. It is my feeling that different architectures may be needed to deal with unstructured data. Otherwise, total power consumption is going through the roof. For unstructured data, these types of problem can be dealt with much better with neuromorphic computers.”

The use of deep learning techniques is moving beyond the biggest players, such as Google, Amazon, and the like. Just as various system integrators package the open source modules of the Hadoop data base technology into a more-secure offering, several system integrators are offering workstations packaged with the appropriate deep-learning tools.

Deep learning has evolved to play a role in speech recognition used in Amazon’s Echo. Source: Amazon

Robert Stober, director of systems engineering at Bright Computing, bundles AI software and tools with hardware based on Nvidia or Intel processors. “Our mission statement is to deploy deep learning packages, infrastructure, and clusters, so there is no more digging around for weeks and weeks by your expensive data scientists,” Stober said.

Deep learning is driving new the need for new types of processors as well as high-speed interconnects. Tim Miller, senior vice president at One Stop Systems, said that training the neural networks used in deep learning is an ideal task for GPUs because they can perform parallel calculations, sharply reducing the training time. However, GPUs often are large and require cooling, which most systems are not equipped to handle.

David Kanter, principal consultant at Real World Technologies, said “as I look at what’s driving the industry, it’s about convolutional neural networks, and using general-purpose hardware to do this is not the most efficient thing.”

However, research efforts focused on using new materials or futuristic architectures may over-complicate the situation for data scientists outside of the research arena. At the International Electron Devices Meeting (IEDM 2017), several research managers discussed using spin torque magnetic (STT-MRAM) technology, or resistive RAMs (ReRAM), to create dense, power-efficient networks of artificial neurons.

While those efforts are worthwhile from a research standpoint, Kanter said “when proving a new technology, you want to minimize the situation, and if you change the software architecture of neural networks, that is asking a lot of programmers, to adopt a different programming method.”

While Nvidia, Intel, and others battle it out at the high end for the processors used in training the neural network, the inference engines which use the results of that training must be less expensive and consume far less power.

Kanter said “today, most inference processing is done on general-purpose CPUs. It does not require a GPU. Most people I know at Google do not use a GPU. Since the (inference processing) workload load looks like the processing of DSP algorithms, it can be done with special-purpose cores from Tensilica (now part of Cadence) or ARC (now part of Synopsys). That is way better than any GPU,” Kanter said.

Rowen was asked if the end-node inference engine will blossom into large volumes. “I would emphatically say, yes, powerful inference engines will be widely deployed” in markets such as imaging, voice processing, language recognition, and modeling.

“There will be some opportunity for stand-alone inference engines, but most IEs will be part of a larger system. Inference doesn’t necessarily need hundreds of square millimeters of silicon. But it will be a major sub-system, widely deployed in a range of SoC platforms,” Rowen said.

Kanter noted that Nvidia has a powerful inference engine processor that has gained traction in the early self-driving cars, and Google has developed an ASIC to process its Tensor deep learning software language.

In many other markets, what is needed are very low power consumption IEs that can be used in security cameras, voice processors, drones, and many other markets. Nvidia CEO Jen Hsung Huang, in a blog post early this year, said that deep learning will spur demand for billions of devices deployed in drones, portable instruments, intelligent cameras, and autonomous vehicles.

“Someday, billions of intelligent devices will take advantage of deep learning to perform seemingly intelligent tasks,” Huang wrote. He envisions a future in which drones will autonomously find an item in a warehouse, for example, while portable medical instruments will use artificial intelligence to diagnose blood samples on-site.

In the long run, that “billions” vision may be correct, Kanter said, adding that the Nvidia CEO, an adept promoter as well as an astute company leader, may be wearing his salesman hat a bit.

“Ten years from now, inference processing will be widespread, and many SoCs will have an inference accelerator on board,” Kanter said.

MRAM Takes Center Stage at IEDM 2016

Monday, December 12th, 2016


By Dave Lammers, Contributing Editor

The IEDM 2016 conference, held in early December in San Francisco, was somewhat of a coming-out party for magneto-resistive memory (MRAM). The MRAM presentations at IEDM were complemented by a special MRAM-focused poster session – organized by the IEEE Magnetics Society in cooperation with the IEEE Electron Devices Society (EDS) – with 33 posters and a lively crowd.

And in the opening keynote speech of the 62nd International Electron Devices Meeting, Seok-hee Lee, executive vice president at SK Hynix (Seoul), set the stage by saying that the race is on between DRAM and emerging memories such as MRAM. “Originally, people thought that DRAM scaling would stop. Then engineers in the DRAM and NAND worlds worked hard and pushed out the end further in the future,” he said.

While cautioning that MRAM bit cells are larger than in DRAM and thus more more costly, Lee said MRAM has “very strong potential in embedded memory.”

SK Hynix is not the only company with a full-blown MRAM development effort underway. Samsung, which earlier bought MRAM startup Grandis and which has a materials-related research relationship with IBM, attracted a standing-room-only crowd to its MRAM paper at IEDM. TSMC is working with TDK on its program, and Sony is using 300mm wafers to build high-performance MRAMs for startup Avalanche Technology.

And one knowledgeable source said “the biggest processor company also has purchased a lot of equipment” for its MRAM development effort.

Dave Eggleston, vice president of emerging memory at GlobalFoundries, said he believes GlobalFoundries is the furthest along on the MRAM optimization curve, partly due to its technology and manufacturing partnership with Everspin Technologies (Chandler, Ariz.). Everspin has been working on MRAM for more than 20 years, and has shipped nearly 60 million discrete MRAMs, largely to the cache buffering and industrial markets.

GlobalFoundries has announced plans to use embedded STT-MRAM in its 22FDX platform, which uses fully-depleted SOI technology, as early as 2018.

Future versions of MRAM– such as spin orbit torque (SOT) MRAM and Voltage Controlled MRAM — could compete with SRAM and DRAM. Analysts said today’s spin-transfer torque STT-MRAM – referring to the torque that arises from the transfer of electron spins to the free magnetic layer — is vying for commercial adoption as ever-faster processors need higher performance memory subsystems.

STT-MRAM is fast enough to fit in as a new memory layer below the processor and the SRAM-based L1/L2 cache layers, and above DRAM and storage-level NAND flash layers, said Gary Bronner, vice president of research at Rambus Inc.

With good data retention and speed, and medium density, MRAM “may have advantages in the lower-level caches” of systems which have large amounts of on-chip SRAM, Bronner said, due in part to MRAM’s smaller cell size than six-transistor SRAM. While DRAM in the sub-20nm nodes faces cost issues as its moves to more complex capacitor structures, Bronner said that “thus far STT-MRAM) is not cheaper than DRAM.”

IBM researchers, which pioneered the spin-transfer torque approach to MRAM, are working on a high-performance MRAM technology which could be used in servers.

As of now, MRAM density is limited largely by the size of the transistors required to drive sufficient current to the magnetic tunnel junction (MTJ) to flip its magnetic orientation. Dan Edelstein, an IBM fellow working on MRAM development at IBM Research, said “it is a tall order for MRAM to replace DRAM. But MRAM could be used in system-level memory architectures and as an embedded memory technology.”

PVD and etch challenges

Edelstein, who was a key figure in developing copper interconnects at IBM some twenty years ago, said MRAM only requires a few extra mask layers to be integrated into the BEOL in logic. But there remain major challenges in improving the throughput of the PVD deposition steps required to deposit the complex material stack and to control the interfacial layers.

The PVD steps must deposit approximately 30 layers and control them to Angstrom-level precision. Deposition must occur under very low base pressure, and in oxygen- and water-vapor free environments. While tool vendors are working on productization of 300mm MRAM deposition tools, Edelstein said keeping particles under control and minimizing the maintenance and chamber cleaning are all challenging.

Etching the complex materials stack is even harder. Chemical RIE is not practical for MRAMs at this point, and using ion beam etching (IBE) presents challenges in terms of avoiding re-deposition of material sputtered off during the IBE etch steps for the high-aspect-ratio MTJs.

For the tool vendors, MRAMs present challenges as companies go from R&D to high-volume manufacturing, Edelstein said.

A Samsung MRAM researcher, Y.J. Song, briefly described IBE challenges during an IEDM presentation describing an embedded STT-MRAM with a respectable 8-Mbit density and a cell size of .0364 sq. micron. “We worked to optimize the contact etching,” using IBE etch during the patterning steps, he said. The short fail rate was reduced, while keeping the processing temperature at less than 350°C, Song said.

Samsung embedded an STT-MRAM module in the copper back end of the line (BEOL) of a 28nm logic process. (Source: Samsung presentation at IEDM 2016).

Many of the presentations at IEDM described improvements in key parameters, such as the tunnel magnetic resistance (TMR), cell size, data retention, and read error rates at high temperatures or low operating voltages.

An SK Hynix presentation described a 4-Gbit STT-MRAM optimized as a stand-alone, high-density memory. “There still are reliability issues for high-density MRAM memory,” said SK Hynix’s S.-W. Chung. The industry needs to boost the TMR “as high as possible” and work on improving the “not sufficiently long” retention times.

At high temperatures, error rates tend to rise, a concern in certain applications. And since devices are subjected to brief periods of high temperatures during reflow soldering, that issue must be dealt with as well, detailed by a Bosch presentation at IEDM.

Cleans and encapsulation important

Gouri Sankar Kar, who is coordinating the MRAM research program at the Imec consortium (Leuven, Belgium), said one challenge is to reduce the cell size and pitch without damaging the magnetic properties of the magnetic tunnel junction. For the 28nm logic node, embedded MRAM would be in the range of a 200nm pitch and 45nm critical dimensions (CDs). At the IEDM poster session, Imec presented an 8nm cell size STT-MRAM that could intersect the 10nm logic node, with the MRAM pitch in the 100nm range. GlobalFoundries, Micron, Qualcomm, Sony and TSMC are among the participants in the Imec MRAM effort.

Kar said in addition to the etch challenges, post-patterning treatment and the encapsulation liner can have a major impact on MTJ materials selection. “Some metals can be cleaned immediately, and some not. For the materials stack, patterning (litho and etch) and clean optimization are crucial.”

“Chemical etch (RIE) is not really possible at this stage. All the tool vendors are working on physical sputter etch (IBE) where they can limit damage. But I would say all the major tool vendors at this point have good tools,” Kar said.

To reach volume manufacturing, tool vendors need to improve the tool up-time and reduce the maintenance cycles. There is a “tail bits” relationship between the rate of bit failures and the health of the chambers that still needs improvement. “The cleanup steps after etching are very, very critical” to the overall effort to improving the cost effectiveness of MRAM, Kar said, adding that he is “very positive” about the future of MRAM technology.

A complete flow at AMAT

Applied Materials is among the equipment companies participating in the Imec program, with TEL and Canon-Anelva also heavily involved. Beyond that, Applied has developed a complete MRAM manufacturing flow at the company’s Dan Maydan Center in Santa Clara, and presented its cooperative work with Qualcomm on MRAM development at IEDM.

In an interview, Er-Xuan Ping, the Applied Materials managing director in charge of memory and materials technologies, said about 20 different layers, including about ten different materials, must be deposited to create the magnetic tunnel junctions. As recently as a few years ago, throughput of this materials stack was “extremely slow,” he said. But now Applied’s multi-cathode PVD tool, specially developed for MRAM deposition, can deposit 5 Angstrom films in just a few seconds. Throughput is approaching 20 wafers per hour.

Applied Materials “basically created a brand-new PVD chamber” for STT-MRAM, and he said the tool has a new e-chuck, optimized chamber walls and a multi-cathode design.

The MRAM-optimized PVD tool does not have an official name yet, and Ping said he refers to it as multi-cathode PVD. With MRAM requiring deposition of so many different metals and other materials, the Applied tool does not require the wafer to be moved in and out, increasing efficiency. The shape and structure of the chamber wall, Ping said, allow absorption of downstream plasma material so that it doesn’t come back as particles.

For etch, Applied has worked to create etching processes that result in very low bit failure rates, but at relatively relaxed pitches in the 130-200nm range. “We have developed new etch technologies so we don’t think etch will be a limiting factor. But etch is still challenging, especially for cells with 50nm and smaller cell sizes. We are still in unknown territory there,” said Ping.

Jürgen Langer, R&D manager at Singulus Technology (Frankfurt, Germany), said Singulus has developed a production-optimized PVD tool which can deposit “30 material layers in the Angstrom range. We can get 20 wafers per hour throughputs, so I would say this is not a beta tool, it is for production.”

Jürgen Langer, R&D manager, presented a poster on MRAM deposition from Singulus Technology (Frankfurt, Germany).

Where does it fit?

Once the production challenges of making MRAM are ironed out, the question remains: Where will MRAM fit in the systems of tomorrow?

Tom Coughlin, a data storage consultant based in Atascadero, Calif., said embedded MRAM “could have a very important effect for industrial and consumer devices. MRAM could be part of the memory cache layers, providing power advantages over other non-volatile devices.” And with its ability to power on and power off without expending energy, MRAM could reduce overall power consumption in smart phones, cutting in to the SRAM and NOR sectors.

“MRAM definitely has a niche, replacing some DRAM and SRAM. It may replace NOR. Flash will continue for mass storage, and then there is the 3D Crosspoint from Intel. I do believe MRAM has a solid basis for being part of that menagerie. We are almost in a Cambrian explosion in memory these days,” Coughlin said.