Solutions for Industrial Computing Blog

Solutions for Industrial Computing

The Solutions for Industrial Computing Blog is the place for conversation and discussion about industrial computers, systems and controllers; communications and connectivity; software and control; and power strategies. Here, you'll find everything from application ideas, to news and industry trends, to hot topics and cutting edge innovations.

Previous in Blog: Oh, The Humanity: You Can Now Instantly Kill A Computer For $50   Next in Blog: Programmable Logic Controllers In Practical Application
Close
Close
Close
7 comments

Building Brain-Inspired Computing Systems

Posted August 26, 2018 12:00 AM by bipin.r
Pathfinder Tags: computing neural networks

Did you know that the supercomputer IBM Watson required 85,000 watts to challenge and ultimately vanquish two Jeopardy! champions? But Watson’s conqueror, Congressman Rush Holt, relied on a far more efficient machine – the human brain – which functions on a mere 20 watts.

My research goal is to build computing systems inspired by the brain that can learn and adapt in the real world. Machine learning algorithms can now perform complex cognitive tasks such as controlling self-driving cars and language interpretation, but their use in mobile devices and sensors embedded in the real world requires new technologies with substantially lower energy and higher efficiency.

Bipin Rajendran, associate professor of electrical and computer engineering at NJIT (right), along with S. R. Nandakumar, a graduate student in electrical engineering.

At the heart of these algorithms are artificial neural networks – mathematical models of the neurons and synapses of the brain – that are fed huge amounts of data so that the parameters of the network are autonomously adjusted to learn the hidden relationships that underlie different parts of the data.

However, the implementation of these brain-inspired algorithms on conventional computers is highly inefficient, consuming huge amounts of power and time. The reason is that in current configurations, the data storage unit (memory) and the data processing unit (processor) are physically separated, and data continually shuttles back and forth. Furthermore, while the brain encodes and processes information in the time domain using electrical spike signals, popular machine learning algorithms use memory-less models of neurons for computing.

Ph.D. students Anakha V. Babu, Shruti Kulkarni, PI Dr. Bipin Rajendran and NJIT undergraduate student John Alexiades at the NJIT research showcase event.

Hence, we can improve the efficiency of computation by designing systems that seamlessly integrate storage and data processing functions and naturally capture timing-based correlations. Memristive devices, whose conductivity depends on prior signaling activity, are ideally suited for building such "in-memory computing" architectures. Our challenge is to optimize algorithms, system architectures and device technologies to build these systems based on nanoscale devices that overcome current reliability hurdles.


Editor's note: This is a sponsored blog post from the New Jersey Institute of Technology.

Reply

Interested in this topic? By joining CR4 you can "subscribe" to
this discussion and receive notification when new comments are added.
Guru

Join Date: Jul 2017
Posts: 520
Good Answers: 13
#1

Re: Building Brain-Inspired Computing Systems. Xmission Line Memories for Massively Parallel Processors

08/27/2018 2:01 PM

Toss the Bits into the Air(or copper)

In the early sixty's there was a conference with Wang, Amdahl,
and other notables in computing. One paper I recall had a
scheme for memory via transmission. Basically, you throw a
bunch of bits into the air(transmit them using radio waves)
and the space through which they travel becomes a frugal, power
efficient memory. Considering Von Neumann's abstract description
of computing, if one could perform arithmetic/logical operations
on the bits as they pass, one could create a generic computing
machine with vacuum distance as memory. One early memory technology
was sound waves on a sonic delay line constructed out of a simple
wire suspended so as not to dramatically attenuate the energy in
the wire. Computing was achieved as the bits passed from one
delay line into another. The point is that shift register(as
opposed to random access) memory based computing has historical
roots. Charge coupled device(CCD) video sensors is a more recent
example which includes integrating memory and input devices in a
distributed fashion.

Our current dominant computing model uses CMOS devices daisy chained
together. Despite over half a century of scaling to stay on the
Moore's Law curve with excellent results, that technique has finally
run out of gas. We cannot scale the insulated gate layer much
farther since we are dangerously close to atomic scale now. Tunneling
leakage means that we have hit a wall with scaling. Daisy chaining
CMOS transistors has the architectural problem of squandering 1/2cvv
buckets of energy at each link of the chain. We are now at multiple
gigahertz frequencies which means that bit widths have shrunk down
so much that multiple bits could fit on an on-chip wire. There have
been experimental ("adiabatic") circuits which abandon daisy chaining
and use techniques which modify bit values as they pass but these
circuits have not been adequately tamed. My opinion regarding the
obstacles with adiabatic circuits is that we find it psychologically
difficult to unlearn the lesson of decreasing the c in one-half c
v squared that was important in daisy chained CMOS.

If, instead, we were to choose to make a uniformly distributed c on
a single closed loop, we might have a lot more success. Since the
velocity of propagation on a transmission line needs to be as uniform
as possible, I feel that we would have the best success if we populated
outputs as uniformly(and finely grained) as possible all the way along
long, smoothly curved chip nets. Energy could be coupled into these
long nets with transmission line transformers paralleling these nets
to customize the traveling pulses to currently desired logic states.
This achieves on-chip, highly energy efficient memory which can be
modified as it passes ALU units bridging between loops of memory.
The capacitance of the ALU inputs is all seen by the memory as
distributed transmission line capacitance which does not cause cvv
losses since the inductance of the transmission line acts as the
magnetic energy storage complement of the electric energy stored in
the input gates of the ALU's. The memory loops and the ALU inputs
could operate at higher voltages than we are accustomed to since
the transmission line losses for a specific wiring resistance are
lower if the voltages on the transmission line is higher. Miller
capacitance could be used to drive pulses to accurate levels without
needing a whole additional transmission line transformer circuit.
Indeed, the transmission line transformers might be used simply to
keep a pulse train energized and not to achieve logic levels in some
designs. Please note that there is huge process advantage to not
connecting any CMOS outputs to any transmission lines directly and
that connecting CMOS outputs to adiabatic nets may well be why so
many have failed at creating workable adiabatic technologies.

This approach to adiabatic circuit design is a radical conceptual change
from current processor design. However, it might be testable without
radical semiconductor process changes until it matures. The shift
register cache memory is like dynamic MOS memory in that it has a
power requirement to retain its contents but solid state disk technology
might be used to mirror the cache in the case of power loss. While
running, the number of ports available to each cache might be higher
than the customary one(or two for video) and each port might have its
own RISC ALU that it services. At any one time all of the bits in
the cache might be available to one ALU or another enabling hugely
parallel processing, thus more closely resembling mucous-ware(brain
computing). As this architecture matures, a great deal of genius
level effort would need to be applied to compiler technologies but
that is new job security for academia and for industrial research and
development innovators.
_____________
thewildotter

Reply
Guru

Join Date: Apr 2010
Location: About 4000 miles from the center of the earth (+/-100 mi)
Posts: 9090
Good Answers: 1036
#2

Re: Building Brain-Inspired Computing Systems

08/27/2018 8:34 PM

A little background...

An artificial neural network is a network of "neurons" arranged in layers. In the first layer, the input is input data, many times from a sensor. The second layer receives its inputs from the outputs of neurons in the first layer, and so on. The final layer outputs are the answer.

Each neuron's inputs are weighted and combined together through a non-linear function to generate the output. These weights are the magic that makes the neural network work. A learning algorithm changes the values of the weights so that the network converges on the "correct answer" given a lot of examples of the data which it is supposed to learn.

The interesting thing is that the network learns from examples and adjusts the connecting weights to store that knowledge. It is truly "Artificial Intelligence" because it learns from examples and doesn't just do what it is programmed to do.

https://www.digitaltrends.com/cool-tech/what-is-an-artificial-neural-network/

An artificial neural network (ANN) can be simulated in a digital computer, and this works just fine, except that it requires a lot of computations to solve a simple problem.

A device called a Memristor is a resistor that changes resistance when voltage pulses are applied. If a memristor is used to encode the weight between neurons, it can be programmed by these pulses. It is the ideal element for this purpose. The big advantage is that this hardware solution is much faster than a digital computer simulation.

https://www.sciencedaily.com/releases/2016/06/160614142300.htm

The memristor chip that powers the new reservoir computing system. Photo: Wei Lu.

https://news.engin.umich.edu/2017/12/new-quick-learning-neural-network-powered-by-memristors/

Reply
Guru

Join Date: Jul 2017
Posts: 520
Good Answers: 13
#3

Re: Building Brain-Inspired Computing Systems. Innovation via Process vs Chip Architecture

08/28/2018 10:38 AM

On Your Marks, Get Set, FAB

Yes, the OP explicitly mentioned memristors as an example of technology which integrates memory and logic. It is indeed true that memristors are much like neurons in their function since both accumulate weighted histories of pulses experienced. I do think there could be lower power, more brain like processing of information in a memristor based system than in a segregated memory CMOS system. A major downside is that the memristor performance is a semiconductor process-tuned analog one which will have yields and process variations which will impact how it operates. How long its memory lasts, how much weighting each input pulse on each input port gets relative to other input ports must be a process line determined and innovated value. All these problems are likely solvable and I do not disagree with investing in those attempts since the generic (technology non-specific) goals are on point.

The memristor, on the other hand, is a new technology relative to CMOS digital technology. CMOS fabs are cranking out the main volume of processors today. The issue with CMOS is that it traditionally segregated memory (except for latches embedded in combinatorial logic) in RAM and ROM which placed an architectural bottleneck on simultaneous access of lots of data at the same time. Once memory was isolated, processing became bottlenecked by memory access. Thus we see the many levels of memory (multiple cache levels, RAM, non-volatile) springing up to provide a mix of access-speed/memory-size ratios to reduce the bottleneck problem. What I point out is that this tradition is blinding potential AI users to possible uses of CMOS which can provide continuous and massively parallel access to memory (the bits on embedded transmission lines). Just as delay lines in oscilloscopes allowed their designers to trade off position and time in the delay line space/time continuum to trigger at specific points (Tektronics oscilloscope manual 1953) delay lines (given more delay via lots of distributed capacitance for more data density) on CMOS chips can provide parallel memory access to lots of logic islands with valuable apparent time skews in the bit stream. The pulses traveling on the transmission lines might be encoded in classical absolute voltage levels or other encodings such as pulse width or pulse height/width ratios might prove more appropriate to mimic brain like behavior.

My assertion is that starting on a CMOS base leverages the mature process we already understand and build in production volume. Designers experienced with CMOS could make some evolutionary changes to gain experience in a brave new world of integrated memory/logic without jumping off a cliff. A straw man roadmap might be:

1. Get transmission line shift register memory working.

2. Extract time shifted bits from multiple shift registers and put a combinatorial logic result into a target shift register.

3. Do number 2 in parallel in a very large number of islands of combinatorial logic.

4. Experiment with alternative logic state encodings on transmission lines and develop circuit libraries to deal with those logic state representations.

Another advantage of the CMOS roadmap is that compiler technologies can mature more smoothly than with a cliff leap. Ultimately, there might be hybrid memristor/xmissionLine systems or we might leap over the Analog phase of maturation straight to the DSP level of operation. Embedding the primary coefficients of operation into chip architecture instead of process is usually a better move. That is, the flexibility of different sizes and operational parameters of multiple transmission line loops are wider, faster, and easier than tuning a memristor process parameter. You still might gain compounding improvement over extended time frames with process variations for transmission lines. I propose a race. Do both to see if there are advantages each has that the other does not.
_______________
thewildotter

Reply
Guru

Join Date: Apr 2010
Location: About 4000 miles from the center of the earth (+/-100 mi)
Posts: 9090
Good Answers: 1036
#4
In reply to #3

Re: Building Brain-Inspired Computing Systems. Innovation via Process vs Chip Architecture

08/28/2018 10:38 PM

A major downside is that the memristor performance is a semiconductor process-tuned analog one which will have yields and process variations which will impact how it operates. How long its memory lasts, how much weighting each input pulse on each input port gets relative to other input ports must be a process line determined and innovated value.

Actually, you don't care if your memristors are all identical or how sensitive they are to programming pulses. You don't even care or know what value they end up with. The memristor weights are adjusted by a closed loop learning algorithm until the network produces the "right answer" to input stimuli. (If a given memristor is less responsive, it gets tweaked more.) And, as far as I know, there is no change in memristor value without programming pulses (stability problem).

We haven't talked about the "learning algorithm". The one I am familiar with is called "Back Propagation". Basically, the way it works is that the error in the output neuron is "blamed on" the inputs to that neuron, proportional to their weights. These weights are adjusted in the proper direction to remove a small portion of that error. This same process is done on the neurons in the previous layer and so on back to the initial layer. By slowly adjusting the weights for each training sample, they will eventually converge on the optimum values. As you can see, it is a slow process.

Reply
Guru

Join Date: Jul 2017
Posts: 520
Good Answers: 13
#5
In reply to #4

Re: Building Brain-Inspired Computing Systems. Innovation via Process vs Chip Architecture

08/29/2018 5:52 PM

A View from the Petri Dish: 3D Print those Neurons

Rixter,

You are talking about whether memristors are effective computing machines. I already believe that they are in an academic sense. I suspect that they eventually can converge on a solution which is likely to be better than random and that with enough hardware and time you get some modicum of
brain like behavior.

I think all that is good and might eventually be practical. I am talking about something else. To begin with, I would like to step back and ask what the most relevant goal was put forward by the OP. I judge that it is to achieve brain-like function with far less power and time than current semiconductor technologies are demonstrated to need. I completely and profoundly agree that this efficiency of functionality already exists in human brains and that the current dominant paradigm of memory segregated processing is unlikely to be a viable path to producing such function in chips in my lifetime. The OP proposes a generic path to alternative architectures which have better hope of near term success, namely closer integration of memory into processing power. I totally and profoundly agree on this point also. I will go farther and say that such a technique has better near term probability of success than all of the glitzier buzz word oriented technologies such as quantum entanglement, Josephson junctions, holographic computing, doo-dah, doo-dah,...

Now I am not disparaging ultimate(or even surprise) success using any of these other approaches beyond saying that memristors behave like neurons to an obvious extent and that they do indeed represent an apparently viable way to mimic certain neural mechanisms. I also recognize that memristors are clearly an integration of memory and decision making. I concur with your assertion that feedback inherent in neural nets can get one to an answer regardless of the process sensitive parameters of memristors. I think we have very common ground to this point.

My desire is to see sharply better parallel computing architectures before I die because I believe I can apply some generic and non-obvious simulation concepts to them to take a great leap in functionality. You might think of it abstractly as a biological twist on operating system architecture. Enough of that, since I am not yet ready to disclose it. I spent my vocational life in massive data computation(EDA,software to build chips) so I am familiar with semiconductor process issues and the time it takes to build
viable technologies with new processes. Test chips and startups are not proof of a technology's viability. More poignantly, many excellent concepts do not catch on until they have died multiple times.

While I see the brain mimicry of neural nets and realize the magic that feedback sometimes can
exhibit (see opamp versatility and seemingly magical behavior) I have a nagging doubt that neural net magic is the only or even the most magical brain trick. Basing a new technology on neural nets is a good thing but my hunch is that it will give us too much confidence that we have unlocked the brain's "singular magic trick." If we have based our whole distributed memory computational technology on neural nets we may be getting stuck in what the simulated annealing people call a local minimum.

OK. So how do I envision avoiding falling in this ditch ? I see neural nets as a conceptual monoculture. I would like to decisively decouple a generic highly distributed memory concept with a simplistic brain model which may run out of gas or, worse still, be a grand success. If memristors and the neural nets paradigm catch on and that makes a Siamese twin of highly distributed memory and memristors, we may win a battle and lose the AI war. The Siamese twin of segregated memory and CMOS is a prime example of why this is a bad idea.

Innovators now have a hard time believing that the generic concept of random access memory, the serialization of memory access, the bagage and timing limitations of decode, the longer latency of interaction of logic with memory, the albatross of pre-planned addressing schemes, are now
all bad things and system architects either love them all(if they watched their good run of success) or are resigned to most of them(Kondratiev saw it clearly). How does one reduce the ill effects of running a concept out of gas ? The best way is to diversify.

Have at least two ways of implementing neural nets neither of which is a joke. Simulation of neural nets on memory segregated architectures, per the OP and I concur, is a joke(paraphrasing) due to performance. So, if you want fast success on your way to AI, should you put all your eggs in
one process basket(memristors) or should you take measured steps toward improving a mature technology(CMOS) stuck in a local minimum(segregated memory.) Should you invest enough to enable implementation of neural nets in a few new circuits in the CMOS circuit library(initially with no process changes) or should you demonstrate faith and commitment and get behind the shiny new memristor ? I'll cheat and give you the answer. It is YES.

You should do both if you can possibly afford to. If you cannot afford to, I got news, nursing the new process is more expensive by a huge amount and you will get way more mulligans for the money with
transmission lines on CMOS. You may say that patching old tools eventually runs out of gas and that is generically true. CMOS has been patching for a very long time and that means it has been tuned
away from distributed memory for reasons of inertia(see LSSD) and that fact presents a "low-hanging fruit" opportunity. Distributed memory may be the enabler for more than one brain magic trick and you will win big if the generic transmission line distributed memory can play both roles. Neural nets are more specific to their named behavior alone and while neuronal operation might be reused for other brain magic, generic distributed memory might enable tricks the mucous-ware did not use. Since generic semiconductor distributed memory can implement neural nets, it is very likely to be able to implement other brain magic even if that other magic is based on neuronal behavior. We have enjoyed
a great deal of success with computers in mimicking the brain attributes of high memory capacity and fast serial calculation. Now it is time to repeat that success but with a distributed memory architecture for use in matrix math, fuzzy logic, and parallel processing realms. It would be a shame to commit too much to using the brain's exact mechanisms when implementing a new (brain inspired) data processing concept. Otherwise, we should just 3D print neurons in petri dishes and be done with it.
____________________
thewildotter

Reply
Guru

Join Date: Mar 2007
Location: at the beach in Florida
Posts: 31108
Good Answers: 1728
#6
In reply to #5

Re: Building Brain-Inspired Computing Systems. Innovation via Process vs Chip Architecture

08/30/2018 1:55 PM

https://www.meddeviceonline.com/doc/new-bioink-can-d-print-tissue-at-room-temperature-0001

Why not just print a whole brain? Then you could link them all together and connect them to the internet....

https://edgylabs.com/how-close-3d-printing-bodies

Then we could give them all encapsulated ambulatory ability....

Have I gone too far?

__________________
Break a sweat everyday doing something you enjoy
Reply
Guru

Join Date: Apr 2010
Location: About 4000 miles from the center of the earth (+/-100 mi)
Posts: 9090
Good Answers: 1036
#7
In reply to #5

Re: Building Brain-Inspired Computing Systems. Innovation via Process vs Chip Architecture

08/30/2018 8:00 PM

I agree with you in that you should not put all the eggs in one basket. But we have lots of baskets, and if memristors work out well, then they will be utilized. If not, they will join other great ideas that didn't make it. (I recall bubble memory, circular no-loss waveguides, etc.)

I personally think they have a lot of advantages: simplicity, non-volatility, and they implement the exact functionality needed for learning in neural networks, variable, programmable resistance. But only time will tell.

Reply
Reply to Blog Entry 7 comments
Interested in this topic? By joining CR4 you can "subscribe" to
this discussion and receive notification when new comments are added.
Copy to Clipboard

Users who posted comments:

Rixter (3); SolarEagle (1); thewildotter (3)

Previous in Blog: Oh, The Humanity: You Can Now Instantly Kill A Computer For $50   Next in Blog: Programmable Logic Controllers In Practical Application

Advertisement