Hopfield neural network example with implementation in Matlab and C

Modern neural networks is just playing with matrices. So in a few words, Hopfield recurrent artificial neural network shown in Fig 1 is not an exception and is a customizable matrix of weights which is used to find the local minimum (recognize a pattern). The Hopfield model accounts for associative memory through the incorporation of memory vectors and is commonly used for pattern classification.

hopfield-network

Contents

Quick reference

A Hopfield network is a form of recurrent artificial neural network popularized by John Hopfield in 1982 but described earlier by Little in 1974. Hopfield nets serve as content-addressable memory systems with binary threshold nodes. They are guaranteed to converge to a local minimum, but convergence to a false pattern (wrong local minimum) rather than the stored pattern (expected local minimum) can occur. Hopfield networks also provide a model for understanding human memory. More details – https://en.wikipedia.org/wiki/Hopfield_network

Theory

A Hopfield neural network is a particular case of a Little neural network. So it will be interesting to learn a Little neural network after.

A Hopfield neural network is a recurrent neural network what means the output of one full direct operation is the input of the following network operations, as shown in Fig 1.

Associative memory

It has been proved that Hopfield network is resistant. In general, it can be more than one fixed point. What fixed point will network converge to, depends on the starting point chosen for the initial iteration.

The fixed points called attractors. The set of points (vectors) that are attracted to a particular attractor in the network of iterations, called “attraction area” of the attractor. The set of fixed points of the Hopfield network – is its memory. In this case, the network can act as an associative memory. Those input vectors that fall within the sphere of attraction of a separate attractor, are related (associated) with them.

For example, the attractor may be some desired pattern. Attraction area may consist of noisy or incomplete versions of this pattern. It’s hoped that the pattern that vaguely resembles the desired pattern will be recalled and associated properly by a network.

Binary model

Fig 1 shows a binary Hopfield network, binary means +1 or -1. Any black and white picture could be represented as sequance of black (+1) and white (-1) pixels which constitute the input vector. The input and output vectors consist of “-1” and “+1” (instead of “-1” can be used “0”) has a symmetric weight matrix composed of integers with zero diagonal symmetric-matrix. Zero diagonal is a recommended condition for the convergence, but not the required one.

Weight matrix

The weight matrix differentiates the behavior of a one Hopfield network from another, so the question arises: “How to determine the weight matrix?“.

The answer – it’s necessary to specify a certain weight vectors, which are called instances. It is hoped that these instances are fixed points of the resulting network Hopfield. Although this is not always the case. In order to instances were attractors, it’s necessary to set the weight matrix as follows:

weight-matrix (1)

where N – the number of specified instances, and Xk– k-th instance.

If instances of the vectors form a set of orthogonal vectors, it is possible to ensure that if the weight matrix is chosen as indicated above, each copy of the vector is a fixed point. However, in general, in order to instances lead to fixed points, orthogonality is not required.

Energy

Hopfield nets have a scalar value associated with each state of the network referred to as the “energy”, E, of the network, where:

E (2)

This value is called the “energy” because the definition ensures that when points are randomly chosen to update, the energy E will either lower in value or stay the same. Furthermore, under repeated updating, the network will eventually converge to a state which is a local minimum in the energy function.

Asynchronous correction

  • Asynchronous: Only one unit is updated at a time. This unit can be picked at random, or a pre-defined order can be imposed from the very beginning.
  • Synchronous: All units are updated at the same time. This requires a central clock to the system in order to maintain synchronization. This method is less realistic since biological or physical systems lack a global clock that keeps track of time.

The input vector X is multiplied by the weight matrix using the normal matrix-vector multiplication. However, only one component of the output vector is used at each iteration. This procedure is known as “asynchronous correction“. This component, which may be randomly selected is applied to the threshold element whose output -1 or 1. The corresponding component of the input vector is replaced by the value, and thus forms the input vector for the next iteration. The process continues as long as the input and output vectors do not become the same (i.e., until a fixed point is reached)

Synchronous correction – means that the whole output vector is used at each iteration. Note that asynchronous correction is much more precise then synchronous correction, but it requires more computing power. Nowadays only asynchronous correction is commonly used.

Asynchronous correction and zeros on the diagonal of weights matrix W ensure that the energy function (2) will decrease with each iteration. Asynchronous correction – it’s particularly important to ensure convergence to the fixed point. If we would work with synchronous correction and assume that the whole vector is adjusted at each iteration, the network can be with periodic cycles like terminal states of attractors, and not with the fixed points.

Algorithm

  • Calculate the weight matrix W using the formula (1).
  • Calculate the output vector components, j = 1,2, .., n, using the formula below:
    Yj (3)

    where T(x)

  • Perform the asynchronous correction:
    • Start with the input vector Xn.
    • Find Yj-single according to formula (3).
    • Replace the Xn on the Yn and feed Y on the input X.
    • Repeat the process to find y2y3, etc. (Yj-single can be selected randomly)
  • Repeat steps 2-3 for as long as the vector Y is no longer changed. Each step reduces the amount of energy ties:
    E (2)
    so that a convergence to the fixed point (attractor) is ensured.

Matlab code

The Matlab has a newhop() function which can do the job for us, but we would like to develop the code for ourselves:

%{ Author: Alex Bod
% Website: https://www.alexbod.com
% Details: https://www.alexbod.com/hopfield-network/
% License: The GNU General Public License, version 2
% hopfield.m: Hopfield network in Matlab
%}
% Input data for learning
x = [' OO '; ...
' OO '; ...
' OOOO '; ...
' O O '; ...
' OO OO '; ...
' O O '; ...
' OOOOOOOO '; ...
' OOOOOOOO '; ...
'OO OO'; ...
'OO OO'];
x(:, :, 2) = ...
['OOOOOO '; ...
'OOOOOOO '; ...
'OO OO '; ...
'OOOOOOO '; ...
'OOOOOOO '; ...
'OO OOO '; ...
'OO OO '; ...
'OO OOO '; ...
'OOOOOOO '; ...
'OOOOOO '];
x(:, :, 3) = ...
['OOOOOOOOOO'; ...
'OOOOOOOOOO'; ...
'OO OO'; ...
'OO '; ...
'OO '; ...
'OO '; ...
'OO '; ...
'OO OO'; ...
'OOOOOOOOOO'; ...
'OOOOOOOOOO'];
x(:, :, 4) = ...
['OO OO'; ...
'OO OO'; ...
'OO OO'; ...
'OO OO'; ...
'OOOOOOOOOO'; ...
'OOOOOOOOOO'; ...
'OO OO'; ...
'OO OO'; ...
'OO OO'; ...
'OO OO'];
% Input data for recognition
y = [' OO '; ...
' OO '; ...
' OOOO '; ...
' O OO '; ...
' OO OOO '; ...
' O OO '; ...
' OOOOOOOO '; ...
' OOOOOOOO '; ...
'OO OO'; ...
'OO OO'];
y(:, :, 2) = ...
['OOOOOOO '; ...
'OOOOOOOOO '; ...
'OO OOOO '; ...
'OOOOOOOOO '; ...
'OOOOOOO '; ...
'OO OOO '; ...
'OO OO '; ...
'OO OOO '; ...
'OOOOOOO '; ...
'OOOOOOOO '];
y(:, :, 3) = ...
['OOOOOOOOOO'; ...
'OOOOOOOOOO'; ...
'OO OO'; ...
'OO '; ...
'OOOOOO '; ...
'OO OOO '; ...
'OO '; ...
'OO OO'; ...
'OOOOOOOOOO'; ...
'OOOOOOOOOO'];
y(:, :, 4) = ...
['OO OO'; ...
'OO OO'; ...
'OO OOOO OO'; ...
'OO OO'; ...
'OOOOOOOOOO'; ...
'OOOOOOOOOO'; ...
'OO OO'; ...
'OO OOOO OO'; ...
'OO OO'; ...
'OO OO'];
% Input data for learning
input = zeros(size(x,3),100);
% Input data for recognition
notcorrect = zeros(size(y,3),100);
% Make x input data binary
for n = 1:1:size(x,3)
for i = 1:1:10
for j = 1:1:10
if x(i,j,n) == 'O'
input(n,(i-1)*10+j) = 1;
else
input(n,(i-1)*10+j) = -1;
end
end
end
end
% Make y input data binary
for n = 1:1:size(y,3)
for i = 1:1:10
for j = 1:1:10
if y(i,j,n) == 'O'
notcorrect(n,(i-1)*10+j) = 1;
else
notcorrect(n,(i-1)*10+j) = -1;
end
end
end
end
% Initialize weight matrix
W = zeros(size(x,1),size(x,2));
% Calculate weight matrix = learning
for i = 1:1:size(x,1)*size(x,2)
for j = 1:1:100
weight = 0;
if (i ~= j)
for n = 1:1:size(x,3)
weight = input(n,i) .* input(n,j) + weight;
end
end
W(i,j) = weight;
end
end
% Recognition
for n = 1:1:size(y,3)
iteration = 0;
iterationOfLastChange = 0;
propagateUnit = 0;
flag = true;
while flag
% Counter
iteration = iteration + 1;
% Generate random element for the asynchronous correction
i = randi([1 size(x,1)*size(x,2)],1,1);
sum = 0;
for j = 1:1:size(x,1)*size(x,2)
sum = sum + W(i, j) * notcorrect(n,j);
end
% Therehold
out = 0;
changed = 0;
if (sum ~= 0)
if (sum < 0)
out = -1;
end
if (sum > 0)
out = +1;
end
if (out ~= notcorrect(n, i))
changed = 1;
notcorrect(n,i) = out;
end
end
% Main condition
if (changed == 1)
iterationOfLastChange = iteration;
end
% Break condition
if (iteration - iterationOfLastChange > 1000)
flag = false;
end
end
% Show the initial not correct input
for i = 1:1:size(y,1)
for j = 1:1:size(y,2)
fprintf('%s', y(i,j,n));
end
fprintf('\n');
end
fprintf('\n\n');
% Show the recognized pattern
for i = 1:1:size(x,1)*size(x,2)
if (rem(i-1, 10) == 0) && (i ~= 1)
fprintf('\n');
end
if notcorrect(n,i) == 1
fprintf('O');
elseif notcorrect(n,i) == -1
fprintf(' ');
else
fprintf('!');
end
end
fprintf('\n\n----------\n\n');
end

Clone it from the Github

C code

/* Author: Alex Bod
* Website: https://www.alexbod.com
* Details: https://www.alexbod.com/hopfield-network/
* License: The GNU General Public License, version 2
* hopfield.c: Hopfield network in C
*/
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
/* Convert points */
#define ZERO_OR_ONE(x) ((x)==-1 ? 0 : 1)
#define BINARY(x) ((x)== 0 ? -1 : 1)
#define NUMBER_OF_VECTORS 4
#define X 10
#define Y 10
#define AREA (X * Y)
/* Network struct */
typedef struct {
int points;
int* output;
int* threshold;
int** weight;
} net;
/* Input data for learning */
char x[NUMBER_OF_VECTORS][Y][X] =
{{ " OO ",
" OO ",
" OOOO ",
" O O ",
" OO OO ",
" O O ",
" OOOOOOOO ",
" OOOOOOOO ",
"OO OO",
"OO OO"},
{"OOOOOO ",
"OOOOOOO ",
"OO OO ",
"OOOOOOO ",
"OOOOOOO ",
"OO OOO ",
"OO OO ",
"OO OOO ",
"OOOOOOO ",
"OOOOOO "},
{"OOOOOOOOOO",
"OOOOOOOOOO",
"OO OO",
"OO ",
"OO ",
"OO ",
"OO ",
"OO OO",
"OOOOOOOOOO",
"OOOOOOOOOO"},
{"OO OO",
"OO OO",
"OO OO",
"OO OO",
"OOOOOOOOOO",
"OOOOOOOOOO",
"OO OO",
"OO OO",
"OO OO",
"OO OO"}};
/* Input data for recognition */
char y[NUMBER_OF_VECTORS][Y][X] =
{{" OO ",
" OO ",
" OOOO ",
" O OO ",
" OO OOO ",
" O OO ",
" OOOOOOOO ",
" OOOOOOOO ",
"OO OO",
"OO OO"},
{"OOOOOOO ",
"OOOOOOOOO ",
"OO OOOO ",
"OOOOOOOOO ",
"OOOOOOO ",
"OO OOO ",
"OO OO ",
"OO OOO ",
"OOOOOOO ",
"OOOOOOOO "},
{"OOOOOOOOOO",
"OOOOOOOOOO",
"OO OO",
"OO ",
"OOOOOO ",
"OO OOO ",
"OO ",
"OO OO",
"OOOOOOOOOO",
"OOOOOOOOOO"},
{"OO OO",
"OO OO",
"OO OOOO OO",
"OO OO",
"OOOOOOOOOO",
"OOOOOOOOOO",
"OO OO",
"OO OOOO OO",
"OO OO",
"OO OO"}};
/* Input data for learning */
int input[NUMBER_OF_VECTORS][AREA];
/* Input data for recognition */
int notcorrect[NUMBER_OF_VECTORS][AREA];
/* Print the result */
void printNet(net* network)
{
int i,j;
for (i=0; i<Y; i++) {
for (j=0; j<X; j++) {
printf("%c", ZERO_OR_ONE(network->output[i*X+j]) ? 'O' : ' ');
}
printf("\n");
}
printf("\n");
}
/* Create the net */
void createNet(net* network)
{
int i;
network->points = AREA; /* Number of points = net area */
network->output = (int*)calloc(AREA, sizeof(int));
network->threshold = (int*)calloc(AREA, sizeof(int));
network->weight = (int**)calloc(AREA, sizeof(int*));
/* Fill thresholds with zeros and allocating memory for weight matrix */
for (i=0; i<AREA; i++) {
network->threshold[i] = 0;
network->weight[i] = (int*)calloc(AREA, sizeof(int));
}
}
/* Convert points of 'O' to the binary -1 or +1 */
void pointstoBinary(net* network)
{
int n,i,j;
for (n=0; n<NUMBER_OF_VECTORS; n++) {
for (i=0; i<Y; i++) {
for (j=0; j<X; j++) {
/* Make points binary and convert 3d matrix to 2d */
input[n][i*X+j] = BINARY(x[n][i][j] == 'O');
notcorrect[n][i*X+j] = BINARY(y[n][i][j] == 'O');
}
}
}
}
/* Calculate the weight matrix = learning */
void calculateWeights(net* network)
{
int i,j,n;
int weight;
for (i=0; i<network->points; i++) {
for (j=0; j<network->points; j++) {
weight = 0;
if (i!=j) {
for (n=0; n<NUMBER_OF_VECTORS; n++) {
/* Main formula for calculating weight matrix */
weight += input[n][i] * input[n][j];
}
}
/* Fill the weight matrix */
network->weight[i][j] = weight;
}
}
}
/* Set the input vector to the Net->output */
void setInput(net* network, int* input)
{
int i;
for (i=0; i<network->points; i++) {
network->output[i] = input[i];
}
printNet(network);
}
/* Set the Net->output to the output vector */
void getOutput(net* network, int* output)
{
int i;
for (i=0; i<network->points; i++) {
output[i] = network->output[i];
}
printNet(network);
printf("----------\n\na");
}
/* Next iteration to find the local minimum = recognized pattern */
int nextIteration(net* network, int i)
{
int j;
int sum, out;
int changed;
changed = 0;
sum = 0;
for (j=0; j<network->points; j++) {
sum += network->weight[i][j] * network->output[j];
}
if (sum != network->threshold[i]) {
if (sum < network->threshold[i]) out = -1;
if (sum > network->threshold[i]) out = 1;
if (out != network->output[i]) {
changed = 1;
network->output[i] = out;
}
}
return changed;
}
/* Asynchronous correction */
void asynCor(net* network)
{
int iteration;
int iterationofLastChange;
iteration = 0;
iterationofLastChange = 0;
do {
iteration++;
/* Every time take random element for the correction */
if (nextIteration(network, rand() % (network->points)))
iterationofLastChange = iteration;
} while (iteration-iterationofLastChange < 10*network->points);
}
/* Find the local minimum = recognizing the pattern */
void findLocalMinimum(net* network, int* input)
{
int output[AREA];
/* Print not correct input for recognizing */
setInput(network, input);
/* Asynchronous correction */
asynCor(network);
/* Print recognized output */
getOutput(network, output);
}
void main()
{
net network;
int n;
/* Allocate memory and create the net */
createNet(&network);
/* Make the points matrix binary */
pointstoBinary(&network);
/* Calculate the weight matrix */
calculateWeights(&network);
/* Find the local minimum = recognizing the pattern */
for (n=0; n<NUMBER_OF_VECTORS; n++) {
findLocalMinimum(&network, notcorrect[n]);
}
}

Clone it from the Github

Application examples

Initial patterns

OO
OO
OOOO
OO
OOOO
OO
OOOOOOOO
OOOOOOOO
OOOO
OOOO

OOOOOO
OOOOOOO
OOOO
OOOOOOO
OOOOOOO
OOOOO
OOOO
OOOOO
OOOOOOO
OOOOOO

OOOOOOOOOO
OOOOOOOOOO
OOOO
OO
OO
OO
OO
OOOO
OOOOOOOOOO
OOOOOOOOOO

OOOO
OOOO
OOOO
OOOO
OOOOOOOOOO
OOOOOOOOOO
OOOO
OOOO
OOOO
OOOO

Noise test

The test pattern Noise percternentage Type of a distorted pattern The result of recognition

OO
OO
OOOO
OO
OOOO
OO
OOOOOOOO
OOOOOOOO
OOOO
OOOO

10%

OO
OOO
OOOOO
OOO
OOO
OO
OOOOOO
OOOOOO
OOOO
OOOO

OO
OO
OOOO
OO
OOOO
OO
OOOOOOOO
OOOOOOOO
OOOO
OOOO

20%

OOOO
OOO
OOOOO
OOOO
OOOO
OO
OOOOOOO
OOOOOOO
OOOO
OOOOOO

OO
OO
OOOO
OO
OOOO
OO
OOOOOOOO
OOOOOOOO
OOOO
OOOO

30%

OOOO
OOOO
OOOOOO
OOOOO
OOOO
OOO
OOOOOOOO
OOOOOOO
OOO
OOOOO

OO
OO
OOOO
OO
OOOO
OO
OOOOOOOO
OOOOOOOO
OOOO
OOOO

40%

OOOOOOO
OOOOOO
OOOOOO
OOOOO
OOOO
OOO
OOOOOOOO
OOOOOOO
OOOOOO
OOOOOO

OO
OO
OOOO
OO
OOOO
OO
OOOOOOOO
OOOOOOOO
OOOO
OOOO

50%

OOOOOOO
OOOOOO
OOOOOOO
OOOOOO
OOOOOO
OOOOOOO
OOOOOOOO
OOOOOOO
OOOOOO
OOOOOOOO

OO
OO
OOOO
OO
OOOO
OO
OOOOOOOO
OOOOOOOO
OOOO
OOOO

60%

OOOOOOO
OOOOOOOOOO
OOOOOOOOO
OOOOOOOO
OOOOO
OOOOOOOOO
OOOOOOOOO
OOOOOOOOO
OOOOOOO
OOOOOOO

OOOOOOOOOO
OOOOOOOOOO
OOOO
OOOOO
OOOO
OOOO
OO
OOO
OOOOOOOO
OOOOOOOO

70%

OOOOOOOOOO
OOOOOOOOOO
OOOOOOOOO
OOOOOOOO
OOOOO
OOOOOOO
OOOOOOOOO
OOOOOOOOOO
OOOOOOOO
OOOOOOOO

OOOOOOOO
OOOOOOOO
OOOOOO
OOOOOOOO
OOOOOO
OOOOOOOO
OO
OO
OOOOOO
OOOOOO

80%

OOOOOOOO
OOOOOOOOOO
OOOOOOOOO
OOOOOOOO
OOOOO
OOOOOOO
OOOOOOO
OOOOOO
OOOOOOOO
OOOOOO

OOOOOOOO
OOOOOOOO
OOOOOO
OOOOOOOO
OOOOOO
OOOOOOOO
OO
OO
OOOOOO
OOOOOO

90%

OOOOOOOO
OOOOOOOO
OOOOOO
OOOOOOOO
OOOOO
OOOOOOO
OOOO
OOOOOO
OOOOOO
OOOOOO

OOOOOOOO
OOOOOOOO
OOOOOO
OOOOOOOO
OOOOOO
OOOOOOOO
OO
OO
OOOOOO
OOOOOO

100%

OOOOOOOO
OOOOOOOO
OOOOOO
OOOOOOOO
OOOOOO
OOOOOOOO
OO
OO
OOOOOO
OOOOOO

OOOOOOOO
OOOOOOOO
OOOOOO
OOOOOOOO
OOOOOO
OOOOOOOO
OO
OO
OOOOOO
OOOOOO

The test above gave very accurate recognition result even if the noise level is greater than 50%, and even a man is hard to recognize.

Rotate test

The test pattern Rotate angle Type of a distorted pattern The result of recognition

OOOO
OOOO
OOOO
OOOO
OOOOOOOOOO
OOOOOOOOOO
OOOO
OOOO
OOOO
OOOO

30°

OO
OO
OOO
OOOO
OOOOOOOOO
OOOOOOOOO
OOOO
OOO
OO
OO

OOOO
OOOO
OOOO
OOOO
OOOOOOOOOO
OOOOOOOOOO
OOOO
OOOO
OOOO
OOOO

40°

O
OO
OO
OOOO
OO
O
OO
OO
OO



OOOOOO
OOOOOOOO
OOOOOOOO
OOOOOOOO
OOOOOOOO
OOOOOO

90°

OOOOOOOOOO
OOOOOOOOOO
OO
OO
OO
OO
OO
OO
OOOOOOOOOO
OOOOOOOOOO

OOOOOOOOOO
OOOOOOOOOO
OOOO
OOOOO
OOOO
OOOO
OO
OOO
OOOOOOOO
OOOOOOOO

The test above shown inability to recognize a pattern when it’s rotated, especially when the rotational angle is 90°. The first pattern is recognized because it looks like the initial pattern with the noise.

Cross associations

Let’s complicate the task and train the network to recognize one more pattern:

OOOOOOOOOO
OOOOOOOOOO
OOOO
OO
OO
OOOOOO
OOOOOO
OOOO
OOOOOOOOOO
OOOOOOOOOO

The letter “G” is very similar to already existing in the network memory letter “C”. Now the network can not recognize any of these letters, even in the undistorted state. Instead of correctly recognized letters it produces something in between (for the distortion of the pattern from 0 to 50%):

The test pattern The result of recognition

OOOOOOOOOO
OOOOOOOOOO
OOOO
OO
OO
OOOOOO
OOOOOO
OOOO
OOOOOOOOOO
OOOOOOOOOO

OOOOOOOOOO
OOOOOOOOOO
OOOO
OO
OO
OOOOOO
OOO
OOOO
OOOOOOOOOO
OOOOOOOOOO

OOOOOOOOOO
OOOOOOOOOO
OOOO
OO
OO
OO
OO
OOOO
OOOOOOOOOO
OOOOOOOOOO

OOOOOOOOOO
OOOOOOOOOO
OOOO
OO
OO
OOOOOO
OOO
OOOO
OOOOOOOOOO
OOOOOOOOOO

It looks a little bit like an every letter “G”, “C”, and it’s not a correct interpretation of any of them.

Furthermore this disturbance affected other patterns with different recognizing parameters.

The test pattern The result of recognition

OOOOOOO
OOOOOOOOO
OOOOOO
OOOOOOOOO
OOOOOOO
OOOOO
OOOO
OOOOO
OOOOOOO
OOOOOOOO

OOOOOOOOOO
OOOOOOOOOO
OOOO
OO
OO
OOOOOO
OOO
OOOO
OOOOOOOOOO
OOOOOOOOOO

But letter “A” without distortions is recognized correctly.

The test pattern The result of recognition

OO
OO
OOOO
OO
OOOO
OO
OOOOOOOO
OOOOOOOO
OOOO
OOOO

OO
OO
OOOO
OO
OOOO
OO
OOOOOOOO
OOOOOOOO
OOOO
OOOO

The described behavior of the neural network is known as the effect of “Cross associations”.

Pros and cons

Pros

  • Accurate recognition even if the noise level is greater than 50%, and even a man is hard to recognize.

Cons

  • Presence of the cross associations when multiple patterns is similar to each other (such as in the experimentation with the letters “G” and “C”).
  • Capacity limits of the the number of stored memory attractor is just (0.3/0.4)*n, where n – the dimension of the weight matrix W.
  • Inability to recognize a pattern when it’s rotated.

These cons substantially limits the practical use of the Hopfield network but I believe that with a little revision the situation can be fixed.

P.S. If you want to learn neural networks, learn mathematics, especially matrices and their operations.