Give Your .NET App Brains and Brawn with the Intelligence of Neural Networks
Christopher M. Frenz
This article discusses:
| This article uses the following technologies: .NET, Visual Basic |
Contents
Pattern recognition is an
increasingly complex field. Every day technologies such as handwriting
recognition software, spam filters, and search engines are required to
identify ever more complicated patterns. The difficulty that arises when
these tasks are attempted through traditional programming is that they
involve a multitude of variables and, more often than not, the
relationships between these variables cannot be explicitly defined. For
example, the differences between spam and legitimate e-mail are often
fuzzy, so hardcoding a set of criteria to differentiate between the two
can be difficult. To deal with these and similar issues, programmers are
beginning to move away from such approaches and are adopting nonlinear
programming techniques such as neural networks.
Neural Networks
Artificial
neural networks are designed much like biological neural networks. Both
comprise a series of simple information processing units that operate
in parallel. In both artificial and biological networks, these simple
units are called neurons. Signals can be passed between neurons through a
series of weighted connections. The pattern of these connections
defines the architecture of the neural network and influences the
functionality for which the neural net is best suited (pattern
recognition, classification, and so on). Neural networks are able to
"learn" by adjusting the strengths of these connections until they can
approximate a function that computes the proper output for a given input
pattern.
In this article I'll
examine one of the most common types of neural networks, the
feed-forward neural network, which is often used for pattern recognition
and predictive purposes. I'll provide a small example program from
medical informatics. The premise is that you have been hired by a group
of doctors who are trying to predict their patients' risk for developing
heart disease. Over the years they have monitored changes in potential
risk factors of past patients, such as blood pressure, weight, and so
on, and recorded whether these patients developed heart disease. The
neural network under development will be trained with this information
so that the doctors can predict the heart disease risk for their
patients and take appropriate preventative action.
The
typical architecture of a feed-forward neural network contains three
layers: an input layer, a hidden layer, and an output layer (see Figure 1).
The input layer transfers the array of input values into the neural
network. The input layer data is then multiplied by a weight matrix (wij)
and passed into the hidden layer neurons. Every possible
interconnection possesses its own weight. Therefore the weight matrix
has n × m dimensions, where n is the number of input layer neurons and m
is the number of hidden layer neurons. Note that not every
interconnection must actually exist, a case that can be modeled using a
weight of zero for that interconnection.
Figure 1 Feed-Forward Neural Net
The
hidden layer neurons allow the network to represent how the elements of
a complex pattern work together to produce a given output. The hidden
layer increases the number of weighted interconnections. This means that
the neural network can approximate more complex functions. In fact, the
most basic neural network architectures, such as single-layer
perceptions, lack the ability to even approximate the XOR function. A
multilayer neural network like the one shown in Figure 1
can readily approximate the XOR function (such that specifying binary
input values at the input layer will yield the correct XOR value of
these input values at the output layer). For solving highly complex
patterns, some neural networks will even employ some additional hidden
layers.
A similar weight matrix (wjk)
connects the hidden layer neurons to the output neurons. A bias is also
added to each neuron in the hidden and output layers, which scales the
neurons' input values before they pass through the neurons' transfer
function. The transfer function, sometimes known as the activation
function, takes the sum of all the neurons' weighted inputs and uses the
value to calculate the neurons' output.
Programming Neural Networks
Now that you have an idea of what neural networks are and how they operate, let's start a new Windows® Application project in Visual Studio®.
First you need to specify the number of neurons in each layer. To
determine the number of input neurons, look at the DataSet provided by
the physicians. The doctors provided three variables of interest: change
in cholesterol, change in weight, and a family history of the disease.
This means that you'll need a total of three input neurons. The number
of hidden layer neurons is generally determined during the training
process, which I'll discuss later, but for now pick an initial value of
3. Furthermore, since the physicians only want to know whether or not
the individual is at risk for the disease, you'll only need a single
output neuron, yielding a 3-3-1 architecture for the network.
Once
you've established the number of neurons necessary for your network,
you need to add the biases and weighted connections between these
neurons. Since there's no way of knowing the appropriate weights and
biases prior to training, randomly initialize each neuron with a weight
between –0.5 and 0.5 (see Figure 2). For larger neural
networks it may be advisable to employ a more sophisticated
initialization procedure, such as Nguyen-Widrow (a simple modification
to the common random weight initialization algorithm that can provide
for faster training times).
Figure 2 Randomizing Neuron Weights
Private Sub Init(ByVal n As Integer, ByVal m As Integer) Dim I, J As Integer Randomize() For I = 0 To n - 1 For J = 0 To m - 1 hweight(I, J) = (Rnd() - 0.5) hweight2(I, J) = hweight(I, J) Next J Next I For J = 0 To m - 1 hbias(J) = (Rnd() - 0.5) hbias2(J) = hbias(J) oweight(J) = (Rnd() - 0.5) oweight2(J) = oweight(J) Next J obias = (Rnd() = 0.5) obias2 = obias End Sub
Within the code in Figure 2,
the variables prefixed with "h," such as hweight, refer to the
variables used by the hidden layer neurons, whereas those prefixed with
"o" refer to output neuron variables. The variables without the numeric
suffix are the actual weights and biases used by the network, while
those with the suffix 2 will play a role in the training process. The
training algorithm requires that you keep track of the current weights
and biases as well as the values from the previous training iteration.
At this point there have been no previous training iterations, so you
can simply initialize both sets of variables to the same values. For the
sake of simplicity, many variables in this code were not declared at
the procedural level, but rather as private form-level variables,
because they will be used by multiple procedures during program
execution. Thus, if a variable is used in a subroutine, but not declared
within that subroutine, make sure to declare that variable in the
declarations section of your project's code.
At
this point the architecture of the neural network has been laid out and
the strength of the interconnections initialized, but you still lack a
means of transferring data between layers. The first step is to
establish a way to transfer the input data from the input neurons to the
hidden layer neurons. When coding, it's important to remember that each
hidden layer neuron is connected to every input neuron and that each
individual hidden layer neuron will receive the sum of the weighted
input neuron values. Applying the bias of this hidden layer neuron then
scales this summation. Thus, for each individual neuron the process
should proceed according to the formula
where Xi represents the input neuron values and wij represents the weight connecting the input neuron to the hidden layer neuron. In other words, the input value to a node is the bias for that node added to the sum of each input interconnection, where the value of an interconnection is the input neuron's value multiplied by the weight of the connection. You can see this in the code that follows. The subroutine in this code snippet contains two nested loops. The inner loop of the HiddenInput function carries out the summation process for each individual hidden layer neuron, as shown in the equation. The outer loop ensures that this process is repeated for each hidden layer neuron in the neural network:
where Xi represents the input neuron values and wij represents the weight connecting the input neuron to the hidden layer neuron. In other words, the input value to a node is the bias for that node added to the sum of each input interconnection, where the value of an interconnection is the input neuron's value multiplied by the weight of the connection. You can see this in the code that follows. The subroutine in this code snippet contains two nested loops. The inner loop of the HiddenInput function carries out the summation process for each individual hidden layer neuron, as shown in the equation. The outer loop ensures that this process is repeated for each hidden layer neuron in the neural network:
Private Sub HiddenInput(ByVal n As Integer, ByVal m As Integer) Dim I, J As Integer Dim sum As Double For J = 0 To m - 1 sum = 0 For I = 0 To n - 1 sum = sum + (InputNeuron(I) * hweight(I, J)) Next I hin(J) = hbias(J) + sum Next J End Sub
Once the hidden layer neuron
input values have been determined, it is time for each hidden layer
neuron to process its input by passing the value through its transfer
function. Hidden layer transfer functions are typically bipolar sigmoid
functions that control the excitation state, or value, of the neuron. A
bipolar sigmoid will generally yield an output that approaches 1 or –1,
although the sigmoid of the output neuron can be scaled to yield a range
of output values that is appropriate for the given application.
Typically, the equation for this type of sigmoid is as follows:
where x is the neuron's scaled input. Here is how you accomplish this programmatically:
where x is the neuron's scaled input. Here is how you accomplish this programmatically:
Private Sub HiddenTransfer(ByVal m As Integer) Dim J As Integer For J = 0 To m - 1 hout(J) = Trans(hin(J)) Next J End Sub Private Function Trans(ByVal Val As Double) As Double Dim f As Double f = (2 / (1 + (System.Math.Exp(-Val)))) - 1 trans = f End Function
The HiddenTransfer
subroutine ensures that the scaled input of each hidden layer neuron is
processed, while the Trans function encodes the actual bipolar sigmoid.
The hout values yielded by this code now need to be passed over the next
layer of weighted connections to the output neuron.
This
process is almost identical to the process used to transfer values from
the input neurons to the hidden layer neurons, although these
connections have their own unique set of weights and the output neuron
has its own unique bias. You can accomplish this data transfer with the
following code:
In this segment of code, the nested loop structures that were
used before aren't necessary since there is only a single output neuron.
Private Sub OutputInput(ByVal m As Integer) Dim J As Integer Dim sum As Double sum = 0 For J = 0 To m - 1 sum = sum + (hout(J) * oweight(J)) Next J oin = obias + sum End Sub
The
output neuron also possesses a transfer function just like the hidden
layer neurons. When writing a transfer function for your output neuron,
it is important to consider the range of output values over which you
want your network to make predictions, and scale the bipolar sigmoid
accordingly. Some networks employ a more linear function as a transfer
function to encompass a wider range of possible output values. In this
case, the doctors just want to know whether their patient is at
increased or decreased risk, so stick with the same bipolar sigmoid
already used and just consider a value of –1 to represent decreased risk
and a value of 1 to represent increased risk. Since you are choosing to
use the same transfer function, just add the code that will pass the
output neuron's scaled input into this transfer function.
Private Sub OutputTransfer() oout = Trans(oin) End Sub
Back-Propagation of Error with Momentum
Now
the architecture of the neural network is laid out, but it still lacks
the ability to learn. The next step is to develop a method of adjusting
the weights and biases so that over time the neural network will be able
to accurately predict disease based on the input variables that it is
presented. This is accomplished through a process known as
back-propagation of error, which utilizes a gradient descent algorithm
(a form of hill climbing) that seeks to minimize the error of the values
that are output from the neural network.
The
first step in this process is to take an output computed by the neural
network for a given pattern and compare it to a corresponding target
value. Target values are known outcomes for given patterns; they are
used as part of the training process so that the network can learn which
patterns can be associated with which output values. An error
information term, delta (δ), is then calculated by multiplying the
difference between these two terms by the derivative of the activation
function.
This error term is then used to compute a weight adjustment term as well as a bias adjustment term. The computation of these terms, however, also requires two additional terms to be taken into account. One term is the learning rate (alpha), which limits the size of a weight/bias adjustment step in a single training iteration. The smaller the value of alpha, the longer a network will take to train. If alpha is too large, however, the network may never reach a reasonable solution to the problem; the large step size will result in the algorithm making the network step over the set of weights and biases where the error is minimized.
This error term is then used to compute a weight adjustment term as well as a bias adjustment term. The computation of these terms, however, also requires two additional terms to be taken into account. One term is the learning rate (alpha), which limits the size of a weight/bias adjustment step in a single training iteration. The smaller the value of alpha, the longer a network will take to train. If alpha is too large, however, the network may never reach a reasonable solution to the problem; the large step size will result in the algorithm making the network step over the set of weights and biases where the error is minimized.
The
simplest way to determine the proper value for a learning rate is trial
and error during the training process. Additionally, it may be
advantageous to have a value for alpha that is not constant, but rather
that adapts as training progresses (for example, large at first to
improve speed and then smaller later to improve accuracy). Thus one
possible improvement to the method being presented here would be to make
alpha an adaptive value by employing the delta-bar-delta rule in which
past error values can be used to make educated guesses about future
calculated error values. With these rough estimates, the system can make
more informed choices when adjusting weights.
The
second value is mu (µ), the momentum term. Momentum is an addition to
the weight adjustment equation. This enables the weight to change in
response to the current gradient step and also to the previous one. It
allows the network to find a reasonable solution in fewer training
iterations. When both the current step and previous step are in
agreement, it allows for a larger step size. This also reduces the
effects of anomalous data, since the momentum-dictated change will
oppose the learning rate-dictated change. The weight and bias update
equations including the momentum terms are as follows:
where t represents the current set of weights/biases, t-1 the previous set, and t+1 the new set being calculated. The corresponding code is found in Figure 3.
where t represents the current set of weights/biases, t-1 the previous set, and t+1 the new set being calculated. The corresponding code is found in Figure 3.
Figure 3 UpdateOut Subroutine
Private Sub UpdateOut(ByVal I As Integer, ByVal m As Integer) Dim J As Integer odelta = dtrans(oin) * (targval(I) - oout) For J = 0 To m - 1 doweight(J) = (alpha * odelta * hout(J)) + (mu * (oweight(J) - oweight2(J))) oweight2(J) = oweight(J) oweight(J) = oweight(J) + doweight(J) Next J dobias = (alpha * odelta) + (mu * (obias - obias2)) obias2 = obias obias = obias + dobias End Sub
The process here is to first
determine the value of odelta, and then enter a loop structure, which
will update all of the weighted interconnections between the output
neuron and the hidden layer neurons. Before this update occurs, the
previous oweight values are shifted to the oweight2 array. This allows
you to keep track of past weights and effectively utilize momentum. A
similar update procedure is then carried out for the bias value.
The
UpdateOut subroutine back-propagates the error to the interconnections
present between the output and hidden layer neurons, but a procedure is
still needed to do the same for the interconnections between the hidden
layer and the input neurons. The procedure for updating the weights is
similar to that used for the weights of the connections between the
output and hidden layers, with the difference being that there are no
target values that can be used to calculate the error of each neuron.
Instead, you can calculate the error term of each hidden layer neuron
using the value of odelta multiplied by the weight of the connection
between the current hidden layer neuron and the output neuron. This
allows for the distribution of the error of the output unit back to all
units within the hidden layer. From this point onward, the procedure is
the same as the previous UpdateOut method (see Figure 4).
Figure 4 UpdateHidden Subroutine
Private Sub UpdateHidden(ByVal n As Integer, ByVal m As Integer) Dim I, J, K As Integer For J = 0 To m - 1 hdelta = (odelta * oweight(J)) * dtrans(hin(J)) For I = 0 To n - 1 dhweight(I, J) = (alpha * hdelta * InputNeuron(I)) + (mu * (hweight(I, J) - hweight2(I, J))) hweight2(I, J) = hweight(I, J) hweight(I, J) = hweight(I, J) + dhweight(I, J) Next I dhbias(J) = (alpha * hdelta) + (mu * (hbias(J) - hbias2(J))) hbias2(J) = hbias(J) hbias(J) = hbias(J) + dhbias(J) Next J End Sub
The only notable difference
between the UpdateHidden subroutine and the UpdateOut subroutine is that
since there can be multiple hidden layer neurons as well as multiple
input neurons, you need to utilize some additional loops to ensure that
you update every weight and bias appropriately.
Putting the Network to Work
Now
that all the functional components of the neural network are laid out,
you need to properly utilize these components. It's that time to train
the neural network using a set of training data. A proper training set
will contain a set of input patterns with a corresponding set of target
output values.
I use a DataSet to
hold values for a patient's change in cholesterol, change in weight, and
family history of the disease. For the two change-based values, the
neural network will be able to accept floating point or integer values; a
positive value indicates an increase in weight or cholesterol and a
negative value indicates a decrease. Since family history of the disease
is a yes or no situation, the values are –1 if there is no family
history or 1 if there is a family history of the disease.
Although
there are only three variables in this network, neural networks can
successfully process many more variables. This would make the pattern
more complex and would likely require a longer training time and/or a
larger training set. Long training times, however, are especially
problematic with the back-propagation algorithm used here. The code in
this article is just a sample to demonstrate neural networks; further
reading on more sophisticated techniques should be conducted before
production-scale neural networks are considered. A list of useful
references can be found in the Suggested References box. Also, you
should note that Analysis Services 2005 provides a neural network
implementation that you can take advantage of. For more information, see
Jamie MacLennan's article in the September 2004 issue of MSDN®Magazine, which can be found at SQL Server 2005: Unearth the New Data Mining Features of Analysis Services 2005.
Sufficient
training data is also critical to the success of neural networks, since
the network will need to use a diversity of possible patterns to be
able to process novel patterns. The more complex the pattern, the larger
the training set required. The three input values are then followed by a
–1 or 1 value that indicates whether the patient is at increased or
decreased risk for developing heart disease. Within the training set
there is data for eight such patients.
To incorporate training into the application, add a button control to the form and add the code found in Figure 5.
In the TrainError subroutine, specify a learning rate and momentum term
as well as the number of neurons in the hidden and output layers. Next
instantiate a new StreamReader, and use the ReadLine method of the
StreamReader to read the training data into array X1 from a file
entitled SampleData.txt (for simplicity of example I've hardcoded both
the path to the file and the number of data elements in the file, but in
a real application you would obviously parameterize these input
values). Then call the previously coded init procedure to lay out the
architecture of the neural network as well as randomly initialize the
weights and biases of the network.
Figure 5 Adding Training
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click alpha = 0.3 mu = 0.8 n = 4 m = 3 Dim TrainError As Double Dim I, J, K As Integer Dim NumCases As Integer = 25 Dim TrainSR As New IO.StreamReader("C:\NNData.txt") For J = 0 To NumCases - 1 For I = 0 To n X1(J, I) = TrainSR.ReadLine Next I targval(J) = X1(J, n) Next J TrainSR.Close() Init(n, m) J = 0 Do Until J = 1000 TrainError = 0 For I = 0 To NumCases - 1 For K = 0 To n - 1 InputNeuron(K) = X1(I, K) Next HiddenInput(n, m) HiddenTransfer(m) OutputInput(m) OutputTransfer() UpdateOut(I, m) UpdateHidden(n, m) Debug.WriteLine(I & " " & oout) TrainError = TrainError + (targval(I) - oout) ^ 2 Next I TrainError = System.Math.Sqrt(TrainError / NumCases) If TrainError < 0.01 Then Exit Do End If J = J + 1 Loop End Sub
Next, you see a nested loop
structure in which the outer loop controls the maximum number of
training iterations. The "For I" loop controls the actual training
process. A training set input pattern is transferred from the X1 array
to the input neurons. The HiddenInput subroutine uses these input neuron
values to calculate the input into each hidden layer neuron. The
HiddenTransfer function then calculates the outputs of the hidden layer
neurons, and the OutputInput subroutine uses these to determine the
value that will be sent into the output neuron. The OutputTransfer
subroutine then calculates the value output from the neural network. You
can use the inserted Debug statement to write the output values to the
Output window. Watch how they start off being fairly inaccurate and then
increase in accuracy.
Of course,
given random initialization of weights and biases, the initial output
value is most likely far from the actual target value. So you can call
the UpdateOut and UpdateHidden subroutines to update the weights and
biases between the output and hidden layer and between the hidden layer
and input layer, respectively. At this point the next training pattern
is transferred to the input neurons from X1 and the process repeated.
Once
all eight training patterns have been used, the first training
iteration (or epoch) is complete and the next epoch can begin. Training
will continue until the value of J reaches its specified cutoff or the
root mean square error stored in the variable TrainError drops below the
specified point.
When training a
neural network, especially for more complex patterns, it is possible
that the first attempt never reasonably approximates the output values
for the given input patterns. Then it's time to adjust some parameters.
The first things to consider adjusting are the learning rate and the
momentum term. Since these values control the degree of weight
adjustment with each training step, an inappropriate value could cause
the training algorithm to be either unable to converge upon a solution
in the number of iterations specified or unable to converge at all. This
is the trial-and-error approach.
The
second parameter you could consider adjusting is the number of training
epochs, since it is possible that the number of iterations allotted was
insufficient for the network to converge upon a solution. The remaining
parameter that you could consider adjusting is the number of hidden
layer neurons, since the more hidden layer neurons within the network,
the more sophisticated the internal representations of the network can
become. Avoid using more input neurons than needed, since too many
promotes a condition known as over-training.
An
over-trained neural network will generally be able to output highly
accurate values for the training set input patterns, but it loses the
ability to predict novel patterns. In other words, the network is only
able to create accurate predictions for sets it is familiar with. Losing
the ability to deal with novel patterns greatly diminishes the
usefulness of neural networks.
Luckily,
over-training can be easily tested for using a set of validation data.
Validation data is similar to training data in that you are aware of
what the output should be for each input pattern in the set, but it
should not repeat patterns contained in the training set. A validation
set is basically a set of known unknowns, in that the patterns are novel
to the neural network, but you know what the answers should be, and as
such can accurately assess the performance of the network. The
validation set can then be input into the neural network and the
predicted results compared to the expected results. If the results match
to within a predetermined degree of accuracy (90 percent, for instance)
the neural network can be considered properly trained and used to make
predictions for true unknowns (patterns that neither you nor the neural
network have seen before). If the network fails to predict the
validation set to within the specified degree of accuracy, you can
assume that the network has been improperly trained and therefore
discard it. A new network can then be trained and validated, and the
process repeated until a network that successfully passes validation
results.
If a given set of network
parameters continues to train successfully but continually fails to
validate, then it may be beneficial to try modifying the training
parameters somewhat, since the training routine is likely converging on a
local minima rather than the global minima.
Figure 7 Unsuccessful Validation
Let's
assume that you now have a successfully trained neural network. You can
begin to examine the validation process by adding a button control and
the code in Figure 6 to the application. You can see
from the code that the validation procedure has much in common with the
training aspect of the neural network. It reads in a set of data,
computes an output value for each pattern in the data set, and then
compares the output value to a target value. Through this comparison it
determines how many members of the validation set were correctly
predicted, and then determines if at least two out of three data set
members were predicted correctly. If at least two out of three members
were correctly predicted, the network is properly validated and can be
used for unknown evaluation. If fewer than two members were correctly
predicted (see Figure 7), then you'll need to discard this network, retrain another, and repeat the validation process once again.
Figure 6 Validation Procedure
Private Sub Button3_Click(ByVal sender As System.Object, _ ByVal e As System.EventArgs) Handles Button3.Click Dim NumCases As Integer NumCases = 3 ReDim X1(NumCases, n + 1) ReDim targval(NumCases) Dim I, J, K As Integer Dim Correct As Integer Dim TrainSR As New IO.StreamReader(("C:\NNValid.txt") For J = 0 To NumCases - 1 For I = 0 To n X1(J, I) = TrainSR.ReadLine Next I targval(J) = X1(J, n) Next J TrainSR.Close() For I = 0 To NumCases - 1 For K = 0 To n - 1 InputNeuron(K) = X1(I, K) Next HiddenInput(n, m) HiddenTransfer(m) OutputInput(m) OutputTransfer() If targval(I) = System.Math.Round(oout) Then Correct = Correct + 1 End If Next I TextBox5.Text = Correct & " out of " & NumCases & " Match. " If Correct >= 2 Then TextBox5.Text = TextBox5.Text & "Validation Successful." Else TextBox5.Text = TextBox5.Text & "Validation Unsuccessful." End If End Sub
In this case I chose to validate
by determining the overall number of predictions that are correct, which
is an acceptable method for binary outputs. For nonbinary outputs,
validation is usually performed by considering the numeric discrepancies
between the output values and the target values.
Evaluating Unknown Patterns
To
make the trained neural network accessible to the physicians, you'll
need to add some textboxes for inputting the variables for each patient,
as well as a third command button for launching the evaluation of the
input pattern. You'll also need to add a fourth textbox that will
indicate to the physicians whether the patient is at increased or
decreased risk. Add the code found in Figure 8 to the click event of the command button just created. The code in Figure 8
simply reads in the values entered into each of the textboxes and
sequentially calls the hiddeninput, hiddentransfer, outputinput, and
outputtransfer subroutines to obtain the neural network's prediction for
the provided input pattern. A simple conditional then translates the
neural network's numeric output into a written statement about the
patient's risk (see Figure 9).
Figure 8 Evaluating Risk
Private Sub Button2_Click(ByVal sender As System.Object, _ ByVal e As System.EventArgs) Handles Button2.Click InputNeuron(0) = TextBox1.Text InputNeuron(1) = TextBox2.Text InputNeuron(2) = TextBox3.Text HiddenInput(n, m) HiddenTransfer(m) OutputInput(m) OutputTransfer() If oout < 0 Then TextBox5.Text = "The patient is at a reduced risk for disease" Else TextBox5.Text = "The patient is at an increased risk for disease" End If End Sub
Figure 9 Sample Output
It
is important to note that in order to make a prediction using a trained
neural network, the weights and biases do not need to be modified. If
this application was being designed for the real world, it would be
beneficial to add code that could save the weights and biases of a
trained network to a file. An alternate initialization procedure should
also be provided, where rather than randomly initializing values,
previously saved weights and biases could be loaded to allow the neural
network to immediately make effective predictions without further
training.
Conclusion
I
have briefly examined the operations behind one of the most common
types of neural networks. Even this simple example proved that neural
networks can provide a highly useful methodology for dealing with
pattern matching and predictive tasks. Neural networks are a diverse
field; in addition to the feed-forward network discussed here, numerous
other types of networks can be employed, depending on the task at hand.
There are even variants of the feed-forward network such as networks
with multiple hidden layers or networks that also provide direct
weighted interconnections between the input and output layers, in
addition to the typical hidden layer connections.
Many
advances have been made in training algorithms as well. While they all
still apply the same basic principles as the back-propagation variant
discussed, many of these newer algorithms are able to converge on a
solution in far fewer iterations, which can be highly advantageous for
patterns with a large number of values.
All
in all, the neural network coded in this article only demonstrates a
fraction of the power of a modern implementation, but it should have
provided you with a glimpse of the evolving and robust arena that is the
world of neural networks.
Suggested References
- Bishop, Christopher M. Neural Networks for Pattern Recognition (Oxford University Press, 1995)
- Faussett, Laurene V. Fundamentals of Neural Networks (Prentice Hall, 1994)
- Reed, Russell D. and Marks, Robert J. II. Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks (MIT Press, 1999)
Christopher M. Frenz
is a bioinformaticist and uses neural networks to model biological
systems. He is the author of Visual Basic and Visual Basic .NET for
Scientists and Engineers (Apress, 2002). He can be reached at cfrenz@gmail.com.
Комментариев нет:
Отправить комментарий