A simple guide for the clueless programmer that doesn’t know fancy terminology
Why another tutorial?
I noticed a high level of intellectual snobbism in pretty much all tutorials about high-end technologies. Those writing these things get lost in technical details and miss explaining the obvious parts, the parts that are essential to the beginner. In this tutorial I will focus on the big picture first, before diving into the technical aspects. It’s very likely that most of you probably already know the very obvious things I will start with, but it’s better to cover them anyway, from a different perspective. This guide should be regarded as a starting point for further research and learning.
Even if some sections might seem way below your knowledge level, you –hopefully – will still find them an interesting read. This is not a tutorial for dumb people, it is simply written in a more basic manner so it is easier to follow by any potential reader – from rookies to professionals – without assuming previous knowledge of the subject matter.
How complicated is it to create Neural Networks?
I am a programmer, I want to use neural networks in my projects, do I have to know advanced math or interpret graphs to understand what neural nets do or how to use them? NO, absolutely not. I learned programming through examples and I perfected my craft through trial and error. I also want to be able to do this when it comes to more fancy tools such as NNs, but even the simplest tutorials out there miss out on explaining the clear and simple things; instead, most of them are a sort of knowledge show-off of those writing them.
This guide is written for people that want to learn about neural networks but don’t want to learn about boring and confusing mathematical properties of such networks, or to get lost in scientific explanations that lead almost nowhere in practice.
- Programmers that want to extend their skills and knowledge with machine learning features
- Non-technical people that want to understand how a machine can mimic a brain
- Beginners in machine learning that feel there is a big picture missing to all that they learned so far about this subject
- Bold people that want a CLEAR starting point for learning cutting edge technologies
Designing a NN might have its difficult parts, of course, but that doesn’t mean that it has to be complicated from the first step. There are lots of libraries and tools to make this task easy. What I found missing are functional example NN topologies (the structure of an “artificial brain”). To be able to learn something efficiently you need to be able to start seeing something functional and do your own functional modifications, based on an example. You don’t need to learn perfect theory and aim to create the perfect mathematically flawless structure, especially when NNs are so sensitive to the type of data analyzed.
I will try to present you with just the part of theory that is “enough” and actually useful for starting to design and use neural networks in your projects. I will also present some of the actually useful tutorials I found so far, and some of the available libraries and resources on this topic.
What can you do with neural networks, and what you can’t, or shouldn’t try to do
Ok, so you want to learn about neural networks – great! But are you sure you know what you can achieve with them? Do you have a plan or you just want to take over the world, or perhaps win the lottery? Step by step I aim to help you not just to find the answers you seek, but also to ask the right questions.
Neural networks, among many other things, can be used for:
(NOTE: As Sci-Fi as it might sound, this list is extracted from actual NN uses I found during my research)
- Recognizing objects in images, including your face!
- Reading fingerprints even if your hands are sweaty
- Predict values based on patterns you don’t know initially
- Translate text, or even compose text in a style similar to how a human would
- Improve search features and better targeting content
- Do stuff in a similar way to whatever you train it to do
- Find patterns in any form of data (raw data, images, sound, text, etc)
- Recognize songs or music genres
- Determine your mood from a social media comment you post
- Predict someone’s behavior in a given situation
- Make your photo camera shutter when you smile
- Control your game console by using gestures and body poses
- Find cheaters and abnormal activities in a network of people
- Find anomalies in a system
- Anything that you could possibly do with a tool that can find patterns
- Classify any sort of data
- Estimate values over time when you don’t have new data to calculate them
- Create art. After all, any method of using data in a non-precise way that eventually lead to meaningful results, could be considered interpretable art, to some extent
- Solve incredible complicated equations without actually calculating them
- Count animal species just from satellite pictures
- Determine human feelings from their reactions
- Help a machine understand what is good and bad (but not what is right or wrong…or should that be the opposite?)
- Get rich by serving advertising in a smart way
- Create incredible tools for data mining
- Find solutions for tasks that would be otherwise impossible, or too difficult to solve with normal programming techniques
- Create tools to understand animal behavior
- Improve your fishing experience
- Find the best food for you in that particular day
- Create a virtual assistant
- Use voice commands to control your computer or gadgets
- Create systems that adapt to the user
- Create fancy image filters and visual effects
- Improve sound and image quality
- Create new chemicals engineered for a specific purpose
- Find cures to diseases
This list is really huge and keeps going. Once you understand the general type of things you could do with neural nets, only your imagination can limit the possible uses of NNs.
What you can’t (currently) do with NNs:
- Build Skynet
- Create a self-aware artificial consciousness
- Transfer your brain data into a computer
- Unleash a NN on the web to let it grow on its own
- Create a universal solution for everything
- Win the lottery
- Accurately predict the stock market
- Forecast sports results accurately (still possible to some extent)
- Make a computer evolve its internal components on its own
- Predict any truly random outcome (or random enough, since truly random doesn’t really exist)
- Create systems that have genuine inspiration (however it can be faked to a decent level)
- Attempt to create a single neural Net to cumulate functionality that should be divided over multiple neural nets (ex: recognize handwriting and determine music genre at the same time within the same network)
It is important to understand that even if NNs uses could be extremely powerful and exotic, they do have their limitations. Also – very important – artificial neural networks are NOT the most efficient solution always. They should be used only when there is no other better way to achieve similar results!
When should you use NNs?
As powerful as they might seem, NNs require a lot of processing power to become functional and to learn what they are supposed to do. In many situations, there are much better alternatives to achieve the same result. A NN cannot just work out of the box. It is not a magical piece of code that does everything. It is a system that acts as a learning tool so that your software can use whatever it has learned, not just recall it (like a normal static memory would).
A NN is rarely used alone. It is usually just a section of the functionality of your software. Its efficiency depends on two things, its structure (called topology), and its training data. It is similar to biological neural networks, such as your brain. It’s not enough just to be smart; after all, if you learn information incorrectly, you will not achieve good results.
NNs are best used in situations where the amount of data is insanely large compared to the desired output. Also, they are very efficient to use when we don’t know what sort of pattern or information should be analyzed to reach the desired output. Take for example object recognition in images. There is a lot of data in a picture, each pixel contains data about its color and position. For a simple task such as finding your dog in a picture, you would need an unreasonable amount of processing to analyze all possibilities of pixel color and position and an unknown pattern to determine if it is a dog in that picture or not…. simply put, not practically possible. A neural net however, would not do all these absurd calculations to find a dog in a picture. Instead it will compare many many picture of dogs and determine whatever pattern the pixels in such images have. After a long process of learning, it will be able to “see” a dog in a new image.
Understanding this via an example of a picture is easy because we process images in our visual cortex constantly, and finding a dog in our line-of-sight is a basic and simple task for any human. However, just imagine that the same amount of processing could analyze other sort of patterns, and instead of dogs it could detect far more unreachable and complex things for us.
What exactly IS a neural network?
No, it is not a glowing mass growing in a test tube next to your computer…sadly.
A neural network is a software generated simulation of a biological structure composed of “neurons” connected with each other, that processes information. From a programming point of view, a NN usually has a core, a structure and a memory. The core is that part of code that you use to create the network by using class functions and so on. This core loads the topology (the structure) and creates virtual neurons disposed on layers, with various properties and behavior. Once created, such a virtual layout of neurons will be flooded with data we provide, and during the process called “learning” it will adjust its internal values so that it will try to adapt to the data you present to it and try to reach the results you will specify as “correct”. These internal values of the network are called “weights”. All neurons have such a weight. The memory of the network is nothing more than a list with all these weights. It is in these weights what defines how much a NN has learned and how capable it is of doing the given task. Its structure, on the other hand, defines how complex the analyzed data can be and how much and in what ways to analyze it.
The memory of a NN does not contain actual training data. If you create for example a NN for facial recognition, it will not contain faces of past training examples. Instead, it will just remember the values needed in each neuron in order to be able to recognize the data.
Once trained, a NN will be used by pushing new data through the same structure, with the updated internal memory, to determine or generate the sort of result it is designed to get. Training is done on thousands, even millions of samples, and it can take many minutes or even many days to process. Using a trained network however, is much faster.
It becomes obvious that it’s unlikely we can train an artificial brain on a device with less computing power, such as an Arduino; however, it is possible to use a NN on such a device once it is trained on a more powerful machine.
There are dedicated processors for NN, that can simulate neurons at a hardware level, making live learning possible and much faster. Such devices are more expensive and less powerful than a GPU (video card) processor – you should keep an eye of such devices as these will be the future of neural nets.
What is the difference between a NN and conventional programming?
Conventional programming is “procedural”, this means that the program starts at its first line of code and goes line by line till the last, executing instructions in a linear fashion. A neural network is a “connectionist” computational system. Information inside a neural net is processed collectively, in parallel, through a network of nodes called neurons.
A neural network does not compute precise output values by following precise rules. Instead, it will try to estimate the right value based on previous examples it was trained on. This sort of freedom in input and output is essential for dealing with real-life data, where the general big picture or an estimated result is more important than a precisely calculated output. This is also the only reasonable way to deal with information that cannot be processed with exact tools. For example, a fingerprint scan is never identical, so computing it in a traditional procedural way to check for an exact match would make absolutely no sense, while using a neural network to approximate the potential result is far more efficient.
What would be a basic structure for a useful NN?
There are many types of neural nets, but some of them are easier to understand than others, and quite efficient. In fact, you could obtain decent results with a small-medium sized network, however if you want to increase its accuracy and capabilities you will probably have to try far more complex architectures, or provide a lot more training data.
The simplest structure for a neural net is meaningless for this guide. I don’t want to fill your heads with theoretical models that have no real practical use, or to use fancy biologically inspired names for such models. This sort of information you can get by reading ANY beginner tutorial on neural nets, except this one.
The most usable SIMPLE NN architecture that I used during my learning experiments was a network composed from one input layer, two hidden layers and an output layer, with the following proportions:
Consider the input layer to have 100%; then the first hidden layer would have 75% the number of the input neurons, and the second layer 50% the number of neurons in the input. The output layer depends on the sort of results you aim for.
What are input neurons and how many do I need?
This is one of the trivial questions that apparently is clear for everybody else but the total beginner, yet nobody takes the time to explain this very simple thing…how do you determine the number of input neurons? What are they actually?
Input neurons are the last granulation of data obtained from whatever you want to process, and the first layer of the neural net. These neurons are the first gates of data that receive the information from your data and feed it to the neural net for processing.
Pictures presenting 3 neurons on the input are so deceiving, for me at least. A network with 3 inputs is probably so useless that it better be implemented through conventional computing not neural nets. Stop using bad examples people, just for the sake of keeping the theory pure.
So let’s get a little bit more real with the examples. Let’s say you analyze an image, the input neurons will be exactly one for each pixel of the image, simply because that is the data the picture is made of – pixels. A 128 pixel by 64 pixel image will have its first layer defined as 128×64 neurons.
If you analyze other types of data instead of images, you should pre-process your data to obtain the minimum efficient granularity in order for not too much data to be lost but, also, not all useless data is passed to the network. As an example, if you want to analyze data from sensors, you would have a linear number of input neurons equal to the number of sensors that acquire the data.
If you analyze raw data, such as analytics related information about website visitors, or more complex sets of data, then you should pre-process the data in such a way that it will yield just enough values to represent it.
Passing all the data to a NN is a waste of resources. You can find out more about how to pre-process data for feeding it to a neural net in the section dedicated to this subject.
What are output neurons and how many do I need?
Output neurons are nothing more than the expected types of results, if you are making the network classify given information, or one single neuron if you use the network to predict an output (called regression). For example, if you create a neural net to determine what genre a song is, the number of output neurons will be equal to all the music genres that the NN will be trained to recognize. When new data is passed to the NN, the result will be that all the output neurons, except one, will be false, and the one output neuron that will be positive will indicate what type the analyzed data is.
When training the network with data (mandatory step before using it), you will need to specify the output in a similar way to how the network will return it after. Each training sample will also tell the network what the expected result it represents. Imagine this as on/off lights. The correct answer will light up while the others will be off or very dim. The lights representation is a good one because the output neurons will not hold just on/off bits, instead they will hold variable values indicating how much the analyzed sample belongs to one or the other possible result of the classification.
Another interesting use of neural nets is to create images or data visualizations, not just to read images. In such a case, the output neurons will be equal to the number of pixels in the desired output image.
What is backpropagation?
The most common type of artificial neural network is called “Backpropagational neural network” (BPNNs). Backpropagation refers to the backwards propagation of the error within the error. During the learning process, each time new data is pushed forward to the network, it will adjust the weights of each neuron backwards throughout the entire network, slightly adapting it so the next time it will get closer to the correct answer. In other words, a BPNN makes random guesses of what the pattern presented to it might be, then it makes adjustments to all the connections between its neurons, depending how far from the truth its guess was. Through repeated learning, the network will adjust the weights on its neurons and as a result, it will “learn”.
You will stumble upon this term in all the documentations. Some will ignore going into more detail, others will explain it in too much detail. The very simple idea behind this is: If you put one additional neuron on each layer of a neural network, and you make sure this neuron will always get fake values (always equal to 1), the entire network will perform incredibly better than without it. This neuron is called a bias neuron. Consider it the error that confirms the rule, philosophically speaking. That’s all you need to know about the bias neuron, all decent neural network libraries will use it by default.
What exactly happens with the data inside a neuron?
Now you know that a neural network is a sort of simulation where virtual/logical neurons are connected with one another on layers. Data comes in from the input neurons, and ends in the output neurons. Provided a large number of examples, each telling the network what the expected result is for a given set of data arriving at input should be, the network adapts its internal “weights” (you can call this its memory), so that next time it sees similar data it will get closer to the same result. But how is data passed from one neuron to another? To do this each neuron filters the data through a function, that will change it in one way or another. Take your time and look over this documentation if you wish to know more about this:
Preparing your data is absolutely essential
Normalize your input values so that they fit within a range that is most effective for NN training
Example method to bring a given value within a range:
normalizedVal = (originalVal – Min) / (Max – Min);
Remember the opposite of 1 is not zero, but -1. When you want to indicate right and wrong within the training data output, mark correct answers with 1 and wrong with -1. 0 (zero) could work also but it is less effective and harder to learn for the network.
Try to bring your input data within a range from -10 to 10. Why? Well there is a mathematical explanation related to numerical hyperplanes that describes why because of the transfer functions within the network, numbers outside this range will be less “visible” … but the simple and effective information out of all this is, just normalize your data, fit it from -10 to 10, works!
Images should also be preprocessed otherwise your NN will analyze lots of useless information. If you want to analyze a normal 1000x1000px image, it means you will have 1,000,000 input neurons on the input layer, and many more cumulated on the hidden layers. Most of the data you are looking to analyze, such as patterns and shapes, are still there if you shrink the image a lot. Analyzing a 100×100 pixel image will still hold the needed patterns found in a much larger image, instead it will require just 10,000 input neurons. You will be surprised to see that even shrinked to barely visible sizes, such as 32×32, the image will still hold many of the properties you are looking for, just enough to be used with a neural net.
Pre-processing your data can dramatically improve the speed and performance of your neural net!
A certain level of processing can be done also from within the NN, for example you can grab data in the hidden layers just from some of the input layer neurons, or you could use just one color channel if the processed data is an image.
Enough with the theory, let’s learn something practical
Need to decide how to implement your neural network architecture, in what programming language to write it in, or will you let a program do it for you?
There are many tools out there, most have impressive graphical user interfaces, but a true custom solution that will be able to adapt to what you need, should be written in code not in editors.
C++ for example is an efficient language to write such a thing in. There are also multiple libraries, some with more advantages than the others, I can’t deny that…but their lack of user friendliness is depressing. Take a look at Tensorflow for example.
Here is a David Miller’s C++ tutorial that I found to be absolutely amazing and very well explained. Take an hour of your time and watch it carefully, then try to replicate it and start experimenting with your first custom made neural network https://vimeo.com/19569529