Artificial intelligence (AI) refers to computer software that exhibits intelligent behavior. The term "intelligence" is difficult to define, and has been the subject of heated debate by philosophers, educators, and psychologists for ages. Nevertheless, it is possible to enumerate many important characteristics of intelligent behavior. Intelligence includes the capacity to learn, maintain a large storehouse of knowledge, utilize commonsense reasoning, apply analytical abilities, discern relationships between facts, communicate ideas to others and understand communications from others, and perceive and make sense of the world around us. Thus, artificial intelligence systems are computer programs that exhibit one or more of these behaviors.
AI systems can be divided into two broad categories: knowledge representation systems and machine learning systems. Knowledge representation systems, also known as expert systems, provide a structure for capturing and encoding the knowledge of a human expert in a particular domain. For example, the knowledge of medical doctors might be captured in a computerized model that can be used to help diagnose patient illnesses.
The second category of AI, machine learning systems, creates new knowledge by finding previously unknown patterns in data. In contrast to knowledge representation approaches, which model the problem-solving structure of human experts, machine learning systems derive solutions by "learning" patterns in data, with little or no intervention by an expert. There are three main machine learning techniques: neural networks, induction algorithms, and genetic algorithms.
Neural networks simulate the human nervous system. The concepts that guide neural network research and practice stem from studies of biological systems. These systems model the interaction between nerve cells. Components of a neural network include neurons (sometimes called "processing elements"), input lines to the neurons (called dendrites), and output lines from the neurons (called axons).
Neural networks are composed of richly connected sets of neurons forming layers. The neural network architecture consists of an input layer, which inputs data to the network; an output layer, which produces the resulting guess of the network; and a series of one or more hidden layers, which assist in propagating. This is illustrated in Figure 1.
During processing, each neuron performs a weighted sum of inputs from the neurons connecting to it; this is called activation. The neuron chooses to fire if the sum of inputs exceeds some previously set threshold value; this is called transfer.
Inputs with high weights tend to give greater activation to a neuron than inputs with low weights. The weight of an input is analogous to the strength of a synapse in a biological system. In biological systems, learning occurs by strengthening or weakening the synaptic connections between nerve cells. An artificial neural network simulates synaptic connection strength by increasing or decreasing the weight of input lines into neurons.
Neural networks are trained with a series of data points. The networks guess which response should be given, and the guess is compared against the correct answer for each data point. If errors occur, the weights into the neurons are adjusted and the process repeats itself. This learning approach is called backpropagation, and is similar to statistical regression.
Neural networks are used in a wide variety of business problems, including optical character recognition, financial forecasting, market demographics trend assessment, and various robotics applications.
Induction algorithms form another approach to machine learning. In contrast to neural networks, which are highly mathematical in nature, induction approaches tend to involve symbolic data. As the name implies, these algorithms work by implementing inductive reasoning approaches. Induction is a reasoning method that can be characterized as "learning by example." Unlike rule-based deduction, induction begins with a set of observations and constructs rules to account for these observations. Inductive reasoning attempts to find general patterns that can fully explain the observations. The system is presented with a large set of data consisting of several input variables and one decision variable. The system constructs a decision tree by recursively partitioning data sets based on the variables that best distinguish between the data elements. That is, it attempts to partition the data so that each partition contains data with the same value for a decision variable. It does this by selecting the input variables that do the best job of dividing the data set into homogeneous partitions. For example, consider Figure 2, which contains the data set pertaining to decisions that were made on credit loan applications.
An induction algorithm would infer the rules in Figure 3 to explain this data.
As this example illustrates, an induction algorithm is able to induce rules that identify the general patterns in data. In doing so, these algorithms can prune out irrelevant or unnecessary attributes. In the example above, salary was irrelevant in terms of explaining the loan decision of the data set.
Induction algorithms are often used for data mining applications, such as marketing problems that help companies decide on the best market strategies for new product lines. Data mining is a common service included in data warehouses, which are frequently used as decision support tools.
Genetic algorithms use an evolutionary approach to solve optimization problems. These are based on Darwin's theory of evolution, and in particular the notion of survival of the fittest. Concepts such as reproduction, natural selection, mutation, chromosome, and gene are all included in the genetic algorithm approach.
Genetic algorithms are useful in optimization problems that must select from a very large number of possible solutions to a problem. A classic example of this is the traveling salesperson problem. Consider a salesman who must visit n cities. The salesperson's problem is to find the shortest route by which to visit each of these n cities exactly once, so that the salesman will tour all the cities and return to the origin. For such a problem there are ( n − 1)! possible solutions, or ( n − 1) factorial. For six cities, this would mean 5 × 4 × 3 × 2 × 1 = 120 possible solutions. Suppose that the salesman must travel to 100 cities. This would involve 99! possible solutions. This is such an astronomical number that if the world's most powerful computer began solving such a problem at the time that the universe had begun and worked continuously on it since, it would be less than one percent complete today!
Obviously, for this type of problem a brute strength method of exhaustively comparing all possible solutions will not work. This requires the use of heuristic methods, of which the genetic algorithm is a prime example. For the traveling salesperson problem, a chromosome would be one possible route through the cities, and a gene would be a city in a particular sequence on the chromosome. The genetic algorithm would start with an initial population of chromosomes (routes) and measure each according to a fitness function (the total distance traveled in the route). Those with the best fitness functions would be selected and those with the worst would be discarded. Then random pairs of surviving chromosomes would mate, a process called crossover. This involves swapping city positions between the pair of chromosomes, resulting in a pair of child chromosomes. In addition, some random subset of the population would be mutated, such that some portion of the sequence of cities would be altered. The process of selection, crossover, and mutation results in a new population for the next generation. This procedure is repeated through as many generations as necessary in order to obtain an optimal solution.
Genetic algorithms are very effective at finding good solutions to optimization problems. Scheduling, configuration, and routing problems are good candidates for a genetic algorithm approach. Although genetic algorithms do not guarantee the absolute best solution, they do consistently arrive at very good solutions in a relatively short period of time.
Artificial intelligence systems provide a key component in many computer applications that serve the world of business. In fact, AI is so prevalent that many people encounter such applications on a daily basis without even being aware of it.
One of the most ubiquitous uses of AI can be found in network servers that route electronic mail. Expert systems are routinely utilized in the medical field, where they take the place of doctors in assessing the results of tests like mammograms or electrocardiograms. Neural networks are commonly used by credit card companies, banks, and insurance firms to help detect fraud. These AI systems can, for example, monitor consumer spending habits, detect patterns in the data, and alert the company when uncharacteristic patterns arise. Genetic algorithms serve logistics planning functions in airports, factories, and even military operations, where they are used to help solve incredibly complex resource-allocation problems. And perhaps most familiar, many companies employ AI systems to help monitor calls in their customer service call centers. These systems can analyze the emotional tones of callers' voices or listen for specific words, and route those calls to human supervisors for follow-up attention.
Although computer scientists have thus far failed to create machines that can function with the complex intelligence of human beings, they have succeeded in creating a wide range of AI applications that make people's lives simpler and more convenient.
SEE ALSO: Expert Systems
Revised by Rhoda L. Wilburn
Dhar, V., and R. Stein. Seven Methods for Transforming Corporate Data into Business Intelligence. Upper Saddle River, NJ: Prentice Hall, 1997.
"Hot Topics: Artificial Intelligence." BBC Online. Available from < http://www.bbc.co.uk/science/hottopics/ai/ >.
Kahn, Jennifer. "It's Alive! From Airport Tarmacs to Online Job Banks to Medical Labs, Artificial Intelligence Is Everywhere." Wired, March 2002. Available from < http://www.wired.com/wired/archive/10.03/everywhere.html >.
Menzies, Tim. "21st Century AI: Proud, Not Smug." IEEE Intelligent Systems, May/June 2003.
Norvig, P., and S. Russell. Artificial Intelligence: A Modern Approach. Upper Saddle River, NJ: Prentice Hall, 2002.
Van, Jon. "Computers Gain Power, But It's Not What You Think." Chicago Tribune, 20 March 2005.