REINFORCEMENT THEORY

Photo by: Gabi Moisa

Reinforcement theory is the process of shaping behavior by controlling the consequences of the behavior. In reinforcement theory a combination of rewards and/or punishments is used to reinforce desired behavior or extinguish unwanted behavior. Any behavior that elicits a consequence is called operant behavior, because the individual operates on his or her environment. Reinforcement theory concentrates on the relationship between the operant behavior and the associated consequences, and is sometimes referred to as operant conditioning.

BACKGROUND AND DEVELOPMENT
OF REINFORCEMENT THEORY

Behavioral theories of learning and motivation focus on the effect that the consequences of past behavior have on future behavior. This is in contrast to classical conditioning, which focuses on responses that are triggered by stimuli in an almost automatic fashion. Reinforcement theory suggests that individuals can choose from several responses to a given stimulus, and that individuals will generally select the response that has been associated with positive outcomes in the past. E.L. Thorndike articulated this idea in 1911, in what has come to be known as the law of effect. The law of effect basically states that, all other things being equal, responses to stimuli that are followed by satisfaction will be strengthened, but responses that are followed by discomfort will be weakened.

B.F. Skinner was a key contributor to the development of modern ideas about reinforcement theory. Skinner argued that the internal needs and drives of individuals can be ignored because people learn to exhibit certain behaviors based on what happens to them as a result of their behavior. This school of thought has been termed the behaviorist, or radical behaviorist, school.

REINFORCEMENT, PUNISHMENT,
AND EXTINCTION

The most important principle of reinforcement theory is, of course, reinforcement. Generally speaking, there are two types of reinforcement: positive and negative. Positive reinforcement results when the occurrence of a valued behavioral consequence has the effect of strengthening the probability of the behavior being repeated. The specific behavioral consequence is called a reinforcer. An example of positive reinforcement might be a salesperson that exerts extra effort to meet a sales quota (behavior) and is then rewarded with a bonus (positive reinforcer). The administration of the positive reinforcer should make it more likely that the salesperson will continue to exert the necessary effort in the future.

Negative reinforcement results when an undesirable behavioral consequence is withheld, with the effect of strengthening the probability of the behavior being repeated. Negative reinforcement is often confused with punishment, but they are not the same. Punishment attempts to decrease the probability of specific behaviors; negative reinforcement attempts to increase desired behavior. Thus, both positive and negative reinforcement have the effect of increasing the probability that a particular behavior will be learned and repeated. An example of negative reinforcement might be a salesperson that exerts effort to increase sales in his or her sales territory (behavior), which is followed by a decision not to reassign the salesperson to an undesirable sales route (negative reinforcer). The administration of the negative reinforcer should make it more likely that the salesperson will continue to exert the necessary effort in the future.

As mentioned above, punishment attempts to decrease the probability of specific behaviors being exhibited. Punishment is the administration of an undesirable behavioral consequence in order to reduce the occurrence of the unwanted behavior. Punishment is one of the more commonly used reinforcement-theory strategies, but many learning experts suggest that it should be used only if positive and negative reinforcement cannot be used or have previously failed, because of the potentially negative side effects of punishment. An example of punishment might be demoting an employee who does not meet performance goals or suspending an employee without pay for violating work rules.

Extinction is similar to punishment in that its purpose is to reduce unwanted behavior. The process of extinction begins when a valued behavioral consequence is withheld in order to decrease the probability that a learned behavior will continue. Over time, this is likely to result in the ceasing of that behavior. Extinction may alternately serve to reduce a wanted behavior, such as when a positive reinforcer is no longer offered when a desirable behavior occurs. For example, if an employee is continually praised for the promptness in which he completes his work for several months, but receives no praise in subsequent months for such behavior, his desirable behaviors may diminish. Thus, to avoid unwanted extinction, managers may have to continue to offer positive behavioral consequences.

SCHEDULES OF REINFORCEMENT

The timing of the behavioral consequences that follow a given behavior is called the reinforcement schedule. Basically, there are two broad types of reinforcement schedules: continuous and intermittent. If a behavior is reinforced each time it occurs, it is called continuous reinforcement. Research suggests that continuous reinforcement is the fastest way to establish new behaviors or to eliminate undesired behaviors. However, this type of reinforcement is generally not practical in an organizational setting. Therefore, intermittent schedules are usually employed. Intermittent reinforcement means that each instance of a desired behavior is not reinforced. There are at least four types of intermittent reinforcement schedules: fixed interval, fixed ratio, variable interval, and variable ratio.

Fixed interval schedules of reinforcement occur when desired behaviors are reinforced after set periods of time. The simplest example of a fixed interval schedule is a weekly paycheck. A fixed interval schedule of reinforcement does not appear to be a particularly strong way to elicit desired behavior, and behavior learned in this way may be subject to rapid extinction. The fixed ratio schedule of reinforcement applies the reinforcer after a set number of occurrences of the desired behaviors. One organizational example of this schedule is a sales commission based on number of units sold. Like the fixed interval schedule, the fixed ratio schedule may not produce consistent, long-lasting, behavioral change.

Variable interval reinforcement schedules are employed when desired behaviors are reinforced after varying periods of time. Examples of variable interval schedules would be special recognition for successful performance and promotions to higher-level positions. This reinforcement schedule appears to elicit desired behavioral change that is resistant to extinction.

Finally, the variable ratio reinforcement schedule applies the reinforcer after a number of desired behaviors have occurred, with the number changing from situation to situation. The most common example of this reinforcement schedule is the slot machine in a casino, in which a different and unknown number of desired behaviors (i.e., feeding a quarter into the machine) is required before the reward (i.e., a jackpot) is realized. Organizational examples of variable ratio schedules are bonuses or special awards that are applied after varying numbers of desired behaviors occur. Variable ratio schedules appear to produce desired behavioral change that is consistent and very resistant to extinction.