CONDITIONING AND LEARNING
TWO TYPES OF CONDITIONING
This module discusses the two very fundamental forms of learning that are represented in classical (Pavlovian) and instrumental (operant) conditioning. Through them, we respectively learn to associate (1.) stimuli in the environment or (2.) our own behaviors with significant events such as rewards and punishers.
CLASSICAL or PAVLONIAN: 1st studied by Ivan Pavlov;The procedure in which an initially neutral stimulus (the conditioned stimulus, or CS) is paired with and an unconditioned stimulus (or US). The result is that the conditioned stimulus begins to elicit a conditioned response (CR). An important idea behind the study of operant conditioning is that it provides a method for studying how consequences influence “voluntary” behavior.
EXAMPLE: Pavlov rang a bell and then gave a dog some food. Once the bell and food had been paired a few times, the dog treated the bell as a signal for food: The dog began salivating at the sound of the bell.
- The food in Pavlov’s experiment is called the unconditioned stimulus (US) because it elicits salivation unconditionally, before the experiment begins. The bell is called the conditioned stimulus (CS) because its ability to elicit the response is conditional on (depends on) its pairings with food. In a corresponding way, the new (learned) response to the bell is called the conditioned response (CR), and the natural response to the food itself is the unconditioned response (UR).
INSTRUMENTAL or OPERANT: 1st studied by Edward Thorndike and later extended by B. F. Skinner
EXAMPLE: a rat learns to press a lever in a box in the laboratory (a “Skinner box”) when lever-pressing produces food pellets.
- The food pellet is called a reinforcer, because it strengthens the response it is made a consequence of. (A reinforcer is any event that does this.)
Useful Things to Know about Pavlonian Conditioning
Many Effects on Behavior: Classical conditioning is also involved in other aspects of eating. Flavors associated with a nutrient (such as sugar, starch, calories, or proteins) become preferred or liked.
- Here the flavor is the CS and the nutrient is the US. In a complementary way, flavors associated with stomach upset or illness become avoided and disliked.
EXAMPLE: a person who gets sick after drinking too much tequila may learn a profound dislike of the taste and odor of tequila—a phenomenon called taste aversion conditioning.
And it is clinically relevant.
EXAMPLE: drugs used in chemotherapy often make cancer patients sick. As a consequence, patients often learn aversions to a food that was eaten recently, or to the chemotherapy clinic itself
Pavlovian conditioning occurs with many other significant events.
EXAMPLE: an experimenter sounds a tone just before applying a mild shock to a rat’s feet, the tone will elicit fear or anxiety after one or two pairings.
- Similar fear conditioning plays a role in creating many anxiety disorders in humans, such as phobias and panic disorders, where people associate cues (like closed spaces or a shopping mall) with panic or other emotional trauma.Here, the CS comes to trigger an emotion.
Whenever a drug is taken, it can be associated with the cues that are present at the same time (e.g., rooms, odors, drug paraphernalia). Drug cues have an interesting property: They elicit responses that often “compensate” for the upcoming effect of the drug.
EXAMPLE: morphine suppresses pain, but the response elicited by cues associated with morphine make us more sensitive to pain. Such conditioned compensatory responses decrease the impact of the drug on the body.
This has many implications. A drug user will be most “tolerant” to the drug in the presence of cues that have been associated with the drug (because they elicit compensatory responses). Overdose is therefore more likely if the drug is taken in the absence of those cues. Conditioned compensatory responses (which include heightened pain sensitivity and decreased body temperature, among others) might also be uncomfortable and motivate the drug user to take the drug to reduce them. This is one of several ways in which Pavlovian conditioning might be involved in drug addiction and dependence.
A final effect of Pavlovian cues is that they motivate ongoing operant behavior. In the presence of drug-associated cues, a rat will work harder (lever-press more) for a drug reinforcer. In the presence of food-associated cues, a rat (or an overeater) will work harder for food. And in the presence of fear cues, a rat (or a human with an anxiety disorder) will work harder to avoid situations that might lead to trauma. Pavlovian CSs thus have many effects that can contribute to important behavioral phenomena.
The Learning Process: Somewhat counterintuitively, studies show that pairing a CS and a US together is not sufficient for an association to be learned between them. Consider an effect called blocking. In this effect, an animal first learns to associate one CS, call it stimulus A, with a US. Once the association is learned, in a second phase, a second stimulus B is presented along with A, and the two stimuli are paired with the US together. Surprisingly, tests of conditioned responding to B alone then show that the animal has learned very little about B. The earlier conditioning of A “blocks” conditioning of B when B is merely added to A. The reason? Stimulus A already predicts the US, so the US is not surprising when it occurs with Stimulus B. Learning depends on a discrepancy between what occurs on a conditioning trial and what is already predicted by cues that are present on the trial. To learn something in Pavlovian learning, there must first be some prediction error.
Blocking and other related effects indicate that the learning process tends to discover the most valid predictors of significant events and ignore the less useful ones. This is common in the real world.
EXAMPLE: Americans often fail to learn the color of a Canadian $20 bill when they take a trip and handle money in Canada. In America, the most valid predictor of the value of the $20 bill is perhaps the number that is printed on it. In Canada, the number occurs together with a unique color. Because of blocking, Americans often don’t learn the color of the $20 bill. (It turns out that the Canadian $20 bill is green.) The number gives them all the information they need; there is no prediction error for the learning process to correct.
Classical conditioning is strongest if the CS and US are intense or salient. It is also best if the CS and US are relatively new and the organism hasn’t been exposed to them a lot before. It is also especially strong if the organism’s biology has prepared it to associate a particular CS and US. There are many factors that affect the strength of classical conditioning, and these have been the subject of much research and theory. Behavioral neuroscientists have also used classical conditioning to investigate many of the basic brain processes that are involved in learning
Erasing Pavlonian Learning: After conditioning, the response to the CS can be eliminated if the CS is presented repeatedly without the US. This effect is called extinction, and the response is said to become “extinguished.” Extinction is important for many reasons. For one thing, it is the basis for many therapies that clinical psychologists use to eliminate maladaptive and unwanted behaviors.
EXAMPLE: a person who has a debilitating fear of spiders will be systematically exposed to spiders (without a traumatic US) to gradually extinguish the fear.
- Here, the spider is a CS, and repeated exposure to it without an aversive consequence causes extinction.
If time is allowed to pass after extinction has occurred, presentation of the CS can evoke some responding again. This is called spontaneous recovery. Another important phenomenon is the renewal effect: After extinction, if the CS is tested in a new context, such as a different room or location, responding can also return. These effects have been interpreted to suggest that extinction inhibits rather than erases the learned behavior, and this inhibition is mainly expressed in the context in which it is learned.
This does not mean that extinction is a bad treatment for behavior disorders. Instead, clinicians can make it effective by using basic research on learning to help defeat these relapse effects.
Useful Things to Know about Instrumental Conditioning
Instrumental Responses Come Under Stimulus Control: Lever-pressing can be reinforced only when a light in the Skinner box is turned on—when the light is off, lever-pressing is not reinforced. The rat soon learns to discriminate the light-on and light-off conditions, and responds only in the presence of the light (responses in light-off are extinguished). The operant is now said to be under stimulus control. In the real world, stimulus control is probably the rule.
EXAMPLE: different behaviors are reinforced while you are in a classroom, at the beach, or at a party, and your behavior adjusts accordingly.
The stimulus controlling the operant response is called a discriminative stimulus. It can be associated directly with the response or the reinforcer. However, it usually does not elicit the response the way a Pavlovian CS does. Instead, it is said to “set the occasion for” the operant response.
Stimulus-control methods can be used to study how categorization is learned.
Operant Conditioning Involves Choice: choice has been studied in the Skinner box by making two levers available for the rat, each of which has its own reinforcement or payoff rate. A thorough study of choice in situations like this has led to a rule called the quantitative law of effect. The law acknowledges the fact that the effects of reinforcing one behavior depend crucially on how much reinforcement is earned for the behavior’s alternatives. In general, a given reinforcer will be less reinforcing if there are many alternative reinforcers in the environment.
Cognition in Instrumental Learning: Modern research also indicates that reinforcers do more than merely strengthen the behaviors they are a consequence of. Instead, animals learn about the specific consequences of each behavior, and will perform a behavior depending on how much they currently want—or “value”—its consequence. This idea is best illustrated by a phenomenon called the reinforcer devaluation effect.
This means that the subject has learned and remembered the reinforcer associated with each response, and can combine that knowledge with the knowledge that the reinforcer is now “bad.” Reinforcers do not merely stamp in responses; the animal learns much more than that. The behavior is said to be “goal-directed”, because it is influenced by the current value of its associated goal (the reinforcer).
Eventually, an action that depends on the animal’s knowledge of the response-reinforcer association becomes automatic and routine. That is, the goal-directed action can become
Putting Pavlovian and Instrumental Conditioning Together
Most of the things that affect the strength of Pavlovian conditioning also affect the strength of instrumental learning, where we learn to associate our actions with their outcomes. As before, the bigger the reinforcer (or punisher), the stronger the learning. And if an instrumental behavior is no longer reinforced, it will also extinguish. Most of the rules of associative learning that apply to classical conditioning also apply to instrumental learning.
Pavlovian and operant conditioning are usually studied separately. But outside the laboratory, they are almost always occurring at the same time.
EXAMPLE: a person who is reinforced for drinking alcohol or eating excessively learns these behaviors in the presence of certain stimuli—a pub, a set of friends, a restaurant, or possibly the couch in front of the TV. These stimuli are also available for association with the reinforcer.
EXAMPLE: Imagine a child walking up to a group of children playing a game on the playground. The game looks fun, but it is new and unfamiliar. Rather than joining the game immediately, the child opts to sit back and watch the other children play a round or two. He observes the children, taking note of the way in which they behave while playing the game. By watching the behavior of the other kids the child can figure out the rules of the game and even some strategies for doing well at the game. This is called observational learning.
- Observational learning is a component of Albert Bandura’s Social Learning Theory (Bandura, 1977), which posits that individuals can learn novel responses via observation of key others’ behaviors.
- Observational learning does not necessarily require reinforcement, but instead hinges on the presence of others, referred to as social models. Social models typically of higher status or authority compared to the observer.
EXAMPLE: parents, teachers, and police officers.
By observing how the social models behave an individual is able to learn how to act in a certain situations. Other examples of observational learning might include a child learning to place her napkin in her lap by watching the adults around her or an adult learning where to find ketchup and mustard after observing other customers at a hot dog stand.
Bandura theorizes that the observational learning process consists of four parts. The first is attention, one must pay attention to what they are observing in order to learn. The second part of observational learning is retention, to learn one must be able to retain the behavior they are observing in their memory. The third part, initiation, the learner must be able to execute (or initiate) the learned behavior. Lastly, the observer must possess the motivation to engage in observational learning; the child must want to learn how to play the novel game in order to properly engage in observational learning.
Consequences do play a role within observational learning. A future adaptation of this study demonstrated that children in the aggression group showed less aggressive behavior if they witnessed the adult model receive punishment for aggressing against Bobo (Bobo doll experiment). Bandura referred to this process as vicarious reinforcement, as the children did not experience the reinforcement or punishment directly, yet it still influenced their behavior.