JUST REWARDS
Pat Miller
© 2000, Pat Miller/Peaceable Paws, LLC All Rights Reserved
The trim, middle-aged lady strode briskly down the rubber mat in the training center, her black Labrador Retriever bouncing happily at her side. She came to a smooth halt, and Skip sat promptly next to her, in perfect heel position. “Yes!” I thought to myself, and then winced as Carla reached down and enthusiastically patted Skip on the head. Skip jumped up and backed away from his human.
“Carla,” I said softly. “You just punished him for sitting straight.” Carla’s face fell. “Darn it!” she exclaimed. “Why can’t I remember that!”
Wait a minute...since when is patting a dog considered punishment? Ever since Skip let us know by ducking his head and backing away from Carla’s hand that he didn’t enjoy being petted. All the other Labs that Carla had owned and trained throughout her life had adored being touched as a reward. Carla petted her dog for being good without even thinking about it – it was a well-conditioned response. Unfortunately, since Skip didn’t like being touched, every time she did it to him, she was actually punishing him, decreasing the likelihood that he would perform that perfect sit again!
A dog’s
decisions in life, and his resulting behaviors, are based on whether a
particular behavior yields something he likes (a reward), or something he
doesn’t like (a punishment). Training is simply a matter of manipulating the
rewards and punishments in a thoughtful manner . . . But you have to know your
dog – be thoroughly aware of his likes and dislikes – and conscious of your own
behavior to make “training” work for you.
Rewards and punishments
In the 1950s, behavioral
scientist B.F. Skinner developed a number of principles that are applicable to
all living things with a central nervous system. He found that animals are
likely to repeat behaviors that are enjoyable/rewarding to them, and not likely
to repeat behaviors that result in something unpleasant (punishment). Neutral
stimuli – things that don’t matter to the animal – don’t have an impact on
behavior one way or the other.
Skinner
demonstrated that humans can use these simple principles to modify an animal’s
behavior. Rewards are the most reliable way to deliberately increase an
animal’s offered behaviors; conversely, punishment decreases those behaviors.
We use these behavioral principles in dog training with great success.
However, as
with Skip, the practical application of “rewards” and “punishments” varies from
dog to dog, even though the definition doesn’t. A reward is anything a
particular dog likes. A punishment is anything that dog doesn’t like.
We
frequently use food treats as our reward in training, because we can almost
always find some food that a dog will
value highly enough that it can serve as an irresistible reward, but food is
not the only reward available to us.
Remember, a reward is anything a dog likes. It could be a pat on the head (but
not for dogs like Skip, who don’t like to be touched), verbal praise, a game of
tug o’ war, a chase after a stick or tennis ball, a walk on leash, a car ride,
permission to jump up on the sofa, the cue to run an agility course, the
release from a “wait” to run out into the yard, permission to go jump in the
lake, or the signal to round up a herd of sheep.
When the
average inexperienced dog handler hears the word “punishment,” he generally
thinks of overt forms of physical punishment, such as smacking, pinching, or
kicking the dog, or jerking on the leash. I do not recommend or use physical
punishment, as it endangers the handler, damages the relationship with his dog,
and can also destroy the dog’s enthusiasm for training. However, physical
punishment is not the only way to eliminate an unwanted behavior.
Remember,
behaviorists define the word “punishment” as anything that causes an animal to decrease a certain behavior. So, in the
case of Skip, the Lab who didn’t like being touched, a pat on the head after he
performed a straight sit was enough to make him stop performing those straight sits.
“Positive
trainers” – people who have made a commitment to train without the use of pain,
fear, force, or intimidation – often use certain forms of “punishment” in the
behavioral sense to accomplish their training goals. For example, when a dog
who craves physical contact and attention jumps up and throws himself all over
the trainer, she will often keep turning her back on the dog, removing both her
attention (eye contact and interaction) and the possibility of physical contact
with the dog. These are the rewards that the dog is seeking by jumping up. When the dog’s jumping behavior keeps
resulting in the loss of something he wants badly, he will stop jumping –
especially when this “punishment” is paired with the “reward” of attention,
treats and petting for sitting quietly.
What
actually constitutes a punishment or reward to any given dog, then, is an
individual matter; in behavioral terms, context is everything.
Unintentional training
Training, therefore, is the
intentional use of rewards and punishments to purposefully manipulate a dog’s
behavior. What is sometimes difficult to remember is the fact that dogs are
learning all the time, whether or not we are paying attention. People are often
mystified as to why their dogs do some of the things they do, or fail to do
what the people want them to do.
It’s
actually pretty simple. Dogs do what works for them; they don’t do things
unless they get something out of it.
Dogs do
things that we consider “inappropriate behavior,” because they it’s fun, it
feels good or it tastes good. From a dog’s perspective, behaviors that are
unacceptable to us, such as getting in the garbage, chasing cats, or sleeping
on the sofa, are just plain fun!.
Frustrated
owners frequently say to their trainers, “He knows he’s not supposed to do that! I punish him when he does, but
he still does it. Why?” Sometimes, the enjoyment the dog gets from the behavior
outweighs the owner’s “punishment.” A dog who is highly aroused by the
experience of chasing a cat over the backyard fence may not care a bit about
getting yelled at for it.
In other cases, the “punishment” may actually be rewarding to the dog.
For example, a boisterous Labrador who gets yelled at, hit or even kicked for
jumping up on his owner may not have any clue that the yelling, hitting, and
kicking is supposed to be a punishment. To dogs who crave attention and love
physical contact with people, this rough treatment is simply an invitation to
play and enjoyable (rewarding) game.
Dog owners
also may fail to realize that they often unthinkingly punish a dog for doing
the right thing. If you do this frequently enough, you will inadvertently
“train” your dog to stop offering the
behaviors you want.
Consider
the woman whose dog is enjoying a good romp with some canine pals at the dog
park. It’s time to leave, so she calls her dog to her. He immediately leaves his play pals and
races to her. “Good dog!” she exclaims, and snaps his leash on, taking him from
the park. In her view, the verbal praise was ample reward, and leaving the park
has no connection to the recall. But here’s how the dog sees it: Mom called, I
came, and the fun’s over. When I
come to Mom, a bad thing happens -- the fun stops.” He is likely to think twice about coming the next time she
calls while he is playing with friends!
Many people
have lots of trouble training their dog to come reliably when called. Perhaps
they haven’t given enough consideration to what happens to the dog most of the time after he does
come. It doesn’t take a canine
Einstein to realize that coming when called is a bad idea if something “bad”
consistently happens to him immediately afterward – say, he gets stuffed into
the basement or locked away from all the guests in the kitchen, or tossed
outside in the cold rain.
Training
may also break down when the reward just isn’t valuable enough to motivate the
dog to bother trying to get it. You must program an automatic response to the
“come” cue with a high value reward in the absence of enticing distractions
before you try to apply it in the face of dashing squirrels. Few dogs will
choose to leave a squirrel hunt in order to come and earn a piece of dry
kibble! Many positive trainers advocate using a variety of enticing rewards and
mixing them up. Then the dog is never sure how big the “payoff” for his good
behavior will be, he just knows it will be good.
If you
doubt that mixing small rewards (such as verbal praise, a pat, or a piece of
dry kibble) with larger rewards (such as pieces of fresh meat, chasing a ball,
or being released to run free) is a powerful motivator, consider the slot
machine. As long as it pays out a mixture of no rewards, small rewards, and
only an occasional jackpot, human gamblers will continue to sit there and pull
the handle, long past the time that it makes sense to do so!
Random acts of reinforcement
Having a variety of rewards
in your training tool kit gives you greater flexibility and allows you to train
your dog without always having a huge supply of treats in your pocket. A good
training program moves toward variable reinforcement once the dog is reliably
performing a new behavior. Instead of clicking and giving the dog a treat every time he performs the behavior, you
occasionally skip a click and praise the dog instead, then ask for the behavior
again and click the next one. Gradually increase the variation and length of
the reinforcement schedule, remembering that randomness is important.
If you
simply keep making your dog work harder and harder for a click, he’s likely to
quit on you. If you vary the reinforcement schedule, like a Las Vegas slot
machine, he can’t predict when you will pay off. Will I get a click this time?
This time? This time? Click! Just as people will continue inserting quarters,
your dog will keep offering behaviors with enthusiasm, sure that the next one will hit the jackpot.
To maintain
his enthusiasm as you gradually lengthen the reinforcement schedule, use other
rewards to let him know he’s still on track. I frequently use “Good dog!” as
praise after I click and treat, so
that my dogs associate the same warm fuzzy feeling of getting a food reward
with the verbal praise. Then, when I use the verbal praise even without the
click and treat, they still have the same classically conditioned response from
the association of praise with food, and it makes them feel good. Thus, “Good
dog!” becomes a useful reward even without food.
Other
rewards may create more of an interruption in the training game. If you use a
toy as a reward, you have to stop and let your dog play with it for a while.
This can work really well to amp him up on the enthusiasm scale, especially for
a dog who is ball crazy or loves to tug. It doesn’t
work well when you want to do a lot of repetitions of a discrete behavior in a
row. If you toss the ball every time he responds to your “down” cue, it will
take you a long time to do a half-dozens repetitions. It does work well as a reward for an extended behavior, such as heel.
A ball-crazy dog can learn to heel with perfect attention for long stretches in
anticipation of the ball-chase that happens at the end.
Timing is key
It is vitally important to a
successful training program to understand what your dog likes and doesn’t like,
and to use those rewards and punishments effectively. In order to be effective,
consequences – good or bad – must be delivered in close proximity in time to
the behavior you are trying to influence.
Say your
dog tips over your kitchen garbage can while you are away at work. If you
reprimand him when you get home from work, hours after the garbage raid
occurred, it only teaches your dog that you are sometimes unpredictable and
dangerous when you come home. No matter how “guilty” he looks when you scold
him, he makes no connection between your behavior of yelling at him and his
behavior of getting in the garbage hours earlier. Your perception of his
apparent guilt-stricken conscience, manifested in his lowered head, lack of eye
contact, and slinking along the baseboards, is a faulty interpretation of his
classic canine body language attempts to quell your wrath, whatever the cause.
Behaviorists
agree that a reward or punishment must be delivered within three seconds,
preferably one second or less, of the behavior you are trying to increase or
decrease. This is a pretty small window of time, and underscores the value of
using a clicker or other reward marker (or no-reward marker) to mark the
instant of desired (or inappropriate) behavior. If you say “Oops!” the instant
your dog jumps up and you turn away, you are teaching your dog a no-reward
marker, which you can use to communicate to your dog which behavior it was that
made the good thing go away (negative punishment). If you Click! or say “Yes!”
the instant your dog sits, he will come to understand that the sit earned the reward, even if it takes
several seconds for you to get the treat into his mouth, and even if he gets up
from the sit before you manage to deliver the treat.
Skipping Ahead
Carla and I
had a long discussion about how to continue with Skip’s training. We identified
two options. Using desensitization, we could teach Skip that having Carla pat
him on the head really was a reward, by consistently pairing her touch with an
off-the-charts treat reward, using gentle contact at first, then increasing in
intensity until he learned to associate vigorous patting with “really good
stuff.” Carla made a commitment to doing this for the long term, as she really
wanted Skip to enjoy her touch.
We also
initiated a short-term approach of modifying Carla’s behavior, agreeing to use positive reinforcement and
negative punishment with her. Every
time Skip sat and she didn’t reach down to pat him, Carla earned a reward,
ranging from a quarter, a piece of chocolate, or a dog toy. Every time she
forgot and reached down to pat him, I stepped out of the training room without
a word, for a period of time from 30 seconds to three minutes. It worked
beautifully, and in short order, Skip was sitting happily in perfect heel
position when Carla halted, without fear of being punished for his good
behavior.
DOGS TRAINING THEIR HUMANS
Is your dog training you? Don’t laugh; it’s pretty common. Scruffy, an Australian Cattle Dog, had his owner beautifully trained to open the door to let him in or out whenever he barked – every 20 minutes or so, all day. Lots of dogs have turned their humans into on-call petting machines by teaching them to reward an endearing nudge of the nose or tap of the paw with a placating scratch behind the ear. Not until the barking, nose-nudging or paw-tapping gets annoying does the owner try to make the dog stop, and by then it’s too late – the dog fully expects the owner to respond appropriately and gets upset when the human doesn’t perform the desired behavior.
Anytime you and your dog are together, one of you is training the other. Dog-human relationships are generally most successful when the human is the trainer and the dog the trainee – at least the majority of the time. You need to be aware when your dog is doing something to try to modify your behavior, and only cooperate if you are sure it’s a behavior you want to encourage. If not, you must either ignore the dog’s behavior so he stops trying to get you to perform (behaviors that aren’t rewarded in some manner will extinguish) and be sure to reward an incompatible behavior in its place. Rather than petting your dog when he nudges you, make it a point to pet him when he is sitting or lying quietly at your feet.
This is easiest to do when your dog first offers a behavior. Behaviors that are well-established are harder to extinguish than embryonic ones. Scruffy had long ago discovered that if his owner had any thoughts of not letting him out, all he had to do was bark louder and longer and she would eventually give in. This creates a very long schedule of reinforcement, which means that if you are going to try to make an established behavior go away by ignoring it, you may have to ignore it for a very long time. It becomes a test of stamina – both yours and the dogs!
You may even have to suffer through an “extinction burst.” Sometimes, shortly before the dog finally gives up, he will try really, really hard to get you to perform – barking louder, poking harder with his nose, of digging unmercifully at you with his paw. It almost like a temper tantrum, and if you could read your dog’s mind, you might hear him saying, with much irritation, “Darn you, you stupid human! You know how to do this behavior, we’ve worked on it for years! Why are you being so stubborn?”
Caution: If you give
in during an extinction burst you are creating a very long reinforcement
schedule and rewarding a very strong manifestation of the behavior, which will
make it even harder to stop the next time you try!
Much easier in the long run to make it a point to always be aware of what your dog is trying to teach you, and to be sure you are the trainer more often than you are the trainee.
THE FOUR PRINCIPLES OF OPERANT CONDITIONING
Behavior scientist B.F. Skinner developed the following behavior principles in the 1950’s, and asserted that they were applicable to all living things with a central nervous system. Bob and Marian Bailey were largely responsible for applying the principles in the real world, building a successful business using Skinner’s work training animals for a wide spectrum of purposes from pigeons and dolphins carrying military messages, dogs and other animals acting in television commercials, and marine mammals performing at places such as Sea World, to chickens playing tic-tac-toe at boardwalks and fairgrounds. In recent years, the Baileys turned their talents toward teaching dog trainers, joining with illuminaries such as Karen Pryor to educate trainers and move the dog training industry toward a more scientifically-based, positive profession. By applying these principles creatively and judiciously, you can teach your dog to do just about anything he is mentally and physically capable of doing:
- Positive reinforcement. The dog’s behavior makes something good happen. “Positive,” in behavioral terms, means something is added. “Reinforcement” (i.e. “reward) means the behavior increases. So, for example, when your dog sits, you feed him a treat. His behavior – sitting – made something good happen, something was added – the treat. As a result, your dog is more likely to offer to sit again, so the behavior increases. Positive trainers use positive reinforcement a lot. Dogs who are positively reinforced learn to think, and think about how to offer behaviors that we like in order to get rewarded.
- Positive punishment. The dog’s behavior makes something bad happen. (Positive means something is added, punishment means the behavior decreases.) For example, when your dog jumps on you with muddy paws you knee him in the chest, hard. He gets off. His behavior – jumping up – made something bad happen; something was added – your knee in his chest. As a result, your dog is more likely to think twice before jumping on you again. Positive trainers do not use positive punishment very much, if at all. Positive punishment can work and does with many dogs, but dogs who are positively punished may learn to fear the punisher, can become aggressive, may simply shut down in training, and are often reluctant to offer new behaviors for fear of being punished.
- Negative punishment. The dog’s behavior makes something good go away. (Negative means something is taken away, punishment means the behavior decreases.) Back to our jumping up example. When your dog jumps up on you with muddy paws you turn your back on you and step away. As long as he turns his back you keep stepping away. When he stops jumping and has four paws on the floor, or even better, sits, then you reach down to pet him and feed him a treat. His behavior – jumping up – made something good, your attention, go away. Then you followed with positive reinforcement; his behavior of sitting made something good happen – you paid attention to him. Positive trainers do use negative punishment as a non-violent means of providing a negative consequence for an unwanted behavior.
- Negative reinforcement. The dog’s behavior makes something bad go away. (Negative means something is taken away, reinforcement increases the behavior.) Some trainers use a shock collar to teach their dogs to come when called. They call the dog, and push the button, holding it down until the dog has returned to the trainer. When the dog reaches the trainer, the button is released. The faster the dog returns, the quicker the shock stops. The dog’s behavior – coming quickly when called – makes the bad thing, the shock, go away. Positive trainers may use a limited amount of gentle negative reinforcement in the form of mild physical pressure, but consider shock collar training to be unacceptable.
