6 Myths about Positive Reinforcement-Based Training

Positive reinforcement-based training is subject to a lot of misunderstanding and misrepresentation. Many people genuinely don’t understand how it works, and others seem to deliberately misrepresent it. Some of these misunderstandings and misrepresentations are very “sticky.”  Misunderstandings, straw men, myths—call them what you will, but they are out there and they are potent.

Here are six that are quite common. There are many more out there. For example, I didn’t even hit on “dogs trained with R+ are obese” or “R+ training only works for tricks and easy dogs” or “R+ training is bribery.”  But the following six illuminate some common misunderstandings about positive reinforcement-based training.

  1. Positive reinforcement-based training is permissive. I believe this one is a true misunderstanding for a lot of people.  Before I started studying learning theory, I certainly would have had no clue how one could use positive reinforcement as part of a training plan, for instance, to get rid of an unwanted behavior. All I could imagine was someone passing out cookies for good behavior. It seemed like a good recipe for chaos. What would one do with a cookie if the dog did something “bad”? What I didn’t know was that positive reinforcement-based trainers not only reinforce desired behaviors, but also have several humane techniques for interfering with the reinforcement for unwanted behaviors so that they don’t pay off for the animal. These include antecedent arrangement, reinforcement of alternative behaviors, and in some cases negative punishment. Positive reinforcement-based training, especially when applied to behavior problems, takes careful thought and planning. It is precise, deliberate, and the opposite of “let’s all hang out here in happy fairy rainbow land.”
  2. Naughty dogs
    Will this behavior fade away if I just ignore it?

    Positive reinforcement-based trainers just ignore bad behavior. The one also brings a very bad image to mind: a doting pet owner letting her pet  jump on grandma, countersurf, and go through the trash. But the truth is quite different. What we actually do about unwanted behavior is to 1) prevent it from happening in the first place; 2) teach the dog something acceptable to do instead;  and occasionally, 3) punish it using negative punishment. We know that ignoring reinforced behaviors doesn’t make them go away. But to make things a little more complicated, there are two situations where “ignoring” is used in training. One is when training new behaviors and/or associating a verbal cue with a new behavior. In these cases, if the dog makes an error, nothing happens. We do not treat. But in these situations we are not dealing with some habitual, harmful behavior that is getting reinforced some other way. It’s just a wrong guess in a guessing game. The other situation where ignoring might be used as a part of a training approach is when the animal’s behavior is being reinforced with attention. But even in that situation, we would not use ignoring by itself. I now have a whole post about the issues with ignoring: “Does Ignoring Bad Behavior Really Work?”

  3. Positive reinforcement trainers believe that nothing unpleasant should happen in the dog’s life, ever, and they try to protect their dogs from all aversives. First, this is impossible. Mild to moderate aversive stimuli are around us at all times, and we—and our animals—perform loads of behaviors to avoid or lessen them. Perhaps the dog is too hot. That’s aversive. Perhaps there is a fly buzzing around her head. That’s aversive. Perhaps the dog has to get a shot at the vet. That’s aversive! The truth is that we avoid training with aversives, even with mild ones. As I’ve written elsewhere, if a thunder-phobic dog escapes into the house when it storms, this is called natural or automatic negative reinforcement. The dog is reinforced for running into the house by gaining distance from the thunder noise. The thunder is an unavoidable aversive in life. (I help my dogs deal with it in other ways besides mere escape.) But I would never put a loud noise into a training session and use a dog’s fear of it to get a certain behavior out of her. And as for major aversives (thinking vet visit again)—we do prepare the dog for them as best we can to make them less so. That’s the opposite of using their aversive qualities.
  4. Because of #3, positive reinforcement-based trainers will do things like let their dog run out in traffic so as to avoid jerking on his collar, or avoid any medical procedure that might “hurt.” This one is almost always a straw man. I’m pretty sure the people saying it and acting like they believe it really don’t think we would stand by in an emergency and watch our dogs get hurt. In an emergency we will body block or grab or tackle or apply leash pressure to a dog who is about to do something dangerous, just like any other normal human being who cares about his or her dog. Yes, this is using an aversive. But it is not part of a teaching scenario. Different behaviors are expected and needed in difficult situations. For example, a friend might ask me to use a needle to remove a sliver that she can’t reach. I would do this if asked, even if it might mean hurting her. But because I am willing to do that, it does not follow that I am fine with training her a new job skill by poking her with a needle every time she makes an error.
  5. Positive reinforcement-based trainers use punishment but just don’t know it (or just don’t admit it). This is silly. We are generally the ones who are trying our best to leave mythology behind and learn the science behind good training. But again, the claim can come from someone who just doesn’t understand what it is we are doing; someone who figures there just has to be punishment in there somewhere! Sometimes there is. And those of us who use negative punishment know when we are using it! But a common variant of this claim is, “When you train, you don’t always give the dog the treat. You are withholding the reward and that’s punishment, har har har.” Actually it is not. As long as there is no consequence to the dog’s wrong guess it is not punishment. It is extinction at work. Extinction by itself is no picnic for the dog either, but in general we don’t use it by itself. Usually another behavior or multiple other behaviors are being reinforced, and we help the animal make the transition to performing one of those instead. We also know and freely admit that certain tools fall easily into aversive use. It’s no news that a plain old collar can be used to hurt a dog. That’s why when we start using any gear on a dog, we use counterconditioning to help the dog build pleasant associations, and we teach the dog behaviors so as to minimize the chance of discomfort. This is the opposite of using the aversive properties of a piece of gear.
  6. Positive reinforcement-based training is just as stressful on dogs as balanced or aversive-based training. Training with positive reinforcement can surely be stressful. But as I’ve written elsewhere, the stressors generally have to do with lack of skill (errors by the trainer), or an added aversive situation that wasn’t planned. It is not sensible to argue that a method that consists of giving the dog food or playing with her when she performs a desirable behavior is as aversive as a method that depends on applying discomfort, pain, or intimidation.
Clara and trash
Surely Clara won’t get in the trash if I keep ignoring her, right?

The Commonalities

Every one of these points is focused on punishment or aversive stimuli. Clearly that is a sticking point in people’s understanding of positive reinforcement-based training. The claims also fit neatly into two categories. The first four misrepresent positive reinforcement-based training. They paint it in a ridiculous light and imply it is impossible or ineffective.  The last two blur the lines between positive reinforcement-based training and training that involves deliberate use of aversives.

In rhetorical terms, the first four are straw man arguments, and the latter two use the tu quoque fallacy in addition to the continuum fallacy. (Follow the links for definitions and examples of the individual terms.)

But as irritating as it is to read and hear these over and over, I try to keep in mind that they can be made from ignorance rather than malice. This is described nicely in the straw man link. Every one of us grew up in a culture that instructs us to use aversives to attempt to change behavior. The “cultural fog” around learning and behavior that Dr. Susan Friedman refers to makes us leery of reinforcement, and can cause us to equate it with mere indulgence or even moral corruption.

I am sure that many of the people who make these arguments are completely unfamiliar with the planning and precision that necessarily go into positive reinforcement training plans. I know I was. I got over it by listening to you folks out there who patiently explained the processes involved in positive reinforcement-based training. I hope you keep describing to the world what you do!

Related Page

© Eileen Anderson 2015                                                                                                                               eileenanddogs.com

25 thoughts on “6 Myths about Positive Reinforcement-Based Training

  1. Thanks for this great post. I will have to share this one, because it is confusing and positive reinforcement training has been misunderstood. I’ve learned a lot by working with trainers.

  2. Very informative article. The people who do not know what training is all about or how it is done, are usually the ones who are making the ignorant comments about + reinforcement training. that group has trained very biddable dogs in the traditional sense and many, if they are good at their timing, have success with it. What is not known by many is + R training is very difficult if there is no knowledge on the subject.

    1. I agree, Rachael. I hate implying that R+ training is “hard,” because there is so much that an interested beginner can do well. But there is a certain mind shift that seems to be necessary before it makes sense as a whole. Glad you liked the article.

  3. Thanks again for providing information to encourage people to put the thought and energy into learning to train both efficiently and humanely.

  4. Ah, the “you’d let your dog eat grandma’s face off!” argument. Sigh. If I’ve let things get to the point where that’s happening, whether I tackle the dog, yank his collar or hit him with a stick is beside the point.

    1. Yes, and extra points if the dog can eat grandma’s face off while knocking over a toddler and running out into traffic….

  5. I get so frustrated by these arguments, and as a mostly-vegetarian I see parallels. “But you wear leather!” It’s like they would rather point out the shortcomings than accept the positive impact you are making. I don’t claim to be purely-positive. Sometimes I get frustrated and yell at my dogs. Sometimes I take things too quickly and cause stress. But I *strive* to be purely positive – an ideal will never be perfectly attained, but that doesn’t mean we should abandon the ideal.

    1. Absolutely, about the vegetarian thing! I used to be one of those who would pick at the “purity” of people’s choices about vegetarianism. Not anymore. I think any single thing that a person does to make the world better for animals is great and I strive to support it. We can’t be purely anything in the world we live in. Thanks for the comment, Lara.

  6. Rachel para said,

    “What is not known by many is + R training is very difficult if there is no knowledge on the subject.”

    True, however, discussing it with a psychology tutor, proved rather difficult for me.

    I explained the basics and showed kikopup’s Invisible Barriers video, which I think is rather impressive.

    More details were required as Bridging was not enough. I explained about the click reaching the amygdala. Her conclusion was that the dog had no free will and had been brain washed, as the amygdala registers fear. I think that comment referred to the proofing part of the video, where Splash stays on the lawn whilst Emily throws toys into the road.

    Not to be defeated after a long time I eventually found this;

    http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3854486/

    Perhaps next time I’ll just mention that my dogs are happy and mostly well behaved because of it.

    Another great post Eileen. I love reading your blog.

    1. Wow, brain washed! That’s a new one. That looks to be an interesting article (You are so good at finding things.) Thanks for the comment!

  7. One has to just love the terminology we have to deal with…you say when withholding a reward as long as there is no consequence to the dog’s wrong guess it is not punishment, but extinction. But the lack of the reward is actually a consequence. But that really doesn’t matter.

    Instead, if in wikipedia (operand conditioning) you compare negative punishment to extinction, in the former an existing stimulus is removed, and in the latter never offered. If that’s not confusing enough, one can also define extinction somewhat differently for classical conditioning, dealing with prediction instead of reinforcement. Go to Applied Behavior Analysis (Cooper, et. al.) and you’ll see more shades of grey.

    Add that extinction may refer to either a process or a state, and that punishment is a reduction in the future probability of occurrence and has no direction connection to aversives or harm, but that most people think of punishment as relating to harm or distress. Nor are eustress stressors usually aversive, so stress can be good. It’s no wonder many are confused, especially if you try to use advice from several people, even if they’re actually saying the same thing.

    Far more important is the rest of what you so clearly and accurately expressed on these issues.

    Finally, there is not and does not have to be any purity here. You can still use +R even though you tell your dog “no”, and doing do both reduces the future probability (maybe) of a behavior (punishment), and is considered (mildly) aversive by your dog and causes stress. Many compound training sessions could be found to have some aspects of classical and most quadrants of operand conditioning, together with aversives and extinction.

    Or, to paraphrase Ian Dunbar: “Give them a scalpel & they will dissect a kiss”. More important is the degree and intensity of each component, minimizing distress and focussing on reward motivation, as in LIMA.

    1. Good points, Gerry. I agree my sentence that you quoted did not touch on the actual definitions of the learning processes being discussed. The linked post does, but upon consideration I think my reference was a little reductionist. (I did leave out the classical definition on purpose though, grin.)

      I like how Pam Reid operationalizes the pertinent distinction of negative punishment (in contrast with extinction) in the realm of animal training. She says the animal needs to to be able to taste or see the reinforcer for it to be able to be withdrawn (contingent on a behavior). This is a point of confusion, since it’s so easy to jump to the conclusion that treats in a treat pouch or a pocket are somehow in the dog’s sphere of possession, and that withholding them is equivalent to taking them away.

      I’mm going to disagree about the lack of reward being a consequence, but I’m glad you made me think about it. I’m checking with an ABA friend, but I’m pretty sure a consequence needs to be operationalizable, as we know behaviors must be. Withholding a reward could be said to be a “dead man’s consequence.” It’s going on 24/7 except at the discrete moments when the animal gets a treat. And a dead man can do it. I’m pretty sure a non-response doesn’t qualify as a consequence in the formal sense since it doesn’t qualify as a stimulus.

      Yeah, the whole “no” thing. I don’t get exercised about it. I just point out that there are some pretty good practical reasons to develop our responses beyond “no.” And I agree that there can be many mini-examples of aversives and the less desirable quadrants going on at any time. What I disagree with, as I know you understand, is that this negates the differences between R+ based training, or following LIMA if you will, and the other approaches.

      1. On Pam Reid’s description, I’m not far away, as my distinction is immediate perception. But a bigger question is what if any difference it makes in the results. I suspect there may sometimes be a difference in intensity, but not type. That some dogs become more excited when a reward appears imminent, which effects their emotional state. Which might imply that extinction is less aversive or distressing, but -P may be more effective from increased arousal. In practice, I feel the best choice is which better fits into the overall approach.

        On my statement of lack of reward being a consequence, I have to reverse and agree with your argument. I was wrong there.

        On your last paragraph, I fully agree.

        1. Those are interesting distinctions. I was looking at the Keller and Schoenfeld book last night and came across mention of the Skinner experiments where he added punishment of responses (P+) at different times during an extinction process. Interesting stuff. In the experiments shown, the responses were suppressed during the punishment period but recovered afterwards and the two curves ended up in the same place after some time had passed. That’s an old book, so I haven’t yet looked to see where the research went from there.

          I often think for myself I generally prefer P- to extinction because one gets more immediate information that way, but I’m sure I could find exceptions to that! Thanks for your comment. Got me thinking, as usual.

          1. Yes, but Estes tries to explain some of it around that area in Keller. And further detail here would take us down a long road, with bits and pieces. However, most research doesn’t take into account prior learned behaviors, available resources and LIMA considerations, which may all be more significant here.

            On -P vs. extinction, we know that arousal increases learning, up to a point (Yerkes-Dodson law). But we have no way of really measuring distress, or if fewer/high-arousal sessions cause more or less overall distress than many/low-arousal sessions. So is -P better or worse for the dog than extinction?

            In practice, I focus on learning rate and extent (including persistence), as those are the only things we can really measure. If arousal is too low or too high, the rate will slow. If we’re causing too much distress, the rate will slow. If the extent is not persistent, we’re missing some factors. Where in general most of the impetus for change comes from +R, with all the remainder simply applying a mild inhibition against undesired behavior.

            I find the combination to be far more effective than just +R (or classical reward). It also gives you a structured response on failed attempts, rather than either ignoring them or showing frustration.

  8. Excellent piece!!

    A nice resource to share with those who are convinced that we +R trainers are standing around vainly waving hot dogs in the air as we turn our dogs loose on the interstate on a daily basis . . . !!

  9. This is a really nice accessible summary.

    My usual shorthand is ‘training without pain or fear’. Of course I say no to my dog, but he is utterly confident that the no will at most be enforced with gentle management (I’ll block the dog flap when I see him thinking about heading out to bury his pizzle in my pansy pots) rather than shouts or blows.

    It *is* a mindset, though, and that’s hard to get across. For novices who are interested, a couple of pithy articles like this are often enough to set them on the road to R+ methods.

    1. Thanks, Jo! I do so agree that it’s a mindset. It requires some basic changes about how we think about stuff. Thanks for your nice example.

Comments are closed.

Copyright 2021 Eileen Anderson All Rights Reserved By accessing this site you agree to the Terms of Service.
Terms of Service: You may view and link to this content. You may share it by posting the URL. Scraping and/or copying and pasting content from this site on other sites or publications without written permission is forbidden.