Don’t be a Jim – NRMs in training

Just as the sound of fingernails running down a blackboard sends tingles down my spine and raises the hair on the back of my head, the words “no” or “ah-ah” during training have a similar effect.

Often used to inform the dog it’s got something wrong, in the training world these are referred to as ‘non-reward markers’ (NRMs).  In my experience, these words are bounded around at high frequency with owners sounding like Jim Trott from The Vicar of Dibley, yet there is little understanding of their impact.

So, here’s a brief overview of the four possible consequences in operant conditioning and where NRMs sit within this…

Operant conditioning is a procedure in which a behaviour becomes stronger or weaker depending on its consequences.  Generally, there are four possible consequences in operant conditioning: 1) positive reinforcement – a behaviour is strengthened by the presentation of a stimulus (what the animal wants), 2) negative reinforcement – a behaviour is strengthened by the removal of an unpleasant stimulus that the animal wants to avoid, 3) positive punishment – presenting an unpleasant stimulus that causes a reduction in the strength of a behaviour, and 4) negative punishment – the removal of a stimulus that the animal seeks out, which causes a reduction in the strength of a behaviour (Chance, 2003).

Positive reinforcement – Positive reinforcement training (PRT) prompts hormonal responses which play a key role in positive associative learning, memory, reward and motivation (Zellner et al., 2009; Affenzeller et al., 2016).  In PRT, the clicker, whistle or words such as “yes”, are used as a means to indicate to the dog that they’ve performed a behaviour we’re after (a secondary reinforcer).  If the clicker, whistle or marker word has been appropriately applied from the start, that sound signifies to the dog something great will immediately follow, such as food, touch, the opportunity to play etc (a primary reinforcer).  And, with a continuous schedule of positive reinforcement for the specific behaviour we’re after, we maximise the opportunities to create a positive conditioned emotional response (+CER) to that behaviour (for example, creating a solid recall).

Punishment – By contrast, NRMs are the opposite of a clicker, whistle or a marker word; essentially, NRMs are a conditioned negative punisher (-P).  This means you’re telling the dog “I’m taking the food away from you” because the dog didn’t perform the behaviour you were after.  Hence, NRMs can lead to frustration and confusion because the dog may not be able to clearly identify what behaviour you’re actually after.  Consequently, you end up with a general suppression of behaviour because the animal cannot identify exactly what behaviour is resulting in the punisher.

In some cases, owners may use the word “no” as a positive punisher (+P) either knowingly or not!  Take this example; the owner is looking to teach their dog to walk on a loose lead, the dog pulls on the lead to get closer to another dog to investigate, the owner says “no”, yanking the dog away.  Overtime, “no” will become a conditioned positive punishment marker.  Depending on the intensity of the “no”, the yank, timing of when the NRM and yank was used plus context etc., the owner may unwittingly pair punishment to the appearance of other dogs.  This might then lead to other behaviours, such as barking and lunging, which are directed towards unfamiliar dogs when on a lead.

Whether NRMs are perceived as -P or +P to your dog will depend on a number of factors; how these have been taught, the animal’s character, and how NRMs are being used.  In a few cases, NRMs might actually be information that speeds up the dog’s chances of getting to the reinforcer, however, NRMs are typically used as a punisher.  So, owners may be reinforcing their dog one minute then punishing them the next.  Blackwell et al. (2008) found inconsistency in training methods was related to the highest aggression score, whilst Herron et al. (2009) found direct and indirect confrontational training methods including owners shouting “no” related to aggression.

Final paws for thought…

Positive reinforcement produces the most positive outcomes (Boissy et al., 2007) from faster acquisition (Hall and Wynne, 2012) and memory retention to happy learners (Rosales-Ruiz, 2011).  If your dog gets a behaviour wrong stay quiet for a few seconds – that’s enough information to let your dog know that it needs to try again.  When you then click, whistle or say “yes” as the behaviour is offered, this increases the likelihood of that specific behaviour being performed again and enables further opportunities for reinforcement which maximises the potential to create a +CER.

In short, if you can’t positively reinforce a behaviour then lower the criteria to set you and your dog up for success.

Previous post:

Next post: