Home » Uncategorized » Learning (according to Skinner)

Learning (according to Skinner)

When we think of learning, we tend to think of education. A high minded course on critical thinking and philosophy teaching us how to solve the problems of the world without breaking too much of a sweat. Well, education and learning are not really one in the same. When you see “learning” the way Skinner did, at least, you will begin to see it as robotic in nature. Granted, Skinner only focused on the procedural memory and not the explicit memory, so his theories were limited by the understanding of the field at that time. His work was still incredible, but it does not adequately explain all the facets of human behavior.

Learning (or operant condition according to Skinner) is a change in behavior due to punishment and reinforcement. Before we can get into learning, we need to go over punishment and reinforcement first.

Here’s a little guide I threw together with some built in examples. On the left side, we can see ‘Increase’ and ‘Decrease.’ This refers to increasing the likelihood of a behavior or decreasing the likelihood of a behavior. The positive/negative sections along the top refer to adding or removing something from the environment to increase or decrease a behavior.

We’ll walk through each example – if I do my chores and that allows me to collect my allowance, then I have been positively reinforced. The act of paying me for my work (adding something)  is reinforcing the work behavior (increasing the behavior).

Next, I hear a squeaking hinge on my door which is annoying me. After the squeaking begins, I’m more likely to grease it. This is negative reinforcement, because it is increasing the likelihood of fixing the hinge which will remove the squeak from my environment. I’m not as likely to fix the hinge when it is not making any sounds at all. The presence of the annoying sound is increasing the ‘fixing the hinge’ behavior, thus it is a form of reinforcement.

Punishment, on the other hand, aims to decrease target behaviors.

For positive punishment – if I fail to complete all my chores, I may get extra chores given to me. The behavior of ‘failing to do my chores’ will hopefully decrease as a result. We are adding extra chores (positive) in hope that I will not fail to do them (decrease my failure rate) – thus it is positive punishment.

Extinction is a form of negative punishment, and it is a bit different. Imagine a child is acting out for the sake of attention. The child in question might be throwing stuffed animals at the wall, and every time they do, you would enter their room and ask them to stop. The child finds this amusing, but you find it annoying. After a while, you simply stop going into their room to ask them to quit. You would now be using a form of negative punishment – extinction. If a behavior that was reinforced suddenly stops receiving its reinforcement, the behavior tends to stop (it becomes extinct). As the child is no longer getting reinforced for throwing the stuffed animals at the wall by gaining your attention, the child will likely stop the behavior.

One more time:

Positive Reinforcement – increase behavior by giving something to the learner

Positive Punishment – reducing a behavior by giving something to the learner

Negative Reinforcement – increase behavior by removing a negative stimulus

Extinction (Negative Punishment) – reduce behavior by removing previous reinforcement

Now that we have a good understanding of punishment and reinforcement, we’ll move on to schedules of reinforcement.

Schedules of reinforcement are fairly simple. I’m going to review top three most commons ones we see in operant conditioning.

Continuous  – this gets the most consistent rate of response. Every time an action occurs, the individual is reinforced. For example, when the dog sits down, I give it a treat.  If I stop reinforcing the behavior, it can be viewed as a negative punishment, which will lead to an extinction of the behavior. Technically speaking, if the resulting reinforcement is even lowered, it may lead to extinction. Imagine you have been getting paid $60,000 dollars a year for your job, then suddenly you are told your pay is getting cut to $20,000 a year? While many people would be happy to have consistent work, it’s hard for the lower amount to be reinforcing as we have associated a much larger reinforcer with the old behavior.

Interval – This is how most of us get reinforced when we go to work. Every two weeks or so, you get a paid for your work. Some suggest that work productivity is a bit higher around pay days, and less so during the non-paying week. This would make sense according to operant condition. The further you are from the key interval the less vigorous your response rate will be. Imagine you had to push a button 300 times in an experiment to get $20. The first 250 presses would go by a bit slow, but as you approached the “home stretch” you might buckle down and hit it as fast as you can until you hit the goal. Then you would slow down again. At least, that’s the idea!

Variable – Imagine you were no longer paid at the two-week interval at your work. Instead, each day you worked you had a 10% chance to get paid for two weeks of work, regardless of how long it has been since your last check. Potentially, you could get paid every day you work there, or perhaps not at all. What would that do for your attendance? Most people would never miss a day of work in that case. This isn’t a good method of paying people, of course, as being able to predictably pay my bills is very important. Variable reinforcement will get a consistently vigorous response – it’s just like gambling!

Schedules of reinforcement, and operant conditioning are things that affect us each and every day. Though a psychologist may use different terminology than other folks, we should all be fairly familiar with these examples. Understanding what we wish to do (increase or decrease a behavior) will allows us to develop strategies for learning.

When we are talking about education, however, how should we use these tools? Punishment and reinforcement are both learning tools, and both are easily misused. In a future post I will be discussing in more depth why we should take care of how we praise and punish behaviors.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: