确实，通过奖励或惩罚学习（游戏邦注：在心理学中通常与“持久的行为改变”同义）的理论可以追溯到20世纪早期的Edward Thorndike和B.F. Skinner等行为心理学家的研究。简单地说，他们发现通过将正确的行为与奖励配对，或将错误的行为与惩罚配对可以最有效地训练动物。比如，当老鼠按下一根控制杆时就给它一些可卡因（这会让它兴奋得像疯子）；但如果老鼠停止按下控制杆，就拿猫吓它。
但是，我们还是回避了问题的本质：奖励如荣誉比惩罚如羞辱或甚至禁止更加有效吗？乍一看，我们似乎一致同意奖励比惩罚有效得多。许多育儿专家、训狗师和管理学专家都持这种观点。但根据我写本文时参考的文献，这个话题其实还存在争议。奖励和惩罚哪个有更有效？这很大程度上取决于你要改变的人属于哪一类型。由Daniel Balliet、Laetitia Mulder和Paul Van Lange整理的一份2011年的荟萃分析（一种结合了许多个体研究资料的分析）表明，在社交困境类游戏中，为了让人们合作而采取的奖励和惩罚是同样有效的。人类本身和人类互动就是复杂的，所以这个问题很难下定论。
Modifying player behavior in League of Legends using positive reinforcement
By Jamie Madigan
What does a game developer do when its players have a bit of a reputation for being insufferable jerks? It hires a team of psychologists to tackle complex behavior modification problems with one of the oldest tricks in the book.
One of the blind spots in my gaming experience is the multiplayer online battle arena (MOBA) genre, which consists of competitive multiplayer games like DOTA, Heroes of Newerth, and League of Legends. Part of the reason I’ve never jumped in to any of these massively popular games is the one-two combination of a daunting learning curve and their reputation as homes to hyper competitive and none-too-pleasant player communities. I don’t like the idea of doing the wrong thing and getting yelled at until I cry. It’s why I don’t go to elementary school anymore.
This hasn’t escaped the attention of developers, of course, and I recently learned about efforts by Riot Games, makers of League of Legends, aimed at improving player behavior. Riot actually has a “Player Behavior Team” consisting of psychologists, human factors specialists, statisticians, and similarly educated folks who stand around in lab coats and experiment with ways to make League of Legends players act with greater sportsmanship.
It’s a hugely complex problem, but Riot seems to be using a simple behavior modification trick straight out of Psych 101 to tackle it: operant conditioning through positive reinforcement of desirable behavior.
To wit, the company recently launched a new Honor system to reward good behavior. After each match, players can give teammates and opponents accolades across categories like “Helpful,” “Friendly,” or “Honorable Opponent.” Points from these accumulate and are made visible in each player’s profile. Players are limited in how many Honor awards they can dole out, so getting one means something and Riot is experimenting with in-game rewards like special badges and player character skins for players who amass lots of Honor.
“The Honor feature was inspired by research on feedback loops and the psychology of learning,” Jeffrey Lin, Lead Designer of Social Systems at Riot, told me when I asked him about the psychological roots of the system. “One pillar of this research suggests that speed and clarity of feedback are catalysts that can really shape behaviors.”
Indeed, learning (which in psychology is often synonymous with “lasting behavior change”) via reinforcement or punishment dates back to research in the early 20th century by pioneers like Edward Thorndike and B.F. Skinner. In brief, they found that animals could be trained most effectively by pairing rewards or punishments with desired or undesired behaviors. Give a rat a pile of cocaine each time he presses a lever and it will jam on that thing like a maniac. But give the rat a pile of cats and it will stop pressing the lever. Or something like that.
Research on this kind of learning developed and expanded, including its use in modifying human behavior and understanding the best ways to schedule and present the rewards and punishments. It turns out that positive reinforcement (adding something the subject likes, like Honor points) is super effective, but even more effective when presented unambiguously, meaningfully, and quickly after the desired behavior.
These lessons about specificity and timeliness of feedback for League of Legends players were taken to heart by the folks at Riot. “Knowing that speed and clarity are key,” notes Lin, “we opted to give players an extremely visible pop-up that clearly outlined the specific types of positive behaviors the player had engaged in immediately after each game. Instead of just showing that a player earned 4 Honor points we show the player the exact types of behaviors that they were Honored for.”
So timeliness and specificity are important to creating associations between behaviors and rewards, but there’s one other facet of the Honor system that I think makes it work: its feedback schedule –that is, how often you pair the reward with the desired behavior. For example, if you make the pairing every tenth time and that’s called a fixed ratio schedule. Do the pairing every ten minutes and that’s basically a fixed interval schedule.
But Honor in League of Legends isn’t given out according to either of those schedules. Rather, like a slot machine it’s essentially random since even if you behave yourself in a match you never know for sure if another player will give you Honor or not. But you learn that over time, if you exhibit good sportsmanship consistently, you’ll get Honor a lot more often. Turns out that random or variable ratio reinforcement schedules are among the most effective way to change behavior in the long term. (For more on why this is, see my article on neurotransmitters and random loot drops in World of Warcraft.)
This all begs the question, though: are rewards like Honor more effective than punishments like shame or even banning? At first blush it seems that the consensus is that rewards are far more effective than punishments. That’s the attitude shared by many child rearing guides, dog trainers, and management gurus, anyway. But in the literature review I did while writing this article, it became clear that there is actually still considerable debate about the topic, and a lot of it depends on the type of people you’re trying to change. A 2011 meta analysis (a kind of superstudy that combines data from many individual studies) by Daniel Balliet, Laetitia Mulder, and Paul Van Lange, for example, found that positive reinforcement and punishment are about equally effective for getting people to cooperate with others in social dilemma type games. Humans and human interactions are complex, it turns out, so there’s little room to be definitive on the topic.
What is clear, though, is that a combination of rewards and punishments can be pretty darn effective, so it’s nice to see companies like Riot using the stick, the carrot, and whatever else it can get ahold of. Plus, it changes the scorecard to make clear that winning a match isn’t everything that matters. Having a good experience is why we play games. As Jeffrey Lin at Riot explained it:
Consider a player that just had a poor game–everyone (including him!) knew that he was the worst player on the team. He’s feeling a bit down and is considering whether to play another match at all after such a terrible performance. Suddenly, he gets a pop-up after he leaves the game that says, “Hey, 2 of your teammates thought you were really friendly and 1 of your teammates thought you were a great teammate.”
That moment changes everything. Yes, you were the worst and your team lost, but it’s OK. Without the system, this player might have just logged off with a bitter taste in his mouth. Now, we’ve nudged the negative experience into more positive territory.
Multiplayer games are social interactions. Shouldn’t our behavior in them should carry the same costs and rewards as it would anywhere else? (source:gamasutra)