You are here

Reciprocity – it’s a big deal

Reciprocity strategies

People remember where others have been reciprocal to them in the past. Humans are very good in detecting when others are not being reciprocal with them, and break stable reciprocal relationships. Previous research in this area (carried out independently by Dan Ariely and Daniel Kahneman) shows we are cognitively more disposed to recognise patterns of cheating over pattern of trust; that is, we are more disposed to recognise irregular trust patterns than normal trust patterns.


And this is understandable in the history human development as small groups learned to trust kin and non-kin, as well as trust between other tribes and larger groups. Much of our learned social behaviour was based around one party trying to get away with something (cheating) and another party detecting when this was occurring. There is also a body of literature that suggests that our brains are hard-wired towards detecting danger and focusing on negative information. This makes sense from an evolutionary standpoint as behaviours were optimised to keep us safe. 

”Why should you ever co-operate with another social animal on any level? …[there is evidence of] a clear-cut reciprocity. Under all sorts of circumstances, co-operation has a strong evolutionary payoff; even among non-relatives. With a condition. Which is, you’re not putting more into it than you are getting. That is reciprocal (which is reciprocal altruism).”
(Robert Sapolsky, Stanford behavioural Biologist)

Early Game Theory (1944)

A previous reciprocal evolutionary psychology test explored the following:

  1. Someone promises if you do one action you will get a reward, and
  2. If you do a different action you will be punished

In this research, one outcome was a person got rewarded without any action; i.e. another person decided to reward them (spontaneous act of kindness). In another scenario an individual was supposed to get rewarded and instead they get punished (a cheater decided to punish them). What was discovered is we are much better at detecting a cheater than we are in detecting spontaneous acts of kindness. So what was the optimal strategy? Well, this has been formalised in Game Theory - its origins are based in the economic model theory of John Von Neumann and Oskar Morgenstern. The theory formalised what was once considered random choice in human game-playing and this theory was extended to economics. Von Neumann discovered that players repeatedly employ a strategy aimed at maximising payoff while assuming their opponent would aim at minimising the payoff. (Norman, 2016)

Early game theory was also supported by Nash’s Equilibrium theory.

Game theory is the study of mathematical models of conflict and co-operation between intelligent rational decision-makers. Modern game theory began with the idea regarding the existence of mixed-strategy equilibria in two-person zero-sum games (participants gains and loss of utility are exactly balanced out).

Nash Equilibrium theory (1951)

In game theory, the Nash equilibrium is a solution concept of a non-cooperative game involving two or more players in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only his or her own strategy. If each player has chosen a strategy and no player can benefit by changing strategies while the other players keep theirs unchanged, then the current set of strategy choices and the corresponding payoffs constitutes a Nash equilibrium.

The prisoner's dilemma (1950), and tit-for-tat strategies (1970’s)

This strategy was originally framed by Merrill Flood and Melvin Dresher working at RAND (Research and Development) Corporation in 1950. Albert Tucker then formalised the game with prison sentence rewards and named it the "prisoner's dilemma".

Two individuals are captured after a crime is committed. If one co-operates with the police and gives up information on the other, then they will receive a lighter sentence.

There are four possible outcomes:

  1. Both individuals will co-operate
  2. Both individuals will cheat against each other
  3. Individual A co-operates and B cheats
  4. Individual B co-operates and A cheats

What you get in prisoner's dilemma is a formal payoff for each outcome. Prisoners try to formulate on what gives you the greatest payoff? The highest payoff happens if you cheat and the other prisoner is exploited.

The second highest payoff is you both co-operate. The third highest payoff is both prisoners cheat on each other. The worst possible payoff for an individual is if you co-operate and the other individual cheats (exploits you).

Experiments allowed payout of the various scenarios to discover when it was optimal to co-operate or cheat. Individuals would randomly we randomly try to maximise a payoff. We all get a hormonal reward when we both cheat and co-operate.

How is prisoner dilemma's play optimised? This is where Robert Axelrod (professor of Political Science) mapped the prisoners' dilemma in the late 1970's using computer models based on a huge number of strategy responses provided by large sample of individuals. What he discovered when thousands of scenarios were played out was the optimal social outcome was for tit-for-tat responses. As long as another individual co-operates, then you will co-operate. But, as soon as an individual cheats against you, then you will cheat against them the next time. However, if they go back to co-operating, then you will also return to co-operating. Tit-for-tat always drove other responses into extinction (e.g. greedy or selfish strategies were driven to extinction in favour of more altruistic strategies).

We get an optimal strategy for a number of reasons:

  1. It was socially pleasing
  2. It retaliates if someone cheats
  3. It is forgiving
  4. It is clear cut and supports probabilistic theory

Statistically over a long time the tit-for-that behaviour will prove to be the optimal strategy - allowing individual to be nice, but retaliatory. This strategy can be forgiving, and is clear about how the rules work. The strategy is a major building block of social behaviour as it works with both relatives and non-relatives (i.e. kin and non-kin groups).

But, there is a vulnerability in tit-for-tat strategies. It is open to signal error or misinformation (i.e. some social signals can be ambiguous or unclear). For example, someone who co-operates can be misinterpreted as cheating where one individual has unclear information or receives input errors from a third party. When presented with social information that is ambiguous, cognitive biases can distort how people interpret signal and triggers. If one person has a bias towards negative appraisals of others, then cheating may be perceived when information is ambiguous.

In some instances a tit-for-tat strategy can be disadvantageous, for example, if an individual is wrongly accused of an action, and socially demonised (or punished) as being guilty. This causes a see-saw effect where outcomes get compromised by signal noise. When this anomaly was discovered, another strategy came to the forefront - this was forgiving tit-for-tat.

Forgiving tit-for-tat (Axelrod, 1984)

Also known as iterated prisoner's dilemma (IPD), this strategy comes into play when the normal rules of tit-for-tat are in place. But, when cheating is discovered then the individual who was cheated against was willing to be forgiving for one round (to get things back on track, or re-establish an equilibrium) as long as the other party is reciprocal in their response.

Axelrod discovered four condition that were necessary in successful strategies, which were:

  1. Altruistic (non-cheating)
  2. Retaliating (needed to avoid blind optimism)
  3. Forgiving (needed to prevent on-going retaliation, and cheating)
  4. Non-envious (not looking to gain more than another participant)

What was discovered is forgiving tit-for-tat outperforms normal tit-for-tat, especially where there is signal error. But, another vulnerability was discovered. You could go back to co-operating and the other party could continue to cheat. So you leave yourself exposed to being exploited. After discovering this vulnerability an even better strategy emerged.

Another variation on forgiving tit-for-tat is a strategy called punitive tit-for-tat. In this strategy participants can begin with punitive decisions and then switch to forgiving tit-for-tat.

Win-stay, lose-switch (Pavlov's tit-for-tat, 1992)

Following forgiving tit-for-tat's vulnerability, another strategy was discovered. A game theory known as Pavlov strategy (not to be mistaken for Ivan Pavlov behavioural conditioning experiments with dogs) was created by Martin Nowak and Karl Sigmund. This strategy looked at situations where ‘noise’ or signal error has been introduced into a system.

In tit-for-tat strategy there are two ways to gain, and two ways to lose. For example:

  1. All participants co-operate
  2. All participants cheat
  3. A co-operates and B cheats
  4. B co-operates and A cheats

The simple rule is, if a participant uses the any of the first two strategies (all co-operate or all cheat) and they gain points or win, then they repeat the decision in their next round. However if they play their strategy (using one of the first two strategies) and lose, they then switch to one of the bottom two strategies (A co-operates and B cheats, or B co-operates and A cheats). If they lose after choosing one of the bottom strategies, they switch again to one of the first two strategies in their next round.

Pavlov's win-stay, lose-switch is a heuristic learning strategy that bases future decisions on the outcome of previous decisions. Unlike forgiving tit-for-tat, if both participants cheated then one participant would switch again to co-operation.

Previous experiences of how others behaved, and the outcomes, was a key driver in what participants would do in their next action or turn.

In Pavlov strategy the simple rule was, if you get some degree of reward, then you do it again the next time. If you lose through one of the tit-for-tat outcomes then you move to a gain payoff strategy the next time. This strategy allows you to exploit someone else who is forgiving.

Regardless of co-operating or cheating what is important is an optimal outcome for the individual. A key outcome in research testing was that Pavlov's theory out-performed other tit-for-tat game theory because it exploits forgiving participants.

Behavioural studies of animals have shown that they also employed similar tit-for-tat and reciprocal altruism (selfless concern for the well-being of others) strategies.

What Pavlov strategy discovered in reciprocity strategies (and an individual responses) was if one party has a reputation, either for co-operation or cheating, it will have an effect on how the first individual will behave when interacting with that party. How we trust and relate to other parties could also be linked to a concept called 'schemas'. These are sets of cognitive knowledge built up over time in regards to concepts, experiences, events or situations.

For example, you can have a schema for going to the doctor - you make an appointment, attend the doctors' practice, wait to be called, explain your symptoms to the doctor, and they provides you with a diagnosis or treatment. Schemas can be also used to fill in the gaps of knowledge or information. When unsure how someone might behave in a 'game' situation, you might use schemas to predict behaviour. Past behaviour is often used to try and predict future behaviour, and this extends to the perception others may have about your reputation.


In social and professional interactions it has been discovered that a state of homeostasis can be achieved and maintained if all parties involved recognise the benefits of mutual reciprocity. Humans are more prone to detecting a cheating (non-reciprocal) behaviour over acts of kindness. However, when one party is cheating or not being reciprocal, the other party will employ equally punitive strategies. But, forgiving behaviour can also be used to restore equilibrium. Where reputations are less than optimal one party may employ a punitive approach against the party with the poor reputation, but then switch to a forgiving strategy if the other party is seen as being reciprocal.

So what does this mean for institutions? Existing customers (and potential new customers) could begin interactions with a base state of reciprocity, or forgiving tit-for-tat. However, if previous experiences (or recognised reputations) are less than optimal then they may first employ Pavlov’s tit-for-tat strategy and change to forgiving tit-for-tat if the experience is seen as a trustful or optimal one.

In all experiences, customers expect their bank or institution to be reciprocal throughout every interaction and touch-point. This includes the transparency of information and not just product-based interactions.

For example, informing customers about any and all requirements that may impact them; being transparent about their obligations; being transparent about fees and charges; informing them about related issues, opportunities or other areas they may not know or be aware of; having their best interests in mind (if a product or service is not suitable for them); not just responding by rote to questions, but understanding needs at a deeper and personal level.

In all studies of reciprocity, trust is the cornerstone to building communications and optimal relationships. So the challenge to organisations is to ensure they include patterns of support into all interactions (and at every touch points). Organisations can use a scaffolding approach in place of operant conditioning (from B. F. Skinners' learning theory, which is commonly known as the ‘carrot and stick’ approach) to create trusting interactions and communication pathways.

Another way to think about this would be to use the three-term contingency model – also referred to as the ABC’s (Antecedent-Behaviour-Consequence) of behavioural psychology.

For example:

Antecedent: A customer receives notification of an unexpected fee

Behaviour: The customer is irate and phones her bank

Consequence: The customer now has a lack of trust for her bank and may start looking at other financial institutions for her business, or is less persuaded in taking another loan or insurance product with her main bank


Antecedent: Customer receives an unexpected notification that her mortgage repayments will be lowering immediately due to a recent interest rate cut

Behaviour: Customer feels valued and rewarded, and feels she her bank have her interests at heart

Consequence: She may be less likely to be swayed to another bank offering lower interests rates, but who have a reputation for not passing on rate change reductions immediately


Opportunities for reciprocity

There are many touch points across most organisations that connect with customers, and each point represents an opportunity to reward customers for new or ongoing business. For example,

  • Use a simple ‘thank you’ at the beginning or end of flows and interactions (for example, when using a mobile app, a response on social media, or in an online application or payment flow)
  • Think about how to create loyalty rewards for ongoing business
  • Push insights and information, rather than having customers’ arduously need to locate and “pull” information. For example, using 'Did you know…' links to useful information that is in context with a task they are doing, or information they looking for
  • Ask for more contextual information if they are struggling to complete a task. Presenting a request for additional personal information in regards to helping to resolve an issue can be seen as asking "in context". This is preferred than asking for too much personal information up-front. This follows the inverted practitioner triangle model used in health care (ref: Professor Fran Baum's model, called the 'Health Promotion Winners and Losers Triangle', 2008)
  • Be transparent when there is a need to collect personal information; and explain the benefits of supplying this information - for example, "If you provide us with X we will reduce the time to approve Y".
  • Provide clear explanations relating to rates, fees or any charges; and in addition explain how they can reduce or remove certain costs
  • Be supportive through every interaction with customers
  • Know the customers, and their situation - don’t ask them to repeatedly explain their situation with each engagement
  • Offer advice and information they may not be aware of, but could affect their situation. For example "If you increase you loan amount by X you will fall into a better reward bracket/tier for …"
  • Own the problem, and empathise with the customer


Sapolsky, R. (2013)
Why should I be nice? Game Theory - Sapolsky’s abridged Stanford behavioural biology 2
Available from: [24 Dec 2013]

D. Kehneman (2011).
Thinking, Fast and Slow
Penguin Books, London

Norman, J. (2016)
Von Neumann & Morgenstern Issue "The Theory of Games and Economic Behavior" (1944)
Available from:

D. Ariely (2010).
The Upside of Irrationality (The Unexpected Benefits of Defying Logic at Work and at Home)
Harper Collins Publishers, London

Queensland Health (2011)
Behaviour Intervention: The ABC of Behaviour
Available from: