Anti-cheating methodology

Roger Lancaster · Post by **Roger Lancaster** » Mon May 04, 2020 1:20 pm

There seems to be general acceptance that Professor Kenneth Regan's detection system has a much higher level of credibility, partly due to his academic credentials and partly because he is prepared to discuss his system publicly, than those systems operated by online platforms who - when challenged - refuse to discuss or disclose anything. I'm hesitant about questioning Professor Regan's methodology but there's one point which puzzles me.

Basically, we're talking probability theory. If one tosses a coin 10 times and it comes down heads 70% of the time, that's unremarkable. But, if one tosses a coin 100 times and it comes down heads 70% of the time, that's significant - meaning that it's almost certainly a biased coin.

As I understand it, essentially the same methodology is used to detect chess cheats. It's possible, by analysing past games, to predict the frequency with which a player rated x will play the 'best' move' as recommended by a strong computer program. Obviously this frequency is higher for high values of x than for low values. It's interesting to speculate how Chess.com and others manage to adjust for this variable without knowledge of a suspect's actual playing strength but enough has already been said about their detection mechanisms without my adding anything further here.

The calculation then proceeds along the lines that - if a player has a rating which leads to an expectation that he will play the 'best move' 50% of the time but he in fact plays the 'best move' 70% of the time - then, if that continues over a long enough period, he is almost certainly cheating.

So far, so good, as far as I can see and - if one applies this to someone who is a regular and consistent cheater - it should be highly reliable. Where I find myself in some difficulty is in assessing how this works in the case of [for example] a reasonably strong player who doesn't normally need to cheat to win games but does so only occasionally - let's say, in one game in five. If I understand Professor Regan correctly, he believes that - by aggregating together all the 20% suspect games and subjecting them to the methodology described, a pattern of cheating will emerge.

I don't doubt that's true. But it seems to me that it's true only because, instead of taking a random sample of the player's games, he has taken a sample of 'suspect' games which - pretty much by definition - is known to be biased and non-random. That's rather like calculating the average weight of a nation's population by taking a sample of males standing over 6 foot - no matter how large the sample, the result will be inaccurate. In the case of a chess player who has played maybe 2,000 games over a lifetime, if his 5 or 6 best games were regarded as 'suspect' and singled out for analysis in the manner described, it seems to me quite likely that the results would be regarded as highly suspicious. On that basis, it seems to me that a very large number of us would be identified as probable cheats.

I'm sure we have a few mathematics experts on this forum. It strikes me as rather alarming if I am right, not least because it would be a very basic sampling error, and frankly I'd be delighted to be proved wrong, so perhaps someone would care to point out the weakness in my reasoning.

NickFaulks · Post by **NickFaulks** » Mon May 04, 2020 1:36 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 1:20 pm
If I understand Professor Regan correctly, he believes that - by aggregating together all the 20% suspect games and subjecting them to the methodology described, a pattern of cheating will emerge.

Where has he said that?

Michael Farthing · Post by **Michael Farthing** » Mon May 04, 2020 1:36 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 1:20 pm
The calculation then proceeds along the lines that - if a player has a rating which leads to an expectation that he will play the 'best move' 50% of the time but he in fact plays the 'best move' 70% of the time - then, if that continues over a long enough period, he is almost certainly cheating.

This looks circular to me. If a player is cheating his rating goes up to match. The higher rating then predicts that he will make the best move more frequently - which he will be doing - that's why he got the higher rating.

Matthew Turner · Post by **Matthew Turner** » Mon May 04, 2020 1:58 pm

NickFaulks wrote: ↑
Mon May 04, 2020 1:36 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 1:20 pm
If I understand Professor Regan correctly, he believes that - by aggregating together all the 20% suspect games and subjecting them to the methodology described, a pattern of cheating will emerge.
Where has he said that?

Picking 5 games out of a sample of 30 because they look suspicious goes against any scientific approach. You would need a reason for selecting those particular 5 games, it my be that they are the most recent, the ones with prize money, the ones against higher rated players etc. etc. As soon as you start looking at a subset of games (or even moves within a game) you need to be extremely careful for exactly the reasons you have outlined.

Matthew Turner · Post by **Matthew Turner** » Mon May 04, 2020 2:02 pm

Michael Farthing wrote: ↑
Mon May 04, 2020 1:36 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 1:20 pm
The calculation then proceeds along the lines that - if a player has a rating which leads to an expectation that he will play the 'best move' 50% of the time but he in fact plays the 'best move' 70% of the time - then, if that continues over a long enough period, he is almost certainly cheating.
This looks circular to me. If a player is cheating his rating goes up to match. The higher rating then predicts that he will make the best move more frequently - which he will be doing - that's why he got the higher rating.

Correct, it is a problem, but there are quite a few ways round it - comparing online rating to OTB rating for starters.

Roger Lancaster · Post by **Roger Lancaster** » Mon May 04, 2020 2:11 pm

NickFaulks wrote: ↑
Mon May 04, 2020 1:36 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 1:20 pm
If I understand Professor Regan correctly, he believes that - by aggregating together all the 20% suspect games and subjecting them to the methodology described, a pattern of cheating will emerge.
Where has he said that?

https://www.youtube.com/watch?v=loNQ__09_fE

The topic in question starts roughly 36 minutes into the interview, although I'd recommend listening to more than just that.

Roger Lancaster · Post by **Roger Lancaster** » Mon May 04, 2020 2:14 pm

Matthew Turner wrote: ↑
Mon May 04, 2020 2:02 pm

Michael Farthing wrote: ↑
Mon May 04, 2020 1:36 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 1:20 pm
The calculation then proceeds along the lines that - if a player has a rating which leads to an expectation that he will play the 'best move' 50% of the time but he in fact plays the 'best move' 70% of the time - then, if that continues over a long enough period, he is almost certainly cheating.
This looks circular to me. If a player is cheating his rating goes up to match. The higher rating then predicts that he will make the best move more frequently - which he will be doing - that's why he got the higher rating.
Correct, it is a problem, but there are quite a few ways round it - comparing online rating to OTB rating for starters.

Also, the frequency-variation with which players make the 'best move' is less than one might suppose - according to Professor Regan, and I assume these figures ignore the opening moves, from around 40% for a weakish player to around 60% for a strong one.

Matthew Turner · Post by **Matthew Turner** » Mon May 04, 2020 2:20 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 2:11 pm

NickFaulks wrote: ↑
Mon May 04, 2020 1:36 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 1:20 pm
If I understand Professor Regan correctly, he believes that - by aggregating together all the 20% suspect games and subjecting them to the methodology described, a pattern of cheating will emerge.
Where has he said that?
https://www.youtube.com/watch?v=loNQ__09_fE

The topic in question starts roughly 36 minutes into the interview, although I'd recommend listening to more than just that.

Roger,
I think you have misunderstood, the 5 games he is looking at are ALL the games in a tournament, not picking 5 games out of say 9 or more.

Matthew Turner · Post by **Matthew Turner** » Mon May 04, 2020 2:26 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 2:14 pm
Also, the frequency-variation with which players make the 'best move' is less than one might suppose - according to Professor Regan, and I assume these figures ignore the opening moves, from around 40% for a weakish player to around 60% for a strong one.

This is something of a double edged sword. Since the matching rate is considerably lower than one might expect, it is easier to catch cheaters. However, the clip talked about ignoring the first 8 moves, but it is perfectly possible that a player would know 10, 15 perhaps even 20 moves of 'computer' theory. Those extra moves can have a huge impact on the result.

Roger Lancaster · Post by **Roger Lancaster** » Mon May 04, 2020 2:29 pm

Matthew Turner wrote: ↑
Mon May 04, 2020 2:20 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 2:11 pm

NickFaulks wrote: ↑
Mon May 04, 2020 1:36 pm

Where has he said that?
https://www.youtube.com/watch?v=loNQ__09_fE

The topic in question starts roughly 36 minutes into the interview, although I'd recommend listening to more than just that.
Roger,
I think you have misunderstood, the 5 games he is looking at are ALL the games in a tournament, not picking 5 games out of say 9 or more.

Point well made, apologies. But that's a case where someone is cheating consistently. My question however remains, how does the system tackle someone who cheats only, say, once or twice in 20% of his games but following no obvious pattern.

Roger de Coverly · Post by **Roger de Coverly** » Mon May 04, 2020 2:30 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 2:14 pm
Also, the frequency-variation with which players make the 'best move' is less than one might suppose - according to Professor Regan

In a position with no immediate forcing tactics, the "best move" is subjective anyway. What an engine using Alpha Zero methods might recommend is very possibly different from Stockfish, which in turn may differ from Rybka fifteen years ago or Chess Genius fifteen years before that.

Roger de Coverly · Post by **Roger de Coverly** » Mon May 04, 2020 2:33 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 2:29 pm
My question however remains, how does the system tackle someone who cheats only, say, once or twice in 20% of his games but following no obvious pattern.

Rausis was eventually caught in part because his rating had risen enough for his play, conduct and results to come under scrutiny.

Roger Lancaster · Post by **Roger Lancaster** » Mon May 04, 2020 2:38 pm

Matthew Turner wrote: ↑
Mon May 04, 2020 2:26 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 2:14 pm
Also, the frequency-variation with which players make the 'best move' is less than one might suppose - according to Professor Regan, and I assume these figures ignore the opening moves, from around 40% for a weakish player to around 60% for a strong one.
This is something of a double edged sword. Since the matching rate is considerably lower than one might expect, it is easier to catch cheaters. However, the clip talked about ignoring the first 8 moves, but it is perfectly possible that a player would know 10, 15 perhaps even 20 moves of 'computer' theory. Those extra moves can have a huge impact on the result.

Yes, in a different thread I cited a game where I played 27 opening moves from memory and I'm sure that stronger players will have a multitude of similar experiences. But that's a different issue - I've a faint recollection [sorry, I can't immediately source this] of reading somewhere else that 8 wasn't a fixed figure and that the program recognised established opening lines.

Mick Norris · Post by **Mick Norris** » Mon May 04, 2020 3:02 pm

Roger de Coverly wrote: ↑
Mon May 04, 2020 2:30 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 2:14 pm
Also, the frequency-variation with which players make the 'best move' is less than one might suppose - according to Professor Regan
In a position with no immediate forcing tactics, the "best move" is subjective anyway. What an engine using Alpha Zero methods might recommend is very possibly different from Stockfish, which in turn may differ from Rybka fifteen years ago or Chess Genius fifteen years before that.

Having just finished Game Changer, Alpha Zero has a completely different understanding than Stockfish and other engines, so I wonder if Alpha Zero could avoid the detection methods

Roger de Coverly · Post by **Roger de Coverly** » Mon May 04, 2020 3:05 pm

Roger Lancaster wrote: ↑
Mon May 04, 2020 2:38 pm
I've a faint recollection [sorry, I can't immediately source this] of reading somewhere else that 8 wasn't a fixed figure and that the program recognised established opening lines.

Justin's experience may suggest otherwise for chess.com, but generally there's a recognition that trundling down, for example, Spanish theory particularly the Marshall to move 18 or beyond isn't evidence of external assistance.

English Chess Forum

Anti-cheating methodology

Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology

Re: Anti-cheating methodology