Arbiter - Physical Chess Games to Virtual Representation

Who:

Nicholas Vadasz (nvadasz), Daniel Cai (dcai9), and Panos Syrgkanis (psyrgkanis)

Introduction: What problem are you trying to solve and why?

Chess is a game world-wide in it's popularity, with a thriving casual and competitive scene. Still, one of the main challenges of chess is the conversion from real world games (i.e. games played over a real board with tangible pieces) to it's virtual, string-based representation that can be evaluated, discussed, or loaded into an engine or platform without the need of a real board. Typically, this is done by humans manually, but it feasibly automatable as well.

This challenge represents a unique problem in the computing space as it can leverage strengths in both the Deep Learning and classical algorithmic space. Namely, the problem may be separated into a number of subtasks -- those being:

1) Physical Board to Binary Virtual Representation: Given some picture of a physical board, we should be able to construct an 8x8 grid (representing the board) where each entry is either 0 (that square on the board is empty) or 1 (that square on the board in occupied by some piece). Colloquially, we can call this the topography of the board.

2) Binary Virtual Representation to Full Representation: Given the binary virtual representation of the board and the corresponding image, we should be able to classify each positive entry as it's corresponding piece (pawn, rook, etc...). Our task at this point is completed, and we can represent this full representation with chess's most common notation, known as Algebraic Notation.

2.1) Bounding of BVR and FR via Game Continuity: Since we are modeling the "playing" of chess, we can constrict our possible predictions by algorithmically implementing the rules of chess (e.g. no more than 8 pawns per player). Furthermore, if we have knowledge of the previous board state, we can very strongly bound our predictions by only accepting predictions that are continuous with that previous board state (i.e. only one piece has moved, it has moved in the right way, and so on). This introduction of classical computing into our model represents a very effective manner by which we can boost our accuracy.

Related Work:

This task has been with a few scattered attempts, perhaps most popular of which being Chessvision. Chessvision represents a much narrower view of this problem, though, as it only works for drawn 2D transcriptions of games, and implements no bounding to possible predictions as discussed in 2.1. Still, it demonstrates the feasibility of our approach.

Data: What data are you using (if any)?

For our purposes, it makes the most sense to synthetically generate data simply by nature of how much data we will need for our purposes. Namely, we hope to generate scenes of varying boards, lighting, and chess positions with Blender. Additionally, it may be ideal to build out some architecture that allows the fine-tuning of our model on a specific chess set to improve accuracy, though this represents a kind of stretch goal.

Metrics: What constitutes “success?”

Success is defined rather simply in our case -- we may say our model is successful if given some picture of a chess board we are able to construct a virtual representation matching that position. Furthermore, we hope our model will be sufficiently generalized, allowing for sufficiently angled pictures of the chess board, varying chess sets, different lightning, lower quality photos, etc... Additionally, it would be ideal if our model could develop it's prediction in real time, allowing for the use of something like a video source.

Ethics:

Chess, perhaps humorously, has been dealing with a virtual doping scandal for some time, with the attack vector being some outside individual being able to view the board, transcribe it to it's virtual representation, and then feed that to a chess engine, returning the winning move covertly back the to player. It could be argued that, were our model to be successful, it represents an expansion of this attack vector where the manual transcription of the board is no longer necessary. To combat this, we may feasibly somehow obfuscate our representation or otherwise withhold it from being exported from our system.

Within chess tournaments (especially popular ones like the FIDE World Championship), it is often the job of many people to manually transcribe the board in real time for viewing and commentating purposes. While our model may make these individuals jobs redundant in some sense, I would argue that it is still necessary for individuals to monitor and inform our model, potentially overriding it in cases of inaccuracy.