[Project] Optimization of LiveChess2FEN on Nvidia Jetson Nano(WIP)
*This is an overview page. If you want to see more details, please visit original paper and my google drive .
【About】
Goal of this project is to implement and deploy LiveChess2FEN (A CNN-based live chess detection) on a Nvidia Jetson Nano, and try to find optimization and acceleration to improve overall performance.
【Why】
Automatic digitization of chess games is of great interest for players who want to broadcast their games online or analyze their games with chess engines. Although previous work has shown promising results, the recognition accuracy and latency still require further enhancements to enable practical and cost-effective deployment. Optimizing on the Jetson Nano, a platform with limited resources, holds significant relevance. Additionally, the chess automatic digitalization system holds practical applications such as live streaming for chess games.
【Methodologies】
There's three parts of our works.
-
Model Optimization
We used various combinations to find the solution, including Keras + TensorFlow (Native), ONNX Runtime, and TensorRT, among others. Considering the deployment difficulty for users, ONNX Runtime + Keras is our first choice. By leveraging modern tools, we successfully decreased the model loading and operation time.
-
Resource reusage
-
Chessboard detection
We utilize the method outlined in the original paper to detect the corners of the central 6x6 squares. If the result remains consistent (indicating that the board hasn't been moved), the system can leverage the previous result. This approach is highly effective; we saved approximately 70% of the time by employing it.
-
Chess piece recognition
We utilize chess rules to assist the system in detecting chess pieces. For instance, a king can move one square horizontally, vertically, or diagonally. By leveraging this information, the system can achieve greater accuracy. Additionally, we analyze changes in probability to identify which chess piece has been moved. The stricter the change in probability, the higher the likelihood that it represents the moved piece. However, this approach did not yield significant improvement, saving only around 1% of time.
-
I/O delay
During the experiment, we discovered a significant hardware delay of 7 seconds. As a solution, we employed OpenCV and Python to implement our own camera command, effectively reducing the delay to 2-3 seconds.
We used various combinations to find the solution, including Keras + TensorFlow (Native), ONNX Runtime, and TensorRT, among others. Considering the deployment difficulty for users, ONNX Runtime + Keras is our first choice. By leveraging modern tools, we successfully decreased the model loading and operation time.
-
Chessboard detection
We utilize the method outlined in the original paper to detect the corners of the central 6x6 squares. If the result remains consistent (indicating that the board hasn't been moved), the system can leverage the previous result. This approach is highly effective; we saved approximately 70% of the time by employing it. -
Chess piece recognition
We utilize chess rules to assist the system in detecting chess pieces. For instance, a king can move one square horizontally, vertically, or diagonally. By leveraging this information, the system can achieve greater accuracy. Additionally, we analyze changes in probability to identify which chess piece has been moved. The stricter the change in probability, the higher the likelihood that it represents the moved piece. However, this approach did not yield significant improvement, saving only around 1% of time.
During the experiment, we discovered a significant hardware delay of 7 seconds. As a solution, we employed OpenCV and Python to implement our own camera command, effectively reducing the delay to 2-3 seconds.