[Computer-go] dealing with multiple local optima

Darren Cook darren at dcook.org
Fri Feb 24 07:36:09 PST 2017

> ...if it is hard to have "the good starting point" such as a trained
> policy from human expert game records, what is a way to devise one.

My first thought was to look at the DeepMind research on learning to
play video games (which I think either pre-dates the AlphaGo research,
or was done in parallel with it): https://deepmind.com/research/dqn/

It just learns from trial and error, no expert game records:



Darren Cook, Software Researcher/Developer
My New Book: Practical Machine Learning with H2O:

More information about the Computer-go mailing list