[Computer-go] dealing with multiple local optima

terry mcintyre terrymcintyre at yahoo.com
Fri Feb 24 11:51:08 PST 2017


"seeing" is complex when the input is just a bunch of pixels.  Terry McIntyre <terrymc iintyre at yahoo.com> Unix/Linux Systems Administration Taking time to do it right saves having to do it twice. 

    On Friday, February 24, 2017 12:32 PM, Minjae Kim <xiver77 at gmail.com> wrote:
 

 But those video games have a very simple optimal policy. Consider Super Mario: if you see an enemy, step on it; if you see a whole, jump over it; if you see a pipe sticking up, also jump over it; etc.

On Sat, Feb 25, 2017 at 12:36 AM, Darren Cook <darren at dcook.org> wrote:

> ...if it is hard to have "the good starting point" such as a trained
> policy from human expert game records, what is a way to devise one.

My first thought was to look at the DeepMind research on learning to
play video games (which I think either pre-dates the AlphaGo research,
or was done in parallel with it): https://deepmind.com/research/ dqn/

It just learns from trial and error, no expert game records:

http://www.theverge.com/2016/ 6/9/11893002/google-ai- deepmind-atari-montezumas- revenge

Darren



--
Darren Cook, Software Researcher/Developer
My New Book: Practical Machine Learning with H2O:
  http://shop.oreilly.com/ product/0636920053170.do
______________________________ _________________
Computer-go mailing list
Computer-go at computer-go.org
http://computer-go.org/ mailman/listinfo/computer-go


_______________________________________________
Computer-go mailing list
Computer-go at computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20170224/b90a2ec1/attachment.html>


More information about the Computer-go mailing list