[Computer-go] AlphaZero tensorflow implementation/tutorial

uurtamo uurtamo at gmail.com
Sun Dec 9 18:31:50 PST 2018


Imagine that your score estimator has a better idea about the outcome of
the game than the players themselves.

Then you can build a stronger computer player with the following algorithm:
use the score estimator to pick the next move after evaluating all legal
moves, by evaluating their after-move scores.

If you use something like Tromp-Taylor (not sure what most people use
nowadays) then you can score it less equivocally.

Perhaps I was misunderstanding, but if not then this could be a somewhat
serious problem.

s


On Sun, Dec 9, 2018, 6:17 PM cody2007 <cody2007 at protonmail.com wrote:

> >By the way, why only 40 moves? That seems like the wrong place to
> economize, but maybe on 7x7 it's fine?
> I haven't implemented any resign mechanism, so felt it was a reasonable
> balance to at least see where the players roughly stand. Although, I think
> I errored on too few turns.
>
> >A "scoring estimate" by definition should be weaker than the computer
> players it's evaluating until there are no more captures possible.
> Not sure I understand entirely. But would agree that the scoring I use is
> probably a limitation here.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Sunday, December 9, 2018 8:51 PM, uurtamo <uurtamo at gmail.com> wrote:
>
> A "scoring estimate" by definition should be weaker than the computer
> players it's evaluating until there are no more captures possible.
>
> Yes?
>
> s.
>
> On Sun, Dec 9, 2018, 5:49 PM uurtamo <uurtamo at gmail.com wrote:
>
>> By the way, why only 40 moves? That seems like the wrong place to
>> economize, but maybe on 7x7 it's fine?
>>
>> s.
>>
>> On Sun, Dec 9, 2018, 5:23 PM cody2007 via Computer-go <
>> computer-go at computer-go.org wrote:
>>
>>> Thanks for your comments.
>>>
>>> >looks you made it work on a 7x7 19x19 would probably give better result
>>> especially against yourself if you are a complete novice
>>> I'd expect that'd make me win even more against the algorithm since it
>>> would explore a far smaller amount of the search space, right?
>>> Certainly something I'd be interested in testing though--I just would
>>> expect it'd take many months more months of training however, but would be
>>> interesting to see how much performance falls apart, if at all.
>>>
>>> >for not cheating against gnugo, use --play-out-aftermath of gnugo
>>> parameter
>>> Yep, I evaluate with that parameter. The problem is more that I only
>>> play 20 turns per player per game. And the network seems to like placing
>>> stones in terrotories "owned" by the other player. My scoring system then
>>> no longer counts that area as owned by the player. Probably playing more
>>> turns out and/or using a more sophisticated scoring system would fix this.
>>>
>>> >If I don't mistake a competitive ai would need a lot more training such
>>> what does leela zero https://github.com/gcp/leela-zero
>>> Yeah, I agree more training is probably the key here. I'll take a look
>>> at leela-zero.
>>>
>>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>>> On Sunday, December 9, 2018 7:41 PM, Xavier Combelle <
>>> xavier.combelle at gmail.com> wrote:
>>>
>>> looks you made it work on a 7x7 19x19 would probably give better result
>>> especially against yourself if you are a complete novice
>>>
>>> for not cheating against gnugo, use --play-out-aftermath of gnugo
>>> parameter
>>>
>>> If I don't mistake a competitive ai would need a lot more training such
>>> what does leela zero https://github.com/gcp/leela-zero
>>> Le 10/12/2018 à 01:25, cody2007 via Computer-go a écrit :
>>>
>>> Hi all,
>>>
>>> I've posted an implementation of the AlphaZero algorithm and brief
>>> tutorial. The code runs on a single GPU. While performance is not that
>>> great, I suspect its mostly been limited by hardware limitations (my
>>> training and evaluation has been on a single Titan X). The network can beat
>>> GNU go about 50% of the time, although it "abuses" the scoring a little
>>> bit--which I talk a little more about in the article:
>>>
>>>
>>> https://medium.com/@cody2007.2/alphazero-implementation-and-tutorial-f4324d65fdfc
>>>
>>> -Cody
>>>
>>> _______________________________________________
>>> Computer-go mailing listComputer-go at computer-go.orghttp://computer-go.org/mailman/listinfo/computer-go
>>>
>>>
>>> _______________________________________________
>>> Computer-go mailing list
>>> Computer-go at computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20181209/10b02518/attachment-0001.html>


More information about the Computer-go mailing list