[Computer-go] AlphaGo Zero SGF - Free Use or Copyright?

Álvaro Begué alvaro.begue at gmail.com
Mon Oct 30 19:20:49 PDT 2017

I am not sure how people are designing self-driving cars, but if it were up
to me, it would be very explicitly about maximizing expected utility. A
neural network can be trained to estimate the expected sum of future
rewards, usually with some exponential future discount. Actually, that's
explicitly what Q-learning does, and it's not that different from how
AlphaGo's value network works.

The fact that it's hard to figure out why a neural network did what it did
is not worse than the situation with humans. We don't understand neurology
well enough to know why someone didn't see a pedestrian or a red light. And
somehow the legal system doesn't collapse. In the case of neural networks,
the case that resulted in the accident and similar cases can be added to
the training database to make future versions of the network more robust,
so over time the number of accidents should drop fast.


On Mon, Oct 30, 2017 at 6:06 PM, Pierce T. Wetter III <
pierce at alumni.caltech.edu> wrote:

> I would argue that if I was an engineer for a hypothetical autonomous car
> manufacturer, that it would be critically important to keep a running
> circular buffer of all the inputs over time for the car. Sort of like how
> existing cars have Dash Cams that continuously record to flash, but only
> keep the video if you tell it to or it detects major G forces.
> To your point, I’m not sure the car would necessarily be able to tell tree
> from child, tree might be “certain large obstacle” and child is “smaller
> large obstacle”. So that would give them the same utility function -1000.
> But utility functions are rarely so straightforward in a neural network as
> you suppose.
> I think it would take differential analysis (A term I just made up) to
> determine the utility function, which is why having a continuous log of all
> the input streams is necessary.
> On Oct 30, 2017, 3:45 PM -0700, Álvaro Begué <alvaro.begue at gmail.com>,
> wrote:
> In your hypothetical scenario, if the car can give you as much debugging
> information as you suggest (100% tree is there, 95% child is there), you
> can actually figure out what's happening. The only other piece of
> information you need is the configured utility values for the possible
> outcomes.
> Say the utility of hitting a tree is -1000, the utility of hitting a child
> is -5000 and the utility of not hitting anything is 0. A rational agent
> maximizes the expected value of the utility function. So:
>  - Option A: Hit the tree. Expected utility = -1000.
>  - Option B: Avoid the tree, possibly hitting the child, if there is a
> child there after all. Expected utility: 0.95 * (-5000) + 0.05 * 0 = -4750.
> So the car should pick option A. If the configured utility function is
> such that hitting a tree and hitting a child have the same value, the
> lawyers would be correct that the programmers are endangering the public
> with their bad programming.
> Álvaro.
> On Mon, Oct 30, 2017 at 2:22 PM, Pierce T. Wetter III <
> pierce at alumni.caltech.edu> wrote:
>> Unlike humans, who have these pesky things called rights, we can abuse
>> our computer programs to deduce why they made decisions. I can see a future
>> where that has to happen. From my experience in trying to best the stock
>> market with an algorithm I can tell you that you have to be able to explain
>> why something happened, or the CEO will rest control away from the
>> engineers.
>> Picture a court case where the engineers for an electric car are called
>> upon to testify about why a child was killed by their self driving car. The
>> fact that the introduction of the self-driving car has reduced the accident
>> rate by 99% doesn’t matter, because the court case is about *this* car
>> and *this* child. The 99% argument is for the closing case, or for the
>> legislature, but it’s early yet.
>> The Manufacturer throws up their arms and says “we dunno, sorry”.
>> Meanwhile, the plaintiff has hired someone who has manipulated the inputs
>> to the neural net, and they’ve figured out that the car struck the child,
>> because the car was 100% sure the tree was there, but it could only be 95%
>> sure the child was there. So it ruthlessly aimed for the lesser
>> probability.
>> The plaintiff’s lawyer argues that a human would have rather hit a tree
>> than a child.
>> Jury awards $100M in damages to the plaintiffs.
>> I would think it would be possible to do “differential” analysis on AGZ
>> positions to see why AGZ made certain moves. Add an eye to a weak group,
>> etc. Essentially that’s what we’re doing with MCTS, right?
>> It seems like a fun research project to try to build a system that can
>> reverse engineer AGZ, and not only would it be fun, but its a moral
>> imperative.
>> Pierce
>> _______________________________________________
>> Computer-go mailing list
>> Computer-go at computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
> _______________________________________________
> Computer-go mailing list
> Computer-go at computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://computer-go.org/pipermail/computer-go/attachments/20171030/a62790a7/attachment-0001.html>

More information about the Computer-go mailing list