Method 'Learner.tell' is ambiguous #80

basnijholt · 2018-12-19T16:57:22Z

(original issue on GitLab)

opened by Joseph Weston (@jbweston) at 2018-05-28T12:50:24.892Z

Presently Learner.tell decides whether it has been passed a single point, or a sequence of points by checking whether both the "x" (domain) and "y" (codomain) values are iterable. If they are, then they are treated as being a sequence of points, otherwise they are treated as a single point.

This works for all the current learners because, by pure accident, we have not yet implemented learners for functions ℝ^N → ℝ^M. This recently had confusing consequences (see gitlab:#58).

We have 2 choices:

Have 2 methods (e.g. tell and tell_many) that treat the single point case and several point case respectively. This would be a pure rename of the previous (and badly named) methods add_point and add_data.
Use better logic for determining when a sequence of points is being passed, as opposed to a single point.

The text was updated successfully, but these errors were encountered:

basnijholt · 2018-12-19T16:57:24Z

originally posted by Joseph Weston (@jbweston) at 2018-05-28T12:57:26.367Z on GitLab

Option 2 gives better UX, but may be difficult to do.

The idea is to implement tell at the level of BaseLearner:

def tell(self, xs, ys):
    if is_sequence_of_points(xs, ys):
        for x, y in zip(xs, ys):
            self._tell(x, y)
    else:
        self._tell(xs, ys)

The difficult part is implementing is_sequence_of_points in general, because we don't know what a point looks like, for a general learner; a single point could a sequence of values scalar values (e.g. vector in ℝ^N).

basnijholt · 2018-12-19T16:57:25Z

originally posted by Jorn Hoofwijk (@Jorn) at 2018-05-29T09:29:03.950Z on GitLab

As we already know the dimensionality of x based on self.bounds, we could use that to determine if you are adding a single point or multiple points at once, like this (pseudocode):

def tell(self, xs, ys):
    if self.ndim == 1 and typeof xs is number:
        self._tell(xs, ys)
    elif self.ndim == 1 and xs is iterable:
        for x, y in zip(xs, ys):
             self._tell(x, y)
    elif self.ndim == 1:  # not allowed
        throw some error
    elif xs is number:  # we know self.ndim > 1
        throw some other error
    elif len(xs.shape) == 1: # we know you are only adding a single point
        self._tell(xs, ys)
    elif len(xs.shape) == 2: # you are adding a list of points
        for x, y in zip(xs, ys):
             self.tell(x, y)
    else:
        throw some more errors

Maybe somewhere we should convert the input into a numpy array if it is an iterable. (so we can call xs.shape safely) and we should check that xs and ys have the same number of entries.

basnijholt · 2018-12-19T16:57:27Z

originally posted by Joseph Weston (@jbweston) at 2018-05-29T10:25:16.095Z on GitLab

Unfortunately BaseLearner does not have bounds. We can of course provide explicit implementations of tell for each subclass, but the point was to have the common logic specified only once.

Currently subclasses should implement _tell, which only takes a single point, unless the algorithm that the learner implements can do something more efficient than "call _tell n times", in which case it should implement tell.

basnijholt · 2018-12-19T16:57:28Z

originally posted by Bas Nijholt (@basnijholt) at 2018-05-29T19:40:59.308Z on GitLab

I agree, option 2 would be the best solution.

basnijholt · 2018-12-19T16:57:29Z

originally posted by Anton Akhmerov (@anton-akhmerov) at 2018-07-02T13:59:40.269Z on GitLab

I am actually in favor of the first solution. My reason is that:

It's easy for any consumer or author to use either of these without knowing that another one exists.
To cover common cases we can provide a default implementation of both methods, so that any learner should just implement one of the two.

basnijholt · 2018-12-19T16:57:30Z

originally posted by Joseph Weston (@jbweston) at 2018-07-02T17:32:20.080Z on GitLab

I think that solution 2 is preferable from a UX standpoint, but will implementing it properly will require too much effort for too little UX reward.

I cannot think of any actual scenario where having 2 methods would actually restrict you from doing anything.

basnijholt · 2018-12-19T16:57:31Z

originally posted by Bas Nijholt (@basnijholt) at 2018-07-02T17:37:23.897Z on GitLab

OK fair enough, I agree and will close gitlab:!78.

basnijholt added the bug label Dec 19, 2018

basnijholt closed this as completed Dec 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Method 'Learner.tell' is ambiguous #80

Method 'Learner.tell' is ambiguous #80

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018

Method 'Learner.tell' is ambiguous #80

Method 'Learner.tell' is ambiguous #80

Comments

basnijholt commented Dec 19, 2018

(original issue on GitLab)

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018

basnijholt commented Dec 19, 2018