Skip to content

Change default # of iterations in Averaged Perceptron to 10 #2305

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
daholste opened this issue Jan 29, 2019 · 6 comments
Closed

Change default # of iterations in Averaged Perceptron to 10 #2305

daholste opened this issue Jan 29, 2019 · 6 comments
Labels
P2 Priority of the issue for triage purpose: Needs to be fixed at some point. perf Performance and Benchmarking related usability Smoothing user interaction or experience

Comments

@daholste
Copy link
Contributor

daholste commented Jan 29, 2019

@justinormont figured out that setting default # of iterations to 10 in the Averaged Perceptron learner would lead to better results

From: Justin Ormont
Sent: Monday, April 3, 2017 2:52:13 PM
Subject: Re: Move AveragedPerceptron defaults to iter=10

Greetings folks,

I had a chance to run larger datasets, and I think my conclusion holds.

I did a sweep of the 15GB dataset, and the 2.7TB dataset.

Sweep: 1 to 20 iterations; while it's still running; it's finished most of the experiments and the pattern is pretty clear.

15GB text (note x-axis is number of iterations, not time; y-axis AUC)
image
Also run (not shown) was FastTreeBinary, its AUC is below this graph at 89.1%, and much, much slower.

2.7TB numeric (note x-axis is number iterations, not time; y-axis AUC)
image

It doesn't appear that I've hit overfitting thus far in either dataset. AUC continues to increase from a low at iter=1 (far left), to a high on the right (iter=15)

How does AP iterations affect time?

Time was a bit odd (not a smooth graph) but generally increasing as the number of iterations increases.

15GB text (note x-axis is iteration count, y-axis is time)

image

Time was almost constant with added iterations (noise is due zooming). There's ~5% runtime difference between fastest and slowest on this graph, with 15 iterations being fastest (likely noise).

For 1 iterations: 14,478 (4.0 hours)
For 10 iterations: 14,623 sec (4.1 hours)
That's a very sub-linear 1.01x growth from 1 to 10 iterations

2.7TB numeric (note x-axis is iteration count, y-axis is time)

image

Sorry, the GUI cuts off the time labels on the left. Time given on next line.
For 1 iteration: 111,367 sec (1.3 days);
For 10 iterations: 317,203 sec (3.7 days).
That's a sub-linear 2.8x growth from 1 to 10 iterations.

I think the 15GB text dataset fitting fully in memory causes it to have a near constant runtime vs iterations and it's dominated by another factor, like Text featurization[wild guess].
The dataset being 2.7TB had to have caching turned off, and each iteration had to fetch the data from CT01; data fetch time may have dominated[wild guess].

Presented is AUC as the datasets are binary. Accuracy graphs look similar though more noisy indicating perhaps we could look at how we're setting the binary threshold.

Memory usage
In both datasets, memory usage appears flat (plus noise) as iteration count increases.

Methodology
Both datasets are binary classification of larger size than previous experiments w/ AveragedPerceptron's iteration count. All experiments were run on HPC with each experiment taking a full node until finished. Data was stored on CT01.

For the 2.7 TB numeric dataset, caching, normalization and shuffling were turned off. Caching was disabled due to size (2.7TB)

Conclusion
For AveragedPerceptron, iterations=10 seems to be an OK default for these two larger datasets; it appears the "best" (in terms of AUC/Acc) hasn't been hit and is above 15 for these.

For 10 iterations, the added duration in the 15GB dataset was negligible and the added runtime for the 2.7TB was an additional 1.8x.

The 2.7TB dataset gains ~0.2% AUC w/ 10 iterations (~7% decrease in relative AUC-loss [aka, 1-AUC]). The 15G dataset gains ~0.4% AUC w/ 10 iterations (~4% decease in relative AUC-loss).

@daholste daholste changed the title Change # of default iterations in Average Perceptron to 10 Change # of default iterations in Averaged Perceptron to 10 Jan 29, 2019
@daholste daholste changed the title Change # of default iterations in Averaged Perceptron to 10 Change default # of iterations in Averaged Perceptron to 10 Jan 29, 2019
@justinormont
Copy link
Contributor

justinormont commented Jan 29, 2019

General purpose is to make default runs better for the user.

Our current docs & benchmarks use AveragedPerceptron{iter=10}. This will simplify our user docs, and also simplify code in our AutoML sweepers.


The above main body of this issue discusses impact on larger datasets. The below discusses the impact on various sized text datasets.

We also evaluated across ~28 text datasets of various sizes:
image

Each line/color represents a certain ngram+chargram length with the pareto frontier highlighted; the connected line varies with a sweep across iter=N. The fastest results are to the right, and the best accuracy is at the top, hence points to the top right are best.

The current default is iter=1, which does very poor in comparison. Iter=10 is a nice bend in the graph in accuracy vs. time.

You'll notice for each featurization technique (each line), iter=10 is in a good place. For the unlabeled points, iter=10 is the 3rd from the right of each line. The only technique which has substantial gains beyond iter=10 is Trigram+Trichar.

@wschin
Copy link
Member

wschin commented Jul 2, 2019

This could be a breaking change, so I think we should keep the current setting. AutoML will eventually solve this problem for us. Please feel free to reopen it if you have other concerns. Thanks.

@wschin wschin closed this as completed Jul 2, 2019
@justinormont
Copy link
Contributor

@eerhardt, @terrajobst: Would changing a hyperparameter default be considered a breaking change?

Assuming I'm reading the breaking change conversation correctly, I think this issue is specifically being called out as an example of a non-breaking change:

So I'm not sure I'm too concerned about that, if someone references something with default number of iterations 1, and we change that default to 10 per #2305 hypothetically...

I would anticipate we can refine our default hyperparameters to gravitate users more quickly to good models. API signature is not affected and behavior is generally the same though resulting in better metrics.

Some examples of us changing hyperparameters:

re: AutoML -- correct, it solves the problem -if- the user is running AutoML. Good defaults get users on the right footing. We should make their first model great.

@justinormont justinormont reopened this Jul 3, 2019
@terrajobst
Copy link

terrajobst commented Jul 3, 2019

Presumably a hyperparameter is a property value that can be set and if not has a default?

Changing default values can break people, but we generally consider this in the realm of acceptable breaking changes, unless the change specifically makes fewer scenarios work (e.g. if it disallows more inputs).

@eerhardt
Copy link
Member

eerhardt commented Jul 8, 2019

I think we've decided that changing default values is an acceptable change in #3689. (Note, on the .NET team, we consider that any change could be a breaking change 😉.)

Note that when you change a parameter's default value, the change doesn't take affect until consuming code is re-compiled. If a user had code that was calling Foo() and in version 1 Foo had a default parameter bool bar = false, and we changed the default of bar from false to true in version 2. If that code was compiled against version 1, and executed against version 2, the value of bar will still be false. This is because in C# default values are compiled into the calling assembly.

@harishsk harishsk added the P2 Priority of the issue for triage purpose: Needs to be fixed at some point. label Jan 12, 2020
@najeeb-kazmi
Copy link
Member

Tracking in #4749

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P2 Priority of the issue for triage purpose: Needs to be fixed at some point. perf Performance and Benchmarking related usability Smoothing user interaction or experience
Projects
None yet
Development

No branches or pull requests

7 participants