Skip to content

Commit edd528a

Browse files
GalOshriShauheen
authored and
Shauheen
committed
Add release notes for ML.NET 0.2 (dotnet#301)
* Add release notes for ML.NET 0.2 * Adding release note about TextLoader changes and additional issue/PR references * Addressing comments: fixing typos, changing formatting, and adding references
1 parent 62da34e commit edd528a

File tree

1 file changed

+95
-0
lines changed

1 file changed

+95
-0
lines changed

docs/release-notes/0.2/release-0.2.md

+95
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# ML.NET 0.2 Release Notes
2+
3+
We would like to thank the community for the engagement so far and helping us
4+
shape ML.NET.
5+
6+
Today we are releasing ML.NET 0.2. This release focuses on addressing
7+
questions/issues, adding clustering to the list of supported machine learning
8+
tasks, enabling using data from memory to train models, easier model
9+
validation, and more.
10+
11+
### Installation
12+
13+
ML.NET supports Windows, MacOS, and Linux. See [supported OS versions of .NET
14+
Core
15+
2.0](https://github.com/dotnet/core/blob/master/release-notes/2.0/2.0-supported-os.md)
16+
for more details.
17+
18+
You can install ML.NET NuGet from the CLI using:
19+
```
20+
dotnet add package Microsoft.ML
21+
```
22+
23+
From package manager:
24+
```
25+
Install-Package Microsoft.ML
26+
```
27+
28+
### Release Notes
29+
30+
Below are some of the highlights from this release.
31+
32+
* Added clustering to the list of supported machine learning tasks
33+
34+
* Clustering is an unsupervised learning task that groups sets of items
35+
based on their features. It identifies which items are more similar to
36+
each other than other items. This might be useful in scenarios such as
37+
organizing news articles into groups based on their topics, segmenting
38+
users based on their shopping habits, and grouping viewers based on
39+
their taste in movies.
40+
41+
* ML.NET 0.2 exposes `KMeansPlusPlusClusterer` which implements [K-Means++
42+
clustering](http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf)
43+
with [Yinyang K-means
44+
acceleration](https://www.microsoft.com/en-us/research/publication/yinyang-k-means-a-drop-in-replacement-of-the-classic-k-means-with-consistent-speedup/?from=http%3A%2F%2Fresearch.microsoft.com%2Fapps%2Fpubs%2Fdefault.aspx%3Fid%3D252149).
45+
[This
46+
test](https://github.com/dotnet/machinelearning/blob/78810563616f3fcb0b63eb8a50b8b2e62d9d65fc/test/Microsoft.ML.Tests/Scenarios/ClusteringTests.cs)
47+
shows how to use it (from
48+
[#222](https://github.com/dotnet/machinelearning/pull/222)).
49+
50+
* Train using data objects in addition to loading data from a file using
51+
`CollectionDataSource`. ML.NET 0.1 enabled loading data from a delimited
52+
text file. `CollectionDataSource` in ML.NET 0.2 adds the ability to use a
53+
collection of objects as the input to a `LearningPipeline`. See sample usage
54+
[here](https://github.com/dotnet/machinelearning/blob/78810563616f3fcb0b63eb8a50b8b2e62d9d65fc/test/Microsoft.ML.Tests/CollectionDataSourceTests.cs#L133)
55+
(from [#106](https://github.com/dotnet/machinelearning/pull/106)).
56+
57+
* Easier model validation with cross-validation and train-test
58+
59+
* [Cross-validation](https://en.wikipedia.org/wiki/Cross-validation_(statistics))
60+
is an approach to validating how well your model statistically performs.
61+
It does not require a separate test dataset, but rather uses your
62+
training data to test your model (it partitions the data so different
63+
data is used for training and testing, and it does this multiple times).
64+
[Here](https://github.com/dotnet/machinelearning/blob/78810563616f3fcb0b63eb8a50b8b2e62d9d65fc/test/Microsoft.ML.Tests/Scenarios/SentimentPredictionTests.cs#L51)
65+
is an example for doing cross-validation (from
66+
[#212](https://github.com/dotnet/machinelearning/pull/212)).
67+
68+
* Train-test is a shortcut to testing your model on a separate dataset.
69+
See example usage
70+
[here](https://github.com/dotnet/machinelearning/blob/78810563616f3fcb0b63eb8a50b8b2e62d9d65fc/test/Microsoft.ML.Tests/Scenarios/SentimentPredictionTests.cs#L36).
71+
72+
* Note that the `LearningPipeline` is prepared the same way in both cases.
73+
74+
* Speed improvement for predictions: by not creating a parallel cursor for
75+
dataviews that only have one element, we get a significant speed-up for
76+
predictions (see
77+
[#179](https://github.com/dotnet/machinelearning/issues/179) for a few
78+
measurements).
79+
80+
* Updated `TextLoader` API: the `TextLoader` API is now code generated and was
81+
updated to take explicit declarations for the columns in the data, which is
82+
required in some scenarios. See
83+
[#142](https://github.com/dotnet/machinelearning/pull/142).
84+
85+
* Added daily NuGet builds of the project: daily NuGet builds of ML.NET are
86+
now available
87+
[here](https://dotnet.myget.org/feed/dotnet-core/package/nuget/Microsoft.ML).
88+
89+
Additional issues closed in this milestone can be found [here](https://github.com/dotnet/machinelearning/milestone/1?closed=1).
90+
91+
### Acknowledgements
92+
93+
Shoutout to tincann, rantri, yamachu, pkulikov, Sorrien, v-tsymbalistyi, Ky7m,
94+
forki, jessebenson, mfaticaearnin, and the ML.NET team for their contributions
95+
as part of this release!

0 commit comments

Comments
 (0)