Skip to content

Commit 6e6b9a8

Browse files
committed
Reintroduced ordering + fixing conversion typos
1 parent ea4af60 commit 6e6b9a8

14 files changed

+77
-45
lines changed

docs/source/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,6 @@ decomposition, and selection of features and samples.
2626
intro
2727
installation
2828
reference
29-
examples/index
29+
tutorials
3030
contributing
3131
bibliography

docs/source/intro.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
What's in scikit-matter?
2-
========================
2+
=======================
33

44
``scikit-matter`` is a collection of `scikit-learn <https://scikit.org/>`_
55
compatible utilities that implement methods born out of the materials science

docs/source/tutorials.rst

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
.. include:: examples/index.rst
2+
:start-after: inclusion-examples-start
3+
:end-before: inclusion-examples-end
4+
5+
.. toctree::
6+
:glob:
7+
:Caption: PCovR and KernelPCovR
8+
9+
examples/PCovR*
10+
11+
.. toctree::
12+
:glob:
13+
:Caption: Feature and Sample Selection
14+
15+
examples/FeatureSelection*
16+
examples/Selectors-Pipelines*
17+
18+
.. toctree::
19+
:Caption: Orthogonal Regression
20+
21+
examples/OrthogonalRegressionNonAnalytic
22+
23+
.. toctree::
24+
:Caption: Feature Reconstruction Measures
25+
26+
examples/PlotGFRE
27+
examples/PlotPointwiseGFRE
28+
examples/PlotLFRE

examples/FeatureSelection-WHODataset.py

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,8 @@
1616
from skmatter.feature_selection import CUR, FPS, PCovCUR, PCovFPS
1717
from skmatter.preprocessing import StandardFlexibleScaler
1818

19-
19+
# %%
20+
#
2021
# Load the Dataset
2122
# ----------------
2223

@@ -78,7 +79,7 @@
7879
# %%
7980
#
8081
# Scale and Center the Features and Targets
81-
# -----------------------------------------
82+
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8283

8384
x_scaler = StandardFlexibleScaler(column_wise=True)
8485
X = x_scaler.fit_transform(X_raw)
@@ -95,7 +96,7 @@
9596
# %%
9697
#
9798
# Provide an estimated target for the feature selector
98-
# ----------------------------------------------------
99+
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
99100

100101

101102
kernel_params = {"kernel": "rbf", "gamma": 0.08858667904100832}
@@ -112,7 +113,7 @@
112113

113114
# %%
114115
# PCov-CUR
115-
# --------
116+
# ^^^^^^^^
116117

117118

118119
pcur = PCovCUR(n_to_select=n_select, progress_bar=True, mixing=0.0)
@@ -121,7 +122,7 @@
121122
# %%
122123
#
123124
# PCov-FPS
124-
# --------
125+
# ^^^^^^^^
125126

126127
pfps = PCovFPS(
127128
n_to_select=n_select,
@@ -134,7 +135,7 @@
134135
# %%
135136
#
136137
# CUR
137-
# ---
138+
# ^^^
138139

139140
cur = CUR(n_to_select=n_select, progress_bar=True)
140141
cur.fit(X_train, y_train)
@@ -143,15 +144,15 @@
143144
# %%
144145
#
145146
# FPS
146-
# ---
147+
# ^^^
147148

148149
fps = FPS(n_to_select=n_select, progress_bar=True, initialize=cur.selected_idx_[0])
149150
fps.fit(X_train, y_train)
150151

151152
# %%
152153
#
153-
# (For Comparison) Recurisive Feature Addition
154-
# --------------------------------------------
154+
# (For Comparison) Recursive Feature Addition
155+
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
155156

156157

157158
class RecursiveFeatureAddition:
@@ -180,7 +181,7 @@ def fit(self, X, y):
180181
# %%
181182
#
182183
# Plot our Results
183-
# ================
184+
# ----------------
184185

185186

186187
fig, axes = plt.subplots(

examples/FeatureSelection.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@
6767
# %%
6868
#
6969
# Non-iterative feature selection with CUR + PCovR
70-
# ------------------------------------------------
70+
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7171
#
7272
# Computing a non-iterative CUR is more efficient, although can resultin poorer
7373
# performance for larger datasets. you can also use a greater number of

examples/OrthogonalRegressionNonAnalytic.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -236,9 +236,10 @@ def z_scaled_square_prism(z_scaling):
236236

237237
ax_xy.set_title("xy plane")
238238

239-
plt.legend(bbox_to_anchor=(1, 1), loc="upper left")
239+
ax_xy.legend(bbox_to_anchor=(1, 1), loc="upper left")
240240

241-
plt.show()
241+
fig.tight_layout()
242+
fig.show()
242243

243244
# %%
244245
#
@@ -279,7 +280,9 @@ def z_scaled_square_prism(z_scaling):
279280
)
280281
ax_wo_orth.set_xlabel("scaling in z direction")
281282
ax_wo_orth.legend(loc="upper right", bbox_to_anchor=(0.7, -0.2))
282-
plt.show()
283+
284+
fig.tight_layout()
285+
fig.show()
283286

284287
# %%
285288
#

examples/PCovR-WHODataset.py

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@
77
# %%
88
#
99

10-
1110
import numpy as np
1211
from matplotlib import pyplot as plt
1312
from scipy.stats import pearsonr
@@ -23,7 +22,7 @@
2322
# %%
2423
#
2524
# Load the Dataset
26-
# ================
25+
# ----------------
2726

2827

2928
df = load_who_dataset()["data"]
@@ -55,14 +54,14 @@
5554
print(X_raw[:, columns.index(ls)].min(), X_raw[:, columns.index(ls)].max())
5655
if ls in columns:
5756
X_raw[:, columns.index(ls)] = np.log10(X_raw[:, columns.index(ls)])
58-
y_raw = np.array(df["SP.DYN.LE00.IN"]) # [np.where(df['Year']==2000)[0]])
57+
y_raw = np.array(df["SP.DYN.LE00.IN"])
5958
y_raw = y_raw.reshape(-1, 1)
6059
X_raw.shape
6160

6261
# %%
6362
#
6463
# Scale and Center the Features and Targets
65-
# -----------------------------------------
64+
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6665

6766
x_scaler = StandardFlexibleScaler(column_wise=True)
6867
X = x_scaler.fit_transform(X_raw)
@@ -79,7 +78,7 @@
7978
# %%
8079
#
8180
# Train the Different Linear DR Techniques
82-
# ========================================
81+
# ----------------------------------------
8382
#
8483
# Best Error for Linear Regression
8584

@@ -90,7 +89,7 @@
9089
# %%
9190
#
9291
# PCovR
93-
# -----
92+
# ^^^^^
9493

9594
pcovr = PCovR(
9695
n_components=n_components,
@@ -113,7 +112,7 @@
113112
# %%
114113
#
115114
# PCA
116-
# ---
115+
# ^^^
117116

118117
pca = PCA(
119118
n_components=n_components,
@@ -140,10 +139,10 @@
140139
# %%
141140
#
142141
# Train the Different Kernel DR Techniques
143-
# ========================================
142+
# ----------------------------------------
144143
#
145144
# Select Kernel Hyperparameters
146-
# -----------------------------
145+
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
147146

148147
param_grid = {"gamma": np.logspace(-8, 3, 20), "alpha": np.logspace(-8, 3, 20)}
149148
clf = KernelRidge(kernel="rbf")
@@ -168,7 +167,7 @@
168167
# %%
169168
#
170169
# KPCovR
171-
# ------
170+
# ^^^^^^
172171

173172

174173
kpcovr = KernelPCovR(
@@ -191,7 +190,7 @@
191190
# %%
192191
#
193192
# KPCA
194-
# ----
193+
# ^^^^
195194

196195
kpca = KernelPCA(n_components=n_components, **kernel_params, random_state=0).fit(
197196
X_train, y_train
@@ -210,15 +209,15 @@
210209
# %%
211210
#
212211
# Correlation of the different variables with the KPCovR axes
213-
# -----------------------------------------------------------
212+
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
214213

215214
for c, x in zip(columns, X.T):
216215
print(c, pearsonr(x, T_kpcovr[:, 0])[0], pearsonr(x, T_kpcovr[:, 1])[0])
217216

218217
# %%
219218
#
220219
# Plot Our Results
221-
# ================
220+
# ----------------
222221

223222
fig, axes = plt.subplot_mosaic(
224223
"""

examples/PCovR.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -164,5 +164,5 @@
164164
# it's important to consider the nature of the property you are learning and the samples
165165
# you are comparing before constructing a kernel, for example, whether the analysis is
166166
# to be based on whole structures or individual atomic environments. For more detail,
167-
# see Appendix C of [Helfrecht 2020](https://iopscience.iop.org/article/10.1088/2632-2153/aba9ef)
168-
# or regarding kernels involving gradients [Musil 2021](https://arxiv.org/pdf/2101.08814.pdf).
167+
# see Appendix C of `Helfrecht 2020 <https://iopscience.iop.org/article/10.1088/2632-2153/aba9ef>`_
168+
# or regarding kernels involving gradients `Musil 2021 <https://arxiv.org/pdf/2101.08814.pdf>`_.

examples/PCovR_Regressors.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@
3333
# %%
3434
#
3535
# Use the default regressor in PCovR
36-
# ==================================
36+
# ----------------------------------
3737
#
3838
# When there is no regressor supplied, PCovR uses
3939
# ``sklearn.linear_model.Ridge('alpha':1e-6, 'fit_intercept':False, 'tol':1e-12)``.
@@ -50,7 +50,7 @@
5050
# %%
5151
#
5252
# Use a fitted regressor
53-
# ======================
53+
# ----------------------
5454
#
5555
# You can pass a fitted regressor to PCovR to rely on the predetermined
5656
# regression parameters. Currently, scikit-matter supports ``scikit-learn``
@@ -80,7 +80,7 @@
8080
# %%
8181
#
8282
# Use a pre-predicted y
83-
# =====================
83+
# ---------------------
8484
#
8585
# With ``regressor='precomputed'``, you can pass a regression output :math:`\hat{Y}` and
8686
# optional regression weights :math:`W` to PCovR. If ``W=None``, then PCovR will determine
@@ -110,7 +110,7 @@
110110
# %%
111111
#
112112
# Comparing Results
113-
# =================
113+
# -----------------
114114
#
115115
# Because we used the same regressor in all three models, they will yield the same result.
116116

examples/PCovR_Scaling.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,6 @@
143143

144144
# %%
145145
#
146-
# **Note: when the relative magnitude of the features or targets is important, such as
147-
# in load_csd_1000r, one should use the `StandardFlexibleScaler` provided by
148-
# ``scikit-matter``.**
146+
# **Note**: When the relative magnitude of the features or targets is important, such
147+
# as in :func:`skmatter.datasets.load_csd_1000r`, one should use the
148+
# :class:`skmatter.preprocessing.StandardFlexibleScaler`.

examples/PlotPointwiseGFRE.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@
5050
# \exp(-\gamma \|\mathbf{x}-\mathbf{x}'\|^2),\quad \gamma\in\mathbb{R}_+
5151
#
5252
# The projected RKHS features are computed using the eigendecomposition of the
53-
# positive-definite kernel matrix :math:`K``
53+
# positive-definite kernel matrix :math:`K`
5454
#
5555
# .. math::
5656
# K = ADA^T = AD^{\frac12}(AD^{\frac12})^T = \Phi\Phi^T

examples/README.rst

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,13 @@
1+
.. inclusion-examples-start
2+
13
Examples
2-
########
4+
========
35

46
For a thorough tutorial of the methods introduced in `scikit-matter`, we
57
suggest you check out the pedagogic notebooks in our companion project
68
`kernel-tutorials <https://github.com/lab-cosmo/kernel-tutorials/>`_.
79

810
The examples presented here need on top of the `scikit-matter` dependencies
911
`pandas <https://pandas.pydata.org>`_ and `matplotlib <https://matplotlib.org>`_.
12+
13+
.. inclusion-examples-end

examples/Selectors-Pipelines.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
# %%
2424
#
2525
# Simple integration of scikit-matter selectors
26-
# =============================================
26+
# ---------------------------------------------
2727
#
2828
# This example shows how to use FPS to subselect features before training a RidgeCV.
2929

@@ -50,7 +50,7 @@
5050
# %%
5151
#
5252
# Stacking selectors one after another
53-
# ====================================
53+
# ------------------------------------
5454
#
5555
# This example shows how to use an FPS, then CUR selector
5656
# to subselect features before training a RidgeCV.

skmatter/datasets/descr/nice_dataset.rst

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22

33
NICE dataset
44
############
5-
65
This is a toy dataset containing NICE[1, 4](N-body Iterative Contraction of Equivariants) features for first 500 configurations of the dataset[2, 3] with randomly displaced methane configurations.
76

87
Function Call
@@ -11,10 +10,8 @@ Function Call
1110

1211
Data Set Characteristics
1312
------------------------
14-
1513
:Number of Instances: 500
1614
:Number of Features: 160
17-
1815
The representations were computed using the NICE package[4] using the following definition of the NICE calculator:
1916

2017
.. code-block:: python

0 commit comments

Comments
 (0)