Skip to content

Commit b5ac873

Browse files
authored
Merge pull request #13 from dayyass/add_pip
add pip install / release v0.1.0
2 parents b38b67d + 3bee084 commit b5ac873

12 files changed

+42
-14
lines changed

Diff for: .coveragerc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[run]
22
branch = True
3-
source = graph_clustering
3+
source = graph_based_clustering
44

55
[report]
66
exclude_lines =

Diff for: .gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,5 @@
1515
venv
1616
.idea
1717
dist
18+
19+
*.egg-info/

Diff for: README.md

+6-8
Original file line numberDiff line numberDiff line change
@@ -27,18 +27,14 @@ pip install graph-based-clustering
2727

2828
### Usage
2929

30-
**graph-based-clustering** has two clustering methods:
31-
- ConnectedComponentsClustering
32-
- SpanTreeConnectedComponentsClustering
33-
34-
Both of these methods has sklearn-like `fit/fit_predict` interface.
30+
The library has sklearn-like `fit/fit_predict` interface.
3531

3632
#### ConnectedComponentsClustering
3733

38-
This method makes pairwise distances matrix on the input data, uses *threshold* (parameter given by the user) to binarize pairwise distances matrix and make undirected graph, and then finds connected components to perform the clustering.
34+
This method computes pairwise distances matrix on the input data, and using *threshold* (parameter provided by the user) to binarize pairwise distances matrix makes an undirected graph in order to find connected components to perform the clustering.
3935

4036
Required arguments:
41-
- **threshold** - threshold to binarize pairwise distances matrix and make undirected graph
37+
- **threshold** - paremeter to binarize pairwise distances matrix and make undirected graph
4238

4339
Optional arguments:
4440
- **metric** - sklearn.metrics.[pairwise_distances](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise_distances.html) parameter (default: *"euclidean"*)
@@ -47,6 +43,7 @@ Optional arguments:
4743
Example:
4844

4945
```python3
46+
import numpy as np
5047
from graph_based_clustering import ConnectedComponentsClustering
5148

5249
X = np.array([[0, 1], [1, 0], [1, 1]])
@@ -66,7 +63,7 @@ labels_pred = clustering.fit_predict(X)
6663

6764
#### SpanTreeConnectedComponentsClustering
6865

69-
This method makes pairwise distances matrix on the input data, consider this matrix as a graph, finds minimum spanning trees, and finaly, to perform the clustering, makes graph with *n_clusters* (parameter given by the user) connected components by removing *n_clusters - 1* edges with highest weights.
66+
This method computes pairwise distances matrix on the input data, builds a graph on the obtained matrix, finds minimum spanning tree, and finaly, performs the clustering through dividing the graph into *n_clusters* (parameter given by the user) by removing *n-1* edges with the highest weights.
7067

7168
Required arguments:
7269
- **n_clusters** - the number of clusters to find
@@ -78,6 +75,7 @@ Optional arguments:
7875
Example:
7976

8077
```python3
78+
import numpy as np
8179
from graph_based_clustering import SpanTreeConnectedComponentsClustering
8280

8381
X = np.array([[0, 1], [1, 0], [1, 1]])
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

Diff for: notebooks/plot_cluster_comparison.ipynb

+1-1
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@
5555
"\n",
5656
"import sys\n",
5757
"sys.path.append(\"..\")\n",
58-
"from graph_clustering import ConnectedComponentsClustering, SpanTreeConnectedComponentsClustering\n",
58+
"from graph_based_clustering import ConnectedComponentsClustering, SpanTreeConnectedComponentsClustering\n",
5959
"\n",
6060
"np.random.seed(0)"
6161
],

Diff for: pyproject.toml

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[build-system]
2+
requires = ["setuptools", "wheel"]
3+
build-backend = "setuptools.build_meta"

Diff for: requirements.txt

-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@ coverage==5.5
22
jupyter==1.0.0
33
matplotlib==3.4.3
44
numpy==1.21.2
5-
pandas==1.3.3
65
parameterized==0.8.1
76
pre-commit==2.15.0
87
scikit-learn==0.24.2

Diff for: setup.cfg

+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
[metadata]
2+
name = graph-based-clustering
3+
version = 0.1.0
4+
author = Dani El-Ayyass
5+
author_email = [email protected]
6+
description = Graph-Based Clustering using connected components and spanning trees
7+
long_description = file: README.md
8+
long_description_content_type = text/markdown
9+
url = https://github.com/dayyass/graph-based-clustering
10+
project_urls =
11+
Bug Tracker = https://github.com/dayyass/graph-based-clustering/issues
12+
classifiers =
13+
Programming Language :: Python :: 3
14+
License :: OSI Approved :: MIT License
15+
Operating System :: OS Independent
16+
17+
[options]
18+
packages = find:
19+
python_requires = >=3.7
20+
install_requires =
21+
numpy >= 1.21.2
22+
scikit-learn >= 0.24.2
23+
scipy >= 1.7.1

Diff for: tests/test_graph_clustering.py

+6-3
Original file line numberDiff line numberDiff line change
@@ -5,18 +5,21 @@
55
from sklearn.metrics import rand_score
66
from sklearn.preprocessing import StandardScaler
77

8-
from graph_clustering.check import (
8+
from graph_based_clustering.check import (
99
_check_matrix,
1010
_check_matrix_is_square,
1111
_check_square_matrix_is_symmetric,
1212
check_adjacency_matrix,
1313
check_symmetric,
1414
)
15-
from graph_clustering.main import (
15+
from graph_based_clustering.main import (
1616
ConnectedComponentsClustering,
1717
SpanTreeConnectedComponentsClustering,
1818
)
19-
from graph_clustering.utils import _pairwise_distances, distances_to_adjacency_matrix
19+
from graph_based_clustering.utils import (
20+
_pairwise_distances,
21+
distances_to_adjacency_matrix,
22+
)
2023

2124
from .utils import prepare_sklearn_clustering_datasets
2225

0 commit comments

Comments
 (0)