-
Notifications
You must be signed in to change notification settings - Fork 3.8k
GPSE example #10118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
GPSE example #10118
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
for more information, see https://pre-commit.ci
Merged
puririshi98
added a commit
that referenced
this pull request
Apr 4, 2025
[Graph Positional and Structural Encoder](https://arxiv.org/abs/2307.07107) implementation as per #8310. Adapted from the original repository: https://github.com/G-Taxonomy-Workgroup/GPSE. This version is a standalone implementation that is decoupled from GraphGym, and thus aims for better accessibility and a smoother integration into PyG. While the priority of this PR is to enable loading and using pre-trained models in plug-and-play fashion, it also includes the custom loss function used to train the model. Nevertheless, it might be easier to use the original repository for pre-training and fine-tuning new GPSE models for the time being. This PR includes the following: - `GPSE`: The main GPSE module, that generates learned encodings for input graphs. - Several helper classes (`FeatureEncoder`, `GNNStackStage`, `IdentityHead`, `GNNInductiveHybridMultiHead`, `ResGatedGCNConvGraphGymLayer`, `Linear`, `MLP`, `GeneralMultiLayer`, `GeneralLayer`, `BatchNorm1dNode`, `BatchNorm1dEdge`, `VirtualNodePatchSingleton`) and wrapper functions (`GNNPreMP`, `GNNLayer`), all adapted from their GraphGym versions for compatibility and enabling the loading of weights pre-trained using the GraphGym/original version. - The class method `GPSE.from_pretrained()` that returns a model with pre-trained weights from the original repository/Zenodo files. - `GPSENodeEncoder`, a helper linear/MLP encoder that takes the GPSE encodings precomputed as`batch.pestat_GPSE` in the input graphs, maps them to a desired dimension and appends them to node features. - `precompute_GPSE` , a function that takes in a GPSE model and a dataset, and precomputes GPSE encodings in-place for a given dataset using the helper function `gpse_process_batch`. - The transform `AddGPSE`, which in similar fashion to `AddLaplacianEigenvectorPE` and `AddRandomWalkPE` adds the GPSE encodings to a given graph using the helper function `gpse_process` - The testing modules `test/test_gpse.py` and `test/test_add_gpse.py`. - The loss function `gpse_loss` and helper functions `cosim_col_sep` and `process_batch_idx` used in GPSE training. - A comprehensive example in `examples/gpse.py` is provided as a separate PR in #10118. This PR has been tested with the latest (25.01-py3) NVIDIA stack, and has works without any issues. --------- Co-authored-by: Semih Cantürk <=> Co-authored-by: rusty1s <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rishi Puri <[email protected]> Co-authored-by: Rishi Puri <[email protected]>
puririshi98
approved these changes
Apr 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
auto-merge was automatically disabled
April 22, 2025 19:49
Invalid email address
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Following #9018, Provides a comprehensive example in
examples/gpse.py
to compute GPSE encodings and use them for a graph regression task on the ZINC dataset. Two methods to compute GPSE encodings is demonstrated:precompute_GPSE
: Given a PyG dataset, computes GPSE encodings in-place once before training, without saving them to storage. Ideal if you want to compute the encodings only once per run (unlike a dataset transform) but do not want to save the pre-transformed dataset to storage (unlike a PyG pre-transform).To run with default pretrained weights (molpcba):
To run with pretrained weights from any other dataset, please provide the pretraining dataset name from the available options as a kwarg:
AddGPSE
transform: A PyG transform analogous to AddLaplacianEigenvectorPE and AddRandomWalkPE, can be used as a pre-transform or transform to a PyG dataset.Using as a transform is not recommended as recomputing them for every batch in every epoch is quite inefficient; using it as a pre-transform or through
precompute_GPSE
is suggested instead. In either case, thetorch_geometric.nn.GPSENodeEncoder
is then used to compute a mapping of the GPSE encodings to the desired dimension, and append them tobatch.x
to prepare them as inputs to a GNN.This PR has been tested with the latest (25.01-py3) NVIDIA stack, and has works without any issues.