Skip to content

Latest commit

 

History

History
28 lines (21 loc) · 976 Bytes

92609e70-f79c-46b5-8419-55726e873cfc.md

File metadata and controls

28 lines (21 loc) · 976 Bytes

92609e70-f79c-46b5-8419-55726e873cfc

Generated from 140,000 most starred projects on GitHub in October 2016. Legacy pipeline, no splitting and stemming, later converted with quality loss.

Example:

from sourced.ml.models import Id2Vec
id2vec = Id2Vec().load("92609e70-f79c-46b5-8419-55726e873cfc")
print("Number of tokens:", len(id2vec))

References

ID 92609e70-f79c-46b5-8419-55726e873cfc
Uploaded 2017-06-18 17:37:06.255615
Version 1.0.0
File https://storage.googleapis.com/models.cdn.sourced.tech/models%2Fid2vec%2F92609e70-f79c-46b5-8419-55726e873cfc.asdf
Size 1.1 GB
Data collection date October 2016
Number of (sub)tokens 5,720,096
Number of repositories 112,273
License