-
Notifications
You must be signed in to change notification settings - Fork 278
Update pos_tag_transformers
function
#865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update pos_tag_transformers
function
#865
Conversation
Hello @pavaris-pm! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:
|
Kudos, SonarCloud Quality Gate passed!
|
pos _tag_transformers
functionpos_tag_transformers
function
_blackboard_support_engine = { | ||
"bert" : "lunarlist/pos_thai", | ||
} | ||
|
||
_pud_support_engine = { | ||
"wangchanberta" : "Pavarissy/wangchanberta-ud-thai-pud-upos", | ||
"mdeberta" : "Pavarissy/mdeberta-v3-ud-thai-pud-upos", | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that i've made a dictionary to match the model name with the input engine so that it would be easier to maintain internal code in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
I think in general we need to update/expand the naming convention to cover our function names and model names as well. Also to better enforce it.
Current naming convention for files
#141
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bact agreed!, setting convention will reduce much complexity in future development in both user side and contributor side as well. Do you need me to put this into the pythainlp discussion section so that we can inform other contributors and also newcomers in the future too? if you're ok, i will put it in rn (of course that i will put it in Thai language krub)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes please. Very appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! 💯
What does this changes
from #866, i've updated
pos_tag_transformers
function by clean up the code, add docstring, fix deprecation, and change the output format of the function to make it be the same format as other tagger in PyThaiNLPWhat was wrong
in #857 ,
pos_tag_transformers
was added which consist of 3 models, however, to call and engine, the full name of it must be specified, also the output still not the same format as another tagger. For examplewhich is very hard for the normal user to remember its entire name, and may result in more mess in the internal code if another transformers model trained on new corpus are added. we will end up with a lot of if-else condition in order to call a model in the future
How this fixes it
i've cleaned up the code to let a user call a model with parameters named
engine
andcorpus
same as what we have from the former function that ispos_tag
andpos_tag_sents
and also fix output format. This will reduce how hard to remember the entire model name. Here is the newly added version:Your checklist for this pull request
🚨Please review the guidelines for contributing to this repository.