Skip to content

Added Burrows-Wheeler transform algorithm. #1029

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jul 17, 2019
Merged

Added Burrows-Wheeler transform algorithm. #1029

merged 16 commits into from
Jul 17, 2019

Conversation

brunohadlich
Copy link
Contributor

I added a docstring with parameters and return explanation but I'm not sure what is the right way to write it, I looked at google for a reference and only found PEP 257 that does not define exactly how these things should be written in a strict way.

I also found this stack overflow question that talks about PEP 287 and how reStructuredText (reST) format is used by Sphinx to generate documentation and seems to be a good approach.

I think it would be nice to adopt a style to standardize the project documentation. What you guys think?

…e 'python3 -m doctest -v data_structures/hashing/*.py' and 'python3 -m doctest -v data_structures/stacks/*.py' were failing not finding hash_table.py and stack.py modules.
…ere is a space after it the script will break making it much more error prone.
…. Limited line length to 79 and executed python black over all scripts.
Copy link
Member

@cclauss cclauss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. I approve as is but consider using PEP8 and PEP484 --> PEP526...
https://pep8.org/#programming-recommendations
https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html

"""


def all_rotations(string):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Python type hints https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html They take some getting used to but can catch bugs. Our automated testing already runs mypy on all pull requests.

Unfortunately, string is the name of a module in the Python Standard Library and str is the name of a Python data type so to avoid shadowing them, do not use those words as variable names. This is one of the very few places where I would advocate using a single letter variable name: s.

:param str string: The string that will be rotated len(string) times.
:return: A list with len(string) rotations of the parameter string.
:rtype: list[str]
:raises TypeError: If the string parameter type is not str.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type hints + a textual description might actually be more helpful than restructured text but if you stick with restructured text then avoid repeating things that are already in the type hints.

...
TypeError: The parameter string type must be str.
"""
if not (type(string) is str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8 says

Object type comparisons should always use isinstance() instead of comparing types directly.

flake8 E721 will find these automatically.

:raises TypeError: If the string parameter type is not str.
Examples:

>>> all_rotations("^BANANA|")
Copy link
Member

@cclauss cclauss Jul 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> all_rotations("^BANANA|") # doctest: +NORMALIZE_WHITESPACE

This comment is a doctest directive that allows you to lose the backslashes and indent the way your want because PEP8 does not advocate the use of backslashes. It says:

The preferred way of wrapping long lines is by using Python’s implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation.

...
ValueError: The parameter string must not be empty.
"""
if not (type(string) is str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isinstance()



if __name__ == "__main__":
string = input("Provide a string that I will generate its BWT transform: ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider doing input().strip() to remove leading and/or trailing whitespace from user input.

@brunohadlich
Copy link
Contributor Author

@cclauss my last commit takes into account your last comments. Thanks they were really helpful. If everything is ok I think it can be merged.

@cclauss cclauss merged commit e662a5a into TheAlgorithms:master Jul 17, 2019
stokhos pushed a commit to stokhos/Python that referenced this pull request Jan 3, 2021
* Added doctest and more explanation about Dijkstra execution.

* tests were not passing with python2 due to missing __init__.py file at number_theory folder

* Removed the dot at the beginning of the imported modules names because 'python3 -m doctest -v data_structures/hashing/*.py' and 'python3 -m doctest -v data_structures/stacks/*.py' were failing not finding hash_table.py and stack.py modules.

* Moved global code to main scope and added doctest for project euler problems 1 to 14.

* Added test case for negative input.

* Changed N variable to do not use end of line scape because in case there is a space after it the script will break making it much more error prone.

* Added problems description and doctests to the ones that were missing. Limited line length to 79 and executed python black over all scripts.

* Changed the way files are loaded to support pytest call.

* Added __init__.py to problems to make them modules and allow pytest execution.

* Added project_euler folder to test units execution

* Changed 'os.path.split(os.path.realpath(__file__))' to 'os.path.dirname()'

* Added Burrows-Wheeler transform algorithm.

* Added changes suggested by cclauss
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants