Skip to content

Encode utf8 strings. Fixes errors when non-ascii chars are used. #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 6, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions robotframework_reportportal/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,14 @@ def __init__(self, name=None, parent_type="SUITE", attributes=None):
self.status = attributes["status"]

def get_name(self):
assignment = "{0} = ".format(", ".join(self.assign)) if self.assign else ""
assign = ", ".join(self.assign).encode("utf8")
assignment = "{0} = ".format(assign) if self.assign else ""
arguments = ", ".join(self.args)
full_name = "{0}{1} ({2})".format(assignment, self.name, arguments)
full_name = "{0}{1} ({2})".format(
Copy link

@krasoffski krasoffski Jun 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a problem context and robot framework internal implementation, thus I can be wrong.
From my perspective we need to convert all to Unicode first, than build string from parts and then convert single string to utf8 bytes (encode in python3 returns bytes type).

On python35

>>> s = "Русский"
>>> print(s)
Русский
>>> type(s)
<class 'str'>
>>> rb=s.encode("utf8")
>>> repr(rb)
"b'\\xd0\\xa0\\xd1\\x83\\xd1\\x81\\xd1\\x81\\xd0\\xba\\xd0\\xb8\\xd0\\xb9'"
>>> repr("{0}".format(rb))
'"b\'\\\\xd0\\\\xa0\\\\xd1\\\\x83\\\\xd1\\\\x81\\\\xd1\\\\x81\\\\xd0\\\\xba\\\\xd0\\\\xb8\\\\xd0\\\\xb9\'"'

On python27

>>> s = u"Русский"
>>> print(s)
Русский
>>> type(s)
<type 'unicode'>
>>> rb=s.encode("utf8")
>>> repr(rb)
"'\\xd0\\xa0\\xd1\\x83\\xd1\\x81\\xd1\\x81\\xd0\\xba\\xd0\\xb8\\xd0\\xb9'"
>>> repr("{0}".format(rb))
"'\\xd0\\xa0\\xd1\\x83\\xd1\\x81\\xd1\\x81\\xd0\\xba\\xd0\\xb8\\xd0\\xb9'"

Need to be checked for py2 and py3.
Also python format is able to work with bytes type.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frizzby This is common problem with Robot Framework, but it should be fixed not with encode.

The reason

It is because Robot Framework works only and only with unicode strings.
Meanwhile, doing "some string {}".format(robot_framework_value) will cause encoding error because python trying to encode unicode (robot_framework_value) to ascii maybe because of some internatioalization options (correct me please).

How to fix it properly

  • Please, remove all .encode calls.
  • Change all strings to unicode strings. For instance (file model.py):

Before:

    def get_name(self):
        assignment = "{0} = ".format(", ".join(self.assign)) if self.assign else ""
        arguments = ", ".join(self.args)
        full_name = "{0}{1} ({2})".format(assignment, self.name, arguments)
        return full_name[:256]

After:

     def get_name(self):
        assignment = u"{0} = ".format(u", ".join(self.assign)) if self.assign else ""
        arguments = u", ".join(self.args)
        full_name = u"{0}{1} ({2})".format(assignment, self.name, arguments)
        return full_name[:256]

assignment,
self.name.encode("utf8"),
arguments.encode("utf8")
)
return full_name[:256]

def get_type(self):
Expand Down