Skip to content

UnicodeEncodeError: 'ascii' codec can't encode character [with ugettext_lazy] #34

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
madisvain opened this issue Sep 28, 2016 · 8 comments · Fixed by #47
Closed

UnicodeEncodeError: 'ascii' codec can't encode character [with ugettext_lazy] #34

madisvain opened this issue Sep 28, 2016 · 8 comments · Fixed by #47
Labels
Milestone

Comments

@madisvain
Copy link

Seems like anymail does not handle unicode correctly.

UnicodeEncodeError: 'ascii' codec can't encode character u'\u0142' in position 0: ordinal not in range(128)
  File "django/core/handlers/base.py", line 149, in get_response
    response = self.process_exception_by_middleware(e, request)
  File "django/core/handlers/base.py", line 147, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "django/utils/decorators.py", line 149, in _wrapped_view
    response = view_func(request, *args, **kwargs)
  File "gsmtasks/users/views.py", line 250, in password_reset
    form.save()
  File "gsmtasks/users/forms.py", line 174, in save
    msg.send()
  File "django/core/mail/message.py", line 292, in send
    return self.get_connection(fail_silently).send_messages([self])
  File "anymail/backends/base.py", line 86, in send_messages
    sent = self._send(message)
  File "anymail/backends/base_requests.py", line 56, in _send
    return super(AnymailRequestsBackend, self)._send(message)
  File "anymail/backends/base.py", line 116, in _send
    response = self.post_to_esp(payload, message)
  File "anymail/backends/base_requests.py", line 69, in post_to_esp
    response = self.session.request(**params)
  File "opbeat/instrumentation/packages/base.py", line 63, in __call__
    args, kwargs)
  File "opbeat/instrumentation/packages/base.py", line 222, in call_if_sampling
    return self.call(module, method, wrapped, instance, args, kwargs)
  File "opbeat/instrumentation/packages/requests.py", line 38, in call
    return wrapped(*args, **kwargs)
  File "requests/sessions.py", line 461, in request
    prep = self.prepare_request(req)
  File "requests/sessions.py", line 394, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "requests/models.py", line 298, in prepare
    self.prepare_body(data, files, json)
  File "requests/models.py", line 452, in prepare_body
    body = self._encode_params(data)
  File "requests/models.py", line 97, in _encode_params
    return urlencode(result, doseq=True)
  File "python2.7/urllib.py", line 1357, in urlencode
    l.append(k + '=' + quote_plus(str(elt)))
@medmunds
Copy link
Contributor

Anymail is meant to handle Unicode, but there may be some cases we missed.

  • Which ESP are you using?
  • What EmailMessage attr was that Unicode character in? (If you're not sure, seeing the code where you build the EmailMessage would be helpful to narrow things down.)

@madisvain
Copy link
Author

Hei - I'll be looking into this with our team also hopefully.

  • We are using Mailgun as an ESP currently.
  • It seems to be the subject - I have attached the Sentry error here.
    screenshot 2016-09-28 21 31 04

So the test string would probably be "Wymagane zresetowanie hasła do GSMtasks" - it's Polish character I suppose.

@medmunds
Copy link
Contributor

Hmm, I'm having trouble reproducing the problem. This seems to work with the Mailgun backend:

# -*- coding: utf-8 -*-
from django.core.mail.message import EmailMessage
msg = EmailMessage(
    u"Wymagane zresetowanie hasła do GSMtasks",
    u"Kliknij aby zresetować hasło",
    '[email protected]', ['[email protected]'])
msg.send()

What version of requests do you have installed, and what version of python? (I've just tested this on python 2.7.11 with both requests 2.9.1 and 2.11.1.)

@medmunds
Copy link
Contributor

Looking at the requests source, it seems like the only way you'd see this error (in python 2.x) is if your subject were somehow a unicode string, but isinstance(subject, unicode) returned False.

Is there any chance the subject is a Django ugettext_lazy translation string? If so, you could call django.utils.encoding.force_text on it before placing it in the EmailMessage subject.

(I'm trying to figure out whether Anymail should be calling force_text for you before handing off to requests. I have this vague memory of a similar problem with the from_email, which we ended up fixing in Django's EmailMessage code.)

@medmunds
Copy link
Contributor

@madisvain were you able to track this down to a Django lazy translation string, or otherwise reproduce it? As I mentioned above, Anymail+Mailgun seems to handle the unicode characters just fine.

@anderspetersson
Copy link

@medmunds I had the same problem and it was due to a lazy translation string, force_text fixed it.

@medmunds medmunds changed the title UnicodeEncodeError: 'ascii' codec can't encode character UnicodeEncodeError: 'ascii' codec can't encode character [with ugettext_lazy] Oct 23, 2016
@medmunds
Copy link
Contributor

@anderspetersson thanks for the confirmation; I updated the issue title to reflect this.

There are actually some potentially-serious problems with using ugettext_lazy but not explicitly forcing the result to non-lazy text somewhere. Even if you're not seeing encoding errors, you could end up with unexpected results. The issue isn't specific to Anymail. I posted to the django-dev list seeking clarification; will see what comes out of the discussion there.

@medmunds
Copy link
Contributor

OK, I've become convinced that as a package with "Django" in its name, one of django-anymail's responsibilities is handling ugettext_lazy objects for its users. Anymail is the bridge between the Django world and non-Django ESP APIs, so it needs to handle converting Django lazy strings for consumption by those APIs.

The fix needs to go into Anymail somewhere in the base backend, because it applies to every EmailMessage attribute that might be text. (Or that might contain text -- consider message.metadata = {'dept': ugettext_lazy("Sales")}.) And it's relevant for every ESP, whether we're using requests or calling the ESP's own python API package.

@medmunds medmunds added the bug label Oct 25, 2016
@medmunds medmunds modified the milestone: next Dec 5, 2016
medmunds added a commit that referenced this issue Dec 30, 2016
In BasePayload, ensure any Django ugettext_lazy
(or similar) are converted to real strings before
handing off to ESP code. This resolves problems where
calling code expects it can use lazy strings "anywhere",
but non-Django code (requests, ESP packages) don't
always handle them correctly.

* Add utils helpers for lazy objects (is_lazy, force_non_lazy*)
* Add lazy object handling to utils.Attachment
* Add lazy object handling converters to BasePayload attr
  processing where appropriate. (This ends up varying by
  the expected attribute type.)

Fixes #34.
medmunds added a commit that referenced this issue Dec 30, 2016
In BasePayload, ensure any Django ugettext_lazy
(or similar) are converted to real strings before
handing off to ESP code. This resolves problems where
calling code expects it can use lazy strings "anywhere",
but non-Django code (requests, ESP packages) don't
always handle them correctly.

* Add utils helpers for lazy objects (is_lazy, force_non_lazy*)
* Add lazy object handling to utils.Attachment
* Add lazy object handling converters to BasePayload attr
  processing where appropriate. (This ends up varying by
  the expected attribute type.)

Fixes #34.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants