Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringUtils.uriDecode decodes strings with non-ASCII characters incorrectly #32360

Closed
Romanow88 opened this issue Mar 4, 2024 · 2 comments
Closed
Assignees
Labels
in: core Issues in core modules (aop, beans, core, context, expression) type: documentation A documentation task
Milestone

Comments

@Romanow88
Copy link

Romanow88 commented Mar 4, 2024

Affects: 6.1.3

StringUtils.uriDecode will take two different paths depending if "%" is found. One of these does not handle non-ASCII characters.

Fast path:

StringUtils.uriDecode("ü", StandardCharsets.UTF_8) // returns "ü"

Replace path:

StringUtils.uriDecode("%20ü", StandardCharsets.UTF_8) // returns " �"
@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Mar 4, 2024
@sbrannen sbrannen added the in: core Issues in core modules (aop, beans, core, context, expression) label Mar 4, 2024
@sbrannen sbrannen changed the title StringUtils.uriDecode decodes strings with non-ascii characters incorrectly StringUtils.uriDecode decodes strings with non-ASCII characters incorrectly Mar 4, 2024
@sbrannen sbrannen added type: bug A general bug and removed status: waiting-for-triage An issue we've not yet triaged or decided on labels Mar 4, 2024
@sbrannen sbrannen self-assigned this Mar 4, 2024
@poutsma poutsma assigned poutsma and unassigned sbrannen Mar 4, 2024
@poutsma poutsma added type: documentation A documentation task and removed type: bug A general bug labels Mar 4, 2024
@poutsma poutsma added this to the 6.2.x milestone Mar 4, 2024
dumbbelloper added a commit to dumbbelloper/spring-framework that referenced this issue Mar 5, 2024
StringUtils.uriDecode now correctly handles non-ASCII characters regardless of the presence of "%" encoding.

Previously, the method took two different paths depending on whether "%" was found, leading to incorrect handling of non-ASCII characters in the absence of "%" encoding.

This fix ensures that all characters, including non-ASCII ones, are properly decoded using the provided Charset, improving the method's reliability and consistency across all inputs.

This change addresses issues with decoding multibyte characters and ensures compatibility with a wider range of character encodings, enhancing the utility's overall functionality.

spring-projects#32360
@dumbbelloper
Copy link

Hello, I've submitted a Pull Request that aims to address this issue: PR #32373

@poutsma
Copy link
Contributor

poutsma commented Mar 5, 2024

StringUtils::uriDecode expects the input to be encoded, which means that the ü character should have been provided as %C3%BC instead. I will add a documentation note that makes this behavior clear.

Note that StringUtils::uriDecode is a low-level, internal utility method exposed via higher-level components, such as UriComponentsBuilder, that are better at dealing with [en|de]coded URIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: core Issues in core modules (aop, beans, core, context, expression) type: documentation A documentation task
Projects
None yet
Development

No branches or pull requests

5 participants