Skip to content

allow for the hash character in configuration values #3425

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 20, 2025

Conversation

sbiscigl
Copy link
Contributor

@sbiscigl sbiscigl commented May 19, 2025

Issue #, if available:

#3412

Description of changes:

There was a issue users encountered that prevented them from using the # character in configuration value, specially sso start urls. This now makes sure that a comment inline has to include white space before the comment character.

go saw something similar as well and made the same change to align with other SDKs

Check all that applies:

  • Did a review by yourself.
  • Added proper tests to cover this PR. (If tests are not applicable, explain.)
  • Checked if this PR is a breaking (APIs have been changed) change.
  • Checked if this PR will not introduce cross-platform inconsistent behavior.
  • Checked if this PR would require a ReadMe/Wiki update.

Check which platforms you have built SDK on to verify the correctness of this PR.

  • Linux
  • Windows
  • Android
  • MacOS
  • IOS
  • Other Platforms

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@sbiscigl sbiscigl force-pushed the fix-sso-start-url-hash-char branch from 431840a to 58e3ee7 Compare May 19, 2025 19:51
@@ -348,13 +348,13 @@ R"(
;another comment
[ default ]
source_profile = base)" "\t" R"(
credential_process = echo '{ "Version": 1, "AccessKeyId": "ASIARTESTID", "SecretAccessKey": "TESTSECRETKEY", "SessionToken": "TESTSESSIONTOKEN", "Expiration": "2022-05-02T18:36:00+00:00" }'#COMMENT
credential_process = echo '{ "Version": 1, "AccessKeyId": "ASIARTESTID", "SecretAccessKey": "TESTSECRETKEY", "SessionToken": "TESTSESSIONTOKEN", "Expiration": "2022-05-02T18:36:00+00:00" }' #COMMENT
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these test cases had "invalid comments", they require white space before the actual comment

@sbiscigl sbiscigl force-pushed the fix-sso-start-url-hash-char branch 2 times, most recently from bf51c5d to 0ebee71 Compare May 19, 2025 20:07
#include <aws/core/utils/memory/stl/AWSSet.h>
#include <aws/core/utils/memory/stl/AWSStreamFwd.h>
#include <aws/core/utils/StringUtils.h>
#include <aws/core/utils/logging/LogMacros.h>
#include <fstream>

namespace {
Aws::Array<const char*, 2> COMMENT_START_SEQ{" #", " ;"};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to use this as you can use just the characters and prepend a space for the check for inline comment

Copy link
Contributor Author

@sbiscigl sbiscigl May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a benefit to having the runtime temp value instead of the few more const bytes? personally i dont really care either way. but I err on the side of less runtime evaluation when possible.

* @param line a raw line read in from configuration.
* @return the raw line with any commented out sections removed.
*/
static Aws::String StripCommentFromLine(const Aws::String& line) {
Copy link
Contributor

@sbera87 sbera87 May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be simplified to usage of just one delimiter list. Also need to strip out trailing whitespaces.
think of example: <code line> #comment

static Aws::String StripCommentFromLine(const Aws::String& line) {
    // return empty if line starts with comment char
    if (line.empty() || std::any_of(COMMENT_CHARS.begin(), COMMENT_CHARS.end(),
        [&line](char c) { return c == line[0]; })) {
        return {};
    }

    // find first comment char that has a space before it, then strip out code line sans trailing spaces
    size_t commentPos = Aws::String::npos;
    for (char c : COMMENT_CHARS) {
        size_t pos = line.find(' ' + Aws::String(1, c));
        if (pos != Aws::String::npos) {
            commentPos = std::min(commentPos, pos);
        }
    }

    if (commentPos == Aws::String::npos) {
        return line;
    }

    // find last non-whitespace character before the comment
    while (commentPos > 0 && std::isspace(line[commentPos - 1])) {
        --commentPos;
    }

    return line.substr(0, commentPos);
}

Copy link
Contributor Author

@sbiscigl sbiscigl May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need to strip out trailing whitespaces.

why does this funciton need to strip out white space? that happens in the ParseSectionDeclaration function. this function removes a inline comment. or returns a empty line for a commented line, simple as that, no need to add more functionality to it that is covered other places. there are added tests cases that actually cover that

can be simplified to usage of just one delimiter list.

' ' + Aws::String(1, c)

how would be creating a temporary value eachtime be beneficial vs having a statically defined list? its just a few more bytes to have a static value instead of a runtime value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its maintaining two hard coded lists vs one. The algorithm is far from optimal using std find, so question of static vs runtime optimization is moot imo. For pattern matching there are better algos (KMP) but thats an overkill here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is a piece of code
<code> ;#
current logic extracts <code> including trailing space which we dont need . It may be relvant when say someone has
an commented line
# comment
in that case we don't really need to strip the leading whitespace as a line. Or if we do, caller can decide what to do with it , but the point is this helper can indicate its all spaces and no other content to the caller to avoid further parsing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its maintaining two hard coded lists vs one. The algorithm is far from optimal using std find, so question of static vs runtime optimization is moot imo.

I personally dont care, i just avoid runtime eval when i can, if you feel as if its a blocker though I can change it.

if there is a piece of code

this is a private helper function, that is not invoked outside of this class, this is encapsulating one line of code and moving it to a private functoin, no need to overthink here. the old operation didnt remove white space, so i dont see why this should now either.

Copy link
Contributor

@sbera87 sbera87 May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the old operation shouldn't be the standard, since we have a shot at bettering it. I think we should have additional tests with lines with indented comments only (which will yield empty lines or no line depending on the final logic you choose) as well as setting a variable to string with "# or ;" in it akin to .
I don't think the current logic handles the latter case.

  configFile << R"([profile maelle])" << "\n"
             << R"(sso_account_id = 333333333333)" << "\n"
             << R"(sso_role_name = PictoAccess)" << "\n"
             << R"(sso_start_url = https://lumiere.com/for-those-who-come-after # stendahl pre nerf)" << "\n"
             << R"(                 ;# dummy comment here)" << "\n"     //this should be ok with/without empty line  
             << R"(StringWithSpaces="This is # a comment with comment chars #")" << "\n"   // this should also be handled 
             << R"(#sso_start_url = https://lumiere.com/verso)" << "\n"
             << R"(sso_region = eu-west-3)" << "\n"
             << R"(sso_registration_scopes = sso:account:access)" << "\n"
             << R"(region = eu-west-3)" << "\n"
             << R"(output = json)" << "\n";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the old operation shouldn't be the standard, since we have a shot at bettering it.

its simple string logic there is no need to over complicate this. this is spacing we're talking about, in a file that is just a simple configuration file. we dont need to spend time boilng the ocean over optimizing here.

if you want to spend sometime working through how we can re-work configuration file parsing we can do that

as well as setting a variable to string with "# or ;" in it akin to .

thats not mentioned in the configuration file spec why do you think we be handling it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ini files support quoted strings. If the code is meant to only parse specific subset of case and is not supposed to be foolproof line stripper, I can approve with the aforementioned comments. To your point, don't we then need to validate file doesn't have quotes? because if we do it will be incorrect then and we should at least flag that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ini files support quoted strings.

from the spec

Some implementations allow values to be quoted, typically using double quotes and/or apostrophes. This allows for explicit declaration of whitespace, and/or for quoting of special characters (equals, semicolon, etc.). The standard Windows function GetPrivateProfileString supports this, and will remove quotation marks that surround the values.

spcifically in this context "Some implementations allow". Currently we do not nor have we in the past.

is not supposed to be foolproof line stripper

this is not supposed to be a foolproof INI parser, nor this change to catpure all nuances of INI parsing. this change is just to fix a customer issue. in the future we have ezpressed interest into moving to the CRT configuraiton parser, aligning with how other SDKs parse INI files in a central dependency. that migrattion is not happening as part of this change.

@sbiscigl sbiscigl force-pushed the fix-sso-start-url-hash-char branch from 0ebee71 to ae8d5ea Compare May 20, 2025 15:13
@sbera87 sbera87 self-requested a review May 20, 2025 15:39
Copy link
Contributor

@sbera87 sbera87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with comments: Pointed out logic/implementation caveats and more test cases and negative test for validation of spec

@sbiscigl sbiscigl force-pushed the fix-sso-start-url-hash-char branch from ae8d5ea to f42af09 Compare May 20, 2025 16:42
@sbiscigl sbiscigl marked this pull request as ready for review May 20, 2025 16:42
@sbiscigl sbiscigl force-pushed the fix-sso-start-url-hash-char branch from f42af09 to 7cdf647 Compare May 20, 2025 19:55
@sbiscigl sbiscigl merged commit 1caaabd into main May 20, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants