Skip to content

Feature request: deep sort payload hashing #2120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 of 2 tasks
dreamorosi opened this issue Feb 21, 2024 · 11 comments · Fixed by #2570
Closed
1 of 2 tasks

Feature request: deep sort payload hashing #2120

dreamorosi opened this issue Feb 21, 2024 · 11 comments · Fixed by #2570
Assignees
Labels
confirmed The scope is clear, ready for implementation feature-request This item refers to a feature request for an existing or new utility good-first-issue Something that is suitable for those who want to start contributing idempotency This item relates to the Idempotency Utility

Comments

@dreamorosi
Copy link
Contributor

Use case

As a customer using the Idempotency utility I want my payloads to be considered idempotent regardless of the ordering of the keys/items within.

Currently, as discovered here, when we hash objects or arrays that have elements ordered differently they result in two different hashes. This causes the requests to be considered unique.

For reference, this is a simplified version of our hashing implementation:

const { createHash } = require("crypto");

/**
 * @param {string} data - The data to be hashed
 * @returns {string} - The hashed data
 */
const hash = (data) => {
  const hash = createHash("sha256");
  hash.update(data);
  return hash.digest("base64");
}

Now, let's take two requests:

const requestA = {
  name: "John",
  age: 30,
  city: "New York",
  address: {
    street: "5th Avenue",
    number: 123,
  },
};

const requestB = {
  city: "New York",
  name: "John",
  age: 30,
  address: {
    number: 123,
    street: "5th Avenue",
  },
};

These two requests should be considered idempotent despite having the keys ordered differently, however in our current implementation, they are considered as two different requests:

console.log(hash(JSON.stringify(objectA)) === hash(JSON.stringify(objectB))); // false

We should implement a function that sorts the objects not only at the top level but also at nested levels.

Solution/User Experience

The change should be completely transparent for customers and the API/DX of the utility should not change.

In terms of implementation, the one found in this blog post could be a good starting point:

function sortObject(object) {
  var sortedObj = {},
    keys = Object.keys(object);

  keys.sort(function (key1, key2) {
    (key1 = key1.toLowerCase()), (key2 = key2.toLowerCase());
    if (key1 < key2) return -1;
    if (key1 > key2) return 1;
    return 0;
  });

  for (var index in keys) {
    var key = keys[index];
    if (typeof object[key] == "object" && !(object[key] instanceof Array)) {
      sortedObj[key] = sortObject(object[key]);
    } else {
      sortedObj[key] = object[key];
    }
  }

  return sortedObj;
}

console.log(
  hash(JSON.stringify(sortObject(objectA))) ===
    hash(JSON.stringify(sortObject(objectB)))
); // true

We could probably improve it when it comes to detecting objects & arrays using the utilities in the commons package as well as making it type safe.

Alternative solutions

Consider other sorting functions if performance is better.

Acknowledgment

Future readers

Please react with 👍 and your use case to help us understand customer demand.

@dreamorosi dreamorosi added idempotency This item relates to the Idempotency Utility feature-request This item refers to a feature request for an existing or new utility confirmed The scope is clear, ready for implementation labels Feb 21, 2024
@dreamorosi dreamorosi moved this from Triage to Backlog in Powertools for AWS Lambda (TypeScript) Feb 21, 2024
@dreamorosi dreamorosi added good-first-issue Something that is suitable for those who want to start contributing help-wanted We would really appreciate some support from community for this one labels Feb 21, 2024
@karthikeyanjp
Copy link
Contributor

I would like to take this and submit a PR.

@dreamorosi
Copy link
Contributor Author

Hi @karthikeyanjp, sounds good!

If you have any questions please feel free to ask.

@dreamorosi
Copy link
Contributor Author

Hi @karthikeyanjp, I just wanted to check if you're still working on this or if I should put back the issue on the backlog.

@dreamorosi
Copy link
Contributor Author

Putting the issue back on the backlog, if anyone is interested in picking this up please leave a comment!

@dreamorosi
Copy link
Contributor Author

Hi @arnabrahman, yes that would be ideal, but it's not a strict requirement since it's used only there for now.

Regarding the usage, yes these are the places that come to mind.

Are you interested in contributing?

@arnabrahman
Copy link
Contributor

Yes, i am interested. @dreamorosi

@dreamorosi
Copy link
Contributor Author

Great! I'll assign the issue to you then

@dreamorosi dreamorosi moved this from Backlog to Working on it in Powertools for AWS Lambda (TypeScript) May 12, 2024
@dreamorosi dreamorosi removed the help-wanted We would really appreciate some support from community for this one label May 12, 2024
@arnabrahman
Copy link
Contributor

I have started working on it and have some followup questions. So if I understood the proposed solution correctly, it's only sorting the object & for array it's doing nothing. Also, the issue description examples describe the scenario for objects but not for arrays.

So, I am assuming for arrays we will honor the ordering & leave it as-is.

Meaning, that for these two scenarios, the requests are not idempotent (array orderings are different)?

const reqA = {
  a: 30,
  b: "New York",
  c: "John",
  d: [{
    a: 30,
    c: "John",
    b: "New York"
  }]
};

const reqB = {
  b: "New York",
  a: 30,
  c: "John",
  d: [{
    a: 30,
    b: "New York",
    c: "John"
  }]
};
const reqA = [{
  a: 30,
  c: "John",
  b: "New York",
}]

const reqB = [{
  a: 30,
  b: "New York",
  c: "John"
}];

@dreamorosi

@dreamorosi
Copy link
Contributor Author

Hi @arnabrahman, thanks for the update.

I did not think about arrays, but you make a good point.

I think if possible we should include arrays as part of this issue together with objects, so that requests like the one in your example are idempotent.

Copy link
Contributor

github-actions bot commented Jun 3, 2024

⚠️ COMMENT VISIBILITY WARNING ⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

@dreamorosi dreamorosi moved this from Coming soon to Shipped in Powertools for AWS Lambda (TypeScript) Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed The scope is clear, ready for implementation feature-request This item refers to a feature request for an existing or new utility good-first-issue Something that is suitable for those who want to start contributing idempotency This item relates to the Idempotency Utility
Projects
Development

Successfully merging a pull request may close this issue.

3 participants