Skip to content

EC2MetadataCredentials and ECSCredentials retry #1114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 8, 2016
2 changes: 1 addition & 1 deletion lib/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ require('./credentials/credential_provider_chain');
* Currently supported options are:
*
* * **base** [Integer] — The base number of milliseconds to use in the
* exponential backoff for operation retries. Defaults to 100 ms.
* exponential backoff for operation retries. Defaults to 30 ms.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default used to be 30 ms but was changed to 100 ms with the configurable retry delays.

* * **customBackoff ** [function] — A custom function that accepts a retry count
* and returns the amount of time to delay in milliseconds. The `base` option will be
* ignored if this option is supplied.
Expand Down
18 changes: 15 additions & 3 deletions lib/credentials/ec2_metadata_credentials.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,23 +9,30 @@ require('../metadata_service');
* can connect, and credentials are available, these will be used with zero
* configuration.
*
* This credentials class will timeout after 1 second of inactivity by default.
* This credentials class will by default timeout after 1 second of inactivity
* and retry 3 times.
* If your requests to the EC2 metadata service are timing out, you can increase
* the value by configuring them directly:
* these values by configuring them directly:
*
* ```javascript
* AWS.config.credentials = new AWS.EC2MetadataCredentials({
* httpOptions: { timeout: 5000 } // 5 second timeout
* httpOptions: { timeout: 5000 }, // 5 second timeout
* maxRetries: 10, // retry 10 times
* retryDelayOptions: { base: 30 } // see AWS.Config for information
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's an example, but we should probably show increasing the base retry delay above 100 (current default). That puts it more in line with the comment that says you can increase these values on line 14.

* });
* ```
*
* @see AWS.Config.retryDelayOptions
*
* @!macro nobrowser
*/
AWS.EC2MetadataCredentials = AWS.util.inherit(AWS.Credentials, {
constructor: function EC2MetadataCredentials(options) {
AWS.Credentials.call(this);

options = options ? AWS.util.copy(options) : {};
options = AWS.util.merge(
{maxRetries: this.defaultMaxRetries}, options);
if (!options.httpOptions) options.httpOptions = {};
options.httpOptions = AWS.util.merge(
{timeout: this.defaultTimeout}, options.httpOptions);
Expand All @@ -39,6 +46,11 @@ AWS.EC2MetadataCredentials = AWS.util.inherit(AWS.Credentials, {
*/
defaultTimeout: 1000,

/**
* @api private
*/
defaultMaxRetries: 3,

/**
* Loads the credentials from the instance metadata service
*
Expand Down
68 changes: 45 additions & 23 deletions lib/credentials/ecs_credentials.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,21 @@ var AWS = require('../core');
* in the container. If valid credentials are returned in the response, these
* will be used with zero configuration.
*
* This credentials class will timeout after 1 second of inactivity by default.
* This credentials class will by default timeout after 1 second of inactivity
* and retry 3 times.
* If your requests to the relative URI are timing out, you can increase
* the value by configuring them directly:
*
* ```javascript
* AWS.config.credentials = new AWS.ECSCredentials({
* httpOptions: { timeout: 5000 } // 5 second timeout
* httpOptions: { timeout: 5000 }, // 5 second timeout
* maxRetries: 10, // retry 10 times
* retryDelayOptions: { base: 30 } // see AWS.Config for information
* });
* ```
*
* @see AWS.Config.retryDelayOptions
*
* @!macro nobrowser
*/
AWS.ECSCredentials = AWS.util.inherit(AWS.Credentials, {
Expand All @@ -40,6 +45,11 @@ AWS.ECSCredentials = AWS.util.inherit(AWS.Credentials, {
*/
host: '169.254.170.2',

/**
* @api private
*/
maxRetries: 3,

/**
* Sets the name of the ECS environment variable to check for relative URI
* If changed, please change the name in the documentation for defaultProvider
Expand Down Expand Up @@ -69,22 +79,17 @@ AWS.ECSCredentials = AWS.util.inherit(AWS.Credentials, {
*/
request: function request(path, callback) {
path = path || '/';

var data = '';
var http = AWS.HttpClient.getInstance();
var httpRequest = new AWS.HttpRequest('http://' + this.host + path);
httpRequest.method = 'GET';
httpRequest.headers.Accept = 'application/json';
var httpOptions = this.httpOptions;

process.nextTick(function() {
http.handleRequest(httpRequest, httpOptions, function(httpResponse) {
httpResponse.on('data', function(chunk) { data += chunk.toString(); });
httpResponse.on('end', function() { callback(null, data); });
}, callback);
});
AWS.util.handleRequestWithTimeoutRetries(httpRequest, this, callback);
},

/**
* @api private
*/
refreshQueue: [],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the refresh queue only implemented for ECS credentials and not EC2 credentials? EC2 credentials has the same issue.

Copy link
Contributor Author

@LiuJoyceC LiuJoyceC Aug 31, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EC2MetdataCredentials already had this feature, through the MetadataService. The refreshQueue on the ECSCredentials is analogous to the loadCredentialsCallbacks on the MetadataService. They're structured slightly differently but work the same way. (refreshQueue has to retain references to the instances of ECSCredentials because the data isn't given to the callback the way it is in MetadataService).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so EC2 already supported this, excellent.


/**
* Loads the credentials from the relative URI specified by container
*
Expand All @@ -98,18 +103,33 @@ AWS.ECSCredentials = AWS.util.inherit(AWS.Credentials, {
*/
refresh: function refresh(callback) {
var self = this;
var refreshQueue = self.refreshQueue;
if (!callback) callback = function(err) { if (err) throw err; };
refreshQueue.push({
provider: self,
errCallback: callback
});
if (refreshQueue.length > 1) { return; }

function callbacks(err, creds) {
var call, cb;
while ((call = refreshQueue.shift()) !== undefined) {
cb = call.errCallback;
if (!err) AWS.util.update(call.provider, creds);
cb(err);
}
}

if (process === undefined) {
callback(AWS.util.error(
callbacks(AWS.util.error(
new Error('No process info available'),
{ code: 'ECSCredentialsProviderFailure' }
));
return;
}
var relativeUri = this.getECSRelativeUri();
if (relativeUri === undefined) {
callback(AWS.util.error(
callbacks(AWS.util.error(
new Error('Variable ' + this.environmentVar + ' not set.'),
{ code: 'ECSCredentialsProviderFailure' }
));
Expand All @@ -119,13 +139,15 @@ AWS.ECSCredentials = AWS.util.inherit(AWS.Credentials, {
this.request(relativeUri, function(err, data) {
if (!err) {
try {
var creds = JSON.parse(data);
if (self.credsFormatIsValid(creds)) {
self.expired = false;
self.accessKeyId = creds.AccessKeyId;
self.secretAccessKey = creds.SecretAccessKey;
self.sessionToken = creds.Token;
self.expireTime = new Date(creds.Expiration);
data = JSON.parse(data);
if (self.credsFormatIsValid(data)) {
var creds = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% certain since I haven't tested this myself, but if these fields aren't assigned to self, will the next batch of requests cause the credentials to be refreshed again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assigning these fields to self is the only way the credentials can be accessed, since the callbacks for refresh are error-only callbacks, and when the callback returns with no error, the credentials are expected to have been loaded on to the instance of the credential provider itself. This is how it worked before on the ECSCredentials and how it works with the other credential providers. When a service request is made, the SDK checks directly on the credential provider for these fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized I misunderstood your question. It is indeed necessary to assign these fields to self, I've just simply moved them up into the callbacks function, and I'm passing this creds object containing all of the needed fields into callbacks. On line 118 above (if (!err) AWS.util.update(call.provider, creds);) is where these fields are getting assigned to the credential provider instance. The reason I can no longer use self is because if there are multiple callbacks in the refreshQueue coming from multiple instances of the ECS credential provider, only the first refresh() call will actually call this.request(), so self will only refer to that one instance of the credential provider. The call.provider in the refreshQueue is a reference to the instance of the credential provider where the corresponding callback in the refreshQueue is from. Before, assigning the fields directly to self was not a problem because every single refresh() call would invoke 'this.request() and the right instance of the credential provider would be referenced.

expired: false,
accessKeyId: data.AccessKeyId,
secretAccessKey: data.SecretAccessKey,
sessionToken: data.Token,
expireTime: new Date(data.Expiration)
};
} else {
throw AWS.util.error(
new Error('Response data is not in valid format'),
Expand All @@ -136,7 +158,7 @@ AWS.ECSCredentials = AWS.util.inherit(AWS.Credentials, {
err = dataError;
}
}
callback(err);
callbacks(err, creds);
});
}
});
16 changes: 5 additions & 11 deletions lib/metadata_service.js
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ AWS.MetadataService = inherit({
*
* * **timeout** (Number) — a timeout value in milliseconds to wait
* before aborting the connection. Set to 0 for no timeout.
* @option options maxRetries [Integer] the maximum number of retries to
* perform for timeout errors
* @option options retryDelayOptions [map] A set of options to configure the
* retry delay on retryable errors. See AWS.Config for details.
*/
constructor: function MetadataService(options) {
AWS.util.update(this, options);
Expand All @@ -61,19 +65,9 @@ AWS.MetadataService = inherit({
*/
request: function request(path, callback) {
path = path || '/';

var data = '';
var http = AWS.HttpClient.getInstance();
var httpRequest = new AWS.HttpRequest('http://' + this.host + path);
httpRequest.method = 'GET';
var httpOptions = this.httpOptions;

process.nextTick(function() {
http.handleRequest(httpRequest, httpOptions, function(httpResponse) {
httpResponse.on('data', function(chunk) { data += chunk.toString(); });
httpResponse.on('end', function() { callback(null, data); });
}, callback);
});
AWS.util.handleRequestWithTimeoutRetries(httpRequest, this, callback);
},

/**
Expand Down
9 changes: 1 addition & 8 deletions lib/service.js
Original file line number Diff line number Diff line change
Expand Up @@ -311,14 +311,7 @@ AWS.Service = inherit({
* @api private
*/
retryDelays: function retryDelays(retryCount) {
var retryDelayOptions = this.config.retryDelayOptions || {};
var customBackoff = retryDelayOptions.customBackoff || null;
if (typeof customBackoff === 'function') {
return customBackoff(retryCount);
}
var base = retryDelayOptions.base || 30;
var delay = Math.random() * (Math.pow(2, retryCount) * base);
return delay;
return AWS.util.calculateRetryDelay(retryCount, this.config.retryDelayOptions);
},

/**
Expand Down
45 changes: 45 additions & 0 deletions lib/util.js
Original file line number Diff line number Diff line change
Expand Up @@ -793,6 +793,51 @@ var util = {
if (typeof service !== 'string') service = service.serviceIdentifier;
if (typeof service !== 'string' || !metadata.hasOwnProperty(service)) return false;
return !!metadata[service].dualstackAvailable;
},

/**
* @api private
*/
calculateRetryDelay: function calculateRetryDelay(retryCount, retryDelayOptions) {
if (!retryDelayOptions) retryDelayOptions = {};
var customBackoff = retryDelayOptions.customBackoff || null;
if (typeof customBackoff === 'function') {
return customBackoff(retryCount);
}
var base = retryDelayOptions.base || 30;
var delay = Math.random() * (Math.pow(2, retryCount) * base);
return delay;
},

/**
* @api private
*/
handleRequestWithTimeoutRetries: function handleRequestWithTimeoutRetries(httpRequest, options, cb) {
if (!options) options = {};
var http = AWS.HttpClient.getInstance();
var httpOptions = options.httpOptions || {};
var retryCount = 0;

var errCallback = function(err) {
var maxRetries = options.maxRetries || 0;
if (err && err.code === 'TimeoutError' && retryCount < maxRetries) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered retrying for all the codes we retry when a service requests fails? TimeoutError is thrown by the SDK when a connection times out. I'm not sure if there are other errors the metadata service may also throw, but it shouldn't hurt to check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the PR to retry for 5xx status codes, as well as for 4xx status codes only if there is a retry-after header. I was able to reliably replicate the Metadata Service returning 429 (Too Many Requests) errors, which also returned a retry-after header which give a minimum number of seconds to wait before retrying.

retryCount++;
var delay = util.calculateRetryDelay(retryCount, options.retryDelayOptions);
setTimeout(sendRequest, delay);
} else {
cb(err);
}
};

var sendRequest = function() {
var data = '';
http.handleRequest(httpRequest, httpOptions, function(httpResponse) {
httpResponse.on('data', function(chunk) { data += chunk.toString(); });
httpResponse.on('end', function() { cb(null, data); });
}, errCallback);
};

process.nextTick(sendRequest);
}

};
Expand Down
58 changes: 43 additions & 15 deletions test/credentials.spec.coffee
Original file line number Diff line number Diff line change
Expand Up @@ -427,6 +427,12 @@ if AWS.util.isNode()

describe 'AWS.ECSCredentials', ->
creds = null
responseData = {
AccessKeyId: 'KEY',
SecretAccessKey: 'SECRET',
Token: 'TOKEN',
Expiration: (new Date(0)).toISOString()
}

beforeEach ->
creds = new AWS.ECSCredentials(host: 'host')
Expand All @@ -437,13 +443,8 @@ if AWS.util.isNode()

mockEndpoint = (expireTime) ->
helpers.spyOn(creds, 'request').andCallFake (path, cb) ->
cb null,
JSON.stringify({
AccessKeyId: 'KEY'
SecretAccessKey: 'SECRET'
Token: 'TOKEN'
Expiration: expireTime.toISOString()
})
expiration = expireTime.toISOString()
cb null, JSON.stringify(AWS.util.merge(responseData, {Expiration: expiration}))

describe 'constructor', ->
it 'allows passing of options', ->
Expand Down Expand Up @@ -479,16 +480,10 @@ if AWS.util.isNode()

describe 'credsFormatIsValid', ->
it 'returns false when data is missing required property', ->
responseData = {AccessKeyId: 'KEY', SecretAccessKey: 'SECRET', Token: 'TOKEN'}
expect(creds.credsFormatIsValid(responseData)).to.be.false
incompleteData = {AccessKeyId: 'KEY', SecretAccessKey: 'SECRET', Token: 'TOKEN'}
expect(creds.credsFormatIsValid(incompleteData)).to.be.false

it 'returns true when data has all required properties', ->
responseData = {
AccessKeyId: 'KEY',
SecretAccessKey: 'SECRET',
Token: 'TOKEN',
Expiration: (new Date(0)).toISOString()
}
expect(creds.credsFormatIsValid(responseData)).to.be.true

describe 'needsRefresh', ->
Expand Down Expand Up @@ -519,6 +514,39 @@ if AWS.util.isNode()
expect(spy.calls.length).to.equal(0)
expect(callbackErr).to.not.be.null

it 'retries up to specified maxRetries for timeout errors', (done) ->
process.env['AWS_CONTAINER_CREDENTIALS_RELATIVE_URI'] = '/path'
options = {maxRetries: 3}
creds = new AWS.ECSCredentials(options)
httpClient = AWS.HttpClient.getInstance()
spy = helpers.spyOn(httpClient, 'handleRequest').andCallFake (httpReq, httpOp, cb, errCb) ->
errCb({code: 'TimeoutError'})
creds.refresh (err) ->
expect(err).to.not.be.null
expect(err.code).to.equal('TimeoutError')
expect(spy.calls.length).to.equal(4)
done()

it 'makes only one request when multiple calls are made before first one finishes', (done) ->
concurrency = countdown = 10
process.env['AWS_CONTAINER_CREDENTIALS_RELATIVE_URI'] = '/path'
spy = helpers.spyOn(AWS.ECSCredentials.prototype, 'request').andCallFake (path, cb) ->
respond = ->
cb null, JSON.stringify(responseData)
process.nextTick(respond)
providers = []
callRefresh = (ind) ->
providers[ind] = new AWS.ECSCredentials(host: 'host')
providers[ind].refresh (err) ->
expect(err).to.equal(null)
expect(providers[ind].accessKeyId).to.equal('KEY')
countdown--
if countdown == 0
expect(spy.calls.length).to.equal(1)
done()
for x in [1..concurrency]
callRefresh(x - 1)

describe 'AWS.TemporaryCredentials', ->
creds = null

Expand Down
Loading