Skip to content

refactor MetaData.Builder to update aliasAndIndexLookup #29575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

talevy
Copy link
Contributor

@talevy talevy commented Apr 18, 2018

It would be nice to be able to look up other indices' aliases
while validating new aliases being introduced. For example,
If we would like to prohibit certain aliases and allow others,
depending on their respective definitions.

The problem with moving the construction of this lookup data-structure
to be iterative is that we do not have a clear view of all other possible
duplicate aliases. We would raise an error with the first duplicate we see.

@talevy talevy added >non-issue review :Core/Infra/Core Core issues without another label labels Apr 18, 2018
@talevy talevy requested a review from martijnvg April 18, 2018 05:20
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@talevy talevy requested a review from bleskes April 18, 2018 05:22
It would be nice to be able to look up other indices' aliases
while validating new aliases being introduced. For example,
If we would like to prohibit certain aliases and allow others,
depending on their respective definitions.

The problem with moving the construction of this lookup data-structure
to be iterative is that we do not have a clear view of *all* other possible
duplicate aliases. We would raise an error with the first duplicate we see.
@talevy talevy force-pushed the refactor-aliasandindexlookup branch from ceda3c1 to 6d3415a Compare April 18, 2018 05:44
@talevy
Copy link
Contributor Author

talevy commented Apr 19, 2018

@martijnvg after realizing what it takes to make this change work, I'm leaning towards your suggestion that was made offline about having more validation in Metadata.Builder#build instead of the other way around. The need for this data-structure being updated as one builds MetaData would not be necessary if more validation was just added to the build method. I like the simplicity of this, but I am not sure how costly it is to add some extra loops here for every build. what do you think? If we feel like more validation can be added to the build method, then I will close this PR

@talevy talevy added WIP and removed review labels Apr 19, 2018
@bleskes
Copy link
Contributor

bleskes commented Apr 19, 2018

@talevy @martijnvg I understand your sentiment but there are also things like #27689 we should consider. We allow a list of atomic alias updates that are applied one by one. I think we need to truly execute them one by one while tracking the current state and throwing errors as soon as we get to a bad place. For example, what happens if people send two commands to update the write alias for the same alias? what will the second one use as a basis?

I like the simplicity of this, but I am not sure how costly it is to add some extra loops here for every build. what do you think?

It should be sane (i.e. not a n^2 operation) but also doesn't need to be super high performance.

If we feel like more validation can be added to the build method

As discussed validation in the build method is required regardless. We don't always construct cluster states via the formal API (i.e., during recovery we don't).

@talevy
Copy link
Contributor Author

talevy commented Apr 19, 2018

@bleskes I see, thanks. I will push forward with this. I believe I have fixed all the test failures after my last commit.

return alias;
} else {
String indexName = ((AliasOrIndex.Index) alias).getIndex().getIndex().getName();
throw new IllegalStateException("index and alias names need to be unique, but the following duplicate was found ["
Copy link
Contributor Author

@talevy talevy Apr 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand I ported this over from MetaData.Builder#build where IllegalStateException makes more sense. I believe this should be InvalidAliasNameException, will update if others agree

@talevy talevy added review and removed WIP labels Apr 19, 2018
@talevy
Copy link
Contributor Author

talevy commented Apr 27, 2018

So, I've moved along with the is_write_index feature that brought this refactor to my attention in the first place. I've implemented that with the existing "rebuild upon request" strategy [1]. If we feel like rebuilding each time is not a big deal, then I can go ahead and close this PR.
Let me know!

[1] https://github.com/talevy/elasticsearch/blob/ece9f506818e1bc6758ac3174da7a39946cebdd6/server/src/main/java/org/elasticsearch/cluster/metadata/AliasValidator.java#L185

@talevy talevy added :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. and removed :Core/Infra/Core Core issues without another label labels Apr 27, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@talevy talevy added :Data Management/Indices APIs APIs to create and manage indices and templates and removed :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. labels Apr 27, 2018
@talevy
Copy link
Contributor Author

talevy commented May 2, 2018

After discussion with the team, I am closing this. It is not necessary for me anymore

@talevy talevy closed this May 2, 2018
@talevy talevy deleted the refactor-aliasandindexlookup branch May 2, 2018 15:56
talevy added a commit to talevy/elasticsearch that referenced this pull request May 17, 2018
This commit introduces the concept of an `is_write_index` flag for
aliases. Before, it was invalid to write to indices which point to multiple
indices. Now, users can configure one index to have an alias with `is_write_index: true`
and others that are false. When one attempts to index to this alias, it will resolve to
the one index that is the write index.

in response to elastic#30061.

This PR rebuilds aliasAndIndexLookup every time it wants to use it for alias validation. Depends on outcome of elastic#29575 for using a pre-built data-structure.

This PR introduces a concept in aliases called `is_write_index`. When `true` on a specific
alias definition associated with an `index`, index requests using the associated alias name will
resolve to this `index` for indexing.

Write now, if one attempts to index a document using an alias that points to multiple indices, there is no way to resolve which one, so an exception is thrown. With this flag, you can have an alias that points to a specific index in a set of indices as the one to write to.

It is possible to set an index as an alias' write-index during index creation:

```
PUT foo1
{
  "aliases": {
    "bar": { "is_write_index": true }
  }
}
```

Now, index requests like `PUT bar/_doc/_id` will route to index `foo1`. Another index, `foo2`, is
free to also have an alias named "bar" pointing to it. This alias needs to be configured with `is_write_index` set to `false`. `is_write_index` is defaulted to `false` if not specified, or it can be explicitly set to `false`.

```
PUT foo2
{
  "aliases": {
    "bar": {}
  }
}
```

Now, searches against `bar` will point to both indices, but index requests will be routed to `foo1`.

These settings can be updated in the `_aliases` API like so:

```
POST _aliases
{
  "actions": [
    {
      "add": {
        "index": "foo1",
        "alias": "bar",
        "is_write_index": false
      }
    }
  ]
}
```

How does this relate to Rollover? Rollover depended on rollover-aliases that only point to
one index since multiple would break indexing. Now that the same alias can point to multiple indices, the only requirement we have is that there is only one write index.

Here is an example scenario that didn't work before, but works with these changes:

```
PUT _template/my_template
{
  "index_patterns": ["foo-*"],
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  },
  "aliases": {
    "logs": {}
  },
  "mappings": {
    "_doc": {
      "_source": {
        "enabled": false
      }
    }
  }
}

PUT foo-000001
{
  "aliases": {
    "logs": { "is_write_index": true }
  }
}

PUT logs/_doc/_id
{
  "hello": "world"
}

POST /logs/_rollover/
{
  "conditions": {
    "max_docs":  1
  }
}

GET foo-000001 # now this is `is_write_index: false`
GET foo-000002 # now this is `is_write_index: true`
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates >refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants