Skip to content

Reindex from remote cluster #17447

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
clintongormley opened this issue Mar 31, 2016 · 10 comments
Closed

Reindex from remote cluster #17447

clintongormley opened this issue Mar 31, 2016 · 10 comments
Assignees
Labels
:Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >feature release highlight

Comments

@clintongormley
Copy link
Contributor

The reindex API should be able to pull data from an index in a remote cluster. This would allow importing an index in a 1.x cluster into a cluster running Elasticsearch 5 or later.

The syntax could look like this:

POST /_reindex
{
  "source": {
    "remote": {
      "host": "192.168.1.2:9200",
      "username": "foo",
      "password": "bar"
    },
    "index": "twitter",
    "query": {
      "match": {
        "user": "kimchy"
      }
    }
  },
  "dest": {
    "index": "new_twitter"
  }
}

The username and password and any other options within remote would be used to configure the HTTP client contacting the remote server, and the query etc would be passed to the remote server without parsing (which deals with any API changes).

@nik9000
Copy link
Member

nik9000 commented Mar 31, 2016

I like the syntax. I think we'll need the http based java API for it though.

@clintongormley
Copy link
Contributor Author

Not necessarily. A simple HTTP client should be sufficient for this.

@nik9000 nik9000 self-assigned this Apr 26, 2016
@nik9000 nik9000 removed the help wanted adoptme label Apr 26, 2016
@ayushsangani
Copy link
Contributor

ayushsangani commented May 4, 2016

+1 @nik9000 eagerly waiting for this to be released in 1.x

@nik9000
Copy link
Member

nik9000 commented May 4, 2016

+1 @nik9000 eagerly waiting for this to be released in 1.x

It'll be a 5.0 or 5.1 feature depending on when I get time to really start it. You should be able to connect to 1.x with it though.

@rgb4268
Copy link

rgb4268 commented Jun 5, 2017

Is the _source._ingest feature working to get metadata fields from a remote source? I can't get it to work. Have tried posting in forum, no response.

@ghost
Copy link

ghost commented Jun 7, 2017

Could you tell what would happen to data that is created or updated when performing reindex api?

@nik9000
Copy link
Member

nik9000 commented Jun 10, 2017

Could you tell what would happen to data that is created or updated when performing reindex api?

Reindex sees a consistent view of the data. It won't notice the changes.

@ghost
Copy link

ghost commented Jun 10, 2017

So how to add that changes.

@nik9000
Copy link
Member

nik9000 commented Jun 10, 2017

So how to add that changes.

We don't have native support for that. If you ask around on discuss.elastic.co you might find people that have some solutions.

@lcawl lcawl added :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. and removed :Reindex API labels Feb 13, 2018
@kirvk548
Copy link

Hey Guys,

Have you tried changing the remote host from this format "host": "192.168.1.2:9200", to this format "host": "https://something.com" ?

Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >feature release highlight
Projects
None yet
Development

No branches or pull requests

6 participants