Skip to content

Remove full flush / FlushType.NEW_WRITER #9559

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 4, 2015

Conversation

s1monw
Copy link
Contributor

@s1monw s1monw commented Feb 4, 2015

The full option and FlushType.NEW_WRITER only exists to allow
realtime changes to two settings (index.codec and index.concurrency).
Those settings are very expert and don't really need to be updateable
in realtime.

@kimchy
Copy link
Member

kimchy commented Feb 4, 2015

@s1monw I think we will need this feature to change the compression levels in upcoming Lucene 5.x? That would be a "non expert" use case, for example, in time base data, for older indexes, the compression level can be changed, and then an optimize can run to apply it.

@s1monw
Copy link
Contributor Author

s1monw commented Feb 4, 2015

@kimchy you can still close the index change settings then reopen and do your merge.

@rmuir
Copy link
Contributor

rmuir commented Feb 4, 2015

When i added best compression, i implemented it with index.codec out of laziness, because i saw someone else had done the work to restart the IW and everything like that. Its only a live setting because someone else did all that :)

Really, i just wanted to get it out there in a usable way, sooner than later, so people could experiment with it. But we could expose it differently too: as its own setting (not as a named codec), as a /archive API or something like that, or whatever.

@s1monw
Copy link
Contributor Author

s1monw commented Feb 4, 2015

@kimchy I really don't think anybody should change the codec in realtime. I think for the compression changes we need to have a higherlevel API that does this together with a clear workflow. For that I guess you are referring to an archive action on the user end. I think this should be an API that set the index to be readonly and then we can change this moving to a READ_ONLY engine as well that can optimize where each optimize operation creates it's own IndexWriter with the relevant settings?

@kimchy
Copy link
Member

kimchy commented Feb 4, 2015

btw @s1monw, I totally agree that for the mentioned advance features, I am +1 on remove this feature, its just makes the engine more complicated.

If we can find a better way to change compressions without the NEW_WRITER feature, lets remove it. I think the ability to change compression levels on the fly on an index is very important. It would be great to not remove it before we figure out in master how to change compression levels on the fly.

@kimchy
Copy link
Member

kimchy commented Feb 4, 2015

blah, I just saw the comments about the read only part, if we can keep changing the compression level in non read only mode it would be great. There are many cases that I have seen where there is still a "trickle" of document updates happening to older indices (like 2-3 days old).

@rmuir if we can somehow make it a setting we can change on the writer in Lucene live it would be great, I think it will be very beneficial not only for Elasticsearch.

@rmuir
Copy link
Contributor

rmuir commented Feb 4, 2015

@rmuir if we can somehow make it a setting we can change on the writer in Lucene live it would be great, I think it will be very beneficial not only for Elasticsearch.

I'm sorry, currently i don't think it belongs there. Being able to change the codec on the fly would add substantial risk to lucene. I don't want that complexity here, and I'm also not convinced the setting needs to be a live one in elasticsearch either.

@kimchy
Copy link
Member

kimchy commented Feb 4, 2015

I'm sorry, currently i don't think it belongs there. Being able to change the codec on the fly would add substantial risk to lucene. I don't want that complexity here, and I'm also not convinced the setting needs to be a live one in elasticsearch either.

sorry, I got confused by your initial comment. I agree that changing codec on the fly is not something that one should be able to do on IW. The way I read your initial comment is that maybe the compression level should not be a different codec, but a flag on the writer config that can be changed and applied to new segments from once its changed.

@rmuir
Copy link
Contributor

rmuir commented Feb 4, 2015

I think its good as a codec option, it allows for our whole unit testing setup to actually work. Thats an important part of containing this change, as we have to support now 2 formats with backwards compatibility, we already have enough trouble with 1 format.

Because of that I tried to do a lot of work to make this option easy to contain so we can support it. And yeah, sue me, all the APIs around that are completely driven by that: not by the user.

But had i not done this, the result would just be too much complexity and the user would not have the option at all.

@kimchy
Copy link
Member

kimchy commented Feb 4, 2015

@rmuir got it, I got confused and thought the flag for the compression was something all segment write constructs sampled and then we could do it. Looking at it, I see that its much more complicated, so fully get why its not changeable, and agreed that it should not be as it will complicate the codebase a lot.

@s1monw A suggestion you made is great. A user can still have "cold" and "warm" nodes, and even if the setting is not changeable at runtime, the setting can be set on the node level and once a shard "moves" to the node, it will inherit this setting and apply it to the newly formed shard on it.

I am good with this change in this case, we have a good story around enabling high compression in a "hot" / "cold" node scenario, which I think is the most common one where changing the compression level on an index is needed, and I love the simplicity of the engine this change will cause.

@mikemccand
Copy link
Contributor

LGTM, this is a nice cleanup.

I didn't realize we exposed this "full" option externally on the flush API.

I left one minor comment (sorry, on the commit, so it's not in the conversation here)...

The `full` option and `FlushType.NEW_WRITER` only exists to allow
realtime changes to two settings (`index.codec` and `index.concurrency`).
Those settings are very expert and don't really need to be updateable
in realtime.
@s1monw s1monw force-pushed the remove_dynamic_settings_on_engine branch from 5354c3d to 0c5599e Compare February 4, 2015 16:38
@s1monw s1monw merged commit 0c5599e into elastic:master Feb 4, 2015
@s1monw
Copy link
Contributor Author

s1monw commented Feb 4, 2015

@kimchy I am considering to port this to 1.5 those settings are very expert and I think we can get rid of it without deprecating?

@kimchy
Copy link
Member

kimchy commented Feb 4, 2015

@s1monw yea, I am good. If those are set, then it won't cause the index to fail to load, they will just be ignored. And we can recommend to set them on the node level of someone really wants to.

@s1monw s1monw added the v1.5.0 label Feb 4, 2015
@clintongormley clintongormley changed the title [ENGINE] Remove full flush / FlushType.NEW_WRITER Remove full flush / FlushType.NEW_WRITER Jun 6, 2015
@clintongormley clintongormley added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Engine :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v1.5.0 v2.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants