Elasticsearch Delete Issues

Problem :­
We have some types of documents which have to be deleted daily based on the date in them and if it has expired. An alert from another tool tells us that if such an old dated document appears in the search results which is filtered out by the end service.
So, for the past month, we saw that the alert was getting triggered every night between 12am and 7:15am and then it disappears (no more data from yesterday)

Remember, that these documents are deleted daily using the XDELETE query to the elasticsearch cluster.

Experiments :­
We run an hourly job to delete old data and logically, one expects the old data to disappear as soon as the first delete job is run after 12am, but we keep seeing the alert with the log showing that a result from yesterday was getting triggered.

I increased the job frequency to run every 15 minutes with a "refresh" on the index after each job and yet, the splunk alert would show up until 7:15am every day. The pattern that is seen is between 6am and 7am, the number of old documents begin to decrease and finally there are none after 7:15am. Even "optimize_expunge_deletes" does not decrease the time these documents show up in the elasticsearch results.

Possible causes :­
• The second last comment on this forum reveals that elasticsearch marks documents for deletes and by default, merges only happen on an index when there is 10% or more documents to be added or deleted. As we index daily, probably 7:15am is the time when a merge happens and the documents get deleted.

• Though, we see the docID in the logs, when searching for the documents in the elasticsearch cluster with the ID leads to no results and hence, this points to that different results shown from the head than from the client API.

Possible solution :­
Note, the index does not use TTL(time­to­live) value for the documents but as per the elasticsearch forums, people still see this issue after using TTL.
So supposedly, this is the way lucene and elasticsearch want to be efficient by limiting merges as a merge is an expensive operation from memory’s perspective.
The likely solution given by elasticsearch forum experts is to create new index and swap indices whenever possible(this means swapping indices everyday for us) as a new index has faster searches avoiding the deleted documents in the lookup (overkill !!).

Considering, most of the elasticsearch users talk about documents being present in the millions, sometimes it is not feasible to index and come up with a new index daily. So, an alternate short approach can be to update the document with a field of deleted set as true and filter out such documents.

0 comments :