Lucidworks: Segment Merging, Deleted Documents and why Optimize May Be Bad For You

Solr merge policy and deleted docs
During indexing, whenever a document is deleted or updated, it’s not really removed from the index immediately, it’s just “marked as deleted” in its original segment. It doesn’t show in search results (or the new version is found in the case of update). This leads to some percentage of “waste”; your index may consist of, say, 15%-20% deleted documents.
In some situations, the amount of wasted space is closer to 50%. And there are certain situations where the percentage deleted documents can be even higher, as determined by the ratio of numDocs to maxDocs in Solr’s …read more

Source: Planet Code4Lib