← all papers Β· overview

Efficient stream-based Max-Min diversification with minimal failure rate

Abstract

The streaming max-min diversification problem concerns the selection of a limited and diverse sample of items out of a data stream of known finite length. The objective to be maximized is the minimum distance among any pair of selected items. We consider the irrevocable-choice sampling, where decisions need to be immediate and irrevocable while processing the items of the stream, which is a setting little studied in the literature. Standard algorithmic approaches for sequential selection disregard selection failures, which is when the last items of the stream are picked by default, to prevent delivering an incomplete selection set. This defect can be catastrophic for the max-min diversification objective. The proposed Failure Rate Minimization (FRM) is a rank-based algorithm that selects a set of diverse items and, in addition, reduces significantly the probability of having failures. We demonstrate with simulations FRM's performance comparing with existing selection strategies.

Related papers

Ranked by semantic similarity β€” how closely each paper's abstract matches this one (100% = near-identical topic).