Combinatorial Algorithms in Sorting, Searching, and Information Dissemination

Date

2020

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This work investigates two problems in information dissemination which are made challenging by combinatorial complexity. The first is consensus sorting, in which large, noisy data sets are used to form the fitness function which guides the sorting process. The second is perpetual gossiping, which is the problem of finding efficient algorithms to maintain universal reachability in a peer-to-peer network. Consensus sorting is a randomized algorithm which uses a hill-climbing approach to achieve sortedness. It has access to large data sets, in an abstraction of what may be possible with publicly available media. It uses a consensus from the data as a fitness function to guide sorting decisions. Other research has studied sorting with noisy comparisons; we delve into the model of sorting based on consensus, to understand the reasons for failures. We compare actual sorting behavior to the ideal, and explain the disparity between the practical results and the theoretical ideals. We construct a model describing the convergence behavior of the sorting algorithm, and why it might not result in a sorted array. We discuss the conditions under which the model will hold. We provide in-depth analysis of two of the more effective sorting metrics, and how one may predict their effectiveness. We discuss the significance of metrics, and why reasonable metrics yield poor consensus sorting results. The second problem, perpetual gossiping is an all-to-all information dissemination problem on a network, in which the goal is to ensure that every node in the network is able to reach every other node via a sequence of calls in minimal time. Thus, the aim is to design efficient schemes such that information which arises at one node at any time will eventually be transmitted to every other node. There is clear applicability for perpetual gossiping in social networks, but it also benefits other practical problems, such as synchronizing information in distributed systems such as RAIDs. Optimal solutions involving a static gossiping sequence and select perpetual schemes have long been known, but general optimal perpetual gossiping schemes have proven far more challenging due to the combinatorial complexity of the problem. One basic conjecture which remains unanswered, for example, is whether it is always possible to find an optimal scheme on a tree network using solely a contiguous series of calls. This work advances the state of knowledge on perpetual gossiping schemes in a number of ways. To begin with, it demonstrates the general solvability of optimal perpetual gossiping schemes on arbitrary graphs using a combinatorial algorithm. It also generalizes the network model to subsume many practical network models, while still remaining solvable. It continues by presenting a body of conjectures relating to the optimality of schemes: properties of critical windows in schemes; sufficient conditions about leaf reachability for schemes to form complete gossips; the window size needed by walk-around schemes; observations about call arcs. The necessity of contiguity in an optimal scheme has been disproven. We continue by expanding the range of networks for which solutions are known. We show efficient optimal results for tree networks with a diameter bounded at 3. We show efficient optimal solutions for "domino" trees (ones in which in-degree and out-degree are known) and "organ pipes" (in which a tree is composed of multiple paths). A mechanism for optimizing walk-around schemes by repeating portions is shown. We also contribute by demonstrating the difficulty of the perpetual gossiping problem, by showing that finding optimal solutions for the class of walk-around schemes is an NP-complete problem.

Description

Keywords

Citation