RUNMON-RIFT: Adaptive Configuration and Healing for Large-Scale Parameter Inference
(Tuesday, October 19, 2021)
Abstract
Gravitational wave parameter inference pipelines operate on data containing unknown sources on distributed hardware with unreliable performance. For one specific analysis pipeline (RIFT), we have developed a flexible tool (RUNMON-RIFT) to mitigate the most common challenges introduced by these two uncertainties. On the one hand, RUNMON provides several mechanisms to identify and redress unreliable computing environments. On the other hand, RUNMON provides mechanisms to adjust pipeline-specific run settings, including prior ranges, to ensure the analysis completes and encompasses the physical source parameters. We demonstrate both general features with two controlled examples.