The problem: A major source for lack of replicability in epidemiological studies is the problem of selective inference in face of multiple parameters. Multiple parameters arise in this area as a result of 1) making inference about many subgroups or strata 2) utilizing various sources of data, each with its separate parameters measuring harm, all implying about the same question of interest; 3) making inference at many locations and about many diseases.
It is typical for recent epidemiological studies to involve many parameters. The larger the problem the more serious becomes the danger of selective inference: the emphasis is put only on the important findings, where importance is assessed from the same data. For example, emphasizing only the confidence intervals for odds-ratio not covering one, or reporting only the statistically significant findings. In these cases confidence intervals fail to cover the parameters more often than expected, as does the probability to make a false discovery.
Selective inference is manifested in other ways as well: selection of results to be included in the abstract as discussed by Kavvoura et al, or burying the less interesting results in appendices and data bases. This is a legitimate strategy if selective inference is attended for. When not attended for selective inference results in publishing biased conclusions that are unreplicable, making false impressions on the public and decision makers.
Our approach: To tailor methods that control the false discovery rate and false coverage rate to selective inference to the above described problems.
Our solutions: Some solutions that utilize the FDR from a Bayesian point of view are already available (Catelan D, Lagazio C, Biggeri A.). We work on selective inference arising from subgroup analysis.