The goal of this article is to develop formal tests to evaluate the relative in-sample performance of two competing, misspecified, nonnested models in the presence of possible data instability. Compared to previous approaches to model selection, which are based on measures of global performance, we focus on the local relative performance of the models. We propose tests that are based on different measures of local performance and that correspond to different null and alternative hypotheses. The empirical application provides insights into the time variation in the performance of a representative Euro-area Dynamic Stochastic General Equilibrium model relative to that of VARs.