Research results on smart heating systems based on occupancy prediction are often difficult to reproduce and to compare. Evaluating the performance of these systems through simulation or real experiments requires defining suitable scenarios and setting a large number of parameters. As different authors rely on different scenarios and parameter settings, comparing the reported performance results is often infeasible. In this paper, we argue that overcoming this problem is crucial to bring research on smart heating systems a step forward. We outline the main factors influencing the performance of such systems and we show how these factors can be integrated by proposing a simple yet thorough evaluation methodology for smart heating systems. Using parameters synthesised from real-world occupancy and weather data, we describe how this methodology can be used to establish performance bounds of smart heating systems.