The ubiquity of wearable audio devices and the importance of the auditory sense imply great potential for audio augmented reality. In this work, we propose a concept and a prototype of synthesizing spatial sounds from arbitrary real objects to users in everyday in-teractions, whereby all sounds are rendered directly by the user's own ear pods instead of loudspeakers on the objects. The proposed system tracks the user and the objects in real time, creates a sim-plified model of the environment, and generates realistic 3D audio effects. We thoroughly evaluate the usability and the usefulness of such a system based on a user study with 21 participants. We also investigate how an acoustic environment model improves the sense of engagement of the rendered 3D sounds.