We present a machine learning technique for recognizing discrete gestures and estimating continuous 3D hand position for mobile interaction. Our multi-stage random forest pipeline jointly classifies hand shapes and regresses metric depth of the hand from a single RGB camera. Our technique runs in real time on unmodified mobile devices, such as smartphones, smartwatches, and smartglasses, complementing existing interaction paradigms with in-air gestures.