Aiming at the problems of insufficient utilization of multi-sensor data and difficult extraction of high-dimensional redundant data features in existing engine residual life prediction methods, an engine residual life prediction model based on spatiotemporal information joint embedding and discrete fusion learning was proposed. Firstly, a spatio-temporal information joint embedding network is designed to encode the spatio-temporal information of multi-sensor data to effectively embed time series information and spatial feature information to help the model understand the correlation within the data more fully. Then, an attention-based discrete fusion variational self-coding network is constructed to quantify the embedded features of spatio-temporal information through code-book mapping in an unsupervised way, and further achieve parallel fusion through upper-lower fusion attention. Finally, the prediction results are obtained by combining forward and backward semantic information of key degradation features with bidirectional time series memory remaining life prediction module. The experimental results on C-MAPSS show that the proposed method can effectively improve the accuracy of the remaining life prediction of the engine, which is significantly better than other existing methods.