2606.12845

Total: 1

#1 A Privacy-Preserving Framework Using Remote Data Science for Inter-Institutional Student Retention Prediction [PDF] [Copy] [Kimi] [REL]

Authors: John Fields, K M Sajjadul Islam, Ruchitha Thota, Victor Chen, Praveen Madiraju

This study explores privacy-preserving machine learning (PPML) techniques using the PySyft platform to enable collaborative prediction of student retention between institutions. We developed a remote data science (RDS) framework with a semi-air-gapped architecture consisting of high-side and low-side servers, allowing researchers from three universities to build predictive models on sensitive student data without direct data access. Using historical data from a small private university (N=720), we evaluated three synthetic data generation approaches and validated the framework through inter-institutional collaboration. The results demonstrate consistent classification performance across institutions (Macro F1: 0.690--0.695) while maintaining strict Family Educational Rights and Privacy Act (FERPA) compliance. We also propose Data-Type-Aware Templates, a novel synthetic data method that prioritizes privacy over distributional fidelity. Our findings confirm that RDS-based PPML is technically feasible for educational settings and offers a practical alternative to federated learning for small-scale inter-institutional collaborations. The code is available at https://github.com/jtfields/NAIRR240195-Privacy-Preserving-Machine-Learning.

Subjects: Cryptography and Security , Machine Learning

Publish: 2026-06-11 03:18:50 UTC