Non-interactive Privacy-preserving k-NN Classifier

Machine learning tasks typically require large amounts of sensitive data to be shared, which is notoriously intrusive in terms of privacy. Outsourcing this computation to the cloud requires the server to be trusted, introducing a non-realistic security assumption and high risk of abuse or data breaches. In this paper, we propose privacy-preserving versions of the k-NN classifier which operate over encrypted data, combining order-preserving encryption and homomorphic encryption. According to our experiments, the privacy-preserving variant achieves the same accuracy as the conventional k-NN classifier, but considerably impacts the original performance. However, the performance penalty is still viable for practical use in sensitive applications when the additional security properties provided by the approach are considered. In particular, the cloud server does not need to be trusted beyond correct execution of the protocol and computes the algorithm over encrypted data and encrypted classes. As a result, the cloud server never learns the real dataset values, the number of classes, the query vectors or their classification.