NIST Issues Blog on Protecting Privacy with Machine Learning
On December 21, 2021, the National Institute of Standards and Technology ("NIST") released a blog post outlining how organizations should maintain consumer or customer information when employing machine learning to initiate the organization’s services.
Although the post centered around the use of machine learning in the medical field, the information provided can be applied more generally, since the NIST highlights both the advantages and disadvantages of machine learning in a variety of industries. For example, a machine learner's memorization of sensitive information may create a susceptibility to a bad-actor accessing private information from the machine learner’s training data.[1]
According to the NIST, “differential privacy . . . can be used to quantify and bound leakage of private information from the learner’s training data. In particular, it allows us to prevent memorization and responsibly train models on sensitive data.” The article goes on to differentiate “differential privacy” from “model generalizations” and to “describe two approaches for training deep neural networks with differential privacy.”[2]
Organizations should be aware of all of the specifics, advantages, and disadvantages of using artificial intelligence and machine learning and should review the NIST's recommendations when if employing such technologies.
If you have any questions or concerns about how machine learning and artificial intelligence can impact or assist your organization, please contact Kennedy Sutherland.
[1] Jared P. Lander (Chief Data Scientist) and Michael Beigelmacher (Data Engineer) of Lander Analytics, The Essential Guide to Quality Training Data for Machine Learning, CloudFactory (Feb. 2020) (“Training data refers to the initial data that is used to develop a machine learning model, from which the model creates and refines its rules.”).
[2] For additional information on “differential privacy,” please see NIST’s Differential Privacy Blog Series