In Supervised Learning, a program ingests training data as sets of observations, called features, identified using labels. For example, many emails, each labeled Spam or Not Spam.
The program expresses the features mathematically and sends them iteratively into a function. It varies the functions’ governing parameters until getting the desired output. Emails correctly identified as spam or not spam. The program is a machine learning model and the iterative process is called training.
Programs, like people, are not perfect learners. Data scientists and engineers evaluate a model by identifying loss, the number of mistakes made, and variance, how well the model performed across different sets of training data.
Too little loss may cause overfitting where model results are erratic, i.e., have high variance. Too much loss yields more consistent results across data sets, but not accurate enough to be useful.
Data scientists send their programs a lot of data and use a lot of computational resources and time to enable successful machine learning. Sometimes their math enables an accurate model that runs in milliseconds on many data sets, for example, voice recognition on your phone. Sometimes their mathematical approach just fails. Very much a case of “if at first you don’t succeed, try, try again.”
The math and the technology involved are non-trivial. Today’s data scientists stand on the shoulders of centuries of pragmatic mathematicians. Moore’s law and distributed big data together enable the computing scale required.
No comments:
Post a Comment
Please no spam, advertisements, or unrelated personal discussions.