Аннотация
Statistics is Important Statistics is important to machine learning practitioners.
- Statistics is a prerequisite in most courses and books on applied machine learning.
- Statistical methods are used at each step in an applied machine learning project.
- Statistical learning is the applied statistics equivalent of predictive modeling in machine learning.
A machine learning practitioner cannot be ef f ective without an understanding of basic statistical concepts and statistics methods, and an ef f ective practitioner cannot excel without being aware of and leveraging the terminology and methods used in the sister f i eld of statistical learning.
Practitioners Don’t Know Stats Developers don’t know statistics and this is a huge problem. Programmers don’t need to know and use statistical methods in order to develop software. Software engineering and computer science courses generally don’t include courses on statistics, let alone advanced statistical tests.
As such, it is common for machine learning practitioners coming from the computer science or developer tradition to not know and not value statistical methods. This is a problem given the pervasive use of statistical methods and statistical thinking in the preparation of data, evaluation of learned models, and all other steps in a predictive modeling project.
Practitioners Study The Wrong Stats Eventually, machine learning practitioners realize the need for skills in statistics. This might start with a need to better interpret descriptive statistics or data visualizations and may progress to the need to start using sophisticated hypothesis tests. The problem is, they don’t seek out the statistical information they need. Instead, they try to read through a text book on statistics or work through the material for an undergraduate course on statistics. This approach is slow, it’s boring, and it covers a breadth and depth of material on statistics that is beyond the needs of the machine learning practitioner.
Practitioners Study Stats The Wrong Way It’s worse than this. Regardless of the medium used to learn statistics, be it books, videos, or course material, machine learning practitioners study statistics the wrong way. Because the material is intended for undergraduate students that need to pass a test, the material is focused on the theory, on proofs, on derivations. This is great for testing students but terrible for practitioners that need results. Practitioners need methods that clearly state when they are appropriate and instruction on how to interpret the result. They need code examples that they can use immediately on their project.
A Better Way I set out to write a playbook for machine learning practitioners that gives them only those parts of statistics that they need to know in order to work through a predictive modeling project. I set out to present statistical methods in the way that practitioners learn–that is with simple language and working code examples. Statistics is important to machine learning, and I believe that if it is taught at the right level for practitioners, that it can be a fascinating, fun, directly applicable, and immeasurably useful area of study. I hope that you agree.
Комментарии к книге "Statistical Methods for Machine Learning"