Выбрать главу

But the journey is far from over. We don’t have the Master Algorithm yet, just a glimpse of what it might look like. What if something fundamental is still missing, something all of us in the field, steeped in its history, can’t see? We need new ideas, and ideas that are not just variations on the ones we already have. That’s why I wrote this book: to start you thinking. I teach an evening class on machine learning at the University of Washington. In 2007, soon after the Netflix Prize was announced, I proposed it as one of the class projects. Jeff Howbert, a student in the class, got hooked and continued to work on it after the class was over. He wound up being a member of one of the two winning teams, two years after learning about machine learning for the first time. Now it’s your turn. To learn more about machine learning, check out the section on further readings at the end of the book. Download some data sets from the UCI repository (archive.ics.uci.edu/ml/) and start playing. When you’re ready, check out Kaggle.com, a whole website dedicated to running machine-learning competitions, and pick one or two to enter. Of course, it’ll be more fun if you recruit a friend or two to work with you. If you’re hooked, like Jeff was, and wind up becoming a professional data scientist, welcome to the most fascinating job in the world. If you find yourself dissatisfied with today’s learners, invent new ones-or just do it for fun. My fondest wish is that your reaction to this book will be like my reaction to the first AI book I read, over twenty years ago: there’s so much to do here, I don’t know where to start. If one day you invent the Master Algorithm, please don’t run to the patent office with it. Open-source it. The Master Algorithm is too important to be owned by any one person or organization. Its applications will multiply faster than you can license it. But if you decide instead to do a startup, remember to give a share in it to every man, woman, and child on Earth.

Whether you read this book out of curiosity or professional interest, I hope you will share what you’ve learned with your friends and colleagues. Machine learning touches the lives of every one of us, and it’s up to all of us to decide what we want to do with it. Armed with your new understanding of machine learning, you’re in a much better position to think about issues like privacy and data sharing, the future of work, robot warfare, and the promise and peril of AI; and the more of us have this understanding, the more likely we’ll avoid the pitfalls and find the right paths. That’s the other big reason I wrote this book. The statistician knows that prediction is hard, especially about the future, and the computer scientist knows that the best way to predict the future is to invent it, but the unexamined future is not worth inventing.

Thanks for letting me be your guide. I’d like to give you a parting gift. Newton said that he felt like a boy playing on the seashore, picking up a pebble here and a shell there while the great ocean of truth lay undiscovered before him. Three hundred years later, we’ve gathered an amazing collection of pebbles and shells, but the great undiscovered ocean still stretches into the distance, sparkling with promise. The gift is a boat-machine learning-and it’s time to set sail.

Acknowledgments

First of all, I thank my companions in scientific adventure: students, collaborators, colleagues, and everyone in the machine-learning community. This is your book as much as mine. I hope you will forgive my many oversimplifications and omissions, and the somewhat fanciful way in which parts of the book are written.

I’m grateful to everyone who read and commented on drafts of the book at various stages, including Mike Belfiore, Thomas Dietterich, Tiago Domingos, Oren Etzioni, Abe Friesen, Rob Gens, Alon Halevy, David Israel, Henry Kautz, Chloé Kiddon, Gary Marcus, Ray Mooney, Kevin Murphy, Franzi Roesner, and Ben Taskar. Thanks also to everyone who gave me pointers, information, or help of various kinds, including Tom Griffiths, David Heckerman, Hannah Hickey, Albert-László Barabási, Yann LeCun, Barbara Mones, Mike Morgan, Peter Norvig, Judea Pearl, Gregory Piatetsky-Shapiro, and Sebastian Seung.

I’m lucky to work in a very special place, the University of Washington’s Department of Computer Science and Engineering. I’m also grateful to Josh Tenenbaum, and to everyone in his group, for hosting the sabbatical at MIT during which I started this book. Thanks to Jim Levine, my indefatigable agent, for drinking the Kool-Aid (as he put it) and spreading the word; and to everyone at Levine Greenberg Rostan. Thanks to TJ Kelleher, my amazing editor, for helping make this a better book, chapter by chapter, line by line; and to everyone at Basic Books.

I’m indebted to the organizations that have funded my research over the years, including ARO, DARPA, FCT, NSF, ONR, Ford, Google, IBM, Kodak, Yahoo, and the Sloan Foundation.

Last and most, I thank my family for their love and support.

Further Readings

If this book whetted your appetite for machine learning and the issues surrounding it, you’ll find many suggestions in this section. Its aim is not to be comprehensive but to provide an entrance to machine learning’s garden of forking paths (as Borges put it). Wherever possible, I chose books and articles appropriate for the general reader. Technical publications, which require at least some computational, statistical, or mathematical background, are marked with an asterisk (*). Even these, however, often have large sections accessible to the general reader. I didn’t list volume, issue, or page numbers, since the web renders them superfluous; likewise for publishers’ locations.

If you’d like to learn more about machine learning in general, one good place to start is online courses. Of these, the closest in content to this book is, not coincidentally, the one I teach (www.coursera.org/course/machlearning). Two other options are Andrew Ng’s course (www.coursera.org/course/ml) and Yaser Abu-Mostafa’s (http://work.caltech.edu/telecourse.html). The next step is to read a textbook. The closest to this book, and one of the most accessible, is Tom Mitchell’s Machine Learning* (McGraw-Hill, 1997). More up-to-date, but also more mathematical, are Kevin Murphy’s Machine Learning: A Probabilistic Perspective* (MIT Press, 2012), Chris Bishop’s Pattern Recognition and Machine Learning* (Springer, 2006), and An Introduction to Statistical Learning with Applications in R,* by Gareth James, Daniela Witten, Trevor Hastie, and Rob Tibshirani (Springer, 2013). My article “A few useful things to know about machine learning” (Communications of the ACM, 2012) summarizes some of the “folk knowledge” of machine learning that textbooks often leave implicit and was one of the starting points for this book. If you know how to program and are itching to give machine learning a try, you can start from a number of open-source packages, such as Weka (www.cs.waikato.ac.nz/ml/weka). The two main machine-learning journals are Machine Learning and the Journal of Machine Learning Research. Leading machine-learning conferences, with yearly proceedings, include the International Conference on Machine Learning, the Conference on Neural Information Processing Systems, and the International Conference on Knowledge Discovery and Data Mining. A large number of machine-learning talks are available on http://videolectures.net. The www.KDnuggets.com website is a one-stop shop for machine-learning resources, and you can sign up for its newsletter to keep up-to-date with the latest developments.