This advanced machine learning book highlights many algorithms from a geometric perspective and introduces tools in network science, metric geometry, and topological data analysis through practical application.
Whether you’re a mathematician, seasoned data scientist, or marketing professional, you’ll find The Shape of Data to be the perfect introduction to the critical interplay between the geometry of data structures and machine learning.
This book’s extensive collection of case studies (drawn from medicine, education, sociology, linguistics, and more) and gentle explanations of the math behind dozens of algorithms provide a comprehensive yet accessible look at how geometry shapes the algorithms that drive data analysis.
In addition to gaining a deeper understanding of how to implement geometry-based algorithms with code, you’ll explore:
Supervised and unsupervised learning algorithms and their application to network data analysis
The way distance metrics and dimensionality reduction impact machine learning
How to visualize, embed, and analyze survey and text data with topology-based algorithms
New approaches to computational solutions, including distributed computing and quantum algorithms
About the Author
Colleen M. Farrelly is a senior data scientist whose academic and industry research has focused on topological data analysis, quantum machine learning, geometry-based machine learning, network science, hierarchical modeling, and natural language processing. Since graduating from the University of Miami with an MS in biostatistics, Colleen has worked as a data scientist in a vari- ety of industries, including healthcare, consumer packaged goods, biotech, nuclear engineering, marketing, and education. Colleen often speaks at tech conferences, including PyData, SAS Global, WiDS, Data Science Africa, and DataScience SALON. When not working, Colleen can be found writing haibun/haiga or swimming.
Yaé Ulrich Gaba completed his doctoral studies at the University of Cape Town (UCT, South Africa) with a specialization in topology and is currently a research associate at Quantum Leap Africa (QLA, Rwanda). His research interests are computational geometry, applied algebraic topology (topologi- cal data analysis), and geometric machine learning (graph and point-cloud representation learning). His current focus lies in geometric methods in data analysis, and his work seeks to develop effective and theoretically justified algorithms for data and shape analysis using geometric and topological ideas and methods.
"The title says it all. Data is bound by many complex relationships not easily shown in our two-dimensional, spreadsheet filled world. The Shape of Data walks you through this richer view and illustrates how to put it into practice." —Stephanie Thompson, Data Scientist and Speaker
“The Shape of Data is a novel perspective and phenomenal achievement in the application of geometry to the field of machine learning. It is expansive in scope and contains loads of concrete examples and coding tips for practical implementations, as well as extremely lucid, concise writing to unpack the concepts. Even as a more veteran data scientist who has been in the industry for years now, having read this book I've come away with a deeper connection to and new understanding of my field." —Kurt Schuepfer, Ph.D., McDonalds Corporation
“A great source for the application of topology and geometry in data science. Topology and geometry advance the field of machine learning on unstructured data, and The Shape of Data does a great job introducing new readers to the subject.” —Uchenna “Ike” Chukwu, Senior Quantum Developer
"See how data looks not just as lists of numbers but as plots and graphs. The Shape of Data shows the reader how to visualize data sets and discover relations hidden in the numbers and sets. . . . In this age of large data sets and deep learning, data graphics are essential to scientists and engineers—just like this book." —David S. Mazel, Principal/Manager Systems Engineer, Regulus-Group
"Everyone who works at the border of geometry and Data Science will find the book and invaluable resource and source of inspiration. It is considerate that the R-codes used in the book have readily accessible python codes. " —Geoffrey Mboya, DPhil (Oxon), Director at Mfano Africa
"Comprehensive and exceptionally well written, The Shape of Data: Geometry-Based Machine Learning and Data Analysis in R is impressively 'reader friendly' in organization and presentation, making it an ideal instructional resource for anyone with an interest in topology, computer hacking, or mathematical/statistical computer software." —Midwest Book Review