Process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.