Welcome

In this book we’ll cover supervised models by using the tidymodels framework, which is a collection of R packages for modeling and machine learning using tidyverse principles.

Within supervised models, there are two main sub-categories:

Regression predicts a numeric outcome.
Classification predicts an outcome that is an ordered or unordered set of qualitative values.

Furthermore, we follow the data science lifecycle process proposed by Wirth and Hipp (2000):

Cross Industry Standard Process for Data Mining (@Wirth2000)

Figure 0.1: Cross Industry Standard Process for Data Mining (Wirth and Hipp (2000))

To learn more about this data science lifecycle framework, review this presentation about CRISP-DM.

License

This online work is licensed under a Creative Commons Attribution-ShareAlike 4.0 Internationale.

Acknowledements

The content in this tutorials is mainly based on the excellent book “Hands-on machine learning with scikit-learn, keras and tensorflow” from Géron (2019).

The website is built with bookdown.

1 CRISP-DM