# Preface

This is included in the sample pages on Spinger's website.

**Chapter 1** Introduction

Prediction Versus Interpretation, Key Ingredients of Predictive Models; Terminology; Example Data Sets and Typical Data Scenarios; Overview; Notation (15 pages, 3 figures)

# Part I: General Strategies

**Chapter 2** A Short Tour of the Predictive Modeling Process

Case Study: Predicting Fuel Economy; Themes; Summary (8 pages, 6 figures, R packages used)

This chapter is included in the sample pages on Spinger's website.

**Chapter 3** Data Pre-Processing

Case Study: Cell Segmentation in High-Content Screening; Data Transformations for Individual Predictors; Data Transformations for Multiple Predictors; Dealing with Missing Values; Removing Variables; Adding Variables; Binning Variables; Computing; Exercises (32 pages, 11 figures, R packages used)

**Chapter 4** Over-Fitting and Model Tuning

The Problem of Over-Fitting; Model Tuning; Data Splitting; Resampling Techniques; Case Study: Credit Scoring; Choosing Final Tuning Parameters; Data Splitting Recommendations; Choosing Between Models; Computing; Exercises (29 pages, 13 figures, R packages used)

# Part II: Regression Models

**Chapter 5** Measuring Performance in Regression Models

Quantitative Measures of Performance; The Variance-Bias Tradeoff; Computing (4 pages, 3 figures)

**Chapter 6** Linear Regression and Its Cousins

Case Study: Quantitative Structure-Activity Relationship Modeling; Linear Regression; Partial Least Squares; Penalized Models; Computing; Exercises (37 pages, 20 figures, R packages used)

**Chapter 7** Non-Linear Regression Models

Neural Networks; Multivariate Adaptive Regression Splines; Support Vector Machines; K-Nearest Neighbors; Computing; Exercises (28 pages, 10 figures, R packages used)

**Chapter 8** Regression Trees and Rule-Based Models

Basic Regression Trees; Regression Model Trees; Rule-Based Models; Bagged Trees; Random Forests; Boosting; Cubist; Computing; Exercises (46 pages, 24 figures, R packages used)

**Chapter 9** A Summary of Solubility Models

(3 pages, 3 figures)

**Chapter 10** Case Study: Compressive Strength of Concrete Mixtures

Model Building Strategy; Model Performance; Optimizing Compressive Strength; Computing (12 pages, 5 figures, R packages used)

# Part III: Classification Models

**Chapter 11** Measuring Performance in Classification Models

Class Predictions; Evaluating Predicted Classes; Evaluating Class Probabilities; Computing (20 pages, 9 figures, R packages used)

**Chapter 12** Discriminant Analysis and Other Linear Classification Models

Case Study; Logistic Regression; Linear Discriminant Analysis; Partial Least Squares Discriminant Analysis; Penalized Models; Nearest Shrunken Centroids; Computing; Exercises (52 pages, 20 figures, R packages used)

**Chapter 13** Non-Linear Classification Models

Nonlinear Discriminant Analysis; Neural Networks; Flexible Discriminant Analysis; Support Vector Machines; K-Nearest Neighbors; Naive Bayes; Computing; Exercises (38 pages, 16 figures, R packages used)

**Chapter 14** Classification Trees and Rule-Based Models

Basic Regression Trees; Rule-Based Models; Bagged Trees; Random Forests; Boosting; C5.0; Wrap-Up; Computing (46 pages, 15 figures, R packages used)

**Chapter 15** A Summary of Grant Application Models

(3 pages, 2 figures)

**Chapter 16** Remedies for Severe Class Imbalance

Case Study: Predicting Caravan Policy Ownership; The Effect of Class Imbalance; Model Tuning; Alternate Cutoffs; Adjusting Prior Probabilities; Unequal Case Weights; Sampling Methods; Cost-Sensitive Training; Computing; Exercises (24 pages, 7 figures, R packages used)

**Chapter 17** Case Study: Job Scheduling

Data Splitting and Model Strategy; Results; Computing (13 pages, 6 figures, R packages used)

# Part IV: Other Considerations

**Chapter 18** Measuring Predictor Importance

Numeric Outcomes; Categorical Outcomes; Other Approaches; Computing; Exercises (24 pages, 10 figures, R packages used)

**Chapter 19** An Introduction to Feature Selection

Consequences of Using Non-Informative Predictors; Approaches for Reducing the Number of Predictors; Wrappers Methods; Filter Methods; Selection Bias; Misuse of Feature Selection; Case Study: Predicting Cognitive Impairment; Computing; Exercises (34 pages, 7 figures, R packages used)

**Chapter 20** Factors That Can Affect Model Performance

Type III Errors; Measurment Error in the Outcome; Measurement Error in the Predictors; Discretizing Continuous Outcomes; When Should You Trust Your Model’s Prediction?; The Impact of a Large Sample; Computing; Exercises (26 pages, 12 figures, R packages used)

# Appendix

These are included in the sample pages on Spinger's website.

**Appendix A** A Summary of Various Models

**Appendix B** An Introduction to R

Startup and Getting Help; Packages; Creating Objects; Data Types and Basic Structures; Working with Rectangular Data Sets; Objects and Classes; R Functions; The Three Faces of =; The AppliedPredictiveModeling Package; The caret Package; Software Used in This Text (16 pages, 1 figure, R packages used)