Conquer the complexities of this open source statistical language
R is fast becoming the de facto standard for statistical computing and analysis in science, business, engineering, and related fields. This book examines this complex language using simple statistical examples, showing how R operates in a user-friendly context. Both students and workers in fields that require extensive statistical analysis will find this book helpful as they learn to use R for simple summary statistics, hypothesis testing, creating graphs, regression, and much more. It covers formula notation, complex statistics, manipulating data and extracting components, and rudimentary programming.
* R, the open source statistical language increasingly used to handle statistics and produces publication-quality graphs, is notoriously complex
* This book makes R easier to understand through the use of simple statistical examples, teaching the necessary elements in the context in which R is actually used
* Covers getting started with R and using it for simple summary statistics, hypothesis testing, and graphs
* Shows how to use R for formula notation, complex statistics, manipulating data, extracting components, and regression
* Provides beginning programming instruction for those who want to write their own scripts
Beginning R offers anyone who needs to perform statistical analysis the information necessary to use R with confidence.
Introduction xxi
Chapter 1: Introducing R: What It Is and How to Get It 1
Getting the Hang of R 2
The R Website 3
Downloading and Installing R from CRAN 3
Installing R on Your Windows Computer 4
Installing R on Your Macintosh Computer 7
Installing R on Your Linux Computer 7
Running the R Program 8
Finding Your Way with R 10
Getting Help via the CRAN Website and the Internet 10
The Help Command in R 10
Help for Windows Users 11
Help for Macintosh Users 11
Help for Linux Users 13
Help For All Users 13
Anatomy of a Help Item in R 14
Command Packages 16
Standard Command Packages 16
What Extra Packages Can Do for You 16
How to Get Extra Packages of R Commands 18
How to Install Extra Packages for Windows Users 18
How to Install Extra Packages for Macintosh Users 18
How to Install Extra Packages for Linux Users 19
Running and Manipulating Packages 20
Loading Packages 21
Windows-Specific Package Commands 21
Macintosh-Specific Package Commands 21
Removing or Unloading Packages 22
Summary 22
Chapter 2: Starting Out: Becoming Familiar with R 25
Some Simple Math 26
Use R Like a Calculator 26
Storing the Results of Calculations 29
Reading and Getting Data into R 30
Using the combine Command for Making Data 30
Entering Numerical Items as Data 30
Entering Text Items as Data 31
Using the scan Command for Making Data 32
Entering Text as Data 33
Using the Clipboard to Make Data 33
Reading a File of Data from a Disk 35
Reading Bigger Data Files 37
The read.csv() Command 37
Alternative Commands for Reading Data in R 39
Missing Values in Data Files 40
Viewing Named Objects 41
Viewing Previously Loaded Named-Objects 42
Viewing All Objects 42
Viewing Only Matching Names 42
Removing Objects from R 44
Types of Data Items 45
Number Data 45
Text Items 45
Converting Between Number and Text Data 46
The Structure of Data Items 47
Vector Items 48
Data Frames 48
Matrix Objects 49
List Objects 49
Examining Data Structure 49
Working with History Commands 51
Using History Files 52
Viewing the Previous Command History 52
Saving and Recalling Lists of Commands 52
Alternative History Commands in Macintosh OS 52
Editing History Files 53
Saving Your Work in R 54
Saving the Workspace on Exit 54
Saving Data Files to Disk 54
Save Named Objects 54
Save Everything 55
Reading Data Files from Disk 56
Saving Data to Disk as Text Files 57
Writing Vector Objects to Disk 58
Writing Matrix and Data Frame Objects to Disk 58
Writing List Objects to Disk 59
Converting List Objects to Data Frames 60
Summary 61
Chapter 3: Starting Out: Working With Objects 65
Manipulating Objects 65
Manipulating Vectors 66
Selecting and Displaying Parts of a Vector 66
Sorting and Rearranging a Vector 68
Returning Logical Values from a Vector 70
Manipulating Matrix and Data Frames 70
Selecting and Displaying Parts of a Matrix or Data Frame 71
Sorting and Rearranging a Matrix or Data Frame 74
Manipulating Lists 76
Viewing Objects within Objects 77
Looking Inside Complicated Data Objects 77
Opening Complicated Data Objects 78
Quick Looks at Complicated Data Objects 80
Viewing and Setting Names 82
Rotating Data Tables 86
Constructing Data Objects 86
Making Lists 87
Making Data Frames 88
Making Matrix Objects 89
Re-ordering Data Frames and Matrix Objects 92
Forms of Data Objects: Testing and Converting 96
Testing to See What Type of Object You Have 96
Converting from One Object Form to Another 97
Convert a Matrix to a Data Frame 97
Convert a Data Frame into a Matrix 98
Convert a Data Frame into a List 99
Convert a Matrix into a List 100
Convert a List to Something Else 100
Summary 104
Chapter 4: Data: Descriptive Statistics and Tabulation 107
Summary Commands 108
Summarizing Samples 110
Summary Statistics for Vectors 110
Summary Commands With Single Value Results 110
Summary Commands With Multiple Results 113
Cumulative Statistics 115
Simple Cumulative Commands 115
Complex Cumulative Commands 117
Summary Statistics for Data Frames 118
Generic Summary Commands for Data Frames 119
Special Row and Column Summary Commands 119
The apply() Command for Summaries on Rows or Columns 120
Summary Statistics for Matrix Objects 120
Summary Statistics for Lists 121
Summary Tables 122
Making Contingency Tables 123
Creating Contingency Tables from Vectors 123
Creating Contingency Tables from Complicated Data 123
Creating Custom Contingency Tables 126
Creating Contingency Tables from Matrix Objects 128
Selecting Parts of a Table Object 130
Converting an Object into a Table 132
Testing for Table Objects 133
Complex (Flat) Tables 134
Making "Flat" Contingency Tables 134
Making Selective "Flat" Contingency Tables 138
Testing "Flat" Table Objects 139
Summary Commands for Tables 139
Cross Tabulation 142
Testing Cross-Table (xtabs) Objects 144
A Better Class Test 144
Recreating Original Data from a Contingency Table 145
Switching Class 146
Summary 147
Chapter 5: Data: Distrib ution 151
Looking at the Distribution of Data 151
Stem and Leaf Plot 152
Histograms 154
Density Function 158
Using the Density Function to Draw a Graph 159
Adding Density Lines to Existing Graphs 160
Types of Data Distribution 161
The Normal Distribution 161
Other Distributions 164
Random Number Generation and Control 166
Random Numbers and Sampling 168
The Shapiro-Wilk Test for Normality 171
The Kolmogorov-Smirnov Test 172
Quantile-Quantile Plots 174
A Basic Normal Quantile-Quantile Plot 174
Adding a Straight Line to a QQ Plot 174
Plotting the Distribution of One Sample Against Another 175
Summary 177
Chapter 6: Si mple Hypothesis Testing 181
Using the Student's t-test 181
Two-Sample t-Test with Unequal Variance 182
Two-Sample t-Test with Equal Variance 183
One-Sample t-Testing 183
Using Directional Hypotheses 183
Formula Syntax and Subsetting Samples in the t-Test 184
The Wilcoxon U-Test (Mann-Whitney) 188
Two-Sample U-Test 189
One-Sample U-Test 189
Using Directional Hypotheses 189
Formula Syntax and Subsetting Samples in the U-test 190
Paired t- and U-Tests 193
Correlation and Covariance 196
Simple Correlation 197
Covariance 199
Significance Testing in Correlation Tests 199
Formula Syntax 200
Tests for Association 203
Multiple Categories: Chi-Squared Tests 204
Monte Carlo Simulation 205
Yates' Correction for 2 n 2 Tables 206
Single Category: Goodness of Fit Tests 206
Summary 210
Chapter 7: Introduction to Graphical Analysis 215
Box-whisker Plots 215
Basic Boxplots 216
Customizing Boxplots 217
Horizontal Boxplots 218
Scatter Plots 222
Basic Scatter Plots 222
Adding Axis Labels 223
Plotting Symbols 223
Setting Axis Limits 224
Using Formula Syntax 225
Adding Lines of Best-Fit to Scatter Plots 225
Pairs Plots (Multiple Correlation Plots) 229
Line Charts 232
Line Charts Using Numeric Data 232
Line Charts Using Categorical Data 233
Pie Charts 236
Cleveland Dot Charts 239
Bar Charts 245
Single-Category Bar Charts 245
Multiple Category Bar Charts 250
Stacked Bar Charts 250
Grouped Bar Charts 250
Horizontal Bars 253
Bar Charts from Summary Data 253
Copy Graphics to Other Applications 256
Use Copy/Paste to Copy Graphs 257
Save a Graphic to Disk 257
Windows 257
Macintosh 258
Linux 258
Summary 259
Chapter 8: Formula Notation and Complex Statistic s 263
Examples of Using Formula Syntax for Basic Tests 264
Formula Notation in Graphics 266
Analysis of Variance (ANOVA) 268
One-Way ANOVA 268
Stacking the Data before Running Analysis of Variance 269
Running aov() Commands 270
Simple Post-hoc Testing 271
Extracting Means from aov() Models 271
Two-Way ANOVA 273
More about Post-hoc Testing 275
Graphical Summary of ANOVA 277
Graphical Summary of Post-hoc Testing 278
Extracting Means and Summary Statistics 281
Model Tables 281
Table Commands 283
Interaction Plots 283
More Complex ANOVA Models 289
Other Options for aov() 290
Replications and Balance 290
Summary 292
Chapter 9: Manipulating Data and Extracting Components 295
Creating Data for Complex Analysis 295
Data Frames 296
Matrix Objects 299
Creating and Setting Factor Data 300
Making Replicate Treatment Factors 304
Adding Rows or Columns 306
Summarizing Data 312
Simple Column and Row Summaries 312
Complex Summary Functions 313
The rowsum() Command 314
The apply() Command 315
Using tapply() to Summarize Using a Grouping Variable 316
The aggregate() Command 319
Summary 323
Chapter 10: Regression (Li near Modeling) 327
Simple Linear Regression 328
Linear Model Results Objects 329
Coefficients 330
Fitted Values 330
Residuals 330
Formula 331
Best-Fit Line 331
Similarity between lm() and aov() 334
Multiple Regression 335
Formulae and Linear Models 335
Model Building 337
Adding Terms with Forward Stepwise Regression 337
Removing Terms with Backwards Deletion 339
Comparing Models 341
Curvilinear Regression 343
Logarithmic Regression 344
Polynomial Regression 345
Plotting Linear Models and Curve Fitting 347
Best-Fit Lines 348
Adding Line of Best-Fit with abline() 348
Calculating Lines with fitted() 348
Producing Smooth Curves using spline() 350
Confidence Intervals on Fitted Lines 351
Summarizing Regression Models 356
Diagnostic Plots 356
Summary of Fit 357
Summary 359
Chapter 11: More About Graphs 363
Adding Elements to Existing Plots 364
Error Bars 364
Using the segments() Command for Error Bars 364
Using the arrows() Command to Add Error Bars 368
Adding Legends to Graphs 368
Color Palettes 370
Placing a Legend on an Existing Plot 371
Adding Text to Graphs 372
Making Superscript and Subscript Axis Titles 373
Orienting the Axis Labels 375
Making Extra Space in the Margin for Labels 375
Setting Text and Label Sizes 375
Adding Text to the Plot Area 376
Adding Text in the Plot Margins 378
Creating Mathematical Expressions 379
Adding Points to an Existing Graph 382
Adding Various Sorts of Lines to Graphs 386
Adding Straight Lines as Gridlines or Best-Fit Lines 386
Making Curved Lines to Add to Graphs 388
Plotting Mathematical Expressions 390
Adding Short Segments of Lines to an Existing Plot 393
Adding Arrows to an Existing Graph 394
Matrix Plots (Multiple Series on One Graph) 396
Multiple Plots in One Window 399
Splitting the Plot Window into Equal Sections 399
Splitting the Plot Window into Unequal Sections 402
Exporting Graphs 405
Using Copy and Paste to Move a Graph 406
Saving a Graph to a File 406
Windows 406
Macintosh 406
Linux 406
Using the Device Driver to Save a Graph to Disk 407
PNG Device Driver 407
PDF Device Driver 407
Copying a Graph from Screen to Disk File 408
Making a New Graph Directly to a Disk File 408
Summary 410
Chapter 12: Writing Your Own Scripts: Beginning to Program 415
Copy and Paste Scripts 416
Make Your Own Help File as Plaintext 416
Using Annotations with the # Character 417
Creating Simple Functions 417
One-Line Functions 417
Using Default Values in Functions 418
Simple Customized Functions with Multiple Lines 419
Storing Customized Functions 420
Making Source Code 421
Displaying the Results of Customized Functions and Scripts 421
Displaying Messages as Part of Script Output 422
Simple Screen Text 422
Display a Message and Wait for User Intervention 424
Summary 428
Appendix: Answers to Exerci ses 433
Index 461
Dr. Mark Gardener is an ecologist, lecturer, and writer working in the UK. He is currently self-employed and runs courses in ecology, data analysis, and R for a variety of organizations.