Put the power of AWS Cloud machine learning services to work in your business and commercial applications!
Machine Learning in the AWS Cloud introduces readers to the machine learning (ML) capabilities of the Amazon Web Services ecosystem and provides practical examples to solve real-world regression and classification problems. While readers do not need prior ML experience, they are expected to have some knowledge of Python and a basic knowledge of Amazon Web Services.
Part One introduces readers to fundamental machine learning concepts. You will learn about the types of ML systems, how they are used, and challenges you may face with ML solutions. Part Two focuses on machine learning services provided by Amazon Web Services. You'll be introduced to the basics of cloud computing and AWS offerings in the cloud-based machine learning space. Then you'll learn to use Amazon Machine Learning to solve a simpler class of machine learning problems, and Amazon SageMaker to solve more complex problems.
* Learn techniques that allow you to preprocess data, basic feature engineering, visualizing data, and model building
* Discover common neural network frameworks with Amazon SageMaker
* Solve computer vision problems with Amazon Rekognition
* Benefit from illustrations, source code examples, and sidebars in each chapter
The book appeals to both Python developers and technical/solution architects. Developers will find concrete examples that show them how to perform common ML tasks with Python on AWS. Technical/solution architects will find useful information on the machine learning capabilities of the AWS ecosystem.
ABOUT THE AUTHOR
ABHISHEK MISHRA has more than 19 years' experience across a broad range of enterprise technologies. He consults as a security and fraud solution architect with Lloyds Banking group PLC in London. He is the author of Amazon Web Services for Mobile Developers.
Introduction xxiii
Part 1 Fundamentals of Machine Learning 1
Chapter 1 Introduction to Machine Learning 3
What is Machine Learning? 4
Tools Commonly Used by Data Scientists 4
Common Terminology 5
Real-World Applications of Machine Learning 7
Types of Machine Learning Systems 8
Supervised Learning 8
Unsupervised Learning 9
Semi-Supervised Learning 10
Reinforcement Learning 11
Batch Learning 11
Incremental Learning 12
Instance-based Learning 12
Model-based Learning 12
The Traditional Versus the Machine Learning Approach 13
A Rule-based Decision System 14
A Machine Learning-based System 17
Summary 25
Chapter 2 Data Collection and Preprocessing 27
Machine Learning Datasets 27
Scikit-learn Datasets 27
AWS Public Datasets 30
Kaggle.com Datasets 30
UCI Machine Learning Repository 30
Data Preprocessing Techniques 31
Obtaining an Overview of the Data 31
Handling Missing Values 42
Creating New Features 44
Transforming Numeric Features 46
One-Hot Encoding Categorical Features 47
Summary 50
Chapter 3 Data Visualization with Python 51
Introducing Matplotlib 51
Components of a Plot 54
Figure 55
Axes55
Axis 56
Axis Labels 56
Grids 57
Title 57
Common Plots 58
Histograms 58
Bar Chart 62
Grouped Bar Chart 63
Stacked Bar Chart 65
Stacked Percentage Bar Chart 67
Pie Charts 69
Box Plot 71
Scatter Plots 73
Summary 78
Chapter 4 Creating Machine Learning Models with Scikit-learn 79
Introducing Scikit-learn 79
Creating a Training and Test Dataset 80
K-Fold Cross Validation 84
Creating Machine Learning Models 86
Linear Regression 86
Support Vector Machines 92
Logistic Regression 101
Decision Trees 109
Summary 114
Chapter 5 Evaluating Machine Learning Models 115
Evaluating Regression Models 115
RMSE Metric 117
R2 Metric 119
Evaluating Classification Models 119
Binary Classification Models 119
Multi-Class Classification Models 126
Choosing Hyperparameter Values 131
Summary 132
Part 2 Machine Learning with Amazon Web Services 133
Chapter 6 Introduction to Amazon Web Services 135
What is Cloud Computing? 135
Cloud Service Models 136
Cloud Deployment Models 138
The AWS Ecosystem 139
Machine Learning Application Services 140
Machine Learning Platform Services 141
Support Services 142
Sign Up for an AWS Free-Tier Account 142
Step 1: Contact Information 143
Step 2: Payment Information 145
Step 3: Identity Verification 145
Step 4: Support Plan Selection 147
Step 5: Confirmation 148
Summary 148
Chapter 7 AWS Global Infrastructure 151
Regions and Availability Zones 151
Edge Locations 153
Accessing AWS 154
The AWS Management Console 156
Summary 160
Chapter 8 Identity and Access Management 161
Key Concepts 161
Root Account 161
User 162
Identity Federation 162
Group 163
Policy164
Role 164
Common Tasks 165
Creating a User 167
Modifying Permissions Associated with an Existing Group 172
Creating a Role 173
Securing the Root Account with MFA 176
Setting Up an IAM Password Rotation Policy 179
Summary 180
Chapter 9 Amazon S3 181
Key Concepts 181
Bucket 181
Object Key 182
Object Value 182
Version ID 182
Storage Class 182
Costs 183
Subresources 183
Object Metadata 184
Common Tasks 185
Creating a Bucket 185
Uploading an Object 189
Accessing an Object 191
Changing the Storage Class of an Object 195
Deleting an Object 196
Amazon S3 Bucket Versioning 197
Accessing Amazon S3 Using the AWS CLI 199
Summary 200
Chapter 10 Amazon Cognito 201
Key Concepts 201
Authentication 201
Authorization 201
Identity Provider 202
Client 202
OAuth 2.0 202
OpenID Connect 202
Amazon Cognito User Pool 202
Identity Pool 203
Amazon Cognito Federated Identities 203
Common Tasks 204
Creating a User Pool 204
Retrieving the App Client Secret 213
Creating an Identity Pool 214
User Pools or Identity Pools: Which One Should You Use? 218
Summary 219
Chapter 11 Amazon DynamoDB 221
Key Concepts 221
Tables 222
Global Tables 222
Items 222
Attributes 222
Primary Keys 222
Secondary Indexes 223
Queries 223
Scans 223
Read Consistency 224
Read/Write Capacity Modes 224
Common Tasks 225
Creating a Table 225
Adding Items to a Table 228
Creating an Index 231
Performing a Scan 233
Performing a Query 235
Summary 236
Chapter 12 AWS Lambda 237
Common Use Cases for Lambda 237
Key Concepts 238
Supported Languages 238
Lambda Functions 238
Programming Model 239
Execution Environment 243
Service Limitations 244
Pricing and Availability 244
Common Tasks 244
Creating a Simple Python Lambda Function Using the AWS Management Console 244
Testing a Lambda Function Using the AWS Management Console 250
Deleting an AWS Lambda Function Using the AWS Management Console 253
Summary 255
Chapter 13 Amazon Comprehend 257
Key Concepts 257
Natural Language Processing 257
Topic Modeling 259
Language Support 259
Pricing and Availability 259
Text Analysis Using the Amazon Comprehend Management Console 260
Interactive Text Analysis with the AWS CLI 262
Entity Detection with the AWS CLI 263
Key Phrase Detection with the AWS CLI 264
Sentiment Analysis with the AWS CLI 265
Using Amazon Comprehend with AWS Lambda 266
Summary 274
Chapter 14 Amazon Lex 275
Key Concepts 275
Bot 275
Client Application 276
Intent 276
Slot 276
Utterance 277
Programming Model 277
Pricing and Availability 278
Creating an Amazon Lex Bot 278
Creating Amazon DynamoDB Tables 278
Creating AWS Lambda Functions 285
Creating the Chatbot 304
Customizing the AccountOverview Intent 308
Customizing the ViewTransactionList Intent 312
Testing the Chatbot 314
Summary 315
Chapter 15 Amazon Machine Learning 317
Key Concepts 317
Datasources 318
ML Model 318
Regularization 319
Training Parameters 319
Descriptive Statistics 320
Pricing and Availability 321
Creating Datasources 321
Creating the Training Datasource 324
Creating the Test Datasource 330
Viewing Data Insights 332
Creating an ML Model 337
Making Batch Predictions 341
Creating a Real-Time Prediction Endpoint for Your Machine Learning Model 346
Making Predictions Using the AWS CLI 347
Using Real-Time Prediction Endpoints with Your Applications 349
Summary 350
Chapter 16 Amazon SageMaker 353
Key Concepts 353
Programming Model 354
Amazon SageMaker Notebook Instances 354
Training Jobs 354
Prediction Instances 355
Prediction Endpoint and Endpoint Configuration 355
Amazon SageMaker Batch Transform 355
Data Channels 355
Data Sources and Formats 356
Built-in Algorithms 356
Pricing and Availability 357
Creating an Amazon SageMaker Notebook Instance 357
Preparing Test and Training Data 362
Training a Scikit-learn Model on an Amazon SageMaker Notebook Instance 364
Training a Scikit-learn Model on a Dedicated Training Instance 368
Training a Model Using a Built-in Algorithm on a Dedicated Training Instance 379
Summary 384
Chapter 17 Using Google TensorFlow with Amazon SageMaker 387
Introduction to Google TensorFlow 387
Creating a Linear Regression Model with Google TensorFlow 390
Training and Deploying a DNN Classifier Using the TensorFlow Estimators API and Amazon SageMaker 408
Summary 419
Chapter 18 Amazon Rekognition 421
Key Concepts 421
Object Detection 421
Object Location 422
Scene Detection 422
Activity Detection 422
Facial Recognition 422
Face Collection 422
API Sets 422
Non-Storage and Storage-Based Operations 423
Model Versioning 423
Pricing and Availability 423
Analyzing Images Using the Amazon Rekognition Management Console 423
Interactive Image Analysis with the AWS CLI 428
Using Amazon Rekognition with AWS Lambda 433
Creating the Amazon DynamoDB Table 433
Creating the AWS Lambda Function 435
Summary 444
Appendix A Anaconda and Jupyter Notebook Setup 445
Installing the Anaconda Distribution 445
Creating a Conda Python Environment 447
Installing Python Packages 449
Installing Jupyter Notebook 451
Summary 454
Appendix B AWS Resources Needed to Use This Book 455
Creating an IAM User for Development 455
Creating S3 Buckets 458
Appendix C Installing and Configuring the AWS CLI 461
Mac OS Users 461
Installing the AWS CLI 461
Configuring the AWS CLI 462
Windows Users 464
Installing the AWS CLI4 64
Configuring the AWS CLI 465
Appendix D Introduction to NumPy and Pandas 467
NumPy 467
Creating NumPy Arrays 467
Modifying Arrays 471
Indexing and Slicing 474
Pandas 475
Creating Series and Dataframes 476
Getting Dataframe Information 478
Selecting Data 481
Index 485