Reference: snippets from SageMaker documentation
You might have in-house example data repositories or use publicly available datasets. Typically, you pull the dataset or datasets into a single repository.
Before using a dataset to train a model, data scientists typically explore, analyze, and preprocess it. You can use a Jupyter notebook on an Amazon SageMaker notebook instance to do so.
You should inspect the data and clean it as needed (ex: if your data has a "country name" attribute with values "United States" and "US", you might want to edit the data to be consistent). You may also want to perform additional data transformations (ex: combine attributes).
SageMaker Processing enables running jobs to preprocess and post process data, perform feature engineering, and evaluate models on Amazon SageMaker easily and at scale. You use the built-in data processing containers or to bring your own containers and submit custom jobs to run on managed infrastructure.
Once the data is ready, store it in an S3 bucket.
Train the model
To train a model in SageMaker, you create a training job. The training job includes the following information:
You have the following options for a training algorithm:
You can create a training job with the Amazon SageMaker console or the API. After you create the training job, Amazon SageMaker launches the ML compute instances and uses the training code and the training dataset to train the model. Depending on the size of your training dataset and how quickly you need the results, you can use resources ranging from a single general-purpose instance to a distributed cluster of GPU instances.
You can use SageMaker Debugger to inspect training parameters and data throughout the training process when working with the TensorFlow, PyTorch, and Apache MXNet learning frameworks. Debugger automatically detects and alerts users to commonly occurring errors such as parameter values getting too large or small.
Your instructions to AWS for training a model:
Evaluate the model
SageMaker saves the resulting model artifacts and other output in the S3 bucket you specified for that purpose. To evaluate your trained model, you use either the AWS SDK for Python (Boto) or the high-level Python library that Amazon SageMaker provides to send requests to the model for inferences. You use a Jupyter notebook in your Amazon SageMaker notebook instance to train and evaluate your model.
You can evaluate your model using historical data (offline testing) or live data (online testing):
Deploy the model
After you train your model, you can deploy it to get predictions in one of two ways:
Deploying a model using Amazon SageMaker hosting services is a three-step process:
You can deploy a model trained with Amazon SageMaker to your own deployment target. To do that, you need to know the algorithm-specific format of the model artifacts that were generated by model training.
You traditionally re-engineer a model before you integrate it with your application and deploy it. With Amazon SageMaker hosting services, you can deploy your model independently, decoupling it from your application code.
Invoke the model
To get inferences from the model, client applications send requests to the Amazon SageMaker Runtime HTTPS endpoint. You can also send requests to this endpoint from your Jupyter notebook during testing. However, endpoints are scoped to an individual AWS account, and are not public. The URL does not contain the account ID, but Amazon SageMaker determines the account ID from the authentication token that is supplied by the caller. This means if the client application is not within the scope of your account, it cannot hit that endpoint. However, you can use Amazon API Gateway and AWS Lambda to set up and deploy a web service that you can call from such a client application.
Generate ground truth
To increase a model's accuracy, you might choose to save the user's input data and ground truth, if available, as part of the training data. You can then retrain the model periodically with a larger, improved training dataset.
Update the model
You can modify an endpoint without taking models that are already deployed into production out of service. For example, you can add new model variants, update the ML Compute instance configurations of existing model variants, or change the distribution of traffic among model variants. To modify an endpoint, you provide a new endpoint configuration. Amazon SageMaker implements the changes without any downtime.
Changing or deleting model artifacts or changing inference code after deploying a model produces unpredictable results. If you need to change or delete model artifacts or change inference code, modify the endpoint by providing a new endpoint configuration. Once you provide the new endpoint configuration, you can change or delete the model artifacts corresponding to the old endpoint configuration.
Monitor the model
Amazon SageMaker Model Monitor enables developers to set alerts for when there are deviations in the model quality, such as data drift and anomalies.