Practice Set 34
Questions 331–340 (10 questions)
A company maintains a 2 TB dataset that contains information about customer behaviors. The company stores the dataset in Amazon S3. The company stores a trained model container in Amazon Elastic Container Registry (Amazon ECR).A machine learning (ML) specialist needs to score a batch model for the dataset to predict customer behavior. The ML specialist must select a scalable approach to score the model.Which solution will meet these requirements MOST cost-effectively? [{"voted_answers": "B", "vote_count": 4, "is_most_voted": true}]
A data scientist is implementing a deep learning neural network model for an object detection task on images. The data scientist wants to experiment with a large number of parallel hyperparameter tuning jobs to find hyperparameters that optimize compute time.The data scientist must ensure that jobs that underperform are stopped. The data scientist must allocate computational resources to well-performing hyperparameter configurations. The data scientist is using the hyperparameter tuning job to tune the stochastic gradient descent (SGD) learning rate, momentum, epoch, and mini-batch size.Which technique will meet these requirements with LEAST computational time? [{"voted_answers": "D", "vote_count": 7, "is_most_voted": true}]
An agriculture company wants to improve crop yield forecasting for the upcoming season by using crop yields from the last three seasons. The company wants to compare the performance of its new scikit-learn model to the benchmark.A data scientist needs to package the code into a container that computes both the new model forecast and the benchmark. The data scientist wants AWS to be responsible for the operational maintenance of the container.Which solution will meet these requirements? [{"voted_answers": "A", "vote_count": 5, "is_most_voted": true}, {"voted_answers": "D", "vote_count": 3, "is_most_voted": false}, {"voted_answers": "C", "vote_count": 1, "is_most_voted": false}]
A cybersecurity company is collecting on-premises server logs, mobile app logs, and IoT sensor data. The company backs up the ingested data in an Amazon S3 bucket and sends the ingested data to Amazon OpenSearch Service for further analysis. Currently, the company has a custom ingestion pipeline that is running on Amazon EC2 instances. The company needs to implement a new serverless ingestion pipeline that can automatically scale to handle sudden changes in the data flow.Which solution will meet these requirements MOST cost-effectively? [{"voted_answers": "C", "vote_count": 6, "is_most_voted": true}, {"voted_answers": "D", "vote_count": 5, "is_most_voted": false}]
A bank has collected customer data for 10 years in CSV format. The bank stores the data in an on-premises server. A data science team wants to use Amazon SageMaker to build and train a machine learning (ML) model to predict churn probability. The team will use the historical data. The data scientists want to perform data transformations quickly and to generate data insights before the team builds a model for production.Which solution will meet these requirements with the LEAST development effort? [{"voted_answers": "B", "vote_count": 4, "is_most_voted": true}]
A media company wants to deploy a machine learning (ML) model that uses Amazon SageMaker to recommend new articles to the company’s readers. The company's readers are primarily located in a single city.The company notices that the heaviest reader traffic predictably occurs early in the morning, after lunch, and again after work hours. There is very little traffic at other times of day. The media company needs to minimize the time required to deliver recommendations to its readers. The expected amount of data that the API call will return for inference is less than 4 MB.Which solution will meet these requirements in the MOST cost-effective way? [{"voted_answers": "B", "vote_count": 7, "is_most_voted": true}, {"voted_answers": "A", "vote_count": 1, "is_most_voted": false}]
A machine learning (ML) engineer is using Amazon SageMaker automatic model tuning (AMT) to optimize a model's hyperparameters. The ML engineer notices that the tuning jobs take a long time to run. The tuning jobs continue even when the jobs are not significantly improving against the objective metric.The ML engineer needs the training jobs to optimize the hyperparameters more quickly.How should the ML engineer configure the SageMaker AMT data types to meet these requirements? [{"voted_answers": "D", "vote_count": 5, "is_most_voted": true}, {"voted_answers": "A", "vote_count": 1, "is_most_voted": false}]
A global bank requires a solution to predict whether customers will leave the bank and choose another bank. The bank is using a dataset to train a model to predict customer loss. The training dataset has 1,000 rows. The training dataset includes 100 instances of customers who left the bank.A machine learning (ML) specialist is using Amazon SageMaker Data Wrangler to train a churn prediction model by using a SageMaker training job. After training, the ML specialist notices that the model returns only false results. The ML specialist must correct the model so that it returns more accurate predictions.Which solution will meet these requirements? [{"voted_answers": "B", "vote_count": 2, "is_most_voted": true}]
A banking company provides financial products to customers around the world. A machine learning (ML) specialist collected transaction data from internal customers. The ML specialist split the dataset into training, testing, and validation datasets. The ML specialist analyzed the training dataset by using Amazon SageMaker Clarify. The analysis found that the training dataset contained fewer examples of customers in the 40 to 55 year-old age group compared to the other age groups.Which type of pretraining bias did the ML specialist observe in the training dataset? [{"voted_answers": "B", "vote_count": 7, "is_most_voted": true}, {"voted_answers": "C", "vote_count": 1, "is_most_voted": false}]
A tourism company uses a machine learning (ML) model to make recommendations to customers. The company uses an Amazon SageMaker environment and set hyperparameter tuning completion criteria to MaxNumberOfTrainingJobs.An ML specialist wants to change the hyperparameter tuning completion criteria. The ML specialist wants to stop tuning immediately after an internal algorithm determines that tuning job is unlikely to improve more than 1% over the objective metric from the best training job.Which completion criteria will meet this requirement? [{"voted_answers": "C", "vote_count": 6, "is_most_voted": true}]