21

Practice Set 21

Questions 201–210 (10 questions)

200

An automotive company is using computer vision in its autonomous cars. The company has trained its models successfully by using transfer learning from a convolutional neural network (CNN). The models are trained with PyTorch through the use of the Amazon SageMaker SDK. The company wants to reduce the time that is required for performing inferences, given the low latency that is required for self-driving.Which solution should the company use to evaluate and improve the performance of the models? [{"voted_answers": "C", "vote_count": 16, "is_most_voted": true}]

201

A company's machine learning (ML) specialist is designing a scalable data storage solution for Amazon SageMaker. The company has an existing TensorFlow-based model that uses a train.py script. The model relies on static training data that is currently stored in TFRecord format.What should the ML specialist do to provide the training data to SageMaker with the LEAST development overhead? [{"voted_answers": "D", "vote_count": 19, "is_most_voted": true}]

202

An ecommerce company wants to train a large image classification model with 10,000 classes. The company runs multiple model training iterations and needs to minimize operational overhead and cost. The company also needs to avoid loss of work and model retraining.Which solution will meet these requirements? [{"voted_answers": "D", "vote_count": 19, "is_most_voted": true}]

203

A retail company uses a machine learning (ML) model for daily sales forecasting. The model has provided inaccurate results for the past 3 weeks. At the end of each day, an AWS Glue job consolidates the input data that is used for the forecasting with the actual daily sales data and the predictions of the model. The AWS Glue job stores the data in Amazon S3.The company's ML team determines that the inaccuracies are occurring because of a change in the value distributions of the model features. The ML team must implement a solution that will detect when this type of change occurs in the future.Which solution will meet these requirements with the LEAST amount of operational overhead? [{"voted_answers": "A", "vote_count": 17, "is_most_voted": true}, {"voted_answers": "B", "vote_count": 1, "is_most_voted": false}]

204

A machine learning (ML) specialist has prepared and used a custom container image with Amazon SageMaker to train an image classification model. The ML specialist is performing hyperparameter optimization (HPO) with this custom container image to produce a higher quality image classifier.The ML specialist needs to determine whether HPO with the SageMaker built-in image classification algorithm will produce a better model than the model produced by HPO with the custom container image. All ML experiments and HPO jobs must be invoked from scripts inside SageMaker Studio notebooks.How can the ML specialist meet these requirements in the LEAST amount of time? [{"voted_answers": "C", "vote_count": 22, "is_most_voted": true}, {"voted_answers": "D", "vote_count": 22, "is_most_voted": false}, {"voted_answers": "B", "vote_count": 14, "is_most_voted": false}]

205

A company wants to deliver digital car management services to its customers. The company plans to analyze data to predict the likelihood of users changing cars. The company has 10 TB of data that is stored in an Amazon Redshift cluster. The company's data engineering team is using Amazon SageMaker Studio for data analysis and model development. Only a subset of the data is relevant for developing the machine learning models. The data engineering team needs a secure and cost-effective way to export the data to a data repository in Amazon S3 for model development.Which solutions will meet these requirements? (Choose two.) [{"voted_answers": "CE", "vote_count": 17, "is_most_voted": true}, {"voted_answers": "AE", "vote_count": 5, "is_most_voted": false}]

206

A company is building an application that can predict spam email messages based on email text. The company can generate a few thousand human-labeled datasets that contain a list of email messages and a label of "spam" or "not spam" for each email message. A machine learning (ML) specialist wants to use transfer learning with a Bidirectional Encoder Representations from Transformers (BERT) model that is trained on English Wikipedia text data.What should the ML specialist do to initialize the model to fine-tune the model with the custom data? [{"voted_answers": "B", "vote_count": 12, "is_most_voted": true}, {"voted_answers": "D", "vote_count": 9, "is_most_voted": false}]

207

A company is using a legacy telephony platform and has several years remaining on its contract. The company wants to move to AWS and wants to implement the following machine learning features:• Call transcription in multiple languages• Categorization of calls based on the transcript• Detection of the main customer issues in the calls• Customer sentiment analysis for each line of the transcript, with positive or negative indication and scoring of that sentimentWhich AWS solution will meet these requirements with the LEAST amount of custom model training? [{"voted_answers": "C", "vote_count": 20, "is_most_voted": true}, {"voted_answers": "B", "vote_count": 18, "is_most_voted": false}]

208

A finance company needs to forecast the price of a commodity. The company has compiled a dataset of historical daily prices. A data scientist must train various forecasting models on 80% of the dataset and must validate the efficacy of those models on the remaining 20% of the dataset.How should the data scientist split the dataset into a training dataset and a validation dataset to compare model performance? [{"voted_answers": "A", "vote_count": 16, "is_most_voted": true}, {"voted_answers": "D", "vote_count": 2, "is_most_voted": false}]

209

A retail company wants to build a recommendation system for the company's website. The system needs to provide recommendations for existing users and needs to base those recommendations on each user's past browsing history. The system also must filter out any items that the user previously purchased.Which solution will meet these requirements with the LEAST development effort? [{"voted_answers": "C", "vote_count": 17, "is_most_voted": true}, {"voted_answers": "B", "vote_count": 5, "is_most_voted": false}]