29

Practice Set 29

Questions 281–290 (10 questions)

280

A data scientist is using Amazon Comprehend to perform sentiment analysis on a dataset of one million social media posts.Which approach will process the dataset in the LEAST time? [{"voted_answers": "C", "vote_count": 11, "is_most_voted": true}, {"voted_answers": "B", "vote_count": 7, "is_most_voted": false}]

281

A machine learning (ML) specialist at a retail company must build a system to forecast the daily sales for one of the company's stores. The company provided the ML specialist with sales data for this store from the past 10 years. The historical dataset includes the total amount of sales on each day for the store. Approximately 10% of the days in the historical dataset are missing sales data.The ML specialist builds a forecasting model based on the historical dataset. The specialist discovers that the model does not meet the performance standards that the company requires.Which action will MOST likely improve the performance for the forecasting model? [{"voted_answers": "D", "vote_count": 14, "is_most_voted": true}, {"voted_answers": "B", "vote_count": 11, "is_most_voted": false}]

282

A mining company wants to use machine learning (ML) models to identify mineral images in real time. A data science team built an image recognition model that is based on convolutional neural network (CNN). The team trained the model on Amazon SageMaker by using GPU instances. The team will deploy the model to a SageMaker endpoint.The data science team already knows the workload traffic patterns. The team must determine instance type and configuration for the workloads.Which solution will meet these requirements with the LEAST development effort? [{"voted_answers": "B", "vote_count": 10, "is_most_voted": true}, {"voted_answers": "A", "vote_count": 6, "is_most_voted": false}]

283

A company is building custom deep learning models in Amazon SageMaker by using training and inference containers that run on Amazon EC2 instances. The company wants to reduce training costs but does not want to change the current architecture. The SageMaker training job can finish after interruptions. The company can wait days for the results.Which combination of resources should the company use to meet these requirements MOST cost-effectively? (Choose two.) [{"voted_answers": "BE", "vote_count": 13, "is_most_voted": true}]

284

A company hosts a public web application on AWS. The application provides a user feedback feature that consists of free-text fields where users can submit text to provide feedback. The company receives a large amount of free-text user feedback from the online web application. The product managers at the company classify the feedback into a set of fixed categories including user interface issues, performance issues, new feature request, and chat issues for further actions by the company's engineering teams.A machine learning (ML) engineer at the company must automate the classification of new user feedback into these fixed categories by using Amazon SageMaker. A large set of accurate data is available from the historical user feedback that the product managers previously classified.Which solution should the ML engineer apply to perform multi-class text classification of the user feedback? [{"voted_answers": "B", "vote_count": 5, "is_most_voted": true}]

285

A digital media company wants to build a customer churn prediction model by using tabular data. The model should clearly indicate whether a customer will stop using the company's services. The company wants to clean the data because the data contains some empty fields, duplicate values, and rare values.Which solution will meet these requirements with the LEAST development effort? [{"voted_answers": "A", "vote_count": 14, "is_most_voted": true}, {"voted_answers": "B", "vote_count": 9, "is_most_voted": false}]

286

A data engineer is evaluating customer data in Amazon SageMaker Data Wrangler. The data engineer will use the customer data to create a new model to predict customer behavior.The engineer needs to increase the model performance by checking for multicollinearity in the dataset.Which steps can the data engineer take to accomplish this with the LEAST operational effort? (Choose two.) [{"voted_answers": "BE", "vote_count": 7, "is_most_voted": true}, {"voted_answers": "BD", "vote_count": 1, "is_most_voted": false}]

287

A company processes millions of orders every day. The company uses Amazon DynamoDB tables to store order information. When customers submit new orders, the new orders are immediately added to the DynamoDB tables. New orders arrive in the DynamoDB tables continuously.A data scientist must build a peak-time prediction solution. The data scientist must also create an Amazon QuickSight dashboard to display near real-time order insights. The data scientist needs to build a solution that will give QuickSight access to the data as soon as new order information arrives.Which solution will meet these requirements with the LEAST delay between when a new order is processed and when QuickSight can access the new order information? [{"voted_answers": "D", "vote_count": 11, "is_most_voted": true}, {"voted_answers": "C", "vote_count": 6, "is_most_voted": false}, {"voted_answers": "B", "vote_count": 1, "is_most_voted": false}]

288

A data engineer is preparing a dataset that a retail company will use to predict the number of visitors to stores. The data engineer created an Amazon S3 bucket. The engineer subscribed the S3 bucket to an AWS Data Exchange data product for general economic indicators. The data engineer wants to join the economic indicator data to an existing table in Amazon Athena to merge with the business data. All these transformations must finish running in 30-60 minutes.Which solution will meet these requirements MOST cost-effectively? [{"voted_answers": "C", "vote_count": 7, "is_most_voted": true}, {"voted_answers": "B", "vote_count": 1, "is_most_voted": false}]

289

A company operates large cranes at a busy port The company plans to use machine learning (ML) for predictive maintenance of the cranes to avoid unexpected breakdowns and to improve productivity.The company already uses sensor data from each crane to monitor the health of the cranes in real time. The sensor data includes rotation speed, tension, energy consumption, vibration, pressure, and temperature for each crane. The company contracts AWS ML experts to implement an ML solution.Which potential findings would indicate that an ML-based solution is suitable for this scenario? (Choose two.) [{"voted_answers": "DE", "vote_count": 7, "is_most_voted": true}]