Practice Set 33
Questions 321–330 (10 questions)
A machine learning (ML) engineer uses Bayesian optimization for a hyperpara meter tuning job in Amazon SageMaker. The ML engineer uses precision as the objective metric.The ML engineer wants to use recall as the objective metric. The ML engineer also wants to expand the hyperparameter range for a new hyperparameter tuning job. The new hyperparameter range will include the range of the previously performed tuning job.Which approach will run the new hyperparameter tuning job in the LEAST amount of time? [{"voted_answers": "A", "vote_count": 2, "is_most_voted": true}]
A news company is developing an article search tool for its editors. The search tool should look for the articles that are most relevant and representative for particular words that are queried among a corpus of historical news documents.The editors test the first version of the tool and report that the tool seems to look for word matches in general. The editors have to spend additional time to filter the results to look for the articles where the queried words are most important. A group of data scientists must redesign the tool so that it isolates the most frequently used words in a document. The tool also must capture the relevance and importance of words for each document in the corpus.Which solution meets these requirements? [{"voted_answers": "B", "vote_count": 4, "is_most_voted": true}]
A growing company has a business-critical key performance indicator (KPI) for the uptime of a machine learning (ML) recommendation system. The company is using Amazon SageMaker hosting services to develop a recommendation model in a single Availability Zone within an AWS Region.A machine learning (ML) specialist must develop a solution to achieve high availability. The solution must have a recovery time objective (RTO) of 5 minutes.Which solution will meet these requirements with the LEAST effort? [{"voted_answers": "C", "vote_count": 7, "is_most_voted": true}]
A global company receives and processes hundreds of documents daily. The documents are in printed .pdf format or .jpg format.A machine learning (ML) specialist wants to build an automated document processing workflow to extract text from specific fields from the documents and to classify the documents. The ML specialist wants a solution that requires low maintenance.Which solution will meet these requirements with the LEAST operational effort? [{"voted_answers": "D", "vote_count": 6, "is_most_voted": true}]
A company wants to detect credit card fraud. The company has observed that an average of 2% of credit card transactions are fraudulent. A data scientist trains a classifier on a year's worth of credit card transaction data. The classifier needs to identify the fraudulent transactions. The company wants to accurately capture as many fraudulent transactions as possible.Which metrics should the data scientist use to optimize the classifier? (Choose two.) [{"voted_answers": "DE", "vote_count": 7, "is_most_voted": true}, {"voted_answers": "BE", "vote_count": 3, "is_most_voted": false}]
A data scientist is designing a repository that will contain many images of vehicles. The repository must scale automatically in size to store new images every day. The repository must support versioning of the images. The data scientist must implement a solution that maintains multiple immediately accessible copies of the data in different AWS Regions.Which solution will meet these requirements? [{"voted_answers": "A", "vote_count": 5, "is_most_voted": true}]
An ecommerce company wants to update a production real-time machine learning (ML) recommendation engine API that uses Amazon SageMaker. The company wants to release a new model but does not want to make changes to applications that rely on the API. The company also wants to evaluate the performance of the new model in production traffic before the company fully rolls out the new model to all users.Which solution will meet these requirements with the LEAST operational overhead? [{"voted_answers": "B", "vote_count": 3, "is_most_voted": true}]
A machine learning (ML) specialist at a manufacturing company uses Amazon SageMaker DeepAR to forecast input materials and energy requirements for the company. Most of the data in the training dataset is missing values for the target variable. The company stores the training dataset as JSON files.The ML specialist develop a solution by using Amazon SageMaker DeepAR to account for the missing values in the training dataset.Which approach will meet these requirements with the LEAST development effort? [{"voted_answers": "B", "vote_count": 8, "is_most_voted": true}]
A law firm handles thousands of contracts every day. Every contract must be signed. Currently, a lawyer manually checks all contracts for signatures.The law firm is developing a machine learning (ML) solution to automate signature detection for each contract. The ML solution must also provide a confidence score for each contract page.Which Amazon Textract API action can the law firm use to generate a confidence score for each page of each contract? [{"voted_answers": "A", "vote_count": 5, "is_most_voted": true}]
A company that operates oil platforms uses drones to photograph locations on oil platforms that are difficult for humans to access to search for corrosion.Experienced engineers review the photos to determine the severity of corrosion. There can be several corroded areas in a single photo. The engineers determine whether the identified corrosion needs to be fixed immediately, scheduled for future maintenance, or requires no action. The corrosion appears in an average of 0.1% of all photos.A data science team needs to create a solution that automates the process of reviewing the photos and classifying the need for maintenance.Which combination of steps will meet these requirements? (Choose three.) [{"voted_answers": "ADE", "vote_count": 12, "is_most_voted": true}, {"voted_answers": "ABE", "vote_count": 5, "is_most_voted": false}]