Practice Set 36
Questions 351–360 (10 questions)
A data scientist uses Amazon SageMaker Data Wrangler to obtain a feature summary from a dataset that the data scientist imported from Amazon S3. The data scientist notices that the prediction power for a dataset feature has a score of 1.What is the cause of the score? [{"voted_answers": "A", "vote_count": 2, "is_most_voted": true}]
A data scientist is conducting exploratory data analysis (EDA) on a dataset that contains information about product suppliers. The dataset records the country where each product supplier is located as a two-letter text code. For example, the code for New Zealand is "NZ."The data scientist needs to transform the country codes for model training. The data scientist must choose the solution that will result in the smallest increase in dimensionality. The solution must not result in any information loss.Which solution will meet these requirements? [{"voted_answers": "B", "vote_count": 6, "is_most_voted": true}, {"voted_answers": "D", "vote_count": 2, "is_most_voted": false}]
A data scientist is building a new model for an ecommerce company. The model will predict how many minutes it will take to deliver a package.During model training, the data scientist needs to evaluate model performance.Which metrics should the data scientist use to meet this requirement? (Choose two.) [{"voted_answers": "BC", "vote_count": 3, "is_most_voted": true}]
A machine learning (ML) specialist is developing a model for a company. The model will classify and predict sequences of objects that are displayed in a video. The ML specialist decides to use a hybrid architecture that consists of a convolutional neural network (CNN) followed by a classifier three-layer recurrent neural network (RNN).The company developed a similar model previously but trained the model to classify a different set of objects. The ML specialist wants to save time by using the previously trained model and adapting the model for the current use case and set of objects.Which combination of steps will accomplish this goal with the LEAST amount of effort? (Choose two.) [{"voted_answers": "DE", "vote_count": 3, "is_most_voted": true}]
A company distributes an online multiple-choice survey to several thousand people. Respondents to the survey can select multiple options for each question.A machine learning (ML) engineer needs to comprehensively represent every response from all respondents in a dataset. The ML engineer will use the dataset to train a logistic regression model.Which solution will meet these requirements? [{"voted_answers": "A", "vote_count": 2, "is_most_voted": true}]
A manufacturing company stores production volume data in a PostgreSQL database.The company needs an end-to-end solution that will give business analysts the ability to prepare data for processing and to predict future production volume based the previous year's production volume. The solution must not require the company to have coding knowledge.Which solution will meet these requirements with the LEAST effort? [{"voted_answers": "B", "vote_count": 2, "is_most_voted": true}]
A data scientist needs to create a model for predictive maintenance. The model will be based on historical data to identify rare anomalies in the data.The historical data is stored in an Amazon S3 bucket. The data scientist needs to use Amazon SageMaker Data Wrangler to ingest the data. The data scientist also needs to perform exploratory data analysis (EDA) to understand the statistical properties of the data.Which solution will meet these requirements with the LEAST amount of compute resources? [{"voted_answers": "C", "vote_count": 4, "is_most_voted": true}, {"voted_answers": "B", "vote_count": 2, "is_most_voted": false}, {"voted_answers": "D", "vote_count": 1, "is_most_voted": false}]
An ecommerce company has observed that customers who use the company's website rarely view items that the website recommends to customers. The company wants to recommend items to customers that customers are more likely to want to purchase.Which solution will meet this requirement in the SHORTEST amount of time? [{"voted_answers": "C", "vote_count": 3, "is_most_voted": false}]
A machine learning (ML) engineer is preparing a dataset for a classification model. The ML engineer notices that some continuous numeric features have a significantly greater value than most other features. A business expert explains that the features are independently informative and that the dataset is representative of the target distribution.After training, the model's inferences accuracy is lower than expected.Which preprocessing technique will result in the GREATEST increase of the model's inference accuracy? [{"voted_answers": "A", "vote_count": 2, "is_most_voted": false}]
A manufacturing company produces 100 types of steel rods. The rod types have varying material grades and dimensions. The company has sales data for the steel rods for the past 50 years.A data scientist needs to build a machine learning (ML) model to predict future sales of the steel rods.Which solution will meet this requirement in the MOST operationally efficient way? [{"voted_answers": "A", "vote_count": 2, "is_most_voted": true}]