Amazon MLS-C01 AWS Certified Machine Learning - Specialty AWS ML Specialty Exam Practice Test

Page: 1 / 14
Total 307 questions
Question 1

A company stores its documents in Amazon S3 with no predefined product categories. A data scientist needs to build a machine learning model to categorize the documents for all the company's products.

Which solution will meet these requirements with the MOST operational efficiency?



Answer : C

Amazon SageMaker's Neural Topic Model (NTM) is designed to uncover underlying topics within text data by clustering documents based on topic similarity. For document categorization, NTM can identify product categories by analyzing and grouping the documents, making it an efficient choice for unsupervised learning where predefined categories do not exist.


Question 2

An ecommerce company wants to train a large image classification model with 10.000 classes. The company runs multiple model training iterations and needs to minimize operational overhead and cost. The company also needs to avoid loss of work and model retraining.

Which solution will meet these requirements?



Answer : D

Amazon SageMaker managed spot training allows for cost-effective training by utilizing Spot Instances, which are lower-cost EC2 instances that can be interrupted when demand is high. By enabling checkpointing in SageMaker, the company can save intermediate model states to Amazon S3, allowing training to resume from the last checkpoint if interrupted. This solution minimizes operational overhead by automating the checkpointing process and resuming work after interruptions, reducing the need for retraining from scratch.

This setup provides a reliable and cost-efficient approach to training large models with minimal operational overhead and risk of data loss.


Question 3

A company's machine learning (ML) specialist is designing a scalable data storage solution for Amazon SageMaker. The company has an existing TensorFlow-based model that uses a train.py script. The model relies on static training data that is currently stored in TFRecord format.

What should the ML specialist do to provide the training data to SageMaker with the LEAST development overhead?



Answer : D

Amazon SageMaker script mode allows users to bring custom training scripts (such as train.py) without needing extensive modifications for specific data formats like TFRecord. By storing the TFRecord data in an Amazon S3 bucket and pointing the SageMaker training job to this bucket, the model can directly access the data, allowing the ML specialist to train the model without additional reformatting or data processing steps.

This approach minimizes development overhead and leverages SageMaker's built-in support for custom training scripts and S3 integration, making it the most efficient choice.


Question 4

A finance company has collected stock return data for 5.000 publicly traded companies. A financial analyst has a dataset that contains 2.000 attributes for each company. The financial analyst wants to use Amazon SageMaker to identify the top 15 attributes that are most valuable to predict future stock returns.

Which solution will meet these requirements with the LEAST operational overhead?



Answer : D

Amazon SageMaker Autopilot is a fully managed solution that automatically explores different ML models and selects the most effective ones for a given prediction task. After model training, Amazon SageMaker Clarify can generate feature importance scores, identifying the top features in a straightforward, automated manner with minimal manual intervention.

By using SageMaker Autopilot, the data scientist can obtain the desired feature importance ranking for predictive attributes with minimal setup and low operational overhead, as opposed to manually configuring models in SageMaker.


Question 5

A company has a podcast platform that has thousands of users. The company implemented an algorithm to detect low podcast engagement based on a 10-minute running window of user events such as listening to. pausing, and closing the podcast. A machine learning (ML) specialist is designing the ingestion process for these events. The ML specialist needs to transform the data to prepare the data for inference.

How should the ML specialist design the transformation step to meet these requirements with the LEAST operational effort?



Answer : C

In this scenario, Kinesis Data Streams efficiently ingests real-time event data, while Amazon Managed Service for Apache Flink (formerly Amazon Kinesis Data Analytics) is ideal for transforming and analyzing data in a continuous stream. Apache Flink allows processing of time-based windows, such as the 10-minute sliding window required here, with low operational overhead.

This combination provides an effective solution for low-latency data processing and transformation, meeting the requirements for preparing data for inference with minimal setup and serverless scalability.


Question 6

An online delivery company wants to choose the fastest courier for each delivery at the moment an order is placed. The company wants to implement this feature for existing users and new users of its application. Data scientists have trained separate models with XGBoost for this purpose, and the models are stored in Amazon S3. There is one model fof each city where the company operates.

The engineers are hosting these models in Amazon EC2 for responding to the web client requests, with one instance for each model, but the instances have only a 5% utilization in CPU and memory, ....operation engineers want to avoid managing unnecessary resources.

Which solution will enable the company to achieve its goal with the LEAST operational overhead?



Answer : B

The best solution for this scenario is to use a multi-model endpoint in Amazon SageMaker, which allows hosting multiple models on the same endpoint and invoking them dynamically at runtime. This way, the company can reduce the operational overhead of managing multiple EC2 instances and model servers, and leverage the scalability, security, and performance of SageMaker hosting services. By using a multi-model endpoint, the company can also save on hosting costs by improving endpoint utilization and paying only for the models that are loaded in memory and the API calls that are made. To use a multi-model endpoint, the company needs to prepare a Docker container based on the open-source multi-model server, which is a framework-agnostic library that supports loading and serving multiple models from Amazon S3. The company can then create a multi-model endpoint in SageMaker, pointing to the S3 bucket containing all the models, and invoke the endpoint from the web client at runtime, specifying the TargetModel parameter according to the city of each request. This solution also enables the company to add or remove models from the S3 bucket without redeploying the endpoint, and to use different versions of the same model for different cities if needed.References:

Use Docker containers to build models

Host multiple models in one container behind one endpoint

Multi-model endpoints using Scikit Learn

Multi-model endpoints using XGBoost


Question 7

An engraving company wants to automate its quality control process for plaques. The company performs the process before mailing each customized plaque to a customer. The company has created an Amazon S3 bucket that contains images of defects that should cause a plaque to be rejected. Low-confidence predictions must be sent to an internal team of reviewers who are using Amazon Augmented Al (Amazon A2I).

Which solution will meet these requirements?



Answer : B

Amazon Rekognition is a service that provides computer vision capabilities for image and video analysis, such as object, scene, and activity detection, face and text recognition, and custom label detection. Amazon Rekognition can be used to automate the quality control process for plaques by comparing the images of the plaques with the images of defects in the Amazon S3 bucket and returning a confidence score for each defect. Amazon A2I is a service that enables human review of machine learning predictions, such as low-confidence predictions from Amazon Rekognition. Amazon A2I can be integrated with a private workforce option, which allows the engraving company to use its own internal team of reviewers to manually inspect the plaques that are flagged by Amazon Rekognition. This solution meets the requirements of automating the quality control process, sending low-confidence predictions to an internal team of reviewers, and using Amazon A2I for manual review.References:

1: Amazon Rekognition documentation

2: Amazon A2I documentation

3: Amazon Rekognition Custom Labels documentation

4: Amazon A2I Private Workforce documentation


Page:    1 / 14   
Total 307 questions