AWS Certified AI Practitioner (AIF-C01) Mock Test

One of the key elements of training a foundation model (during pre-training) is:

Utilizing vast amounts of unlabeled or self-supervised data.

Relying heavily on human feedback for every training step.

Focusing solely on a single, narrow task.

Using a very small, highly curated dataset.

It often involves learning from the data itself without explicit labels.

Correct! Wrong!

Pre-training foundation models typically relies on self-supervised learning on massive amounts of unlabeled text and/or code, where the model learns to predict parts of the input data itself.

When preparing data for fine-tuning a foundation model, what does 'data curation' primarily involve?

Choosing the right model architecture for the data.

Selecting, cleaning, and organizing data to ensure quality and relevance.

Deploying the model to a production environment.

Generating entirely new synthetic data.

It's about ensuring the quality and relevance of your fine-tuning dataset.

Show hint

Correct! Wrong!

Data curation involves selecting, cleaning, and organizing data to ensure it is high-quality, relevant, and suitable for the fine-tuning task.

What are 'embeddings' in the context of Generative AI and NLP?

The final output generated by the AI model.

Numerical vector representations of text or other data that capture semantic meaning.

The user interface elements for interacting with the AI model.

The ethical guidelines embedded within the AI model.

They represent words or phrases as vectors in a high-dimensional space.

Show hint

Correct! Wrong!

Embeddings are numerical vector representations of words, sentences, or other data types, where similar items have similar vector representations. They capture semantic meaning.

Which of the following is a common use case for Generative AI?

Database administration.

Data storage and retrieval.

Network traffic monitoring.

Generating novel images and text.

Think about creating new, original content.

Show hint

Correct! Wrong!

Content creation, such as generating text, images, audio, and video, is a primary use case for Generative AI.

If a foundation model consistently produces biased or unfair outputs, what is a common approach to mitigate this after deployment?

Ignoring the bias as it's inherent in the data.

Reducing the output length of the model.

Monitoring, collecting feedback, and potentially re-training or fine-tuning with debiased data.

Increasing the 'temperature' parameter.

Think about ongoing efforts to improve fairness.

Show hint

Correct! Wrong!

While pre-deployment mitigation is best, post-deployment strategies can include continuous monitoring for bias, collecting feedback, and potentially re-training or fine-tuning the model with more diverse and debiased data, or implementing fairness-aware post-processing techniques.

What is 'model versioning' and why is it important for AI governance?

A method to ensure all models are perfectly fair.

Tracking different versions of models for reproducibility, rollback, and auditing.

The process of selecting the best pre-trained model.

A technique to make models run faster.

It helps track changes and revert to previous model states if needed.

Show hint

Correct! Wrong!

Model versioning is the practice of tracking different versions of trained models. It's important for reproducibility, rollback capabilities, auditing, and understanding how model performance changes over time.

In the context of foundation model inference parameters, what does 'temperature' typically control?

The randomness or creativity of the model's output.

The maximum length of the generated output.

The factual accuracy of the model's output.

The speed of the inference process.

A higher value makes the output more 'creative' or random.

Show hint

Correct! Wrong!

Temperature is a parameter that controls the randomness of the model's output. Higher temperatures lead to more creative and diverse outputs, while lower temperatures produce more focused and deterministic outputs.

When deploying an AI model that handles sensitive data, what is a key security consideration for the inference endpoint?

Making the endpoint publicly accessible for ease of use.

Ensuring the endpoint is secured with authentication, authorization, and network controls.

Disabling all encryption to improve performance.

Using the largest possible instance type for the endpoint.

Think about controlling who can access the model's prediction service.

Show hint

Correct! Wrong!

Securing the inference endpoint with proper authentication, authorization, and network controls (e.g., using VPCs and security groups) is crucial to prevent unauthorized access and protect sensitive data.

What is the first step in a typical Machine Learning development lifecycle?

Model deployment

Model training

Data collection

Problem definition and goal setting

Before collecting data, you need to know what you're trying to achieve.

Show hint

Correct! Wrong!

The ML development lifecycle typically begins with defining the problem you want to solve and understanding the objectives.

What is the primary purpose of Retrieval-Augmented Generation (RAG) in applications using foundation models?

To reduce the size of foundation models for faster deployment.

To allow foundation models to access and use external knowledge sources for more accurate responses.

To pre-train foundation models from scratch more efficiently.

To automatically fine-tune foundation models on new data.

It helps ground the model's responses in specific, external information.

Show hint

Correct! Wrong!

RAG enhances foundation models by providing them with access to external, up-to-date knowledge sources. The model retrieves relevant information from these sources to generate more accurate and contextually appropriate responses.

Which of these is a common use case for Natural Language Processing (NLP)?

Image recognition

Sentiment analysis

Fraud detection in financial transactions

Predictive maintenance

Think about understanding human language.

Show hint

Correct! Wrong!

Sentiment analysis, which involves determining the emotional tone behind a series of words, is a common application of NLP.

Which concept is crucial for data privacy when working with AI models that process personal information?

Storing personal data indefinitely.

Data minimization.

Sharing raw personal data freely between services.

Data maximization (collecting as much data as possible).

It's about collecting only what's essential.

Show hint

Correct! Wrong!

Data minimization involves collecting and retaining only the necessary personal data for a specific, legitimate purpose, reducing privacy risks.

When selecting a pre-trained foundation model for a specific task, which factor is LEAST likely to be a primary design consideration?

Cost of inference.

The number of citations of the model's original research paper.

Model size and its impact on deployment.

Latency requirements for the application.

Think about practical business and technical needs.

Show hint

Correct! Wrong!

While the original research paper's citation count might indicate influence, factors like cost, latency, model size, and performance on relevant benchmarks are more direct and practical considerations for selecting a model for a business application.

Identifying whether an email is 'spam' or 'not spam' is an example of what kind of ML problem?

Clustering

Classification

Regression

Anomaly detection

The output is a discrete category.

Show hint

Correct! Wrong!

This is a classification problem because the goal is to assign a predefined category (spam or not spam) to each email.

What does 'prompt engineering' refer to in Generative AI?

Designing effective input queries or instructions for a generative AI model.

The architectural design of the AI model.

The hardware setup for running generative AI.

The process of training the AI model from scratch.

It's about crafting the input to get the best output.

Show hint

Correct! Wrong!

Prompt engineering is the process of designing and refining input prompts to guide a generative AI model to produce desired outputs.

What is a best practice for securing data used to train AI models on AWS?

Encrypting data at rest and in transit.

Disabling all logging to save costs.

Storing all training data in public S3 buckets for easy access.

Using the same IAM access key for all users and services.

Think about protecting data when it's stored and when it's moving.

Show hint

Correct! Wrong!

Encrypting data both at rest (e.g., in Amazon S3) and in transit (e.g., using TLS/SSL) is a fundamental security best practice to protect sensitive training data.

What is a key consideration when choosing a pre-trained model for an application with strict low-latency requirements?

The availability of extensive pre-training data documentation.

The model's ability to generate highly diverse outputs.

The number of languages the model supports.

The model's size and inference speed.

Larger models usually take longer to produce an output.

Show hint

Correct! Wrong!

Model size and complexity directly impact inference speed (latency). Smaller, optimized models are generally preferred for low-latency applications.

According to the AWS Shared Responsibility Model for AI/ML services like Amazon SageMaker, what is AWS responsible for?

Ensuring the customer's ML models are unbiased.

Securing the customer's data used for training.

Managing IAM user permissions for customer accounts.

The security of the underlying infrastructure that runs SageMaker.

AWS manages the infrastructure that runs the services.

Show hint

Correct! Wrong!

AWS is responsible for the security 'of' the cloud, including the underlying infrastructure, hardware, software, networking, and facilities that run AWS Cloud services. Customers are responsible for security 'in' the cloud, such as data encryption, IAM configurations, and network traffic protection.

The 'pre-training' phase of a foundation model typically involves training on:

A large and diverse dataset of unlabeled or self-supervised data.

Only on data related to a single specific task.

A small, highly specialized dataset.

Data that has been manually labeled by human annotators for specific outcomes.

Think about the scale and variety of data used initially.

Show hint

Correct! Wrong!

Pre-training foundation models involves training them on massive, diverse datasets to learn general patterns, language structures, and world knowledge.

Which of the following is a core principle of Responsible AI?

Fairness and mitigating bias.

Using the largest possible dataset, regardless of its source.

Ensuring the model is always 100% accurate.

Maximizing model complexity at all costs.

Think about treating all groups equitably.

Show hint

Correct! Wrong!

Fairness is a key principle of Responsible AI, ensuring that AI systems do not perpetuate or amplify existing biases and treat all individuals and groups equitably.

Which ethical concern is particularly relevant to generative AI models that can create realistic but fake images or videos (deepfakes)?

High energy consumption during training.

Slow inference times for real-time generation.

The potential for misuse in creating misinformation and deepfakes.

Limited creativity in generated outputs.

Think about the potential for creating misleading content.

Show hint

Correct! Wrong!

The ability of generative AI to create convincing deepfakes raises significant concerns about misinformation, disinformation, and the potential for malicious use like impersonation or defamation.

What is the primary difference between Machine Learning (ML) and traditional programming?

ML uses more complex algorithms.

ML systems learn from data rather than being explicitly programmed with rules.

ML is only used for prediction tasks.

Traditional programming cannot handle large datasets.

Consider how rules are defined in each approach.

Show hint

Correct! Wrong!

In traditional programming, humans write explicit rules for the computer to follow. In ML, the system learns patterns from data to make predictions or decisions without being explicitly programmed for each case.

Which of these is a key method for fine-tuning foundation models to align them better with human preferences and instructions?

Unsupervised pre-training on web-scale text.

Reinforcement Learning from Human Feedback (RLHF).

Using only zero-shot prompting for all tasks.

Increasing the number of layers in the model architecture.

This method uses human ratings to guide the model's learning.

Show hint

Correct! Wrong!

Reinforcement Learning from Human Feedback (RLHF) is a technique used to fine-tune language models by incorporating human feedback into the training process, helping the model generate outputs that are more helpful, harmless, and honest.

Which of the following best describes the 'effect of inference parameters on model responses'?

They are only relevant during the model fine-tuning process.

They primarily determine the model's pre-training data.

They only affect the cost of running the model.

They control aspects like randomness, length, and focus of the generated output.

These settings fine-tune how the model generates its output during inference.

Show hint

Correct! Wrong!

Inference parameters like temperature, top-k, top-p, and max length significantly influence the style, diversity, length, and determinism of the generated output from a foundation model.

Which AWS service helps you manage and enforce permissions for accessing AWS resources, including AI services?

AWS Key Management Service (KMS)

AWS Identity and Access Management (IAM)

AWS Shield

Amazon GuardDuty

It's all about who can do what in AWS.

Show hint

Correct! Wrong!

AWS Identity and Access Management (IAM) enables you to securely control access to AWS services and resources for your users and applications.

Why is 'transparency' important in Responsible AI?

It prevents any form of bias in the AI model.

It ensures the AI model is the most computationally efficient.

It helps build trust and enables accountability by making the AI system's operations understandable.

It guarantees the AI model will always be correct.

It's about being open about how the AI works.

Show hint

Correct! Wrong!

Transparency involves providing clear information about how an AI system works, its capabilities, limitations, and the data it uses, which helps build trust and allows for accountability.

What is a 'business application' of Retrieval-Augmented Generation (RAG)?

Optimizing the underlying hardware for model inference.

Pre-training a large language model from scratch.

Generating artistic images in the style of famous painters.

Creating a customer support chatbot that answers queries using up-to-date internal knowledge bases.

Consider how RAG can provide up-to-date, specific information.

Show hint

Correct! Wrong!

RAG is highly valuable for building customer support chatbots that can access and use a company's latest product manuals or FAQs to provide accurate answers, reducing the need for frequent model retraining.

What is a key characteristic of a 'transformer-based model'?

They rely heavily on convolutional layers for processing.

They can only generate very short sequences of text.

They are primarily used for unsupervised clustering tasks.

They use attention mechanisms to weigh the importance of different parts of the input sequence.

They don't process input word by word in a strict sequence.

Show hint

Correct! Wrong!

Transformer models process entire input sequences simultaneously using attention mechanisms, rather than sequentially like RNNs, allowing for better parallelization and capturing long-range dependencies.

A company wants to predict house prices based on features like size, number of bedrooms, and location. Which type of ML problem is this?

Regression

Reinforcement Learning

Clustering

Classification

The output is a continuous numerical value.

Show hint

Correct! Wrong!

Predicting a continuous value (like price) based on input features is a regression problem.

What is 'Deep Learning' a subfield of?

Robotics

Machine Learning

Data Science

Artificial Intelligence

It involves neural networks with many layers.

Show hint

Correct! Wrong!

Deep Learning is a specialized subfield of Machine Learning that uses neural networks with many layers (deep neural networks) to analyze various factors of data.

What type of data is used in supervised learning?

Semi-structured data

Streaming data

Unlabeled data

Labeled data

The algorithm needs 'supervision' in the form of correct answers.

Show hint

Correct! Wrong!

Supervised learning algorithms require labeled data, where each data point is tagged with a correct output or target variable.

Which of the following is a common metric used to evaluate the performance of text summarization models?

Accuracy (for classification)

Mean Squared Error (MSE) (for regression)

ROUGE score

Perplexity (for language modeling, but ROUGE is more specific to summarization)

This metric often compares generated summaries to human-written ones.

Show hint

Correct! Wrong!

ROUGE (Recall-Oriented Understudy for Gisting Evaluation) is a set of metrics commonly used for evaluating automatic summarization and machine translation by comparing an automatically produced summary against reference summaries.

Which of these is NOT typically considered a direct use case of Generative AI?

Image synthesis

Music composition

Code generation

Real-time fraud detection based on transaction history.

One of these is more about identifying existing patterns than creating new content.

Show hint

Correct! Wrong!

While Generative AI can assist in analyzing data that might lead to fraud detection, fraud detection itself often relies more on discriminative models (classification) to identify anomalous patterns from existing data.

The 'robustness' of an AI model refers to its ability to:

Perform consistently and reliably, even with noisy or adversarial inputs.

Learn from very small amounts of data.

Generate highly creative and novel outputs.

Be easily understood by non-technical users.

It's about performing well even in unexpected situations.

Show hint

Correct! Wrong!

Robustness means the AI system can maintain its level of performance even when faced with adversarial attacks or unexpected, noisy, or out-of-distribution inputs.

Which AWS service allows you to build, train, and deploy machine learning models, and also provides tools for managing the entire ML lifecycle, including features for foundation model hosting and fine-tuning?

AWS Glue

Amazon SageMaker

Amazon Kendra

Amazon Personalize

It's AWS's end-to-end ML platform.

Show hint

Correct! Wrong!

Amazon SageMaker is a comprehensive ML platform that supports the entire ML workflow, including capabilities for working with foundation models.

What is 'batch inferencing' in the context of ML models?

Continuously updating model predictions.

Training a model on small batches of data.

Making predictions on data as it arrives in real-time.

Processing a large set of data inputs at once to get predictions.

Think about processing many data points together, not one by one as they arrive.

Show hint

Correct! Wrong!

Batch inferencing involves making predictions on a large collection of data points at once, typically in a scheduled manner, rather than in real-time.

What is a potential disadvantage of Generative AI models like LLMs?

They require very little data to train effectively.

They can produce 'hallucinations' or factually incorrect information.

They are too slow for real-time applications.

They can only process text data.

They can sometimes 'make things up'.

Show hint

Correct! Wrong!

Generative AI models, especially LLMs, can sometimes produce 'hallucinations,' which are confident but incorrect or nonsensical outputs.

What is the primary purpose of Amazon SageMaker Clarify?

To accelerate model training times.

To detect bias in data and models, and explain model predictions.

To automatically deploy models to production.

To provide tools for data labeling.

This AWS service helps with bias detection and model explainability.

Show hint

Correct! Wrong!

Amazon SageMaker Clarify helps improve machine learning models by detecting potential bias and helping explain how these models make predictions.

What is a 'foundation model' in the context of Generative AI?

A small model trained for a very specific task.

A type of hardware used for training AI models.

A theoretical AI concept not yet implemented.

A large AI model pre-trained on vast amounts of data, adaptable to many tasks.

It serves as a base for many applications.

Show hint

Correct! Wrong!

A foundation model is a large AI model pre-trained on a vast quantity of broad data, designed to be adapted (e.g., fine-tuned) to a wide range of downstream tasks.

What is 'data labeling' in the context of preparing data for fine-tuning a supervised ML model?

Storing the data in a version-controlled system.

Encrypting the data for security.

Converting data into a numerical format.

Adding informative tags or target attributes to raw data.

It's about adding the 'correct answers' to your training data.

Show hint

Correct! Wrong!

Data labeling is the process of annotating raw data (e.g., images, text) with informative labels or tags that provide the ground truth for the model to learn from during supervised fine-tuning.

What is the importance of 'data lineage' in AI governance?

It primarily helps in reducing model training time.

It automatically encrypts all data used by the AI model.

It ensures the AI model is always fair and unbiased.

It tracks the origin, transformations, and usage of data, which is vital for auditing and quality.

It's about knowing where your data came from and how it changed.

Show hint

Correct! Wrong!

Data lineage tracks the origin, movement, transformations, and usage of data throughout its lifecycle. This is crucial for auditing, ensuring data quality, and understanding model behavior in AI systems.

What are 'tokens' in the context of Large Language Models (LLMs)?

Basic units of text, such as words or subwords, processed by the model.

The parameters that define the model's architecture.

Security credentials for accessing the model.

Units of computational power used by the model.

They are the building blocks of text for an LLM.

Show hint

Correct! Wrong!

Tokens are basic units of text (like words, subwords, or characters) that LLMs process and generate.

Which AWS service can be used to log API calls made to AWS services, including AI services, for security analysis and compliance auditing?

AWS Systems Manager

Amazon CloudWatch

AWS Config

AWS CloudTrail

This service provides an audit 'trail' of API activity.

Show hint

Correct! Wrong!

AWS CloudTrail records AWS API calls for your account and delivers log files to an Amazon S3 bucket, enabling security analysis, resource change tracking, and compliance auditing.

What is the 'Transformer' architecture known for in AI?

Its use of attention mechanisms, making it powerful for NLP tasks.

Its efficiency in image classification.

Its simplicity and low computational cost.

Its primary application in reinforcement learning.

It heavily relies on 'attention mechanisms'.

Show hint

Correct! Wrong!

The Transformer architecture, introduced in the paper 'Attention Is All You Need,' is highly effective for sequence-to-sequence tasks and is the basis for many modern LLMs due to its use of attention mechanisms.

What is a key characteristic of unstructured data?

It typically resides in relational databases.

It does not have a predefined format or organization.

It is always numerical.

It is organized in a predefined manner.

Think about data that doesn't fit neatly into rows and columns.

Show hint

Correct! Wrong!

Unstructured data does not have a predefined format or organization, making it more difficult to collect, process, and analyze. Examples include text documents, images, and videos.

When evaluating a foundation model, 'latency' refers to:

The accuracy of the model's predictions.

The time it takes for the model to provide a response after receiving an input.

The cost of running the model per query.

The amount of data the model was trained on.

It's about how quickly the model responds.

Show hint

Correct! Wrong!

Latency is the time delay between sending a request to the model and receiving a response. Low latency is critical for real-time applications.

If an AI application needs to comply with specific industry regulations (e.g., HIPAA for healthcare), what is a key responsibility of the customer using AWS AI services?

Foundational AI certifications automatically grant regulatory compliance.

The customer is responsible for configuring services and applications to meet regulatory requirements.

Compliance is solely the responsibility of third-party auditors.

AWS automatically ensures full compliance for all customer applications.

AWS provides the tools, but the customer must use them correctly for compliance.

Show hint

Correct! Wrong!

While AWS provides compliant infrastructure and services, the customer is responsible for configuring those services and building their applications in a way that meets the specific requirements of regulations like HIPAA, including data handling, access controls, and audit logging.

Which of these is a strategy for mitigating bias in AI datasets?

Using the smallest possible dataset.

Increasing the complexity of the model architecture.

Training the model for more epochs.

Ensuring diverse and representative training data.

Think about the data used to train the model.

Show hint

Correct! Wrong!

Ensuring the training dataset is diverse and representative of the population the AI will affect is a key strategy. This can involve augmenting underrepresented groups or carefully sampling data.

What is a 'vector database' often used for in conjunction with RAG systems?

Storing and efficiently querying embeddings for similarity search.

Storing the foundation model's weights and parameters.

Logging the interactions between users and the foundation model.

Managing user authentication for accessing the AI application.

They are good at finding 'similar' items based on vector representations.

Show hint

Correct! Wrong!

Vector databases are optimized for storing and querying embeddings (vector representations of data). In RAG, they store embeddings of the external knowledge source, allowing for efficient similarity searches to find relevant context.

Which AWS service provides access to a range of foundation models from AI21 Labs, Anthropic, Stability AI, and Amazon, along with tools to build generative AI applications?

Amazon Comprehend

Amazon SageMaker

Amazon Lex

Amazon Bedrock

This service acts as a 'bedrock' for various FMs.

Show hint

Correct! Wrong!

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models via a single API, along with a broad set of capabilities to build generative AI applications.

In the context of foundation models, 'deployment' refers to:

Making the trained model available for applications to use.

The initial collection of data for pre-training.

The ethical review process before model release.

The process of selecting the best model architecture.

It's about making the model usable by applications.

Show hint

Correct! Wrong!

Deployment is the process of making a trained and evaluated foundation model available for use in applications, often via an API endpoint.

The term 'multimodal AI' refers to models that can:

Process information from multiple data types (e.g., text and images).

Be deployed in multiple cloud environments.

Only process and generate text.

Be fine-tuned using multiple different datasets.

Think 'multiple modes' of data.

Show hint

Correct! Wrong!

Multimodal AI models are designed to process and understand information from multiple types of data, such as text, images, audio, and video, simultaneously.

Which stage of the ML development lifecycle involves splitting data into training, validation, and test sets?

Model Deployment

Model Evaluation

Data Preparation

Problem Definition

This happens before model training.

Show hint

Correct! Wrong!

Data preparation is the stage where data is cleaned, transformed, and split into appropriate sets for training and evaluating the model.

Which of the following best describes Artificial Intelligence (AI)?

The ability of machines to perform complex calculations faster than humans.

A type of advanced data analytics.

The simulation of human intelligence in machines.

The use of robots to automate physical tasks.

Think about machines mimicking human cognitive functions.

Show hint

Correct! Wrong!

AI refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions. This can include learning, problem-solving, and decision-making.

A company wants to build a chatbot that can answer customer queries based on its internal knowledge base. This is a common use case for:

Image recognition models.

Predictive maintenance algorithms.

Generative AI.

Traditional rule-based systems only.

Think about AI that can generate conversational responses.

Show hint

Correct! Wrong!

Generative AI, particularly LLMs, excels at understanding natural language and generating human-like responses, making them ideal for building chatbots and virtual assistants.

An e-commerce company wants to group its customers into distinct segments based on their purchasing behavior without any predefined labels. Which ML technique is most suitable?

Classification

Regression

Time series forecasting

Clustering

It's about finding natural groupings in data.

Show hint

Correct! Wrong!

Clustering is an unsupervised learning technique used to group similar data points together based on their characteristics, without prior knowledge of the groups.

What does 'explainability' in AI refer to?

The ability to understand and describe how an AI model makes its decisions.

The speed at which the AI model can be trained.

The number of tasks an AI model can perform.

The AI's ability to generate human-like language.

It's about understanding the 'why' behind an AI's decision.

Show hint

Correct! Wrong!

Explainability (or interpretability) is the ability to explain how an AI model arrived at a particular decision or prediction in terms understandable to humans.

What is a significant legal risk associated with using Generative AI for content creation?

Intellectual property infringement.

Difficulty in finding skilled prompt engineers.

Limited availability of pre-trained models.

High computational costs.

Consider who owns the data the model was trained on.

Show hint

Correct! Wrong!

Generative AI models trained on copyrighted material may inadvertently produce outputs that infringe on existing intellectual property rights, leading to legal challenges.

Adjusting the 'input/output length' parameter for a foundation model primarily affects:

The model's factual accuracy.

The maximum number of tokens processed or generated.

The ethical bias of the model.

The creativity of the model's responses.

It relates to the amount of text the model handles.

Show hint

Correct! Wrong!

This parameter directly controls the maximum number of tokens the model will process as input or generate as output, impacting resource usage and the scope of the interaction.

Which stage of the foundation model lifecycle focuses on assessing the model's performance on specific tasks or benchmarks?

Prompt Engineering

Data Curation

Evaluation

Pre-training

This stage measures how well the model performs.

Show hint

Correct! Wrong!

The evaluation stage involves testing the foundation model (either pre-trained or fine-tuned) against various benchmarks and metrics to understand its capabilities and limitations.

What does 'fine-tuning' a foundation model involve?

Further training a pre-trained model on a smaller, specific dataset.

Reducing the size of a pre-trained model for faster inference.

Slightly adjusting the input prompts to a pre-trained model.

Building a new model from scratch using a small dataset.

It's about adapting a general model to a specific need.

Show hint

Correct! Wrong!

Fine-tuning involves taking a pre-trained foundation model and further training it on a smaller, task-specific dataset to adapt its capabilities to that particular task.

Which method of fine-tuning involves adapting a pre-trained model to a new task that is similar to the one it was originally trained on, using a relatively small dataset for the new task?

Transfer Learning

Pre-training from scratch

Reinforcement Learning from Human Feedback (RLHF)

Zero-shot prompting

It's about 'transferring' knowledge from one task to another.

Show hint

Correct! Wrong!

Transfer learning is a technique where a model developed for a task is reused as the starting point for a model on a second, similar task. It's particularly useful when the dataset for the new task is small.

Which AWS service is primarily used for building, training, and deploying machine learning models at scale?

AWS Lambda

Amazon SageMaker

Amazon S3

Amazon EC2

It's AWS's flagship ML platform.

Show hint

Correct! Wrong!

Amazon SageMaker is a fully managed service that provides developers and data scientists with the ability to build, train, and deploy machine learning models quickly.

A 'human-in-the-loop' system for AI is designed to:

Completely automate all decision-making without human intervention.

Replace human workers with AI entirely.

Allow humans to intervene, review, or correct AI model outputs or decisions.

Focus solely on training AI models using human-generated data.

It involves human oversight or intervention.

Show hint

Correct! Wrong!

Human-in-the-loop systems combine machine and human intelligence, where humans can review, validate, or correct AI decisions, especially in critical or ambiguous cases.

A business wants to use a foundation model to answer questions based on its proprietary company documents. Which approach would be most suitable for this?

Fine-tuning the model on a very small, generic dataset.

Using a general pre-trained model without any modifications.

Training a new foundation model from scratch using only the company documents.

Implementing Retrieval-Augmented Generation (RAG) with the company documents as the knowledge source.

The model needs to access and use specific, private documents.

Show hint

Correct! Wrong!

Retrieval-Augmented Generation (RAG) allows the model to retrieve relevant information from the company's documents (the external knowledge base) and use that information to generate answers, reducing hallucinations and improving factual accuracy.

One of the key elements of training a foundation model (during pre-training) is:

When preparing data for fine-tuning a foundation model, what does 'data curation' primarily involve?

What are 'embeddings' in the context of Generative AI and NLP?

Which of the following is a common use case for Generative AI?

If a foundation model consistently produces biased or unfair outputs, what is a common approach to mitigate this after deployment?

What is 'model versioning' and why is it important for AI governance?

In the context of foundation model inference parameters, what does 'temperature' typically control?

When deploying an AI model that handles sensitive data, what is a key security consideration for the inference endpoint?

What is the first step in a typical Machine Learning development lifecycle?

What is the primary purpose of Retrieval-Augmented Generation (RAG) in applications using foundation models?

Which of these is a common use case for Natural Language Processing (NLP)?

Which concept is crucial for data privacy when working with AI models that process personal information?

When selecting a pre-trained foundation model for a specific task, which factor is LEAST likely to be a primary design consideration?

Identifying whether an email is 'spam' or 'not spam' is an example of what kind of ML problem?

What does 'prompt engineering' refer to in Generative AI?

What is a best practice for securing data used to train AI models on AWS?

What is a key consideration when choosing a pre-trained model for an application with strict low-latency requirements?

According to the AWS Shared Responsibility Model for AI/ML services like Amazon SageMaker, what is AWS responsible for?

The 'pre-training' phase of a foundation model typically involves training on:

Which of the following is a core principle of Responsible AI?

Which ethical concern is particularly relevant to generative AI models that can create realistic but fake images or videos (deepfakes)?

What is the primary difference between Machine Learning (ML) and traditional programming?

Which of these is a key method for fine-tuning foundation models to align them better with human preferences and instructions?

Which of the following best describes the 'effect of inference parameters on model responses'?

Which AWS service helps you manage and enforce permissions for accessing AWS resources, including AI services?

Why is 'transparency' important in Responsible AI?

What is a 'business application' of Retrieval-Augmented Generation (RAG)?

What is a key characteristic of a 'transformer-based model'?

A company wants to predict house prices based on features like size, number of bedrooms, and location. Which type of ML problem is this?

What is 'Deep Learning' a subfield of?

What type of data is used in supervised learning?

Which of the following is a common metric used to evaluate the performance of text summarization models?

Which of these is NOT typically considered a direct use case of Generative AI?

The 'robustness' of an AI model refers to its ability to:

Which AWS service allows you to build, train, and deploy machine learning models, and also provides tools for managing the entire ML lifecycle, including features for foundation model hosting and fine-tuning?

What is 'batch inferencing' in the context of ML models?

What is a potential disadvantage of Generative AI models like LLMs?

What is the primary purpose of Amazon SageMaker Clarify?

What is a 'foundation model' in the context of Generative AI?

What is 'data labeling' in the context of preparing data for fine-tuning a supervised ML model?

What is the importance of 'data lineage' in AI governance?

What are 'tokens' in the context of Large Language Models (LLMs)?

Which AWS service can be used to log API calls made to AWS services, including AI services, for security analysis and compliance auditing?

What is the 'Transformer' architecture known for in AI?

What is a key characteristic of unstructured data?

When evaluating a foundation model, 'latency' refers to:

If an AI application needs to comply with specific industry regulations (e.g., HIPAA for healthcare), what is a key responsibility of the customer using AWS AI services?

Which of these is a strategy for mitigating bias in AI datasets?

What is a 'vector database' often used for in conjunction with RAG systems?

Which AWS service provides access to a range of foundation models from AI21 Labs, Anthropic, Stability AI, and Amazon, along with tools to build generative AI applications?

In the context of foundation models, 'deployment' refers to:

The term 'multimodal AI' refers to models that can:

Which stage of the ML development lifecycle involves splitting data into training, validation, and test sets?

Which of the following best describes Artificial Intelligence (AI)?

A company wants to build a chatbot that can answer customer queries based on its internal knowledge base. This is a common use case for:

An e-commerce company wants to group its customers into distinct segments based on their purchasing behavior without any predefined labels. Which ML technique is most suitable?

What does 'explainability' in AI refer to?

What is a significant legal risk associated with using Generative AI for content creation?

Adjusting the 'input/output length' parameter for a foundation model primarily affects:

Which stage of the foundation model lifecycle focuses on assessing the model's performance on specific tasks or benchmarks?

What does 'fine-tuning' a foundation model involve?

Which method of fine-tuning involves adapting a pre-trained model to a new task that is similar to the one it was originally trained on, using a relatively small dataset for the new task?

Which AWS service is primarily used for building, training, and deploying machine learning models at scale?

A 'human-in-the-loop' system for AI is designed to:

A business wants to use a foundation model to answer questions based on its proprietary company documents. Which approach would be most suitable for this?

You may also like

About the author

Arslan Khan