AWS Certified Generative AI Developer – Professional (AIP) Exam Prep Guide

～ Proving Your AWS Skills in the Generative AI Era ～

Introduction: What This Certification Means and How to Approach It
Exam Overview and Domains
Key Knowledge by Domain
- 3.1 Leveraging and Selecting Foundation Models
- 3.2 Prompt Engineering
- 3.3 RAG (Retrieval-Augmented Generation) Architecture
- 3.4 Fine-Tuning and Model Customization
- 3.5 Building AI Agents
- 3.6 Security and Responsible AI
- 3.7 AgentCore (New Service)
What You'll Be Able to Do in Practice
Study Methods and Resource Guide
Pre-Exam Checklist

1. Introduction: What This Certification Means and How to Approach It

This document was created as a study guide for the AWS Certified Generative AI Developer – Professional exam (commonly referred to as AIP).

What This Certification Proves

This certification demonstrates your skills in developing and optimizing generative AI applications on AWS. As of 2026, demand for engineers with hands-on generative AI experience is growing rapidly. Earning this certification clearly signals to the market that you can work with AWS's generative AI stack at a professional level.

Core Study Philosophy

Advice: This exam tests understanding, not memorization. Always keep asking yourself "Why does this service exist?" and "What problem does it solve?" as you study.

2. Exam Overview and Domains

Item	Details
Exam Name	AWS Certified Generative AI Developer – Professional
Official Launch	April 2026 onward (beta prior to that)
Question Format	Multiple choice (single and multiple answer)
Exam Duration	180 minutes
Passing Score	750 / 1000 (scaled score)
Exam Fee	¥44,000 (tax included)

Main Exam Domains

Domain	Key Themes
Domain 1	Selecting and leveraging foundation models
Domain 2	Prompt engineering
Domain 3	Designing and building RAG architectures
Domain 4	Fine-tuning and customization
Domain 5	Building AI agents
Domain 6	Security, governance, and responsible AI

3. Key Knowledge by Domain

3.1 Leveraging and Selecting Foundation Models

★ What the Exam Tests

You'll need to understand the characteristics of Foundation Models (FMs) accessible through Amazon Bedrock and be able to select the optimal model for a given use case.

Models You Need to Know

Model Family	Provider	Key Characteristics / Strengths
Claude	Anthropic	Long-form comprehension, logical reasoning, safety-focused
Titan	Amazon	Text generation, embeddings, image generation. AWS-native
Llama	Meta	Open-source lineage, highly customizable
Mistral	Mistral AI	Lightweight and fast, cost-efficient
Stable Diffusion	Stability AI	Specialized for image generation
Command/Embed	Cohere	Strong at text generation and embeddings

Selection Criteria (Frequently Tested)

Factors to consider when selecting a model:

Task type: Text generation, summarization, code generation, image generation, embeddings, etc.
Accuracy requirements: Is high precision required, or is "good enough" acceptable?
Latency requirements: Is real-time response needed?
Cost: Input/output token pricing, inference costs
Context window: Maximum number of tokens the model can accept as input
Multimodal support: Is combined text + image processing needed?

Exam Tip: Expect many questions about the trade-off between "cost optimization" and "accuracy." For questions asking "What is the most cost-efficient approach?", the correct answer usually involves starting with a smaller model and scaling up only if needed.

Key Amazon Bedrock Features

Feature	Description
Model Access	API-based access to FMs from multiple providers
Playground	GUI-based test environment to try out models
Knowledge Bases	Managed service for building RAG pipelines
Agents	Autonomous task execution with external tool integration
Guardrails	Filtering for harmful content
Model Evaluation	Performance comparison across models
Customization	Fine-tuning and continued pre-training

3.2 Prompt Engineering

★ What the Exam Tests

You'll be tested on the name, characteristics, and appropriate use of each prompting technique. Questions in the format "Which prompting technique is best for this situation?" appear frequently.

Core Prompting Techniques

Zero-Shot Prompting

Giving instructions only, with no examples. Relies entirely on the model's pre-trained knowledge.

Please summarize the following text in 3 lines:
[text]

When to use: Simple tasks where the model's general capabilities are sufficient

Few-Shot Prompting

Providing a few input–output examples before presenting the actual task.

Review: "This product is amazing!" → Sentiment: Positive
Review: "It broke and is unusable." → Sentiment: Negative
Review: "It's okay." → Sentiment:

When to use: When you need the model to follow a specific format or classification criteria

Chain-of-Thought (CoT) Prompting

A technique that guides the model through a step-by-step reasoning process. Add instructions like "Think through this step by step."

Problem: A store has 12 apples. 8 are sold, then 5 more arrive.
How many apples are there? Think through this step by step.

When to use: Complex tasks requiring mathematical reasoning or logical thinking

System Prompts

A prompt that defines the model's role, constraints, and behavior. This part is not visible to the end user.

You are an AWS technical support engineer.
Only answer questions related to AWS services.
Keep your responses under 200 characters.

When to use: Any application where consistent response quality needs to be maintained

Prompt Optimization Best Practices

Practice	Description
Be specific	Avoid vague instructions; explicitly state output format, length, and tone
Use delimiters	Separate input sections with XML tags or dividers
Iterate	Don't aim for perfection on the first try; test and refine repeatedly
Use negative instructions	Constraints like "Do not…" are also effective
Tune temperature	Low temperature = deterministic; high temperature = creative

Inference Parameters (Frequently Tested)

Parameter	Role	Effect of Value
Temperature	Controls randomness of output	Low → precise and consistent; High → diverse and creative
Top P	Limits candidate tokens by cumulative probability	Low → conservative; High → diverse
Top K	Selects from the top K candidate tokens	Small → conservative; Large → diverse
Max Tokens	Maximum number of output tokens	Affects cost and response length
Stop Sequences	String(s) that halt generation	Useful for controlling output format

Exam Tip: "Tasks where accuracy matters (code generation, fact-based answers)" → Low Temperature "Tasks where creativity matters (brainstorming, story writing)" → High Temperature This judgment call comes up constantly.

3.3 RAG (Retrieval-Augmented Generation) Architecture

★ What the Exam Tests

The exam heavily focuses on RAG's architecture, the role of each component, and vector database selection. Build patterns using Amazon Bedrock Knowledge Bases are one of the most important topics.

What Is RAG?

RAG (Retrieval-Augmented Generation) is an architecture that retrieves relevant information from external data sources and passes it as context to an LLM to generate a response.

It reduces hallucination (plausible-sounding but incorrect answers) — a known weakness of standalone LLMs — and enables accurate responses grounded in up-to-date internal data.

RAG Architecture (Processing Flow)

┌──────────────────────────────────────────────────────────────┐
│                    RAG Processing Flow                        │
│                                                              │
│  User Question                                               │
│      ↓                                                       │
│  ① Vectorize the question using an Embedding model           │
│      ↓                                                       │
│  ② Similarity search in the vector DB (semantic search)      │
│      ↓                                                       │
│  ③ Retrieve relevant documents (chunks)                      │
│      ↓                                                       │
│  ④ Inject retrieved info + original question into prompt     │
│      ↓                                                       │
│  ⑤ LLM generates a response                                  │
│      ↓                                                       │
│  Return answer to user                                       │
└──────────────────────────────────────────────────────────────┘

Data Ingestion Pipeline

┌──────────────────────────────────────────────────────────────┐
│               Data Ingestion Pipeline                         │
│                                                              │
│  Data Sources (S3, Web Crawler, etc.)                        │
│      ↓                                                       │
│  ① Load and parse documents                                  │
│      ↓                                                       │
│  ② Chunking (split documents into smaller units)             │
│      ↓                                                       │
│  ③ Vectorize using an Embedding model                        │
│      ↓                                                       │
│  ④ Store in vector DB                                        │
└──────────────────────────────────────────────────────────────┘

Chunking Strategies (Frequently Tested)

Strategy	Description	Best For
Fixed-size	Mechanically split by a set token count	Simple, fast, general-purpose
Semantic	Split by meaningful units	When semantic coherence is important
Hierarchical	Split into parent and child chunks	When both broad context and fine detail are needed
Overlapping	Overlap chunk boundaries	When you want to prevent information loss at boundaries

Exam Tip: Chunks too large → more noise, lower accuracy, higher cost Chunks too small → context is lost, meaningful answers become impossible Expect questions about "choosing the right chunk size."

Vector Database Options

Service	Characteristics	Exam Role
Amazon OpenSearch Serverless	Serverless, hybrid full-text + vector search	Most frequently tested. Appears most often in RAG questions
Amazon Aurora PostgreSQL (pgvector)	Adds vector search to a relational DB	When leveraging an existing RDB
Amazon Neptune	Graph DB + vector search	Combined with knowledge graphs
Pinecone	Third-party, purpose-built for vector search	Can connect from Bedrock Knowledge Bases
Redis Enterprise Cloud	High-speed in-memory + vector search	Low-latency requirements

Amazon Bedrock Knowledge Bases Configuration

Amazon Bedrock Knowledge Bases is a service that lets you build the RAG pipeline described above as a fully managed solution.

Supported Data Sources:

Amazon S3 (most common)
Web Crawler
Confluence
SharePoint
Salesforce

Key Configuration Options:

Choice of Embedding model (Titan Embeddings, etc.)
Choice of chunking strategy
Choice of vector DB
Metadata filtering settings

Exam Tip: "Build a chatbot that returns accurate answers using internal documents" → RAG (Bedrock Knowledge Bases) is the go-to answer. Make sure you understand why RAG is preferred over fine-tuning (data freshness, cost, ease of implementation).

RAG vs. Fine-Tuning (A Very Frequently Tested Comparison)

Aspect	RAG	Fine-Tuning
Purpose	Improve answer accuracy by referencing external knowledge	Change model behavior or style
Data freshness	Can reference the latest data in real time	Depends on data available at training time
Cost	Increased tokens at inference (added context)	Training cost (GPU time) required
Implementation complexity	Relatively simple (especially with Bedrock Knowledge Bases)	Requires data prep, training, and evaluation
Best use cases	Internal FAQs, knowledge search, referencing current info	Specific tone/format, learning domain-specific terminology

3.4 Fine-Tuning and Model Customization

★ What the Exam Tests

You'll be tested on the types of fine-tuning, their use cases, cost trade-offs, and when to use fine-tuning vs. RAG.

Comparison of Customization Approaches

Approach	Cost	Effectiveness	When to Apply
Prompt Engineering	Lowest	Limited	Always try this first
RAG	Moderate	Highly effective for knowledge expansion	When external knowledge retrieval is needed
Fine-Tuning	High	Highly effective for changing model behavior	Specialized tasks in a specific domain
Continued Pre-Training	Highest	Fundamentally adds domain knowledge	Adding a new language or specialized field

Exam Tip: When the question says "most cost-efficient" or "what should you try first," the correct answer pattern is: Prompt Engineering → RAG → Fine-Tuning, in that order.

The Fine-Tuning Process

Prepare training data: Create input–output pairs in JSONL format
Upload data to S3
Create a customization job in Bedrock
Train the model (Provisioned Throughput required)
Evaluate the custom model
Deploy and use

When to Choose Fine-Tuning

You want to teach the model a specific response style or tone
You need the model to understand industry-specific terminology or abbreviations
You need consistent output in a specific format (JSON, XML, etc.)
Prompt engineering and RAG don't achieve sufficient accuracy

3.5 Building AI Agents

★ What the Exam Tests

You'll be tested on how Amazon Bedrock Agents work, integrating actions with Knowledge Bases, and connecting to Lambda functions.

What Are Amazon Bedrock Agents?

Bedrock Agents enable an LLM to interact with external APIs and data sources to autonomously execute multi-step tasks.

Agent Components

Component	Description
Foundation Model	The LLM that serves as the agent's "brain"
Instructions	A prompt defining the agent's role and constraints
Action Groups	External operations the agent can perform (implemented via Lambda functions)
Knowledge Bases	Internal data the agent can reference (RAG)

Agent Execution Flow

User question
    ↓
Agent analyzes the question (orchestration)
    ↓
Selects actions as needed
    ├→ Search Knowledge Base → retrieve relevant info
    ├→ Invoke Lambda function → external API/DB operation
    └→ Additional reasoning needed → query model again
    ↓
Generate final answer and return to user

Defining Action Groups

Action groups are defined using an OpenAPI schema and linked to a backend Lambda function.

# Example OpenAPI schema
paths:
  /getOrderStatus:
    get:
      summary: "Get the status of an order"
      parameters:
        - name: orderId
          description: "Order ID"
          required: true

Exam Tip: "Retrieve or update data from an external system based on a user's question" → Bedrock Agents + Action Groups (Lambda) is the correct answer pattern.

3.6 Security and Responsible AI

★ What the Exam Tests

You'll be tested on configuring Guardrails, ensuring data privacy, IAM-based access control, and harmful content filtering.

Amazon Bedrock Guardrails

Feature	Description
Content Filters	Detect and block violent, sexual, or discriminatory content
Denied Topics	Refuse to respond to specific topics
Word Filters	Block specific words or phrases
PII Detection	Detect and mask personally identifiable information
Contextual Grounding	Hallucination detection (verifying alignment with source)

Security Best Practices

Least privilege with IAM: Restrict Bedrock model access to the minimum necessary
Use VPC endpoints: Access Bedrock without going through the public internet
Encrypt data: At rest (KMS) and in transit (TLS)
Audit with CloudTrail: Log all API calls
Model invocation logging: Record inputs and outputs (S3 / CloudWatch Logs)

Responsible AI Principles (Tested Points)

Fairness: Detecting and mitigating bias
Explainability: Being able to present the reasoning behind model decisions
Privacy: Handling personal data appropriately
Safety: Preventing harmful outputs
Transparency: Disclosing when content is AI-generated

Exam Tip: "How do you prevent data leakage when processing data that may contain PII with an LLM?" → Bedrock Guardrails (PII detection and masking) is the correct answer.

3.7 AgentCore (New Service: Announced 2025)

★ What the Exam Tests

Since AgentCore is a relatively new service, the exam focuses on understanding its basic positioning and key components rather than deep technical details. In some questions, AgentCore may be the ideal answer but not appear as an option — a solid grasp of the overview is enough to handle those.

What Is AgentCore?

AgentCore is a managed production infrastructure service that supports the shift from "calling a model" to "operating autonomous agents".

Before AgentCore, the dominant pattern was "call an LLM, get a response (+ RAG)." In 2025, the paradigm fundamentally shifted toward building agents that plan, execute, learn, and act autonomously.

The "6 Challenges" from POC to Production

AgentCore was built to address the following challenges:

Challenge	Description
① Accuracy	Real users don't behave the way demos assume
② Scalability	Supporting many users across many domains
③ Memory	Safe memory management across users and agents
④ Security	Access control to production systems and real data
⑤ Cost	Controlling inference token and hosting costs
⑥ Observability	Real-time visibility into agent behavior

AgentCore's 7 Key Components

← The Nobita-Type Talent Theory The Conditions for Being the Rarest Kind of Person Approaches and Limits of Weather Map Analysis Using Generative AI [Part 1] — The "Plausible Lies" Told with Absolute Confidence → © 2026 da-leca · sitemap

AWS Certified Generative AI Developer – Professional (AIP) Exam Prep Guide

～ Proving Your AWS Skills in the Generative AI Era ～

Table of Contents

1. Introduction: What This Certification Means and How to Approach It

What This Certification Proves

Core Study Philosophy

2. Exam Overview and Domains

Main Exam Domains

3. Key Knowledge by Domain

3.1 Leveraging and Selecting Foundation Models

★ What the Exam Tests

Models You Need to Know

Selection Criteria (Frequently Tested)

Key Amazon Bedrock Features

3.2 Prompt Engineering

★ What the Exam Tests

Core Prompting Techniques

Zero-Shot Prompting

Few-Shot Prompting

Chain-of-Thought (CoT) Prompting

System Prompts

Prompt Optimization Best Practices

Inference Parameters (Frequently Tested)

3.3 RAG (Retrieval-Augmented Generation) Architecture

★ What the Exam Tests

What Is RAG?

RAG Architecture (Processing Flow)

Data Ingestion Pipeline

Chunking Strategies (Frequently Tested)

Vector Database Options

Amazon Bedrock Knowledge Bases Configuration

RAG vs. Fine-Tuning (A Very Frequently Tested Comparison)

3.4 Fine-Tuning and Model Customization

★ What the Exam Tests

Comparison of Customization Approaches

The Fine-Tuning Process

When to Choose Fine-Tuning

3.5 Building AI Agents

★ What the Exam Tests

What Are Amazon Bedrock Agents?

Agent Components

Agent Execution Flow

Defining Action Groups

3.6 Security and Responsible AI

★ What the Exam Tests

Amazon Bedrock Guardrails

Security Best Practices

Responsible AI Principles (Tested Points)

3.7 AgentCore (New Service: Announced 2025)

★ What the Exam Tests

What Is AgentCore?

The "6 Challenges" from POC to Production

AgentCore's 7 Key Components

Related posts