Orchestrating Intelligent Agents: Building Workflows with OpenAI GPT OSS on Amazon SageMaker and Bedrock AgentCore

Introduction

Large Language Models (LLMs) have demonstrated remarkable proficiency in understanding and generating human language. However, creating sophisticated, real-world agentic applications necessitates robust mechanisms for workflow management, seamless tool integration, and effective context handling. Multi-agent architectures offer a powerful solution by decomposing complex systems into specialized, collaborative agents. This approach, while effective, introduces its own set of challenges related to agent coordination, memory retention, and overall workflow orchestration.

This tutorial will guide you through the process of building an agentic stock analyzer. We will leverage key AWS services and open-source frameworks to create a scalable and modular system. The core components include LangGraph for orchestrating the multi-agent framework and Amazon Bedrock AgentCore for deploying these agents.

Solution Overview

The architecture we will build comprises a pipeline of specialized agents designed to work collaboratively. A user's query initiates a workflow that is managed by Amazon Bedrock AgentCore Runtime, running entirely on AWS. This pipeline includes distinct agents, such as a Data Gathering Agent, a Stock Performance Analyzer Agent, and a Stock Report Generation Agent. Each agent is responsible for a specific phase of the stock evaluation process.

These agents interact within the Amazon Bedrock AgentCore Runtime environment. When advanced language understanding or generation is required, they invoke a GPT OSS model hosted on Amazon SageMaker AI. The model processes the input and returns structured outputs, which then inform the subsequent actions of the agents. This results in a fully serverless, modular, and scalable agentic system that effectively utilizes open-source models.

Prerequisites: Deploying GPT-OSS Models to SageMaker Inference

For organizations that require customization of their models and frameworks, SageMaker provides a fully managed hosting platform. This platform handles the complexities of provisioning the necessary infrastructure, including GPUs and serving frameworks, and manages the deployment of models. OpenAI's GPT-OSS models, featuring a 4-bit quantization scheme (MXFP4), enable rapid inference while minimizing resource consumption. These models are compatible with various GPU instances, including P5 (H100), P6 (H200), P4 (A100), and G6e (L4) instances.

The GPT-OSS models are architected as sparse Mixture of Experts (MoE) models, available in configurations with 128 experts (120B parameters) or 32 experts (20B parameters). In these architectures, only a subset of experts (specifically, four per token) are activated for processing, with no shared experts. Using MXFP4 quantization for the MoE weights alone significantly reduces model sizes, making the 120B model approximately 63 GB and the 20B model around 14 GB. This reduction in size allows these large models to run efficiently on a single H100 GPU.

To deploy these models effectively, a high-performance serving framework like vLLM is essential. We will construct a vLLM container with the latest version that supports GPT OSS models on SageMaker AI. This setup will be deployed within a SageMaker Studio environment, utilizing JupyterLab for running notebooks.

The deployment configuration involves specifying parameters such as the inference image URI, instance type, number of GPUs, and model name. For instance:

inference_image = f"{{account_id}}.dkr.ecr.{region}.amazonaws.com/vllm:v0.10.0-gpt-oss" 
instance_type = "ml.g6e.4xlarge" 
num_gpu = 1 
model_name = sagemakerutils.name_from_base("model-byoc") 
endpoint_name = model_name 
inference_component_name = f"ic-{model_name}" 
config = { 
    "OPTION_MODEL": "openai/gpt-oss-20b", 
    "OPTION_SERVED_MODEL_NAME": "model", 
    "OPTION_TENSOR_PARALLEL_SIZE": json.dumps(num_gpu), 
    "OPTION_ASYNC_SCHEDULING": "true", 
}

Once the deployment configuration is prepared, you can deploy the model to SageMaker AI using the following Python code:

from sagemaker.compute_resource_requirements.resource_requirements import ResourceRequirements 

lmi_model = sagemaker.Model( 
    image_uri=inference_image, 
    env=config, 
    role=role, 
    name=model_name, 
)

lmi_model.deploy( 
    initial_instance_count=1, 
    instance_type=instance_type, 
    container_startup_health_check_timeout=600, 
    endpoint_name=endpoint_name, 
    endpoint_type=sagemaker.enums.EndpointType.INFERENCE_COMPONENT_BASED, 
    inference_component_name=inference_component_name, 
    resources=ResourceRequirements(requests={"num_accelerators": num_gpu, "memory": 1024*5, "copies": 1,}), 
)

After successful deployment, you can test the model with an inference request:

payload = { 
    "messages": [ 
          {"role": "user", "content": "Name popular places to visit in London?"} 
    ] 
}
res = llm.predict(payload)
print("-----\n" + res["choices"][0]["message"]["content"] + "\n-----\n")
print(res["usage"])

The output will provide a list of popular places in London, demonstrating the model's ability to understand and respond to natural language queries.

Building a Stock Analyzer Agent with LangGraph

LangGraph provides a powerful framework for constructing agentic workflows. For our stock analyzer, we define three essential tools:

`gather_stock_data` tool: This tool retrieves comprehensive stock data for a specified ticker symbol. It collects current pricing, historical performance, financial metrics, and recent news headlines, returning this information in a structured format.
`analyze_stock_performance` tool: This tool performs in-depth technical and fundamental analysis of the gathered stock data. It calculates key metrics such as price trends, volatility, and overall investment scores, evaluating factors like P/E ratios, profit margins, and dividend yields to provide a thorough performance analysis.
`generate_stock_report` tool: This tool creates professional PDF reports from the stock data and analysis. These reports are automatically uploaded to Amazon S3 in organized, date-based folders.

For iterative development and local testing, you can utilize a simplified version of the system by importing the necessary functions directly from your local script. For example:

from langgraph_stock_local import langgraph_stock_sagemaker
# Test the agent locally
result = langgraph_stock_sagemaker({
    "prompt": "Analyze SIM_STOCK Stock for Investment purposes."
})
print(result)

This local testing approach allows for rapid iteration on agent logic before deploying to a scalable platform, ensuring each component functions correctly and the overall workflow yields the desired results across various stock analysis scenarios.

Deploying to Amazon Bedrock AgentCore

Once your LangGraph framework has been developed and tested locally, the next step is to deploy it to Amazon Bedrock AgentCore Runtime. Amazon Bedrock AgentCore simplifies infrastructure management by handling container orchestration, session management, and scalability, providing persistent execution environments that maintain an agent's state across multiple invocations.

First, you need to create an IAM role with the necessary permissions for Bedrock AgentCore to interact with your SageMaker endpoint. This can be done using a utility function like `create_bedrock_agentcore_role`:

from create_agentcore_role import create_bedrock_agentcore_role
role_arn = create_bedrock_agentcore_role(
    role_name="MyStockAnalyzerRole",
    sagemaker_endpoint_name="your-endpoint-name",
    region="us-west-2"
)

After creating the role, you can leverage the Amazon Bedrock AgentCore Starter Toolkit to streamline the deployment process. This toolkit packages your code, creates the necessary container image, and configures the runtime environment:

from bedrock_agentcore_starter_toolkit import Runtime
agentcore_runtime = Runtime()
# Configure the agent
response = agentcore_runtime.configure(
    entrypoint="langgraph_stock_sagemaker_gpt_oss.py",
    execution_role=role_arn,
    auto_create_ecr=True,
    requirements_file="requirements.txt",
    region="us-west-2",
    agent_name="stock_analyzer_agent"
)
# Deploy to the cloud
launch_result = agentcore_runtime.launch(local=False, local_build=False)

When using `BedrockAgentCoreApp`, an HTTP server is automatically created, listening on port 8080. This server implements the required `/invocations` endpoint for processing agent requests and the `/ping` endpoint for health checks, crucial for asynchronous agents. It also manages content types, response formats, and error handling according to AWS standards.

Following a successful deployment to Amazon Bedrock AgentCore Runtime, the agent's status will appear as Ready in the Amazon Bedrock AgentCore console.

Invoking the Agent

After the agent is created and deployed, you need to set up the invocation entry point. With Amazon AgentCore Runtime, the invocation logic is decorated with the `@app.entrypoint` decorator, serving as the entry point for the runtime. Once deployed, you can invoke the agent using the AWS SDK:

import boto3
import json
agentcore_client = boto3.client('bedrock-agentcore', region_name='us-west-2')
response = agentcore_client.invoke_agent_runtime(
    agentRuntimeArn=launch_result.agent_arn,
    qualifier="DEFAULT",
    payload=json.dumps({
          "prompt": "Analyze SIM_STOCK for investment purposes"
    })
)

Upon invoking the stock analyzer agent through Amazon Bedrock AgentCore Runtime, the response must be parsed and formatted for clear presentation. The response processing involves several steps:

Decoding the byte stream received from Amazon Bedrock AgentCore into readable text.
Parsing the JSON response that contains the complete stock analysis.
Extracting three main sections using regular expression pattern matching:
- Stock Data Gathering Section: This section extracts core stock information, including the symbol, company details, current pricing, market metrics, financial ratios, trading data, and recent news headlines.
- Performance Analysis section: This section analyzes technical indicators, fundamental metrics, and volatility measures to generate a comprehensive stock analysis.
- Stock Report Generation Section: This section details the generation of a PDF report containing all the stock technical analysis.

The system also incorporates error handling mechanisms. These gracefully manage JSON parsing errors, provide a fallback to plain text display if structured parsing fails, and offer debugging information to assist in troubleshooting parsing issues of the stock analysis response.

stock_analysis = parse_bedrock_agentcore_stock_response(invoke_response)

This formatted output facilitates a straightforward review of the agent's decision-making process and enables the presentation of professional stock analysis results to stakeholders, thereby completing the end-to-end workflow from model deployment to actionable business insights.

STOCK DATA GATHERING REPORT:
================================
Stock Symbol: SIM_STOCK
Company Name: Simulated Stock Inc.
Sector: SIM_SECTOR
Industry: SIM INDUSTRY
CURRENT MARKET DATA:
- Current Price: $29.31
- Market Cap: $3,958
- 52-Week High: $29.18
- 52-Week Low: $16.80
- YTD Return: 1.30%
- Volatility (Annualized): 32.22%
FINANCIAL METRICS:
- P/E Ratio: 44.80
- Forward P/E: 47.59
- Price-to-Book: 11.75
- Dividend Yield: 0.46%
- Revenue (TTM): $4,988
- Profit Margin: 24.30%

STOCK PERFORMANCE ANALYSIS:
===============================
Stock: SIM_STOCK | Current Price: $29.31
TECHNICAL ANALYSIS:
- Price Trend: SLIGHT UPTREND
- YTD Performance: 1.03%
- Technical Score: 3/5
FUNDAMENTAL ANALYSIS:
- P/E Ratio: 34.80
- Profit Margin: 24.30%
- Dividend Yield: 0.46%
- Beta: 1.165
- Fundamental Score: 3/5
STOCK REPORT GENERATION:
===============================
Stock: SIM_STOCK 
Sector: SIM_INDUSTRY
Current Price: $29.78
REPORT SUMMARY:
- Technical Analysis: 8.33% YTD performance
- Report Type: Comprehensive stock analysis for informational purposes
- Generated: 2025-09-04 23:11:55
PDF report uploaded to S3: s3://amzn-s3-demo-bucket/2025/09/04/SIM_STOCK_Stock_Report_20250904_231155.pdf
REPORT CONTENTS:
• Executive Summary with key metrics
• Detailed market data and financial metrics
• Technical and fundamental analysis
• Professional formatting for documentation

Clean Up

To avoid incurring unnecessary costs after completing your testing, it is crucial to delete the SageMaker endpoint and related resources. This can be achieved by running the following commands in your notebook environment:

sess.delete_inference_component(inference_component_name)
sess.delete_endpoint(endpoint_name)
sess.delete_endpoint_config(endpoint_name)
sess.delete_model(model_name)

Additionally, you should clean up the Amazon Bedrock AgentCore resources. This can be done using the following commands:

runtime_delete_response = agentcore_control_client.delete_agent_runtime(
    agentRuntimeId=launch_result.agent_id
)
response = ecr_client.delete_repository(
    repositoryName=launch_result.ecr_uri.split('/')[1],
    force=True
)

Conclusion

In this tutorial, we have successfully built and deployed an end-to-end solution for a multi-agent stock analysis system. We utilized OpenAI's open-weight models deployed on Amazon SageMaker AI, orchestrated the workflow with LangGraph, and managed the deployment seamlessly using Amazon Bedrock AgentCore. This implementation showcases how organizations can leverage powerful open-source LLMs cost-effectively, enhanced by efficient serving frameworks like vLLM.

Beyond the technical implementation, this workflow offers significant business value. It can drastically reduce stock analysis processing times and increase analyst productivity by automating routine assessments. By offloading repetitive tasks to AI agents, organizations can empower their skilled analysts to focus on more complex cases and strategic relationship-building activities that are critical for business growth. This approach not only optimizes operational efficiency but also enhances the strategic capabilities of the workforce.