AWS re:Invent 2025 - Keynote with CEO Matt Garman
By AWS Events
Summary
## Key takeaways - **AWS $132B Revenue, 20% Growth**: AWS has grown to be a $132 billion business accelerating 20% year-over-year. The amount we grew in the last year alone, it's about $22 billion, larger than the annual revenue of more than half of the Fortune 500. [03:26], [03:40] - **P6e-GB300 NVL72 GA Launch**: I'm excited to announce the new P6E GB300. These are powered by NVIDIA's latest GB300 NVL72 systems and we continue to bring you the best-in-class compute for the most demanding AI workloads. [15:56], [16:05] - **Trainium3: 5x Tokens/Megawatt**: Trainium3 ultra servers bring another huge leap forward, 4.4x more compute, 3.9 times the memory bandwidth, and five times more AI tokens per megawatt of power. [22:40], [23:09] - **Nova2 Pro Tops GPT-4.1 Benchmarks**: Nova 2 Pro delivers better absolute results compared to leading models like GPT 4.1, Gemini 3 Pro, and Claude 45 Sonnet in instruction following and agentic tool use. [31:58], [33:40] - **Nova Forge: Custom Frontier Models**: Nova Forge is a new service that introduces the concept of open training models. You get exclusive access to a variety of Nova training checkpoints and blend in your own proprietary data to produce a model that deeply understands your information. [41:57], [42:10] - **Kiro Agent: 6 Devs, 76 Days**: Instead of taking 30 developers 18 months to complete the project, they delivered the entire rearchitecture with only six people in 76 days using Kiro autonomous agent. [01:41:55], [01:38:41]
Topics Covered
- AWS Grew $22B Last Year, Beats Half Fortune 500
- AI Agents Mark Inflection, Billions Coming
- Tranium 3 Delivers 5x AI Tokens per Megawatt
- Nova Forge Blends Data into Frontier Training
- Frontier Agents Reduce Months to Hours
Full Transcript
[Music]
Heat. Heat.
Heat. Heat.
[Music]
[Music] [Music]
Wow.
[Music]
[Applause] [Music] Heat.
Heat.
[Music] Heat.
[Music] Heat.
Heat.
[Music] [Music] Please welcome the CEO of AWS, Matt Garmin.
Welcome everyone to the 14th annual reinvent. It's so awesome to be here. We
reinvent. It's so awesome to be here. We
have over 60,000 people here with us in person and almost 2 million watching online, including a bunch of you that are joining us from Fortnite out there.
Uh it's where we're streaming the keynote for the first time. Welcome to
everybody and thank you all for joining us. It is
incredible to feel the energy as you walk through the halls here in Las Vegas and it matches a lot of what I've been seeing recently as I've been talking to you in recent months. It has been an
unbelievable year.
AWS has grown to be $132 billion business accelerating 20% year-over-year.
I want to put this a little bit in perspective. The amount we grew in the
perspective. The amount we grew in the last year alone, it's about $22 billion.
That absolute growth over the last 12 months is larger than the annual revenue of more than half of the Fortune 500.
And that growth is coming from across the business.
S3 continues to grow with customers storing more than 500 trillion objects, hundreds of exabytes of data, and every day averaging over 200 million requests
a second. For the third year in a row,
a second. For the third year in a row, more than half of the CPU capacity that we've added to the AWS cloud comes from Graviton.
We have millions of customers using our database services and Amazon Bedrock is now powering AI inference for more than a 100,000 companies around the world.
This year, we gave you the first building blocks for deploying and operating highly capable agents securely at enterprise scale with Bedrock Agent Core. And we're seeing incredibly strong
Core. And we're seeing incredibly strong momentum from agent core. In fact, just a few months since launch, agent core SDK has already been downloaded more than two million times. And we announced
Ocelot, our first quantum computing chip prototype. Ocelot is a breakthrough in
prototype. Ocelot is a breakthrough in making quantum a reality, reducing the costs of implementing uh quantum error correction by more than 90%.
Now, all of this starts with a foundation of secure, available, and resilient planet scale infrastructure that is frankly unmatched anywhere. AWS
has by far the largest and most broadly deployed AI cloud infrastructure anywhere in the world. Our global
network of data centers spans 38 regions, 120 availability zones, and we've already announced plans for three more regions. In fact, in the last year
more regions. In fact, in the last year alone, we've added 3.8 gawatt of data center capacity, more than anyone in the world. And we have the world's largest
world. And we have the world's largest private network, which has increased 50% over the last 12 months to now be more than 9 million kilometers of of terrestrial and subc cable. That's
enough optical cabling to reach from the Earth to the moon and back over 11 times.
But at Amazon, everything starts with the customer. So, I want to start off by
the customer. So, I want to start off by thanking all of you. Today, we have millions of customers running every imaginable use case. The largest
enterprises in the world across every single industry and vertical running their businesses on us. You've
transformed industries like financial services healthcare media entertainment telecommunication even government agencies all around the world. And as you all know at AWS,
world. And as you all know at AWS, security is priority one. For us,
everything is built on that foundation.
This is the reason why uh the US intelligence community has chosen AWS as its cloud of choice for more than a decade. It's why companies like NASDAQ
decade. It's why companies like NASDAQ have moved their trading markets to AWS and why Fizer chose AWS as their core for their digital transformation.
And since the beginning, we've known how important partners would be to our customer success. That's why we're so
customer success. That's why we're so proud to have such a massive network of partners. Many of them are here with us
partners. Many of them are here with us this week. Thank you to all of our
this week. Thank you to all of our partners, SAS providers, system integrators, and solution providers serving our massive set of customers all around the world. We couldn't do it without you.
And while I personally appreciate all of our customers, I will tell you that I have a special affinity for our startup customers. More Unicarm startups have
customers. More Unicarm startups have been built on AWS than anywhere else.
And it isn't even close. Thank you to all of you, the innovators out there.
More than ever, every startup and AI startups in particular are flocking to AWS. 85% of the Ford's 2025 AI50, 85% of
AWS. 85% of the Ford's 2025 AI50, 85% of the CNBC disruptor 50 are all running on AWS.
It's incredibly amazing and I'm personally amazed at what these founders are inventing and I thought you all might like to hear from some of them today. Let's hear from the first audio
today. Let's hear from the first audio shake who was the winner of last year's reinvent unicorn tank pitch competition.
Let's take a rainforest or a playground or no, better yet, three musicians on a street corner. Okay, what if we could
street corner. Okay, what if we could just isolate the music?
Now, the car driving by or just the conversation going on between the people in the background.
>> Know that guy.
>> Wait, where's our car? At AudioShake, we separate sound so that humans and machines can access it, make sense of it, and understand it in all kinds of new ways. Our multis speaker separator
new ways. Our multis speaker separator is the world's first highresolution separator of speakers into different streams. So, it could be isolating individual voices in environments like
call centers. We're also used across
call centers. We're also used across media and entertainment. But if we think about hearing and speaking impairments, there's a lot that sound separation can do to help. We work with some nonprofits
that work in the ALS space where they're using old recordings of their patients, separating the voices so that then it can be cloned and the patient can speak with their original voice before the
voice started to degrade. When we first started, we were a threeperson team.
Having the infrastructure to actually get our models in the hands of real customers is something that we couldn't have done without AWS. We run our entire production pipeline on AWS. So
everything from inference and storage through to job orchestration and all of production. We're moving into a world
production. We're moving into a world where sound should be a lot more customizable than it is today.
Eventually sound separation should be able to help people who have hearing challenges to hear the way they want to hear while also simultaneously going deeper into helping machines make sense
of the real world.
Very cool and thank you to the auto shake audioshake team for sharing. Now
none of what we do at AWS happens without builders and specifically developers. AWS has always been
developers. AWS has always been passionate about developers and this conference is and frankly it always has been a learning conference. It's a
little bit different and it's dedicated to all of you out there. Thank you to every developer out there and the millions of additional AWS developers all around the world. Uh a special call
out to our AWS heroes over here. I see
you. Thank you so much. Awesome.
And thank you also to the million plus members of our user group community in 129 countries all around the world.
>> So why do we do this? What motivates us?
Why are we just as passionate today as we were 20 years ago when we first launched AWS?
What drives us every day is giving you all the freedom to invent. This has been our motivation since when we started AWS, the very beginning. We wanted to make it possible for every developer or
inventor in her dorm room or garage to access the technology infrastructure and capabilities so that they could build whatever they could imagine. 20 years
ago, it just wasn't possible for uh developers or builders to get the servers or compute capacity that they needed without investing significant capital and time. Developers were
spending way too much of their time procuring servers, managing infrastructure, and not enough of that time building. We've actually felt this
time building. We've actually felt this ourselves inside of Amazon. We had a company full of builders who had these incredible ideas of how they could make our customers lives better, but they
couldn't move as fast as they wanted.
So we asked ourselves why not?
Why couldn't developers focus on building instead of on infrastructure?
Why couldn't we bring the time and the cost of experimentation down to zero?
Why not make every idea possible? And
we've spent the last two decades innovating towards those goals.
Giving all of you the freedom to keep inventing is why we're here today. And
right now we're witnessing an explosion of invention with AI. Every single
customer experience, every single company, frankly, every single industry is in the process right now of being reinvented. And we're still in the early
reinvented. And we're still in the early days of what AI is going to deliver. And
the technology is iterating faster than anything any of us have ever witnessed before. It wasn't that long ago that we
before. It wasn't that long ago that we were all testing and experimenting with chat bots. And now it seems like there's
chat bots. And now it seems like there's something new every day.
But when I speak to customers and many of you out there, you haven't yet seen the returns that match up to the promise of AI. The true value has of AI has not
of AI. The true value has of AI has not yet been unlocked. But a lot of that is changing fast, too. AI assistants are starting to give way to AI agents that
can perform tasks and automate on your behalf. This is where we're starting to
behalf. This is where we're starting to see material business returns from your AI investments. I believe that the
AI investments. I believe that the advent of the of AI agents has brought us to in an inflection point in AI's trajectory. It's turning from a
trajectory. It's turning from a technical wonder into something that delivers us real value.
This change is going to have as much impact on your business as the internet or the cloud. I believe that in the future there's going to be billions of agents inside of every company and across every imaginable field.
Already we see agents accelerating healthcare discoveries, improving customer service, making payroll processing more efficient, and agents are also starting to scale people's
impact up by 10x in some cases so they have more time to invent more.
Wouldn't it be awesome if everyone could see that level of impact? We think so.
And that's why we ask the question, why not? Getting to a future of billions of
not? Getting to a future of billions of agents where every organization is getting real world value and results from AI is going to require us to push the limits of what's possible with the infrastructure. We're going to have to
infrastructure. We're going to have to invent new building blocks for agentic systems and applications. We want to reimagine every single process and the way that all of us work. At AWS, we've been innovating at all of the layers of
sta of the stack to give you all the freedom to invent what's next.
We have a lot to share. Let's get
started.
First off, what are the components that you need to deliver agents that are going to truly deliver value for you? It
starts with having the most scalable and powerful AI infrastructure to power everything.
You have to have a highly scalable and secure cloud that delivers the absolute best performance for your AI workloads.
And you're going to want it at the lowest possible cost across your model training and customization and inference. Now, that's quite easy to
inference. Now, that's quite easy to say, but to deliver that requires optimizing across every single layer of hardware and software, and that is something that only AWS does. It turns
out there are no shortcuts. Now, when
you think about AI infrastructure, one of the first things that comes to mind is GPUs.
And AWS is by far the best place to run NVIDIA GPUs. We were actually the first
NVIDIA GPUs. We were actually the first to offer in Nvidia GPUs in the cloud and we've been collaborating together with Nvidia for over 15 years. And what that means is that we've learned to operate
GPUs at scale. In fact, if you talk to anyone who's run large GPU clusters at any other provider, they'll tell you that AWS is by far the most stable at running a GPU cluster. We're much better
at avoiding node failures and definitely deliver the best reliability. And the
reason for that is because we sweat the details. minor things like debugging
details. minor things like debugging BIOS to prevent GPU reboots. If you go to other places, they just kind of accept that as how it works and it's just you go on about your business, not
us. We'll investigate and root cause
us. We'll investigate and root cause every single one of them. And then we collaborate with our partners in Nvidia to make sure that we're making constant improvements. Nothing is too small for
improvements. Nothing is too small for us to be focused on. Those details
really matter and it's why we lead the industry in GPU reliability. It takes
hard work and real engineering to make that happen.
And we improve on those new gener on those new dimensions with every generation. This year we launched our P6
generation. This year we launched our P6 generation of EC2 instances featuring the Nvidia Blackwell processor, the P6E GB200 Ultra Server, which provides over
20x the compute compared to our previous P5N generation. These are ideal for
P5N generation. These are ideal for customers that are working with really large AI models out there. And we're
doing that again today. I'm excited to announce the new P6E GB300.
These are powered by NVIDIA's latest GB300 uh NVL72 systems and we continue to bring you the best-in-class compute for the most demanding AI workloads. Our
full stack approach to hardware and software plus operational rigor delivers you the absolute best performance and reliability for the biggest organizations in the world. This
includes Nvidia themselves by the way who run their large scale genai cluster project Saba on AWS and many others like OpenAI who are actively running on AWS
today. They're using clusters of EC2
today. They're using clusters of EC2 ultra servers with hundreds of thousands of chips today GB200s and soon GB300s and they have the ability to scale to more than tens of millions of CPUs to
manage their agentic workflows. All of
this is to support their chat GPT application which I'm sure many of you use as well as the training of their next generation models. Or take Humane.
Humane is Saudi Arabia's newly created company that's responsible for driving AI innovation in the region. We recently
announced a partnership together on this groundbreaking AI zone for the Kingdom of Saudi Arabia. This partnership will bring customers high performing infrastructure models, AI services like
SageMaker and Bedrock all while help while helping meeten the kingdom's uh standards for security, privacy and responsible AI standards. Now this type of work has sparked some interest in
others large government organizations and public sector in the public sector who are interested in a similar concept.
And so we sat back and asked ourselves, could we deliver this type of AI zone to a broader set of customers? Maybe even
something that could leverage customers existing data centers. And that's why today we're excited to announce AWS AI factories.
With this launch, we're enabling customers to deploy dedicated AI infrastructure for AWS in their own data centers for exclusive use for them.
Effectively, AWS AI factories operate like a private AWS region, letting customers leverage their own data center space and power capacity that they've already acquired. We also give them
already acquired. We also give them access to leading uh AWS AI infrastructure and services including the very latest and tranium ultra servers or Nvidia GPUs and access to services like SageMaker and Bedrock.
These AI factories operate exclusively for each customer and it helps them with that um separation maintaining the security and reliability that you get from AWS while also meeting stringent uh compliance and sovereignity
requirements. We're super excited to see
requirements. We're super excited to see what these AI factories unlock for customers.
Now AI at AWS we have always been about choice and if you want to have the absolute best in AI infrastructure you need to have the best compute for AI training and inference and AWS is by far
leading the way with the broadest set of options including our our groundbreaking purpose-built AI processors. AWS
trrenium is our custom AI chip designed to offer the best price performance for AI workloads.
Now, customers love Tranium for what it achieves for the training workloads, but I'm going have to pause and be a little bit vocally self-critical here. Uh,
people often give us a little bit of a hard time about product naming in AWS.
No, no, it's true.
Well, it turns out Tranium is no exception. We named it Trannium because
exception. We named it Trannium because it's designed to be an awesome trip for a chip for AI training. And it is, but as it turns out, Trrenium 2 is actually the best systems in the world currently
for inference. Uh customers often ask me
for inference. Uh customers often ask me how can I best take advantage of the benefits of tranium. And what I tell them is you probably are already using it and you just didn't know it.
In fact, if you look at all the inference that's running in Amazon Bedrock today, the majority is actually powered by Tranium already. And the
performance advantage of Tranium is really noticeable. If you're using any
really noticeable. If you're using any of Claude's latest generation models in Bedrock, all of that traffic is running on Tranium, which is delivering the best end toend response times compared to any
other major provider. And that's part of the reason why we've deployed over 1 million Tranium chips already to date.
Now, we've gotten to a million chips in record speed. And that's because the
record speed. And that's because the whole process, we control the whole stack. We can optimize end to end how we
stack. We can optimize end to end how we roll it out. And it allows us to move even faster. In fact, we've been able to
even faster. In fact, we've been able to ramp the volumes of Tranium 2 in our data center 4x faster than the next fastest chip we've ever uh AI chip we've ever ramped and we're selling those as
fast as we can make them. Tranium
already represents a multi-billion dollar business today and continues to grow really rapidly. Now, what does it look like when all of this comes together in a system purpose-built around Tranium?
It wasn't that long ago that people had this saying where they said that the data center was the new computer. Well,
when you're training this next generation of models, it turns out that the data center campus is the new computer. One of the best models in the
computer. One of the best models in the world today is Anthropics Claude. We
wanted to give you a little bit of a look behind the scenes at how a model like this is born. Made possible by Tranium in Project Rainer. Let's take a look.
Heat.
[Music] Heat.
[Music] Heat. Heat.
Heat. Heat.
[Music] Really cool to see the massive scale that we've got to with Trrenium so quickly. So, what's next? Last year, we
quickly. So, what's next? Last year, we announced that we were already hard at work on our next chip, Tranium 3, designed to make AI workloads better, faster, and more cost- effective.
Today, I'm excited to announce that Tranium 3 ultra servers are now generally available.
Now, these ultra servers are our most advanced, containing the very first 3 nanometer AI chip in the AWS cloud.
Trrenium 3 offers the industry's best price performance for large scale AI training and inference. Now, we've
talked about the incredible results that we've seen with Tranium 2 this past year, but Trrenium 3 ultra servers bring another huge leap forward, 4.4x more
compute, 3.9 times the memory bandwidth, and this one is super important. Five
times more AI tokens per megawatt of power.
And as a special surprise, I have a rack of our Ultra servers on stage with me today.
Our largest TRN3 ultra servers combine 144 total tranium 3 chips acting together in a single scaleup domain connected by
custom neuron switches. This delivers a massive 362 FP8 pedaflops of compute over 700 terabytes per second of
aggregate bandwidth and all in a single compute instance. And our customuilt EFA
compute instance. And our customuilt EFA networks support scaling these out to clusters of hundreds of thousands of chips.
No one else can deliver this for you. It
requires all of these system level pieces to be co-designed together. It
requires multiple types of custom silicon. It requires scale up and scale
silicon. It requires scale up and scale out networking. It requires a detailed
out networking. It requires a detailed and integrated software stack and of course the industry's leading data centers.
In a real world example of how performance improves, we ran through a number of open- source models uh open weights models that we've been optimized to run on trrenium 2 and we wanted to
see how they'd run on tranium 3. As one
example, here's an inference benchmark for a popular uh open source GPT OSS 12B model from OpenAI and we ran it on both trrenium 2 and tranium 3. As you see
here with TRN3, we get remarkable efficiency gains over tranium 2. You see
over 5x higher output tokens per megawatt all while maintaining the same leg uh latency per user, what we call interactivity in this chart here. And
this is just one example. We see similar results as we run this across a number of different models, which is fantastic.
We're excited to see what TRN3 is going to unlock for customers, but we're also not stopping there. I want to give you a bit of sneak peek as to what's coming around the corner. That's why I'm
excited to announce that we're already hard at work on Trannium 4.
We're well into designing it and we're excited about what we're seeing already.
Trrenium 4 is going to bring massive leaps across every single dimension.
Compared to tranium 3, trrenium 4 will deliver six times the FP4 compute performance, four times more memory bandwidth, two times more high memory bandwidth capacity to support the very
largest models that you have in the world. Tranium continues to push the
world. Tranium continues to push the bounds of what's possible with AI infrastructure. So you all can be freed
infrastructure. So you all can be freed to push the bounds of your industry.
Let's hear from a startup that's using AWS's massive AI infrastructure to transform computational biology.
>> We're trying to create a kind of beautiful mind for science that can be a polymath across fields, material science, chemistry, and life. But the
internet and prior data only take you so far. You have to be capable of testing
far. You have to be capable of testing things in the real world. Lily is
building the first of what we call AI science factories, which are an infrastructure through which AI can autonomously propose hypotheses, design experiments, and then run those
experiments in the real world with all of the results of the successes and failures flowing into models and become super intelligent within them by running
the scientific method itself. It won't
surprise anybody that building scientific super intelligence requires a lot of computing. Laya is at trillions of tokens of scientific reasoning today.
We expect that to go up by no less than 100x over the next few years. AWS is an incredible partner because as the scale and speed and intelligence of science
goes up, the scale, speed of computing and the security of that process is going to be more important than ever.
AWS is the best in the world at that combination. What this means for
combination. What this means for humankind is that building a very broad new kind of scientific mind and
infrastructure that can scalably set that mind in motion to find cures, new energy technologies, new materials and
more that we can collectively pull a better future into the present.
really amazing to see what this scale of compute is enabling for customers like Laya to be able to accomplish. It's just
incredible invention that's happening at the infrastructure layer. But we also know that infrastructure is just a part of the story. We're seeing never nearly every single application in the world
being reinvented by AI. And we're moving to a future where inference is such an integral part of every single application that everyone builds. Now to
be successful in that future, you need a secure, scalable, featurerich inference platform that you all can build on. And
that's why we developed Amazon Bedrock.
Bedrock is a comprehensive platform that helps you fasttrack your generative AI applications as you move from prototype into production. With Bedrock, you get a
into production. With Bedrock, you get a broad choice of all the latest models.
You have the ability to customize these models for your individual use case and your performance needs. You get the tools to integrate them into your data and the capabilities to add guardrails as you need them. All of this with the
security and integrations that make it easy to build on applications and data that you already have in AWS.
And companies of every size and in every industry all around the world are using Bedrock. Customers like BMW, GoDaddy,
Bedrock. Customers like BMW, GoDaddy, Straa, those are just a few. with more
than twice as many customers building on Bedrock compared to just this time last year. Bedrock is seeing unprecedented
year. Bedrock is seeing unprecedented momentum. But it's not just the number
momentum. But it's not just the number of customers that are using it. It's
actually the volume of the usage that's quite astounding. In fact, some
quite astounding. In fact, some customers are processing a huge number of requests through Bedrock. Today, some
of the largest scale AI applications in the world all run on this platform. I
actually asked the team to check for me.
We now have over 50 customers that have processed more than 1 trillion tokens each through bread bedrock. Incredible
scale and momentum.
Now when you all start building a Genai application, the very first thing you likely decide on is which model are you going to use? Which is the one that gives you the best cost, the lowest latency, what's going to give you the
best answers. And it a lot of times
best answers. And it a lot of times actually the right answer is a mix of different models for your applications or your agents which is why we think model choice is so critical.
We've never believed that there was going to be one model to rule them all but rather that there would be a ton of great models out there and it's why we've continued to rapidly build upon an already wide selection of models. We
have open weights models and proprietary models, general purpose or specialized ones. We have really large ones and
ones. We have really large ones and small models. and we've nearly doubled
small models. and we've nearly doubled the number of models that we offer in Bedrock over the last year.
Today, I'm pleased to announce that we're introducing a whole host of new open weights models.
These models come uh include ones like Google's Gemma, Miniax M2, and Nvidia's Neotron.
And today we have a couple of new models that are de debuting to the world for the very first time.
I'm excited to announce that today from Mistral AI and available immediately on bedrock are two new sets of open weights models.
The first is Mistall large which is a big leap forward from their large 2 model doubling the context window size and vastly increasing the number of model parameters by more than five times.
We're also launching today Ministrol 3, which is a cool set of three models that are uh offer really great deployment flexibility for ultraefficient edge devices or single GPU deployments uh or
advanced local operations. It's going to be super fun to see how you all use all of these open weights models.
Now, in addition to providing a huge selection of thirdparty models on Bedrock, last year we announced Amazon Nova, which is Amazon's family of foundation models, delivering the industry's best price performance for
many workloads out there. Over the last year, we've actually extended the Nova family to support more use cases and deliver more possibilities for you that deliver real value. We've unlocked
speech-tospech use cases, an example, with Amazon Sonic. And just a few weeks ago, we launched the industry's best performing model for creating embeddings across multiple modalities with Nova's
multimodal embeddings.
And the momentum has been really fantastic. Nova has been has grown to be
fantastic. Nova has been has grown to be used by tens of thousands of customers today. Everyone from marketing giants
today. Everyone from marketing giants like Denu to tech leaders like Infosys or Blue Origin or Robin Hood to innovative startups like Ninja AI. And
today we're making Nova even better.
Announcing a new generation of Nova with Nova 2.
Nova 2 delivers cost optimized low latency models with frontier level intelligence. The Nova 2 family includes
intelligence. The Nova 2 family includes Nova 2 Light, which is our fast and cost-effective reasoning model suitable for broad set of workloads, and Nova 2 Pro, which is our most intelligent
reasoning model for highly complex workloads. We're also introducing Nova 2
workloads. We're also introducing Nova 2 Sonic, which is our next generation speech-to-pech model that enables real-time human-like conversational AI for all of your applications.
Now, Nova 2 Light delivers incredible price performance for many workloads that we actually see our customers wanting to deliver in production. Nova 2
Lite compares really favorably in industry benchmarks to models like Claude Haiku 45, GPT5 Mini, and Gemini Flash 2.5. In particular, Nova 2 Lite
Flash 2.5. In particular, Nova 2 Lite excels at things like uh instruction following, following, tool calling, generating code, and extracting information from documents, often
matching or exceeding the performance that we see from these comparable models at an industry-leading cost performance.
We think that Nova 2 Light is going to be a real workhorse and is going to be really popular for a wide variety of your use cases out there.
Nova 2 Pro is our most intelligent reasoning model and it's going to be great for when you have those really complex workloads. In particular, we
complex workloads. In particular, we look at really important areas where you need your agents to be great and that's where Nova 2 Pro really shines where skills like instruction following and
agentic tool use are critical. In fact,
for those Nova 2 is frequently coming out on top.
If you look at artificial analysis benchmarks in those areas, Nova 2 Pro delivers better absolute results compared to leading models like GPT 5.1,
Gemini 3 Pro, and Claude 45 Sonnet.
And for applications that need voice capabilities, Nova 2 Sonic offers industryleading conversational quality at awesome price performance with improved latency and ex and
significantly expanded language support.
We think you're going to love it.
Now I also have one more Nova model that I want to talk about that has a unique set of capabilities.
Now today's models out there are actually quite good at reasoning across one type of modality. Say they're
looking at an image or listening to audio or outputting text and then maybe outputting uh in a different modality.
Say they're maybe reading text and then creating an image. But in the real world you have to understand multiple modalities at the same time. Take for
example this keynote. If you wanted a model to try to understand all that was going on in the keynote today, you'd want to get the nuance and understanding of everything that we're saying, that means you'd have to listen to what I'm
saying. You'd have to understand the
saying. You'd have to understand the contents of all these slides. You'd have
to be able to watch the videos and understand what's going on and what we're showing. Now, let's say you want
we're showing. Now, let's say you want to take that same model and produce a summary output for your sales team where it had a summary of all the launches we announced today along with some images and marketing material.
Now, if you wanted to do this, you could, right? This is possible today.
could, right? This is possible today.
But it means testing a wide variety of different models, stitching them all together, and trying to accomplish this outcome, which is is totally doable, but is quite hard. It'd be easier if you had a single model that could do all of
that. And that's why I'm excited to
that. And that's why I'm excited to announce Nova 2 AMI.
It's a unified model for multimodal reasoning and image generation. Nova 2
AMI is the industry's first reasoning model that supports text, image, video, and audio input and then supports text and image generation output. So that's
four new industryleading models for Amazon Nova and we're just getting started. Up next, let's hear from
started. Up next, let's hear from Gradial who's building some pretty cool capabilities with Nova and Bedrock.
The biggest slowdown in marketing isn't creativity, it's everything that happens after. and
Gradial helps achieve that.
Today, the content operations world is fairly manual. To get from a creative
fairly manual. To get from a creative brief onto a website takes 4 to 6 weeks and involves 20 different steps that require designers, engineers,
copywriters, and web strategists. And
so, what Gradial's done is we've connected all of those different systems in place so that we can take it from idea to action. Our orchestration agent
decides which sub agents that it actually uses. If it's going to use the
actually uses. If it's going to use the authoring agent, the Figma agent, the sitecore agent, and it will combine each of those agents to actually get a task
done, right? And actually making
done, right? And actually making recommendations of how your content can convert audiences better and faster. We
will forever live in a multimodel world.
There isn't one model fits all. And so
AWS Bedrock and Nova gives us the freedom to use efficiency where we need it, use power when we need it, and use reasoning when we need it. That was
crucial for us. AWS is extremely invested in startup success. And that is very clear. I don't think that anybody
very clear. I don't think that anybody got into marketing to do a link swap. I
think that folks got into marketing to be creative. If you free up that time,
be creative. If you free up that time, just imagine all the things that will be created years from now.
[Music] Really amazing to see what Gradel has been able to do with Bedrock. Now, the
models that are available today are really incredible and it continues to impress me what everyone out there is able to accomplish with them. But as
these models are used to power more and more missionritical line of business applications and your agentic workflows, it turns out that the AI's ability to understand your company's data is what
really starts to deliver huge value for your company and for your customers. And
I'll just pause here. I can't stress this strongly enough. Your data is unique. It's what differentiates you
unique. It's what differentiates you from the competition. And I see this over and over again. If your models have more specific knowledge about you and your data and your processes, you can do
a lot more.
Now, the wizardry here comes when you can deeply integrate a model with your unique data and IP. But in order to do this well, it's critical that you have your data in the cloud.
So, what are the best ways to get these models to access your data? Clearly,
third party models don't start with access to your data. they don't natively know about your business and and frankly you wouldn't want them to since you you wouldn't want your proprietary data embedded in those models so that everyone else could use it. It's why
actually the isolation that we provide inside of bedrock is so important to prevent your data from leaking back to the core models. Now the co the most common techniques that we see people successfully use today to combine your
data with the models are things like leveraging rag or vector data to provide your chosen model with context at inference time. And these are quite
inference time. And these are quite effective to help your models more nav uh more effectively navigate your massive set of data and return relevant results.
Usually what we see though is this only goes so far. Almost every customer I talk to wishes that they could somehow teach the model to really understand
their data, really understand their deep domain knowledge, the expertise. They
want the model to know their expertise when it's making its decisions.
Let's say for a take for a second you work at a hardware company that's looking to accelerate R&D for new products. You'd optimally want a model
products. You'd optimally want a model that understands your past products, your manufacturing preferences, your historical success and failure rates, whatever process constraints you might have. And then you want something that
have. And then you want something that could combine all of these to provide an intelligent guidance for your design engineers. And it turns out whatever
engineers. And it turns out whatever your company is, you have this in incredibly vast corpus of IP and data that would be super valuable if it was integrated into the model you use. Now
the natural question is why not just train a custom model? It turns out there's only really two ways to do this today.
You could build your own model from scratch, include your own data in that, but of course, this is super expensive.
And frankly, you probably don't have all of the data that you would need to build the general intelligence in the model.
And then even if you did, you may not have the uh in-house expertise to pre-train a frontier model anyway. So
that's probably not very practical for most companies. What most people do
most companies. What most people do instead is they start with an open weights model and you modify them. And
there's lots of ability to customize there. You can tune weights with with
there. You can tune weights with with techniques like fine-tuning, reinforcement learning, and you can try to build something that really focuses on your use case. However, it turns out there's actually limits to how effective
this is as well. It's really hard to teach a model a completely new domain that it wasn't already pre-trained on.
And it turns out the more you customize models, the more you add a bunch of data in post- trainining, these models tend to forget some of that interesting stuff that it learned earlier, the core
reasoning. It's a little bit like uh
reasoning. It's a little bit like uh humans trying to learn a new language.
When you start when you're really young, it's actually relatively easy to pick up. But when you try to learn a new
up. But when you try to learn a new language later in life, it's actually much much harder. Model training is kind of like this, too. Now, there's been some pretty cool things done with the
limitability that you have to tune these open weights models, but you can only go so far.
Today, you just don't have a great way to get a frontier model that deeply understands your data and your domain.
But what if it was possible? What if you could integrate your data at the right time of train during the training of a frontier model and then create a proprietary model that was just for you?
I think this is actually what customers really want. And so we asked ourselves,
really want. And so we asked ourselves, why not?
And today I'm excited to announce Amazon Nova Forge.
Nova Forge is a new service that introduces the concept of open training models. With Nova Forge, you get
models. With Nova Forge, you get exclusive access to a variety of Nova training checkpoints and then you get the ability for you to blend in your own
proprietary data together with an Amazon curated training data set at every stage of the model training. This allows you to produce a model that deeply understands your information all without
forgetting the core information that the thing has been trained on. We call these resulting models novellas. And then we allow you to easily upload your novella and run it in bedrock. Let me show you
how this works. Let's say you're that hardware manufacturer that we discussed earlier. You have several hundreds of
earlier. You have several hundreds of gigabytes of data, billions of tokens related to your past designs, your failure modes, your review notes, etc. And you decide that you're going to
start from an 80% pre-trained Nova 2 light checkpoint. Using our provided
light checkpoint. Using our provided tool set, you blend all of your data in with that Amazon curated training data set. And then you run the provided
set. And then you run the provided recipes to finish pre-training that model, but this time with all of your data included. This introduces your
data included. This introduces your domain specific knowledge all without losing the important foundational capabilities of the model like reasoning. Nova Forge also provides the
reasoning. Nova Forge also provides the ability to use remote remote reward functions and reinforcement fine-tuning to further improve your model, letting you plug real world environments into
the training loop. And because your baseline model already understands your business, these post-training techniques are actually much more effective. Once
you're ready, you import this model, your Nolla, into Bedrock and you run inference on it just like you would any other Bedrock model. Now, your
industrial engineers can ask questions like, "What are the pros and cons of design A versus design B?" and get responses that are specific to your company's historical results, manufacturing constraints, and customer
preferences.
We've already been working with a few customers to test out Nova Forge, and they're already seeing transformative results from customizing Nova's open training models.
Let's dive a little bit into the example with Reddit.
Reddit uses Genai to moderate content for multiple different safety dimensions across their chats and searches.
Fine-tuning existing models didn't get them the performance they needed. They
even tried using multiple models of different safety dimensions, but it was super complex and even then they couldn't get the accuracy they wanted for the specific requirements of their community. With Forge, however, Reddit
community. With Forge, however, Reddit was able to integrate their own proprietary domain data during pre-training, enabling the model to develop integrated representations that
naturally combined the general language understanding with their own community specific knowledge. For the first time,
specific knowledge. For the first time, they were able to produce a model that met their accuracy and cost efficiency targets. And at the same time, it was
targets. And at the same time, it was much easier to deploy and operate.
We think this idea of open training models is going to completely transform what companies can invent with AI.
Now, here to share how Sony is transforming and reinventing their business on AWS, please welcome Chief Digital Officer and Corporate Executive Officer from Sony Group Corporation,
John Codera.
[Music] Good morning. Today I'd like to talk to
Good morning. Today I'd like to talk to you about the word that holds a very special place for Sony.
Kando. The direct translation of the word into English is emotion. But kando
means more than that in Japanese.
It captures feelings of deep emotional connections and experiences when watching a movie, listening to music, or
playing a game. For us at Sony, Kando is what we strive to create and deliver to
our customers in all aspect of our work.
Kando is at the core of who we are.
Our founders created Sony in 1946 with the dream to enrich people's lives through the power of technology. A world
where technology can deliver new experiences, lifestyles and condo.
Driven by this vision, we have delivered innovative products creating entirely new industries and customer experiences
along the way. With each era of technology from analog to digital, the internet, the cloud, Sony has reinvented
itself again and again.
Today, Sony is more than a hardware technology company. We are also a leader
technology company. We are also a leader in entertainment across games, music, pictures, anime. There is no other
pictures, anime. There is no other company in the world like Sony with the depths of our business portfolio and
touch point with fans and creators.
One of the most remarkable successes this year is the anime movie Demon Slayer Kims Noa Infinity Castle. As of
late November, this film has become the highest ever grossing Japanese film released worldwide and the fifth highest grossing film across all categories in
2025.
As we've done with Demon Slayer, we hope to keep delivering new condo by marrying the creators vision with a deep
understanding of their funds. and our
relationship with AWS plays a pivotal role to make this happen.
One example started in the early 2010s when I was president of the network service company for PlayStation and
other Sony devices. We chose AWS as our provider for its global footprint, high standards in availability and
scalability.
In 2020, for the launch of PlayStation 5, we utilized AWS building blocks for our network architecture.
These services allowed us to scale out at the moment's notice and accelerated our shift to microservices,
increasing deployment by 400% with the oneten the lead time. Today, our
relationship with AWS supports a safe, secure, and high quality gaming experiences for up to 129 million gamers
to connect and experience Kando together.
Moving forward, we see incredible potential for growing the fun community, connecting funds with similar taste and interest across our diverse portfolio of
content IPs.
At the same time, we also want to better serve our creator community by providing them with more tools, connection, and
insights to their fan base.
We call this the Sony engagement platform. Creating deeper understanding
platform. Creating deeper understanding and connection between the fans who experience content and the creators who
are making it.
And one of the building blocks of for the engagement platform is the Sony data ocean. It utilize data insights
ocean. It utilize data insights generated from multiple connected data lakes.
Built using AWS services, it enable us to process up to 760 terabyte of data
from more than 500 data across Sony group. Of course, to make the most
group. Of course, to make the most effective use of our data and deliver to our customers, we have to effectively
harness the power of AI and agents to empower our employees and augment our business capabilities
to maximize our productivity in the enterprise setting. We are actively
enterprise setting. We are actively promoting the usage of generative AI.
Our homegrown enterprise LLM built using Amazon Bedrock over 57,000 users today since its introduction two
years ago and we are serving 150,000 inference requests per day.
And today we are integrating new agentic capabilities into our platform to enable a new level of advanced operational
efficiency across our businesses.
By placing Amazon betrok agent core at the center of our a agentic AI system, we gain the ability to easily govern,
deploy and manage more useful agentic capabilities to accelerate our enterprise AI transformation.
And today we are pleased to share that we are adopting Nova Forge to apply state-of-the-art customized models to
our unique business and operations.
We fine-tuned a Nova 2.0 light model that outperform baseline models for tasks like reference consistency and
document grounding. We are now aiming to
document grounding. We are now aiming to increase the efficiency of Sony's compliance review and assessment
processes by 100x.
In addition to the enterprise setting, Sony is fully committed to the responsible and ethical development and
use of AI in the creative domain. We
hold ourselves against the highest standard and respect toward the rights of creators and performers.
So where are we going from here? We will
continue to create and deliver condo fulfilling the aspirations of both funds and creators building meaningful connection between
them.
In the future, we will continue to expand our fans engagement with with their favorite content IP across multiple entertainment genres.
As we have done with Uncharted and The Last of Us, we hope to connect fans and creators in both virtual and physical
environments, including location-based entertainment.
Realizing our vision will require even greater and stronger collaboration with AWS as well as our creative and
technology partners in the audience.
We look forward to creating and delivering even greater cando to our fans across the world. Thank thanks for
listening. Thank you.
listening. Thank you.
[Music] Thank you so much, Gerasan. Such a great story. And just like Sony has seen, a
story. And just like Sony has seen, a key to their success over the long run is having the rout right foundation for that innovation. What really excites me
that innovation. What really excites me about Sony's story is that because they have their data and their applications in the cloud in AWS, it's that much easier for them to deal with any
uncertainty that comes their way. Not
every not many companies have been able to transition as successfully as Sony from an electronics device company into a global digital media business. Such a
major change and so cool to see how having the right technology platform in AWS has helped them on that journey.
Now, when you have your data in the cloud, it turns out you can move more rapidly and you can adjust to any of those unexpected changes that come your way. Now, the world is not slowing down.
way. Now, the world is not slowing down.
In fact, if there's one thing that I think we can all count on, it's that more change is coming. Now, one of the biggest opportunities that is going to change everyone's business is agents.
Agents are exciting because they can take action and they can get things done. They can reason dynamically and
done. They can reason dynamically and they create workflows to solve a job in the best way without you needing to pre-program them. These agents work in
pre-program them. These agents work in non-deterministic ways, which is part of what makes them so powerful. But it also means that the foundation and tools that got us to where we are building software
aren't necessarily the ones that we need for agents.
And that's why we launched Amazon's bedrock agent core delivering the most uh advanced agentic platform so that you all could build, deploy and operate agents securely at scale. We designed
agent core at the same time to be comprehensive but also modular.
It has a secure serverless runtime so agents that can deliver uh to agents so that agents can run in a complete session isolation. Agent core memory
session isolation. Agent core memory enables agents to keep context handling both short and long-term memory so that they can learn and get better over time.
We provide an agent core gateway so agents can easily discover and securely connect to tools, data, and other agents. We have agent core identity that
agents. We have agent core identity that provides you a way to do secure authentication and gives you controls over what tools and data your agents can access. Agent core observability gives
access. Agent core observability gives you real-time visibility in the deployed agent workflows that you have. And we
have a variety of foundational tools that allow your agents to securely execute real world workflows. Things
like code interpreter that gives you access to a secode a secure uh code execution environment or our managed browser service which provides a managed environment that makes it easy for your
agents to access the internet.
Agent Core is truly unique in what it enables for building of agents and it's significantly different than anything else out there. And we built agent core to be open and modular. So you can use
it with a variety of frameworks, things like crew AI or llama index or lang chain or AWS's trans agent. You can also use it with any model out there. Whether
it's from the variety of models that we have in bedrock or from models like OpenAI's GPT or Gemini models. You only
have to use the building blocks that you need. We don't force you as builders to
need. We don't force you as builders to go down a single fixed path. We allow
you to pick and choose which services you want to make for your own situation.
Agent Core also then makes it easy to deploy your agents privately and securely inside of your Amazon VPC and then allows you to scale to thousands of sessions to support hightraic use cases.
It's also super fast and easy to deploy your agents. Agents can be deployed in
your agents. Agents can be deployed in under a minute with just drag and drop for a few lines of code.
And this is part of why we're seeing so much momentum with agent core as our customers rapidly adopt it as the foundation for their agentic applications.
We see it across industries from companies like regulated industries like Visa or National Australia Bank or Riotinto. We see it from ISVS like
Riotinto. We see it from ISVS like Palumi and ADP or startups like Coher Health and Snorkel AI and the momentum is really accelerating. I'm going to
talk about a few of them. Now, Dina
Freeman is the CEO of NASDAQ, and she and her team are moving really fast to build agents that can do real work in core areas of their business. Now,
before Agent Core, they were planning on dedicating a whole team to build the foundation infrastructure that they needed to reliably operate and build resilient agents to meet their very high
standards. Agent Core, however, now
standards. Agent Core, however, now frees them from this heavy lifting so they can just focus on building great agents.
Bristol Myers Squib built a new agent that's able to evaluate more than 10,000 compounds across multiple hypotheses in less than an hour. This is a process that used to take their researchers four
to six weeks. The company's drug discovery agent uses Asian core runtime for its ability to seamlessly and dynamically scale and to keep their
sensitive data secure and isolated.
We see ISVS like Workday who are building the software of the future on Agent Core. Agent Core's code
Agent Core. Agent Core's code interpreter delivered exactly what they needed, the essential features and security requirements and data protection that was needed to power their planning agent. This capability
reduces the time spent on routi routine planning analysis by 30% um saving them nearly 100 hours uh of work every month.
And you don't have to build your own agents either. Many companies are using
agents either. Many companies are using AWS marketplace as the trusted place to publish and procure pre-built agents, tools, solutions, and professional services. And this is where AWS partners
services. And this is where AWS partners can help you all move even faster. We're
really excited what customers have been able to do with Agent Core, but we're far from done.
One big challenge that we've seen when you're building agents is how do you get them to behave predictably and in line with your intents? What makes agents powerful is this ability to reason and act autonomously.
But that also makes it hard for you to have complete confidence that your agents aren't going to stray way out of bounds.
This is a little bit like raising a teenager. I currently have two awesome
teenager. I currently have two awesome teenagers myself at home. Now, as your kids get older, you have to start giving them more autonomy and freedom so that they can learn or adulting as they like
to call it.
But you also want to put some ground rules in place to avoid major issues.
Think about when your kids start driving. This is the current situation
driving. This is the current situation that I'm in. All of a sudden, the kids have all this autonomy. There's a ton of things that they can go and do by themselves, but you still kind of want to have those guardrails in place, like
you have to be home by a certain time or you don't want to drive more than say 5 miles an hour over the speed limit, things like that.
One way you can actually build trust in agents is by making sure that they have the right permissions to access your tools and your data. Agent core identity provides a great way to do this today.
But while permissions on your tools that your agents can access is a good start, what you really want to be able to control is the specific actions that your agents can or cannot take with those tools. Have actions like what is
those tools. Have actions like what is the agent going to do with those tools?
How can they use them? Who are the tools for? And today customers struggle with
for? And today customers struggle with this. You can embed policies inside
this. You can embed policies inside directly in your agents code, but because agents generate and execute their own code on the fly, these safeguards are really best effort and can only provide you weak guarantees.
and they're really difficult to audit.
In practice, this means today you can't with certainty control what your agent does or does not do while also giving it the agency to go and complete these workflows on its own. As a result, most
customers feel that they're blocked from being able to deploy agents to their most valuable critical use cases. And
today, that's why we're announcing policy in Agent Core.
[Applause] policy provides you with real-time deterministic controls for how your agents interact with your enterprise tools and your data. Now you can set up
these policies that can define which tools and data and agents uh which tools that your agents can access but also how they access them. So whether they're APIs or Lambda functions or MCP servers
or possible or popular thirdparty uh services like Salesforce or Slack and you also can then define what actions they can perform and under what conditions. So what agent core does is
conditions. So what agent core does is it then evaluates every single agent action against this policy before ever granting access to your tools or your data. We'll walk through a simple
data. We'll walk through a simple example.
Let's say you're in agent core policy and just using natural language, you define a policy. Say something like, I want you to block all refunds uh from customers when the reimbursement amount is greater than $1,000.
Then under the hood, what happens is your prompt is converted to Cedar, which is a popular open source language that's powered by our automated reasoning work across authorization and our verifiable
systems inside of AWS.
Once established, these policies are then deployed to your agent core gateway and they're evaluated in milliseconds, which ensures that all of your actions are checked instantly and consistently to keep your agent workflows fast and
responsive. And the design of where this
responsive. And the design of where this sits is actually super important because this policy enforcement is outside of your agents application code. Uh the
policy evaluation actually sits in between your agent and all of your data and your APIs and tools. So you can predictably control their behavior.
Going back to our example in the refund policy, if every agent action is checked against your policies before the agent is able to access the tools. So let's
pretend a situation happens uh where a refund is over the limit uh that you've defined. The agent is blocked from then
defined. The agent is blocked from then now now issuing that refund. Now that
you have these clear policies in place, organizations can much more deeply trust the agents that they're building and deploying knowing that they'll stay with inside the boundaries that you've defined.
Now of course you all need agents to do more than just follow the explicit rules that you define.
You have to know that they're behaving in the right way. Trust but verify is a phrase that we've kind of co-opted at Amazon as a mental model for how you manage at scale. At AWS, we give our
teams incredible autonomy. I trust our teams to go and invent for customers and execute on that mission. But I also have mechanisms that allow me to dive deep
and inspect when things are on track. I
want to check that our strategic initiatives that we've identified are in fact getting done in the way that we've intended.
If I go back more time to our teenagers, I generally trust that they're following the rules, but I can still check my Ring camera to ensure that they got home on time. And I can always check the status
time. And I can always check the status of my Live 360 app to ensure that they're within the bounds of where I expect. The same thing applies to
expect. The same thing applies to agents. To gain confidence, you want
agents. To gain confidence, you want visibility into how they're acting. Now,
customers love what they're getting with Agent Core observability. You get
real-time visibility into all your operational metrics. You can see your
operational metrics. You can see your agent response times. You can see the computational power that's being used and your error rates and which tools and functions are being accessed. That's all
great. But in addition to how agents are performing operationally, there's other things that you actually want to know.
You want to know things like, are your agents making the right decisions? Are
they using the best tool for the job?
Are their answers are correct and appropriate? Are they even on brand?
appropriate? Are they even on brand?
These are things that are super hard to measure today. It usually actually
measure today. It usually actually requires you to have a data scientist.
The data scientist is going to build some complex data pipeline. They're
going to select a model that's going to try to judge the outputs of their agents. They have to build the
agents. They have to build the infrastructure to serve these evaluations and then manage quotas and throttling. And each time you want to
throttling. And each time you want to roll out a new agent or you want to upgrade to a new version of a model that you're using, you have to do all of this work all over again.
But unlike traditional software, actually testing and pre-pro here even then is really hard. You only know how your agents are going to react and respond when you have them out there in
the real world. That means you have to continuously monitor and evaluate your agent behavior in real time and then quickly react if you see them doing something that you don't like. We think
we can make this a lot better.
Today, I'm excited to announce agent core evaluations.
Evaluations is a new agent core service that helps developers continuously inspect the quality of their agent based on real world behavior. Evaluations can
help help you analyze agent behavior for specific criteria like the ones I mentioned correctfulness helpfulness harmfulness, and they come with 13 pre-built evaluators for common quality
dimensions. Of course, you can always
dimensions. Of course, you can always create your own custom scoring system with your own preferred prompts and models as well. And you can easily evaluate agents in this testing phase to correct any issues before you end up
deploying them broadly. So now if you're going to upgrade to a newer version of a model as an example, you run your evaluations to evaluate your agent and you want to make sure that it maintains the same level of helpfulness for
example that you have in your current release. You can actually also use
release. You can actually also use evaluations um in production to catch any of those hard to find quality degradations really quickly. You'll see
your results in cloudatch right alongside your Asian core observability insights and agent core um uh evaluation automates what used to take specialized
expertise into an and and a bunch of infrastructure heavy lifting into something that everyone can access and allows you to continually improve the quality of your agents. We're quite
excited about it. All right, so this is agent core, the agentic platform that's powering the next wave of agents. We're
helping you move quickly to get your agents into production without compromising or making any sacrifices, which this is what we're all about. We
want you to move fast so you have the broadest set of capabilities to build for your own customers. Today, we're
really excited that we've added two new powerful capabilities in policy and evaluations. And I'm really excited to
evaluations. And I'm really excited to see how this unlocks some real powerful production use cases.
Now to tell us more about how they're building agents of their own to transform their business and how they're using AWS as a key part of their agentic transformation. Please welcome Shanteneu
transformation. Please welcome Shanteneu Nayan, CEO and chair of Adobe.
[Music] [Applause] [Music]
Everybody.
[Music] Thanks, Matt. Good morning.
Thanks, Matt. Good morning.
Hello, everyone. I'm thrilled to join you at this transformative time. We're
clearly witnessing a golden area of creativity where AI is amplifying human ingenuity and enabling people to bring their imagination to life. Adobe has
been at the forefront of this revolution. From the invention of
revolution. From the invention of desktop publishing to the origins of digital documents to groundbreaking advances in imaging and video, we're constantly pushing the boundaries of
what's possible.
It was actually our transformation to a cloud-based subscription model over a decade ago that marked the beginning of our relationship with AWS because it was
services like Amazon's EC2 and S3 that actually provided us with the scalable as well as secure foundation for Adobe's innovation.
As we transition into this era of AI, AWS is actually helping us innovate faster with the core services that we need, as Matt said, to train models as
well as deploy agents. And this allows us to focus on what Adobe does best.
Unleashing creativity across every facet of digital experiences for our business, which span business professionals, consumers creators creative
professionals, as well as marketing and IT professionals.
When it comes to AI for creativity, we're reimagining every stage of the process for people of every skill level.
We do this with the knowledge that over 90% of creators actively are using creative focused generative AI today.
And to support them, we're infusing AI into Adobe Firefly, our all-in-one destination for creative workflows driven by AI in our flagship Creative
Cloud applications like Photoshop and in Adobe Express, the quick and easy app to create onbrand content.
As just an example, our Adobe Firefly model that powers capabilities like ST to image, text to video, generative fill, and generative recolor have
actually been trained and are using both P5 and P6 instances with all the data stored in S3 and FSX for luster. These
models have actually been used to generate over 29 billion assets and enable creators to create content with unmatched creative control. And our AI
assistant now in Adobe Express helps users redefine their entire creative process using conversational editing.
And these agentic experiences are now powered by our AI platform. And our
relationship with AWS helps ensure that these agents operate efficiently and more importantly securely.
When it comes to productivity, PDF remains the way that people consume information. Over 40 billion PDFs have
information. Over 40 billion PDFs have been opened and shared with Adobe Acrobat. And every year, more than 18
Acrobat. And every year, more than 18 billion PDF files are created and edited by our customers around the world.
Today, we're integrating productivity for billions of business professionals and consumers through AI capabilities, including an AI assistant.
In August, we announced Adobe Acrobat Studio, a firstofits-kind platform that brings together Acrobat, Adobe Express, and AI agents to enable users like you
to work more efficiently with both structured and unstructured information.
Our collaboration with AWS is absolutely key here given Acrobat Studio uses Amazon Sage Maker as well as Amazon
Bedrock to access our and third party models helping millions of users research strategize analyze and collaborate even faster.
Adobe PDF is also a new offering that helps consumers and business professionals now collaborate with conversational knowledge hubs that are supported by the personalized AI
assistance.
And finally, in the AI era, we all know that the role of marketers has evolved to the orchestration of engaging customer experiences for their consumers
and customers.
To support them, we're unifying the key elements of customer engagement, the content supply chain, as well as brand visibility. Adobe Experience Platform is
visibility. Adobe Experience Platform is the core foundation for driving this customer engagement, bringing together AI powered apps and agents to drive
engagement and loyalty.
It operates at the scale of over 35 trillion segment evaluations and more than 70 billion profile activations per
day. An experience platform runs using
day. An experience platform runs using AWS building blocks as well as an innovative cellular architecture. Our
joint customers can now ingest data from sources like Redshift into the Adobe Experience platform to create these profiles, hydrate them, and use these
audiences in Adobee's real-time customer data platform.
Key, I think, to this customer engagement is creating onbrand content that's delivered in the right time and the right channel at exactly the right
moment. Because marketers expect that
moment. Because marketers expect that the demand for content will grow 5x over the next two years and every business needs a content supply chain to manage
this. Adobe Gen Studio is our solution
this. Adobe Gen Studio is our solution to address this in an end-to-end fashion and Amazon ads is a key collaboration even here by integrating our creative
and customer experiences with how creative and marketers can now bring all of these ideas to market.
And finally, brand visibility is clearly top of mind for CMOS as we all turn to these LLMs for information recommendations as well as purchase decisions.
We actually observed a 1100% year-over-year increase in AI traffic to US retail sites as recently as September. And with products like Adobe
September. And with products like Adobe Experience Manager as well as the newly available Adobe LLM optimizer and Adobe Brand Concurge, we're helping brands
stay in front of the AI search.
We're really excited about the promise of Agent Core and KO to help us accelerate the deployment of all these new Aenteic capabilities. We've already
had numerous successful agent core proof of concepts. For example, our Adobe
of concepts. For example, our Adobe Commerce team was able to run a prototype migration assessment using Agent Core to help our customers
identify as well as solve compatibility challenges as they move to this SAS product.
Adobe has incorporated AI into our tools for over 15 years, delivering hundreds of advances that enhance this efficiency and collaboration.
99% of Fortune 100 companies have used Adobe AI in an application. Across all
these categories, AWS is helping us to innovate faster, operate more efficiently, and deploy new technologies at scale. Whether it's in the data layer
at scale. Whether it's in the data layer where we train our category leading Adobe Firefly Foundation models, whether as Matt said in making sure that we
offer choice in AI models so we can continue to innovate in creative categories through agent orchestration where we're augmenting this ecosystem.
And finally, integrating AI into all of our apps, making it easy for customers of all types to adopt and realize value where they do their work today. It's an
incredibly exciting time to stand at this intersection of human and computer interaction. And the AI transformation
interaction. And the AI transformation Adobe and AWS are driving together, I believe, will redefine digital experiences for billions of people
around the world. and we couldn't be more excited to work with all of you.
Thank you.
[Applause] [Music] That's great, Shantu. Thanks so much.
It's really exciting to see how Adobe is pioneering across digital experiences all on top of AWS.
Now, with the tools and services we're providing, we know that our customers and partners out there are going to build a huge number of incredibly impactful agents. But you can also
impactful agents. But you can also expect that some of the most capable, powerful agentic solutions are going to come direct from AWS. Let's dive into a few of those now. Now, as we thought
about what agents should we build and which experiences could we reimagine, we focused on areas where we thought we could bring some differentiated expertise to our customers. For example,
turns out Amazon has a very large heterogeneous global workforce and we understand the importance and frankly the complexity of tying together all of your enterprise data and systems to
empower those employees.
We set off to build something that would empower Amazon and our customers set of corporate cu uh employees which is why we built Amazon Quick. With Quick, our goal is to give every employee a
consumer AI experience. that consumer AI experience that they've come to embrace, but with the context and the data and the security that you all need to get your work done. Just earlier today, I
talked about how important deep access to your company's data is when you're trying to make critical decisions. And
that's one of the things that makes quick, unique, and powerful. It brings
together all your data sources, your structured data like BI data and your databases and your data warehouses, your data from apps like Microsoft 365 or Jira or Service Now or HubSpot or
Salesforce, as well as all your unstructured data, things like your own documents or your files that you have in SharePoint or Google Drive or Box. All
of that data that you need to make great decisions and we make it accessible to a powerful suite of powerful agents.
With Quick, you get a rich set of BI capabilities that make it easy for anyone to discover insights across all of those sources of structured and unstructured data. You get a capability
unstructured data. You get a capability to do deep research. This is actually one of my personal favorite features and it allows Quick to to investigate complex topics, but then it can pull
information from your internal data repositories as well as external sources of data on the internet to pull together a thoughtful detailed research report complete with source citations so you
know exactly where the information comes from. And you can create quick flows
from. And you can create quick flows which give you the ability to create these little mini personal agents that can automate flows for your everyday repetitive tasks to drive efficiency for you as an individual and help your and
this can help your teams at your companies be much more efficient and productive at work. A few uh months ago we released Quick internally at Amazon and today we already have hundreds of thousands of users inside of the
company. The value that our own
company. The value that our own employees are getting from Quick has frank quite frankly blown us away. teams
are telling us that they're completing tasks in onetenth the time that it used to take. Take for one example that I
to take. Take for one example that I have where I heard from our internal Amazon tax team where they built a quick agent that helps them consolidate all of their sources of tax data whether
they're projects from audits or um uh details from the internet and it performs deep research into any tax code changes or policies changes that might be made and it presents all of this tax information from all those sources of
data in a single view for them. They
then use quick to visualize that information and allows them to track your regulatory changes in real time.
Now, no one there, these weren't developers, these were tax uh people and they were able to do this without writing any code or pulling any manual reports. Now, when a new tax law
reports. Now, when a new tax law emerges, everyone can act on it quickly.
It eliminated this siloed set of systems and enabled the team to stay compliant and proactive rather than reactive. And
we're hearing stories across the company like this over and over again.
Another place where we use agents to transform what's possible for you all is in customer service. This turns out like that's another area where an area where Amazon knows a lot about.
Amazon Connect is a leading cloud contact center solution and it transforms customer experiences across organizations of all sizes. Connect was
a pioneer in bringing AI to the contact center with AI powered self-service. It
allows you to intelligently automatically resolve issues. But it
also combines it with AIdriven recommendations so that you can give to guide your human agents. Connect gives
you this ability to deliver personalized exceptional experiences for all of your customers. And it's impressive to see
customers. And it's impressive to see how quickly connect has grown to lead the transformation from these legacy on-prem environments into a cloud AI and agentpowered contact center. And it's
done this for global enterprises like Toyota and State Farm and Capital 1 and National Australia Bank as well as for hundreds and hundreds of startups.
Customers are seeing the impact of this move to the cloud and we're seeing this momentum really accelerate the business.
In fact, it's being shown in the business results where earlier this year the Connect business passed the 1 billion annualized run rate mark while helping tens of thousands of customers grow the grow their business faster.
Thank you to all of you who use Connect.
[Applause] Quick and connect are just two examples of AWS delivering impactful agentic solutions for our customers. Up next,
we're going to hear from a fast growing startup that's also helping enterprises get more work done. To share their story about how they're transforming what's possible with agents in the enterprise,
please welcome Mayh Khabib, CEO of Writer.
[Music] What if Mars, one of the world's largest consumer goods companies in the world,
could run every ad image through compliance in seconds, saving thousands of hours as checks are done instantly?
What if Astroenica, makers of some of the most innovative drugs, could automate the paperwork needed to get treatments approved all over the world, saving months of painstaking manual work
and getting life-saving treatments to people faster?
And what if Qualcomm, the global technology leader, could uncover the most efficient places to put marketing spend in real time, dramatically boosting campaign performance while
saving millions in the process. This is
not just the promise of AI. This is all happening today, right now with Aentic AI from Writer.
I'm Mayh Khabib, Writer's co-founder and CEO. And over the last five years, we've
CEO. And over the last five years, we've worked with the world's largest companies in the most highly regulated industries to build a platform for agentic work.
Early on, we saw a gap between the amazing things these LLMs were capable of and what would meet the enterprises bar for reliability, security, and
control.
We made a bold decision to be a full stack platform, one that has the precision and compliance that enterprises need. It's powered by our
enterprises need. It's powered by our own enterprisegrade Palmyra LLMs and delivers agents that handle the toughest
enterprise workflows.
To truly scale our full stack vision, we needed an infrastructure provider that was resilient, secure, and engineered for the enterprise.
The majority of the Fortune 500 run on AWS, including so many of our customers.
So, teaming up with AWS, was a no-brainer.
AWS stands alone as the cloud provider that enables us to both train our frontier models and deploy our entire platform securely to our enterprise
customers.
Our work with AWS started two years ago with the model layer. We had just launched our latest Palmyra LLM and it was posting top scores on leaderboards.
But as our models got larger, the computational power that was needed for both training and inference was growing.
And that's where the depth of the AWS stack became a strategic advantage.
Our foundation is built on SageMaker Hyper Pod, which gives us a powerful service for large-scale model training.
We use P5 instances and soon P6 instances to handle the heavy GPU workloads and they're connected with elastic fabric adapter which is what makes the high-speed communication
between nodes possible so our training runs stay fast and synchronized.
We've also paired Hyper Pod with Amazon FSX for Luster so we get data at the speed our models need while keeping costs under control
and the results have been enormous.
We've been able to do runs at a third of the time, going from six weeks down to two weeks. And our training pipelines
two weeks. And our training pipelines have become 90% more reliable.
All that work gives us the power and stability to build our latest frontier model, Palmyra X5, trained right on Hyper Pod. X5 gives exceptional adaptive
Hyper Pod. X5 gives exceptional adaptive reasoning, a massive 1 million token context window and nearperfect accuracy and extracting business insights from
even the most complex high volume data.
And it delivers this with incredible speed. A million token prompt in just 22
speed. A million token prompt in just 22 seconds and multi-turn function calls in 300 milliseconds, outpacing other
Frontier models at a fourth of the cost.
But our relationship with AWS was never just about creating pass fast powerful models. It's always been about building
models. It's always been about building a breakthrough AI platform that can transform how businesses operate. And
with Palmyra X5 is the engine, we're delivering on that vision.
with writer enterprise teams at companies like Mars and Astroenica and Qualcomm work smarter by connecting agents to the data to the context and to
the business knowhow that transforms critical processes all without business users needing to write a single line of code playbooks are central to writer. They
let teams capture a process once, linking the tools, the data, and the systems they rely on and turning them into repeatable, intelligent agents. A
playbook becomes a living dynamic blueprint for how great work gets done.
And because they're shared across teams, the highest impact playbooks can be scaled across organizations instantly.
And very soon with the help of AWS, Writer is going to be including our next generation of self-evolving LLMs that can learn how organizations operate and
anticipate requests on the fly. They're
going to be the world's first agents that improve the more that you use them.
But there is a question hanging over all of this. And for the leaders in IT
of this. And for the leaders in IT security and compliance in the room, those who are held accountable for something when something goes wrong, how
do we empower business teams to innovate but do it safely and securely?
Writer has to be first and foremost an interoperable platform, one that can observe, control, and connect your agents at scale with the tools and
safeguards you already trust.
Today, we're bringing that paradigm to writer by launching a powerful suite of supervision tools built specifically for the enterprise. We're giving
the enterprise. We're giving organizations full visibility and control across the agent life cycle.
Every session tracked, every output compliant, and every data connector governed in real time.
True interoperability means connecting to the systems you trust. So, our
platform works with the observability guard rails and systems that you already use. And beginning today, we're very
use. And beginning today, we're very excited to announce Amazon Bedrock guard rails now integrate directly with our platform. That means
platform. That means Thank you.
that means if you've already set up your policies and safety rules in Bedrock, you can apply those exact same guard rails to use and rider. You don't have to rebuild anything and you get one
consistent compliant layer of control across your entire AI stack.
We also know that model choice is really important to enterprises. So also
starting today, models from Amazon Bedrock are available directly inside of the writer platform. That means AWS and Writer customers can now build agents on
Writer using a catalog of different models from our own Palmyra family to the awesome Nova models you just heard about today and many more all within a
single governed environment. It's the
ultimate flexibility without compromising on security.
For organizations like Vanguard, longtime customers of Writer and AWS, where trust is non-negotiable, the writer and bedrock integrations give
them the control they need to innovate responsibly at scale. And trust is how companies go from a few scattered PC's
to a truly governed enterprisewide impactful AI strategy. You can't scale what you don't trust.
At Writer, our vision is to empower people to transform work, and we're very proud to do it with AWS. Thank you.
[Music] [Applause] [Music] Thanks a lot, May. We're very excited to help customers like Ryder make AI and
agents real for their customers.
All right. One end user that we haven't talked much yet about is developers. And
this turns out to be an area where AWS and Amazon have a really deep expertise.
We know that by far one of the biggest pain points today for development teams who are trying to rapidly modernize their applications is dealing with their technical debt. Accenture estimates that
technical debt. Accenture estimates that tech debt costs companies a combined $2.4 trillion a year in the US alone.
And Gardner says 70% of IT budgets today are consumed by maintaining legacy systems. We knew this is an area where AI could help. And this is why we built AWS Transform to help customers move
away from their legacy platforms, things like VMware and mainframes and Windows.NET with mainframe modernization. As an
example, our customers have already used transform to analyze over a billion lines of mainframe code as they move those mainframe operations uh applications into the cloud. Using
transform, Thompson Reuters has is modernizing over 1.5 million lines of code per month as they move from Windows onto Linux. We knew that helping you
onto Linux. We knew that helping you modernize faster would be really popular. But man, it turns out you all
popular. But man, it turns out you all really dislike your legacy platforms. Yesterday at the festival grounds here in Las Vegas, some of you might have seen, many of you turned in and tuned in
and cheered as we dropped an old decommissioned rack of servers from a crane and blew them up. Uh, as an ode to crushing tech debt with AWS Transform.
Now, this was pretty fun, but there's a lot more legacy platforms that we need to go after. A lot more.
After we launched Transform last year, we actually quickly sat back and we started prioritizing which transformations would we go after next.
We had a ton of ideas. Lambda function
upgrades, Python upgrades, maybe Postgress version upgrades or maybe people who wanted to move from C to Rust migrations. But then we thought about
migrations. But then we thought about what about prepare what about updates to prop proprietary applications and libraries.
The list is almost infinite. So we asked ourselves why not support all modernizations.
Just yesterday we launched AWS transform custom which gives you the ability to cut. Thank you. Those guys are excited
cut. Thank you. Those guys are excited about AWS custom. It gives you the ability to create uh custom code transformation agents to modernize any code or API or framework or runtime or
language translation, even programming languages or frameworks that are only used by your company and customers are already flocking to it. We've already
seen customers doing Angular to React migrations, converting VBA scripts that are embedded in their Excel sheets into Python, converting bash shell scripts into Rust.
One great customer example is with QAD, a provider of cloud-based ERP solutions and supply chain. Their customers
struggled with modernizing from customized old versions of Progress Software's proprietary advanced business language to their QAD adaptive ERB platform. QAD turned to uh to AWS
platform. QAD turned to uh to AWS transform. They had these engagements
transform. They had these engagements that were taking a minimum of two weeks to modernize and all of a sudden they were completing them in under three days. We're really excited to see what
days. We're really excited to see what legacy code you're all able to transform.
Now, one of the great things about making all these transformations easier is that it leaves a lot more time for developers to invent, and that's what we get excited about. And it turns out that developers today are building faster
than ever. AI software tools have seen
than ever. AI software tools have seen rapid changes over the last year. We've
moved from things that are doing like inline tab completion to authoring chunks of code to actually completing simple multi-part tasks. We really see the potential for the entire developer
experience and and frankly the way that software is built to be completely reimagined.
We're taking what's exciting about AI powered software development, but we thought that there's opportunity to add structure to it to make it ready for enterprises to adopt and for high
velocity code development teams to use more effectively.
And this is why we launched Kuro, the agentic development environment for structured AI coding. Kira helps
developers take advantage and of the speed of AI coding, but with more structure where they're in the driver street every step of the way.
Kro has popularized this idea of spec driven development. From simple to
driven development. From simple to complex projects, Kero works alongside developers and teams, turning prompts into these detailed specs and then into working code by its advanced agents. So
what you get and what gets built is exactly what you want and expect. Kro
understands the intent behind your prompts and helps you and your team implement very complex features in large code bases and in fewer shots.
Now the reception to Kira has been quite frankly overwhelming. Hundreds of
frankly overwhelming. Hundreds of thousands of developers have already used Kuro since the preview of the launch just a few months ago. Let's hear
directly from them on how transformative Kuro has been to their work.
I use Curo in almost all the development I do. I ask it questions. I create specs
I do. I ask it questions. I create specs with it.
>> With Curo, I was able to ship more code in the last 5 months than in the past 10 years.
>> With Kira, I'm able to work with a partner. Um, so it feels like we're
partner. Um, so it feels like we're collaborating on the project together.
>> It operates the way my brain operates when solving a problem.
>> I can just say, "Hey, Kira, remember that feature we added in? Can you also write a test as well?" I can be hands off once I break the problem down and
just let Kira deliver for me.
>> I feel like my world has just opened up to a completely different perspective.
>> Everything feels possible now.
>> You can go from zero to PC 10 times faster.
>> Kira makes me want to build more.
Honestly, hero is just awesome.
We think you're all going to love how Kira will transform your development work. And so I'm excited to announce
work. And so I'm excited to announce today that for any qualified startup, we're giving away a year's worth of Kira, up to 100 seats if you apply in the next month.
We're so excited about the impact that Kira is having on making developers lives better each and every day. And
I've frankly been amazed at the impact that this development velocity has seen inside of Amazon. In fact, we've been so blown away that last week all of Amazon decided to standardize on Curo is our
official AI development environment internally. We took a look at all of the
internally. We took a look at all of the tools out there in the market and we recognized that the best way for us to make our developers faster and more productive was to double down on Curo and many of you all are rapidly doing
the same. Now, I want to take a quick
the same. Now, I want to take a quick moment and dive deeper into one of the stories we heard in this video because I think the details are pretty eye opening.
Now, this was a quote from Anthony, one of our distinguished engineers. Now,
Anthony was working on a significant rearchitecture project, and he and the team originally thought that they would need about 30 developers working for 18 months to complete this work. Now,
Anthony and the team were intrigued by the potential of Aantic AI and the potential for it to really supercharge their output. So they decided that they
their output. So they decided that they were going to fully leverage Kira to deliver the project. It turned out as the team started really digging in and seeing the full potential of Agentic tools, it was better than they expected.
And they saw that by leaning in on Agentic development, a much smaller team could actually deliver incredible results. Instead of taking 30 developers
results. Instead of taking 30 developers 18 months to complete the project, they delivered the entire rearchitecture with only six people in 76 days. And with
Curo, this is not just the 10 to 20% efficiency gains that people were seeing with the first generation of AI coding tools. This is orders of magnitude more
tools. This is orders of magnitude more efficiency. Now, I think this is a super
efficiency. Now, I think this is a super powerful story and I've related to a couple of customers over the last month or so. And invariably, I get the
or so. And invariably, I get the question, how did they do it?
Well, at first it turns out it took the team a little bit of time to fully understand how to best leverage Agentic tools. They started to see, of course,
tools. They started to see, of course, some uh efficiency gains right away, but these were honestly a little bit more incremental than transformative, but a few weeks in, they had an aha moment.
They realized that they couldn't keep operating the same way they always operated. They realized to get the most
operated. They realized to get the most out of the agents meant changing their workflows, and they wanted to lean in to the strengths of what the agents were.
And then they had to question some of the assumptions they always had about how they wrote software. The team
learned a ton along the way and was able to spot a whole series of new opportunities for how agents could enable teams to ship faster. The first
learning they had which was how they interacted with these Curo agents. In
the beginning, they would feed the tools small tasks to ensure that they got the reliable results back and they would go back and forth with all their tools constantly. But as they learned what the
constantly. But as they learned what the agents were good and what were not good at, there was this inflection point where they moved from babysitting individual tasks into directing broad goal-driven outcome. And this is when
goal-driven outcome. And this is when they saw their velocity on shipping features rapidly accelerate.
Second, then they thought about moving even faster and they recognized that they were thinking much too linearly in assigning tasks to the agent. They
realized that the team's velocity was tied to how many concurrent agentic tasks they could run. And if they could have the agent do more in parallel, they'd go faster. And so they kept looking for ways to scale out their workloads.
Finally, the team observed that as they scaled out, they themselves became the bottlenecks. They had to keep unblocking
bottlenecks. They had to keep unblocking the agents as they came back because they needed a human intervention or direction. It turns out that the longer
direction. It turns out that the longer they could get these agents to work independently, the better. One clear
example actually is when the team looked at their commit graphs. Not
surprisingly, they saw that progress stopped when everyone went to sleep.
They hypothesized that if the agents could use that time to clear the backlog, the team would be able to wake up in the morning with lots more code to review um and be able to keep moving faster.
So, we sat back and reflected on these learnings and we asked ourselves, why can't we have agents that are able to do all of these things? And that's why today we're introducing Frontier Agents.
Frontier agents are a new class of agents that are a step function change more capable than what we have today. We
generally think about three things that differentiate frontier agents. One,
they're autonomous. You direct them towards a goal and they figure out how to achieve it. Two, they have to be massively scalable. Of course,
massively scalable. Of course, individually they can perform multiple concurrent tasks, but you have to be able to distribute work across multiple instances of each type of agent. And
three, these agents need to be long running. They may be working for hours,
running. They may be working for hours, maybe even days in pursuit of ambitious, sometimes frankly amorphous goals without requiring human intervention or direction. Let me introduce the first
direction. Let me introduce the first frontier agent we'll be launching today and that is the Kirao autonomous agent.
The Kira agent is an agent that transforms how developers and teams build software, vastly increasing your developers team's capacity to invent.
The Kira autonomous agent runs alongside your workflow, maintaining context and automating development tasks so that your team never loses momentum. You
simply assign it complex tasks from the backlog and it independently figures out how to get that work done. Hero can now autonomously tackle a full range of things your developer might need from
delivering new features to triaging bugs, even improving code coverage. And
all of this takes place in the background so that your engineers can stay in their flow state focusing on the big ideas.
Kira autonomous agent connects with your tools that you already use like Jira and GitHub and Slack and it'll use does that to build a shared understanding of your team and your work. One of the super
cool things is that the cure agent is just like another member of your team.
It actually learns how you like to work and it continues to deepen its understanding of your code and your products and the standards that your team follows over time. It weaves
together everything you do, every spec, every discussion, every pull request, and it builds this collective memory that fuels smarter development.
Let's take an example.
Let's say you need to upgrade a critical library that's used across 15 different microservices.
Now, if you were to do this with Curo today, you'd have to first open a repo, prompt it to update the library, then you'd review those changes, fix anything it missed, run your tests, and create a
pull request. Then you'd move on to repo
pull request. Then you'd move on to repo two and you'd start all over reexplaining your context, reprompting for similar changes. And you do that 14 more times. Each time you did it, you'd
more times. Each time you did it, you'd have to approve the changes. And if you paused or went home for the day or anything like that, you'd have to remind Curo of all the context when you start back up since it doesn't main maintain
state in between sessions.
Let's take a look and see what this looks like with the new Curo autonomous agent.
First, you'll get started in curo.dev dev and kick off a task associated with your GitHub repo. You'll describe the problem that you're trying to solve and then the agent uses that and it uses all of its reasoning and knowledge from
previous implementations to ask clarifying questions that it doesn't understand as it tries to plan tasks.
With its deep knowledge of your entire codebase, it then quickly identifies where it needs to make updates in all the selected repositories that it needs to where it needs to go update your libraries. The agent identifies every
libraries. The agent identifies every affected repo that you have, analyzes how every service that you have uses the library, and updates the code following your patterns. It runs full test suites,
your patterns. It runs full test suites, and then opens 15 tested merge ready pull requests. All of this is in the
pull requests. All of this is in the background while you work on something else.
And to go even faster, it scales out to more parallel tasks, each with its own context. So that while you have Curo off
context. So that while you have Curo off there implementing your your new library, you can also have it fix a bug that you found last night.
And this agent isn't session based. It
doesn't forget. When you give it feedback on one of your pull requests about error handling, it applies that learning to the next 14. When it seems similar architectural decisions in the past, it references the work that you've
done before. You're not reexplaining
done before. You're not reexplaining your codebase every time. It already
knows how you work and it gets better with every single task that it does. We
think that this will help you move much more quickly and it's going to completely change the way that you think about writing code.
So that's the Cure Autonomous Agent and we're really excited about how it's going to allow you to ship more code more quickly. But one other thing that
more quickly. But one other thing that our teams quickly discovered as we started writing tons and tons more code is that you can't just accelerate writing code alone. It's only the beginning. You have to make sure that
beginning. You have to make sure that every stage of the software development life cycle can scale and accelerate at the same rate. Otherwise, you're just going to create new bottlenecks.
We realize that the same lessons that we learned, directing outcomes, scaling out, extending agent autonomy, apply to almost every aspect of the development life cycle.
Now, as we've said this once, we'll say it a thousand times. Security has always been our number one priority AWS. And
we've been working with you all, our customers, to help you all secure your products in the cloud for nearly two decades. So, we naturally thought next
decades. So, we naturally thought next about what a security frontier agent would look like. We know that every customer wants their products to be secure, but you have trade-offs. Where
do you spend your time? Do you
prioritize existing and improving the security of existing features or do you prioritize time on ship on shipping new ones? at Amazon and at AWS
ones? at Amazon and at AWS uh security is so deeply embedded in everything that we do in our development culture and our practices. We perform
code reviews. We conduct security reviews of systems architecture. We do
tons of pen testing with huge teams consisting of both internal and external experts that look for vulnerabilities all before any code ever reaches production. But it turns out most
production. But it turns out most customers can't afford to do this continually. So what happens is either
continually. So what happens is either you don't do all of this or you just do it a couple times a year. And now when when development is so accelerated with AI, this can mean that there's multiple
releases that are going out the door before your code is rigorously assessed for security risks.
We have a firm belief that in order to get security right, you have to build it into everything you do from the ground up. And so I'm very excited to announce
up. And so I'm very excited to announce the launch of the AWS security agent.
This agent will help you build applications that are secure from the very beginning. AWS security agent helps
very beginning. AWS security agent helps you ship with more confidence. It embeds
security expertise upstream and enables you to secure your your systems more often. It proactively reviews your
often. It proactively reviews your design documents and it also scans your code for vulnerabilities.
And since uh security agent integrates directly with your pullhub requests, with your GitHub pull requests, it provides your developers with feedback directly into their workflows.
Security agent also helps with penetration testing. It turns pentesting
penetration testing. It turns pentesting in from this practice that was slow and expensive process into something that's an ondemand practice. It allows you to continuously validate your application
security posture. I'll quickly show you
security posture. I'll quickly show you how it works. Let's say your company has an approved way of storing and processing credit card information.
But let's say you have a developer that in inadvertently works with the wrong approach. This can mean a ton of rework
approach. This can mean a ton of rework and late in the development process, it pro possibly could mean throwing away months of work. However, the AWS security agent can catch these issues
early. It can even catch it from your
early. It can even catch it from your design documents before you write a line of code by always looking to ensure that you're following your team's best practices. Then when it come when the
practices. Then when it come when the time does come to submit your code, AWS security agent can revol uh review your pull request against those same requirements and flag any issues,
providing you with a concise remediation steps for anything that it finds. When
your code's complete, you simply initiate a pentest and that agent will immediately jump on it, giving you real-time visibility into its progress.
When it's done, you actually get validated uh findings complete with suggested remediation code to fix any issues that it does find. No more
waiting for resources. No expensive
external consultants. And let's say you have multiple apps that are ready to deploy in production. You can just launch multiple security agents in parallel. So you can get and test all of
parallel. So you can get and test all of your applications and not get bottlenecked. Now you're writing code
bottlenecked. Now you're writing code faster and you're deploying it just as fast because you know it's secure.
Now of course you know what comes next.
You have to operate that code.
And we all know that as systems grow, the surface area of what you're operating grows as well. And that means growing DevOps work. This is something that our own teams inside of Amazon have a ton of experience with. At Amazon,
we've always believed that the best way to create a great customer experience is to have developers operate their own code. We've been living DevOps for many
code. We've been living DevOps for many years, and what we've learned that frankly, as your service scales, operations can eat up more and more of your time. We thought this is another
your time. We thought this is another area where we could put our expertise in your hands.
Introducing the AWS DevOps agent.
This agent is a frontier agent that resolves and proactively prevents incidents, continuously improving your reliability and your performance. The
AWS DevOps agent investigates incidents and identifies operational improvements just like your experienced DevOps engineers would. It learns from your
engineers would. It learns from your resources, their relationships, and including things like your existing server uh observability solutions, runbooks, code repositories, and CI/CD
pipelines. It then correlates all that
pipelines. It then correlates all that telemetry and code and deployment data across all of those sources and allows them to understand relationships between your application resources, including,
by the way, applications in multicloud and hybrid environments. Let me show you how this can transform incident response. Let's say an incident happens,
response. Let's say an incident happens, an alarm goes off. Before your on call engineer can even check in, AWS DevOps agent instantly responds, diagnosing that it found some elevated
authentication error rates from a Lambda function that was trying to connect to your database.
It uses knowledge of your application topology and the relationship between all those different components to independently work back from the alert to find the root cause of the problem.
In this example, let's say you use Dinatrace for your observability solution. The AWS DevOps agent uses its
solution. The AWS DevOps agent uses its built-in integration with Dinatrace to provide more context for the incident.
It understands all of your dependencies and knows your deployment stack that created each and every resource. When
it's found the problem, let's say in this case it was a change that was made to your Lambda functions IM policy. It
then tells you what introduced that change. It turns out it was a simple
change. It turns out it was a simple mistake in your CDK code deployment. By
the time your on call engineer logs on, the DevOps Frontier agent already has found the issue, suggested a change, and is ready for your on call to review the change and approve the fix. What's even
better is that it lets you prevent such an incident from happening in the future by recommending some CI/CD guardrails to catch these type of policy changes before they're ever deployed. And that's
it. The DevOps agent is always on call, fast and accurate, making incident response and operations work easy.
Together, these three frontier agents, Hero Autonomous Agent, AWS Security Agent, and the AWS DevOps agent are going to completely transform the way your teams build, secure, and operate
your software.
Let's take a quick look at what your future might look like here.
>> Hey, did you run a pentest? and test.
Still working on these unit tests.
You're funny. Like we have time for pen test.
[Music] How does upgrading one package break five others?
We didn't work for me.
>> Love that for you.
>> Morning.
>> You look nice.
>> Paged 4:00 a.m. I feel awful.
>> Cheers.
Did you use the latest pull-up request format? Classic.
format? Classic.
[Music] >> We're green across the board.
>> Yes. Oh my god.
>> What?
>> We got it.
>> Look at that.
>> Today, I'm excited to announce the next leap forward. We're launching three new
leap forward. We're launching three new Frontier agents. These agents can reduce
Frontier agents. These agents can reduce a lot of the time that's spent on these really important but repetitive, time-conuming, and frankly unfulfilling tasks. This takes what used to be months
tasks. This takes what used to be months of work into hours.
>> Wo!
>> They are going to transform the way you and your teams build, secure, and operate software.
>> Any dumpster fires last night?
>> Nope. Solved for me. Proved it and went back to sleep in a few minutes. Look at
you. Thank you. All right, let's go.
I've got some fire ideas for you guys.
>> Oh, cool.
>> Now with added pen tests.
>> We think that we're only at the beginning of Frontier Agents, and we're super excited to see what you all achieve with them.
>> Automatically balances the load across the charging stations.
>> That's awesome. Good work.
[Music] I think we can all agree that's a future we'd be excited about.
Today is a big leap forward in the journey towards unlocking the value of AI. We're bringing you powerful
AI. We're bringing you powerful innovations at every single layer of the stack. The innovation that's happening
stack. The innovation that's happening across AI and agents today is truly incredible.
But it turns out it's not just our AI and agentic services that are developing a ton of new innovation this week here at reinvent. There's a bunch of launches
at reinvent. There's a bunch of launches that AWS is very excited about. And
because AWS is so broad, I know many of you were hoping to hear about our fed our fantastic additions to our core nonAI services as well. It turns out it's actually one of the hardest things
of planning and doing this uh reinvent keynote. What do you cut from the talk?
keynote. What do you cut from the talk?
How do you fit it all in?
Well, teams asked us when when our teams were delivering, they said, I'm going to keep doing the pace of
innovation. And I said, well, I don't
innovation. And I said, well, I don't know how to fit it into one keynote, but I said, you know, why not try?
So, I said, if our AWS teams can deliver at such a rapid pace, I can up my keynote game, too. I'm gonna try. We'll
see. So, if everybody can hang with me for just a few minutes longer, we're not done yet. I have 25 exciting new product
done yet. I have 25 exciting new product launches across our core AWS services to unveil. And I'm going to give myself
unveil. And I'm going to give myself just 10 minutes to do it. To keep me honest here, the team is rolling out a shot clock, and you all can keep track.
Okay.
All right.
All right. Buckle up everybody. Let's
get to it. Let's start with our compute offerings. We know that one of the
offerings. We know that one of the things that you all love is that AWS continues to offer the broadest selection of instances so you always have the best possible instance for your application. Now, lots of you run memory
application. Now, lots of you run memory in uh intensive applications out there like SAP HANA or SQL Server or EDA.
So, today I'm excited to announce our next generation of X family of large memory instances.
They're powered by custom Intel Xeon 6 processors. And these instances can
processors. And these instances can provide up to 50% more memory.
And I'm excited to announce the next generation of AMD epic memory pro processors as well, giving you three terabytes of memory.
Now, you've also told us that you have a lot of really demanding CPUheavy applications out there like batch processing and gaming. So, today we're launching our C8A instances which are
based on the latest AMD epic processors and give 30% higher performance.
Many of you also run EC2 instances that run security or network applications.
And those applications need a lot of compute and super fast networking. For
those, we're announcing our C8 instances powered by custom Intel Xeon 6 processors using the latest Nitro V6 cards. These instances deliver two and a
cards. These instances deliver two and a half times higher packet performance per vCPU.
What about applications that need really ultra fast single thread frequency compute? You got that, too. Introducing
compute? You got that, too. Introducing
our M8 AN instances with the absolute fastest CPU clock frequency available anywhere in the cloud. These instances
are ideal for applications like multiplayer gaming, highfrequency trading, and real-time uh data analytics. Today, AWS is still the only
analytics. Today, AWS is still the only provider that offers Apple Macbased instances and they are really popular.
So, today I'm happy to announce two new instances powered by the latest Apple hardware. announcing the EC2 M3 Ultra
hardware. announcing the EC2 M3 Ultra Mac and the EC2 M4 MaxM Mac instances.
Developers can now use the latest Apple hardware to build, test, and sign Apple apps in AWS.
All right, customers love using Lambda to quickly build functions and run code at scale. Lambda works great when you
at scale. Lambda works great when you want to execute code quickly, but sometimes you have a use case where your Lambda function needs to wait for a response, like waiting on an agent that's working in the background for several hours or maybe even days. We
wanted to make it easy for you to program weight times directly into your Lambda functions.
So today we're announcing Lambda durable functions.
Durable functions make it easy for you to manage state, build longunning workloads with built-in error handling and automatic recovery. All right, how are we doing? All right, eight launches in a bit about three minutes. All right,
I better pick it up. All right, let's move on to storage.
>> We know you love S3. I mentioned earlier that S3 stores more than 500 trillion objects, hundreds of exabytes of data.
That is a lot of data. When we launched S3 in 2006, we had a 5 GB max object size. Then a couple years later, we
size. Then a couple years later, we increased that to be 5 tab and that has been sufficiently large for the past decade. But data has gotten a lot bigger
decade. But data has gotten a lot bigger in the in the past couple of years. So
we asked ourselves, what would be the object size that would meet all of your needs today? Should we double it? Triple
needs today? Should we double it? Triple
it?
How about 10x it? I'm pleased today to announce that we're increasing the maximum object size in S3 by 10x to 50 terabytes.
But not just bigger, though. We knew you also wanted to make S3 faster for batch operations. So, starting today, we're
operations. So, starting today, we're improving the performance of B batch operations where large batch jobs now run 10x faster.
Last year at reinvent, I announced S3 tables, which is a new bucket type optimized for iceberg tables. It's been
incredibly popular. But as the volumes of table data has started to quickly rise, you all have asked for ways that we can help you save money. So today,
we're announcing intelligent tiering for S3 tables.
This can save you up to 80% on storage costs for the data in your S3 table buckets automatically.
You also asked us to make it easier to replicate these tables between regions so that you can get consistent query performance from anywhere. So as of today, you can now automatically
replicate your S3 tables across AWS regions and accounts.
Earlier this year, we introduced S3 access points for FSX for Open ZFS. This
allows you to access your ZFS file system information as if it was data inside of S3. And today, we're making it possible for you to access even more of your file data this way by expanding S3
access points to FSX to include support for NetApp Onap.
Now, ONAP customers can also access their data seamlessly, just as if it was in S3. Now, one of the fastest growing
in S3. Now, one of the fastest growing data types that you all have or vector embeddings, which are used to make it easier for your AI models to search for and make sense of your data. Earlier
this year, we announced a preview of S3 Vectors, which is the first cloud object store with native support to store and query vectors. And today, I'm happy to
query vectors. And today, I'm happy to announce the general availability of S3 vectors.
You can now store trillions of vectors in a single S3 bucket and reduce the cost of storing and quering them by 90%.
Now, I expect that many of you will use S3 vectors in concert with a high performance vector database. Today, the
most popular way to do that at low latency search of your vector embeddings is with an index in Amazon Open Search.
But many of you have asked us, is there a way that you can speed up the process of creating that index for all of my data? So, today we're excited to
data? So, today we're excited to announce GPU acceleration for vector indices in Amazon Open Search by you. By using GPUs to index that
by you. By using GPUs to index that data, you can now index data 10x faster at 1/4 the cost. All right, how are we doing? 4 minutes left. Awesome. All
doing? 4 minutes left. Awesome. All
right, 15 launches and just a few minutes left. Let's keep going.
minutes left. Let's keep going.
Let's move on to EMR, which is our popular big data processing service. We
launched EMR serverless four years ago, and customers love it because it takes a lot of the muck out of running pabyte scale processing. But it turns out today
scale processing. But it turns out today it isn't quite muck-free. Customers
still have to provision and manage their EMR storage, but not anymore.
As of today, we're eliminating the need for you to have to provision local storage for your EMR serverless clusters.
All right, let's move on to security.
Today, tens of thousands of you rely on Guard Duty to monitor and protect your accounts, applications, and data from threats. This past summer, we added
threats. This past summer, we added Guard Duty's extended threat detection to Amazon EKS. And we're pleased with the momentum we're seeing. So, of
course, naturally, we didn't stop there.
And today we're adding this capability to ECS.
Now you can use AWS's most advanced threat detection capabilities for all of your containers and all of your EC2 instances as well. These
are both enabled for all Guard Duty customers at no additional cost.
Every customer wants to find and fix security issues quickly. The faster and easier the better. That's why we have Security Hub, which aggregates security data from AWS and third party sources
and helps you identify potential security problems. Earlier this year, we previewed an enhanced version of Security Hub. And today, I'm excited to
Security Hub. And today, I'm excited to announce that Security Hub is GA. Today,
we're also announcing several new capabilities, including near uh real-time risk analytics, a trends dashboard, and a new uh streamlined pricing model.
Ops teams out there live and die by their log data, but that log data is everywhere. They're in cloud trail logs
everywhere. They're in cloud trail logs and VPC flow logs and WFT logs and logs from third parties like Octa and Crowd Strike. We thought that we could make
Strike. We thought that we could make that better. So today, we're announcing
that better. So today, we're announcing a new unified data store in Cloudatch for all of your operational security and compliance data. It automates log data
compliance data. It automates log data collection from AWS and third parties and they'll store it in S3 data or in S3 tables to make it easier and faster to
find issues and unlock new insights. All
right, stick with me everyone. We're
almost there. We're on the home stretch.
Let's move on to databases. Now, I'm not going to make you raise your hand. Don't
worry. But I know that many of you out there are still supporting some legacy SQL Server and Oracle databases. Getting
off of them is hard, but at least AWS makes it easier to manage them. We see
you out there. And don't worry, we're here to help.
One thing I hear from many of you is that your legacy databases have grown very large over time. They're actually
bigger than what we support in RDS.
So, I'm excited to announce we're increasing the storage capacity for RDS for SQL Server and Oracle from 64 tabytes to 256 terabytes.
This also delivers a 4x improvement in IOPS and IO bandwidth. This is going to make it a lot easier to migrate existing workloads from on-rem to scale them on AWS.
We also want to provide you with more controls to help you optimize your SQL server licenses and manage your costs.
So, we're doing a couple of things starting today. You can specify the
starting today. You can specify the number of vCPUs that are enabled for your uh SQL Server database instance.
This helps you reduce your per uh CPU licensing costs that you get from Microsoft. And today we're also adding
Microsoft. And today we're also adding support for SQL Server Developer Edition, so you can build and test your applications with no licensing fee.
Oh, the loot music's coming on. All
right, that means I only have a few seconds left. But there's one last thing
seconds left. But there's one last thing that I think all of you are going to love. Several years ago, we launched
love. Several years ago, we launched compute savings plans as a way to simplify making commitments across our entire set of compute offerings. And
since that day, I've regularly been asked, when can I get a unified savings plan for databases?
And here it is. Starting today, we're launching database savings plans.
These can save you up to 35% across all your usage for our database services.
All right. Boom. Did it. A whole second keynote worth of new capabilities for all of you in under 10 minutes. Now you
have four full days to go out and learn.
Dive deep into the details and start inventing.
Thank you all for coming to reinvent.
Enjoy.
[Music]
Loading video analysis...