AMA: Black Friday Peak Performance with PAIGE
By Vercel
Summary
## Key takeaways - **2M Concurrent Requests Peak**: We got almost 2 million concurrent requests per second at the highest peak. It was a total of 86 billion requests for the entire holiday and our uptime went above the 99.99%. [02:01], [02:28] - **Blissfully Boring Black Friday**: Last year, Black Friday was the first Black Friday sale on the Vercel platform after replatforming. It was blissfully boring in the best way possible with record-breaking numbers, site speed, and performance stability without any problems. [02:32], [03:23] - **Elasticity Beats Fixed Bandwidth**: Prior to Vercel, our hosting provider was not elastic with a certain amount of bandwidth and CPU, leading to downtime or slowing if thresholds were exceeded. Now on Vercel, elasticity ensures stability even for sudden spikes like Taylor Swift promoting our jeans, just resulting in a higher bill. [05:21], [06:09] - **Instant Rollbacks Zero Downtime**: Instant rollbacks give me less headaches as I could get a Slack message about an issue and roll back while on a call without skipping a beat. The history goes far back to check previous deployments and determine the right one with zero downtime. [11:35], [12:01] - **Stress Tests Built Confidence**: In the first year before Black Friday 2024, we met with Vercel prior and did stress tests on the website to see how it would behave under Black Friday-level visitors and clicks, and everything went super smooth. This year, higher trust from proven performance means focusing on stability with code freeze. [19:37], [20:19] - **24/7 Support Owns P1 Tickets**: I take the lead on any P1 tickets with 24/7 support, and with enterprise support, Vercel can access our codebase to dig into root causes without finger-pointing between vendors. It's nice to have but we've had fewer issues on Vercel. [13:00], [15:15]
Topics Covered
- Blissfully Boring Black Friday Wins
- Elasticity Enables Fearless Scaling
- Instant Rollbacks Eliminate Headaches
- Unified Vercel-Next.js Fixes Root Causes
Full Transcript
Hello everyone and welcome to our webinar. I'm your host Sebastian
webinar. I'm your host Sebastian Contreres. I'm a senior development
Contreres. I'm a senior development success engineer here at Versell and today with us is Michael Aoya, director of software engineer at page. As
announced before uh we will be diving into how page achieved a successful Black Friday 2024 with Bell and the technicalities behind it. Remember that
this session is also at MAA. So feel
free to drop your questions in the chat.
My co-host Shon will be collecting the questions and we'll cover as many as we can during the Q&A. And as usual, for any questions that we don't get to answer, we'll reach out to you via email. So, let's get started. Hi,
email. So, let's get started. Hi,
Michael. Thanks for accepting our invitation. Please introduce yourself to
invitation. Please introduce yourself to the audience and tell us about your role and about Page. Hey, guys. And yes,
hello Sebastian. So, my name is Michael.
Uh, I am the director of software engineering at Page. Been at the company for about five and a half years now.
Uh my role is mainly um leading a team of developers um at page. Our biggest
role is on the ecom side with custom development on the website, but our company at page is an omni channel experience. Um so for those unfamiliar
experience. Um so for those unfamiliar with page as a brand or company um we have over 20 retail stores throughout the country. uh as well as in this past
the country. uh as well as in this past year started open up internationally with uh stores in the UK as well as we just opened up our first store in Macau, China, which is a big accomplishment. Um
in addition to retail, we have the online e-commerce channel which is uh mainly what I manage and then we also do wholesale with kind of higherend department stores such as Bloomingdale,
Nordstrom, Nean Marcus, that kind of tier. Um so we're across many different
tier. Um so we're across many different channels. Um but we're definitely in the
channels. Um but we're definitely in the kind of apparel space.
>> Okay, that's good. So I would like to start uh the conversation with Black Friday 2024 from our cell perspective.
So what you're seeing on the screens are the numbers like we got almost 2 million concurrent requests per second at the highest peak. uh it was a total of 86
highest peak. uh it was a total of 86 billion requests for the entire holiday and our uptime went above the our 99.99%
uh uptime. So my first two questions are
uh uptime. So my first two questions are like how was Black Friday 2024 for you like on the client side and how this experience with Versail shaped your
Black Friday preparation for 2025.
>> All right. Yeah. So last year, Black Friday was the first Black Friday sale um on the Versell platform. We
replatformed on the Versell. So this was a big test for us um just because you know, you always kind of want to prove it in a sense. And u I'm happy to report
that uh that the term I always use for this is it was blissfully boring in the best way possible. What I mean by that is obviously you're always concerned about stability, sight speed, and all
that. Um, and everything went out
that. Um, and everything went out without any problems, very smooth. Um,
you know, I was able to enjoy my Thanksgiving without having to uh huddle over a computer and put out fires that year. Um, but in addition to just sight
year. Um, but in addition to just sight speed and performance stability, um, record-breaking numbers for us. So, it
was wins across the entire board.
And for 2025, is that something that you want to do different? How how it is with,
different? How how it is with, >> you know, 2025, we we set a high bar for 2024. So 2025 is really trying to always
2024. So 2025 is really trying to always beat that bar. Uh but now with another year under our belt with Forcell, uh there's been a lot of new features,
optimizations that we've made to go into this Black Friday, Cyber Monday, kind of even more confident than we were last year.
>> That's good. That's actually what I what I wanted to talk uh next and is let's dive a little bit into scalability because it's a core aspect when we have all these traffic spikes. Um so for
supervised mechanisms and we definitely did some improvements. Uh the first one is like fluid compute which is our new infrastructure model that allows you to execute functions in parallel within the
instances and also scale the instances uh as we as as we did before. Uh we also have automatic regional failovers for your versel functions to avoid
disruptions in case uh your main region falls. So you have you can set up a
falls. So you have you can set up a second region that can take uh over and the global edge network which is the CDM part uh that you know as part of dealing
with this high amounts of traffic is serving catch content or static content closer to your users. But we also have all those mechanisms for ISR where we
can on demand revalidate and put fresh content on on the caching as well. So
with that being said, we can affirm that Verscell has proven to scale. But uh
more into your specific solution, what technical aspects have you considered for your enterprise architecture given these elastic versell capabilities?
[snorts] >> Yeah, so I think it's important to know where we came from before having these type of features. Um in our previous iteration of the website prior to
Verscell, our hosting provider was uh it was not elastic. We were given a certain amount of of kind of bandwidth and and [clears throat] kind of CPU and we knew
if if the website went over that threshold, whether it was visitors, clicks, whatever it may be, that we would actually experience either some downtime, some slowingness in the site, just overall bad experience, which did
occur in the previous version. So, um,
now being on Verscell and the website being elastic is is huge for us because now, um, as someone that manages the website and his main focus on his on
stability, having it be elastic means, you know, if you know, let's just say Taylor Swift went on Instagram live and told the world to go by go buy Paige
jeans that I know that our site is still going to be stable and the elasticity is going to just put resources and the only thing that I'm going to get in the end of the month is just a higher bill because of that. But you don't want to
miss out on those opportunities or in this case for Black Friday, Cyber Monday, those sales. So, it's much more important to have the peace of mind when
you sleep, when you wake up that the site will scale to our enterprise needs.
>> Yeah, that that's definitely a priority for us as well. We did test this um fluid compute during Super Bowl this year. So it was like those spikes and it
year. So it was like those spikes and it it scales well. I think scaling compute is is very important in the sense that it's it's kind of like one of the things that break first. You can have catch you
know for a static websites but nowadays dynamic features are everywhere. So
having something like an elastic platform behind it really makes a difference.
>> Okay. So uh my last questions um I would like to ask this question based more on versel features.
>> Okay. So when it comes to black Friday uh when it comes to versel uh enterprises uses Verscell based on unique conditions that they have and we
also present the PCEL like scalability features and security and the different workflow features at different dimensions but not all the companies interact with them on the same way. So
for example, if we talk about uh monitoring and observability, there are companies that do everything from the dashboard and they go to observability plus or the real time logs.
Uh and also you have usage alerts that notifies you like if something uh if you're reaching out to your allocation of of mius but also we have this new
anomaly alerts obsability which is in beta and that one will also uh let you and notify you if something goes wrong in terms of errors and also if you have
also some usages and that can pipe like in Slack or emails as well. But uh we have other companies they go out of the dashboard. So what they have is log
dashboard. So what they have is log drain. So they pipeline all the logs
drain. So they pipeline all the logs from Versell into an external platform and there they have their their dashboards as well. When you look at it from a platform security perspective,
it's kind of similar in the sense that you have things that we provide by default. The DDoS protection is there.
default. The DDoS protection is there.
You don't need to do anything to get it working. But uh you also have things on
working. But uh you also have things on the extreme like bot ID which is our cases bake uh advanced protection for bots that you actually have to modify
the code to make it work. And in between you have things like your W custom rules that you can do in the dashboard but those are typically designed for your
custom business requirements and some optim features like manage rule sets. So
you can define uh where you want to be in how you want to be involved with the platform security also workflow.
Workflow is very interesting because there are different ways in which you can handle it like some companies have like instant rollbacks as the way to deal with if something goes wrong in
production. It's like this deployment
production. It's like this deployment sequence and then you can select a previous one and you can say now you are my um live deployment and that one will
be without uh zero downtime. So that's
one way but there are flags rolling releases which is our newest um deployment strategy and also you have the people. So you have the 247 support
the people. So you have the 247 support team our SLAs that will depend on your contract and also weekday support that is kind of like my role that we can meet
before and then we can define and check your configurations and see like if everything is going correctly.
So with all those features that I presented, I lost my my breath there.
The question is like if you think about your Black Friday 2025 playbook, which Versell features have you considered like a must-h have and what roles and responsibilities have you defined for
your team in relation to those features uh with Versell? Yeah, I I will start off by saying definitely a lot of features and and for those maybe watching this, uh don't stress out about
thinking you have to implement all of these all at once. Um even us, uh we're still kind of slowly picking away at things we want to add on in terms of
this list. We're not all fully there. Um
this list. We're not all fully there. Um
but we do use a diverse amount on this list and it is also kind of either managed or supported by different members of the team with different uh tiers of kind of permissions on the
site. So I'll go over a few specifics
site. So I'll go over a few specifics just so everyone has like an example. Um
starting with something like the real-time logs. Um, we have our uh QA
real-time logs. Um, we have our uh QA who periodically checks these logs on a consistent basis to see if anything is popping up in the logs that we might need to inherently create a bug ticket
for in the back end because we're seeing specific either 4x or 5x errors. Um, in
terms of platform security, always that's the top of the list in terms of priority, especially for myself. Um, so
we've we've really over the past year started to dial in on specific uh W custom rules as well as tested you guys' bot protection in terms of like implicit
um rule sets. Uh instant rollbacks. I
think this is the one that's like uh gives me less headaches throughout the years compared to before um because it instant.
Um we could be on this call right now. I
could get a Slack message that hey, there's this issue and I could roll back while doing this webinar and and not even skip a beat. Um, that's how good it is. And the history goes all the way far
is. And the history goes all the way far back. We can even look into the previous
back. We can even look into the previous deployments and see if the issue was persistent there to determine which one we want to roll back to. So, we have all
the history. Um and then uh one thing
the history. Um and then uh one thing recently that we've been doing is um our
kind of business side and um more on digital kind of like digital analytics is starting to leverage [snorts] the web
analytics within Verscell. Um so they're not actually technical team at all but we give them um view only permissions and they've been leveraging looking at the web analytics to kind of track how
some of their campaigns are going uh because Versel being the hosting provider has access uh to some of the either endpoints the query parameters um to see how their campaigns are going. So
that's like a new one. So it just goes to show and then sorry lastly the 247 support. It's another big one that gives
support. It's another big one that gives me less headaches throughout the year.
Um, I will first and foremost say Sebastian is not someone that's new to me. Uh, when we first joined Verscell,
me. Uh, when we first joined Verscell, we also uh built our website on Nex.js and Nex.js is kind of, you know, under the Verscell umbrella. And not only did
we get support on Versel itself, but also on Next.js and Sebastian was very helpful getting us across the finish line there. It's one of those things
line there. It's one of those things where um you may not need it every day, but when you need it, you're sure glad they're there. Um, so in terms of ticket
they're there. Um, so in terms of ticket support, um, I definitely have, um, kind of take take the lead there of any P1
tickets. Um, but I will say because we
tickets. Um, but I will say because we have good support and not nearly as much issues in the past, it's one of those things it's nice to have, but we're happy to say we haven't had to leverage
that as much as we would if we weren't on Versel.
>> Appreciate that. I appreciate that.
Yeah, we were before. um is is one of those uh weekday support things that I was talking it was good to see what were the especially at that time uh the different catching strategies that can
make a difference when it comes to to something like Black Friday even if it's not only the CDN catch but you can do data catch or things like that to protect against like you know traffic
spikes and you mentioned something that that I will I would also like to highlight the the 247 support team that that you define you take ownership of those tickets. That's important because
those tickets. That's important because anyone on the team can technically go and file a ticket, but the more control there is in who files it and when it's
filed, it also helps into like how we articulate um actions like contingency actions coordinated coordinatorally which is a challenge if if you have
someone that notifies the support the creates a support ticket but is not involved in the decision making if we need to escalate something on the client side. So I think that that's good as a
side. So I think that that's good as a >> I would even add uh this this is more at anyone that's listening to this who's also building a nexjs at least with the
enterprise support um you guys are just able to get even more dig a little bit deeper into root causes and there's been times where you guys gain access to our
codebase uh for an issue and actually look into it which is kind of like a beautiful thing with Nex.js and PCEL working well together. Um, historically
in the past you would have a lot of vendors pointing the finger at each other. But because our framework and our
other. But because our framework and our hosting provider is the same, we can solely work on the issue in house without having to go to a bunch of different people who are kind of
pointing the blame somewhere else. So
it's it's much better work process when it's all combined into one.
>> Yeah, we try to keep everything good for our customers for sure.
>> So those were all the questions I I had.
So I think we can open the Q&A section.
Sharon.
Okay. So our first question is how can I be notified by Brazil if something goes wrong? So I I think the new feature uh
wrong? So I I think the new feature uh it's part of observability plus that is called anomaly alerts. Uh it's
definitely worth checking. You can
define if you wanted to send the notification to Slack or email. uh and
it it will it is also attached to errors spikes. So in case there is an error
spikes. So in case there is an error then it will notify you like proactively before it becomes something worse. Um
and also you have by default the notifications alerts and usage which are also related to if something goes wrong it tends to spike. So I will recommend
you to verify your notification settings before Black Friday because those needs to I mean we have cases but no the email that is there is no one is taking care of that email or something like that. So
make sure that you have everything there.
Okay. Another questions I see is like what is the best way to receive support from Brazil in case something happens?
Um for sure is the open a support case uh on on on holidays and and things like Black Friday there is a severity and whenever
you open a ticket a support ticket with Brazil there is a severity severity category and if you have like your production traffic or your production site is suffering that gets treated like
prioritized for the entire team. We also
have our product team, our platform team of course making sure that everything works smoothly but having a support case is the official way in which we can track you know SLAs and the time and the
responses. So it's some of our
responses. So it's some of our enterprise customers have us on the slack channels and they can reach out to us uh like the weekday support but the
the initial that it always should be like a support case.
Okay, another question is where can I find downtime of a sale incident? Uh,
this is good. Um, I think one way you can buy your support ticket and normally they they got under investigation and someone is going to check on that but uh
check up our versel status page and in the versell status page we update any incident and we put the downtime there if you needed to correlate down times
across other stacks on your platform.
Okay. So, this one um where can I find overall recommendations to prepare for uh Black Friday? Uh I don't know. We do
have a production checklist and we also have some official documentation on on on Black Friday itself. If you look at like preparing for Black Friday versel,
it was it will show you. But um I don't know, Michael, if you want to describe how it was for you, like uh how did you engage with us when we when we were
getting closer to Black Friday 2024? How
was the process for you?
>> To clarify, you mean like prepping communication with you guys just to make sure we're all in a good place? Is that
what you're kind of referring to?
>> Yeah. And a little bit of like the process. I will start with that because
process. I will start with that because it's kind of like the standard procedure. We send the production
procedure. We send the production checklist whenever we we're approaching there. But what else?
there. But what else?
>> Yeah. So, I would say it's a bit different than our first year. Our first
year, I wouldn't say we had trust issues, but we just we hadn't seen the proof yet. So, we were much more uh we
proof yet. So, we were much more uh we wanted to prep a little bit more. Um,
and during that time we met with you guys prior and we ended up doing some um, stress tests on the website to kind of see how it would behave as if it was uh, Black Friday, Cyber Monday in terms
of visitors and clicks and uh, everything went super smooth. So, we
were very confident. You guys obviously with enterprise support um, we have direct communications with our account account reps. We meet on a recurring
account reps. We meet on a recurring basis um, throughout the years. were
just always aligned on goals and understand that at this point it's more about stability. And so this year going
about stability. And so this year going into it um it's been a little bit more just higher trust because it was proven
last year. So um we kind of know we have
last year. So um we kind of know we have our bearings going into this exactly u what we can expect and no different than last year. So it's it's I will say it's
last year. So it's it's I will say it's been easy. I I don't know how else to
been easy. I I don't know how else to really say it, but uh it's a great feeling. You know, knock on wood, of
feeling. You know, knock on wood, of course. Uh but um we're in a place right
course. Uh but um we're in a place right now where we're on code freeze and the name of the game is just sight speed and stability.
>> Yeah, I think I think after the code freeze is like knock on wood. Let's wait
for the better. And we really hope that you have a very boring Black Friday this year as well, Michael.
>> Appreciate it.
>> And just that's pretty much uh that's all the time we have for today. So,
thank you so much for joining us. Uh,
both mine and Michael's emails are here on the screen. So, if you have any other questions, feel free to reach out directly. And thanks again for being
directly. And thanks again for being part of today's session. And we hope to see you on the next one. Goodbye.
Loading video analysis...