TLDW logo

AMA: Black Friday Peak Performance with PAIGE

By Vercel

Summary

## Key takeaways - **2M Concurrent Requests Peak**: We got almost 2 million concurrent requests per second at the highest peak. It was a total of 86 billion requests for the entire holiday and our uptime went above the 99.99%. [02:01], [02:28] - **Blissfully Boring Black Friday**: Last year, Black Friday was the first Black Friday sale on the Vercel platform after replatforming. It was blissfully boring in the best way possible with record-breaking numbers, site speed, and performance stability without any problems. [02:32], [03:23] - **Elasticity Beats Fixed Bandwidth**: Prior to Vercel, our hosting provider was not elastic with a certain amount of bandwidth and CPU, leading to downtime or slowing if thresholds were exceeded. Now on Vercel, elasticity ensures stability even for sudden spikes like Taylor Swift promoting our jeans, just resulting in a higher bill. [05:21], [06:09] - **Instant Rollbacks Zero Downtime**: Instant rollbacks give me less headaches as I could get a Slack message about an issue and roll back while on a call without skipping a beat. The history goes far back to check previous deployments and determine the right one with zero downtime. [11:35], [12:01] - **Stress Tests Built Confidence**: In the first year before Black Friday 2024, we met with Vercel prior and did stress tests on the website to see how it would behave under Black Friday-level visitors and clicks, and everything went super smooth. This year, higher trust from proven performance means focusing on stability with code freeze. [19:37], [20:19] - **24/7 Support Owns P1 Tickets**: I take the lead on any P1 tickets with 24/7 support, and with enterprise support, Vercel can access our codebase to dig into root causes without finger-pointing between vendors. It's nice to have but we've had fewer issues on Vercel. [13:00], [15:15]

Topics Covered

  • Blissfully Boring Black Friday Wins
  • Elasticity Enables Fearless Scaling
  • Instant Rollbacks Eliminate Headaches
  • Unified Vercel-Next.js Fixes Root Causes

Full Transcript

Hello everyone and welcome to our webinar. I'm your host Sebastian

webinar. I'm your host Sebastian Contreres. I'm a senior development

Contreres. I'm a senior development success engineer here at Versell and today with us is Michael Aoya, director of software engineer at page. As

announced before uh we will be diving into how page achieved a successful Black Friday 2024 with Bell and the technicalities behind it. Remember that

this session is also at MAA. So feel

free to drop your questions in the chat.

My co-host Shon will be collecting the questions and we'll cover as many as we can during the Q&A. And as usual, for any questions that we don't get to answer, we'll reach out to you via email. So, let's get started. Hi,

email. So, let's get started. Hi,

Michael. Thanks for accepting our invitation. Please introduce yourself to

invitation. Please introduce yourself to the audience and tell us about your role and about Page. Hey, guys. And yes,

hello Sebastian. So, my name is Michael.

Uh, I am the director of software engineering at Page. Been at the company for about five and a half years now.

Uh my role is mainly um leading a team of developers um at page. Our biggest

role is on the ecom side with custom development on the website, but our company at page is an omni channel experience. Um so for those unfamiliar

experience. Um so for those unfamiliar with page as a brand or company um we have over 20 retail stores throughout the country. uh as well as in this past

the country. uh as well as in this past year started open up internationally with uh stores in the UK as well as we just opened up our first store in Macau, China, which is a big accomplishment. Um

in addition to retail, we have the online e-commerce channel which is uh mainly what I manage and then we also do wholesale with kind of higherend department stores such as Bloomingdale,

Nordstrom, Nean Marcus, that kind of tier. Um so we're across many different

tier. Um so we're across many different channels. Um but we're definitely in the

channels. Um but we're definitely in the kind of apparel space.

>> Okay, that's good. So I would like to start uh the conversation with Black Friday 2024 from our cell perspective.

So what you're seeing on the screens are the numbers like we got almost 2 million concurrent requests per second at the highest peak. uh it was a total of 86

highest peak. uh it was a total of 86 billion requests for the entire holiday and our uptime went above the our 99.99%

uh uptime. So my first two questions are

uh uptime. So my first two questions are like how was Black Friday 2024 for you like on the client side and how this experience with Versail shaped your

Black Friday preparation for 2025.

>> All right. Yeah. So last year, Black Friday was the first Black Friday sale um on the Versell platform. We

replatformed on the Versell. So this was a big test for us um just because you know, you always kind of want to prove it in a sense. And u I'm happy to report

that uh that the term I always use for this is it was blissfully boring in the best way possible. What I mean by that is obviously you're always concerned about stability, sight speed, and all

that. Um, and everything went out

that. Um, and everything went out without any problems, very smooth. Um,

you know, I was able to enjoy my Thanksgiving without having to uh huddle over a computer and put out fires that year. Um, but in addition to just sight

year. Um, but in addition to just sight speed and performance stability, um, record-breaking numbers for us. So, it

was wins across the entire board.

And for 2025, is that something that you want to do different? How how it is with,

different? How how it is with, >> you know, 2025, we we set a high bar for 2024. So 2025 is really trying to always

2024. So 2025 is really trying to always beat that bar. Uh but now with another year under our belt with Forcell, uh there's been a lot of new features,

optimizations that we've made to go into this Black Friday, Cyber Monday, kind of even more confident than we were last year.

>> That's good. That's actually what I what I wanted to talk uh next and is let's dive a little bit into scalability because it's a core aspect when we have all these traffic spikes. Um so for

supervised mechanisms and we definitely did some improvements. Uh the first one is like fluid compute which is our new infrastructure model that allows you to execute functions in parallel within the

instances and also scale the instances uh as we as as we did before. Uh we also have automatic regional failovers for your versel functions to avoid

disruptions in case uh your main region falls. So you have you can set up a

falls. So you have you can set up a second region that can take uh over and the global edge network which is the CDM part uh that you know as part of dealing

with this high amounts of traffic is serving catch content or static content closer to your users. But we also have all those mechanisms for ISR where we

can on demand revalidate and put fresh content on on the caching as well. So

with that being said, we can affirm that Verscell has proven to scale. But uh

more into your specific solution, what technical aspects have you considered for your enterprise architecture given these elastic versell capabilities?

[snorts] >> Yeah, so I think it's important to know where we came from before having these type of features. Um in our previous iteration of the website prior to

Verscell, our hosting provider was uh it was not elastic. We were given a certain amount of of kind of bandwidth and and [clears throat] kind of CPU and we knew

if if the website went over that threshold, whether it was visitors, clicks, whatever it may be, that we would actually experience either some downtime, some slowingness in the site, just overall bad experience, which did

occur in the previous version. So, um,

now being on Verscell and the website being elastic is is huge for us because now, um, as someone that manages the website and his main focus on his on

stability, having it be elastic means, you know, if you know, let's just say Taylor Swift went on Instagram live and told the world to go by go buy Paige

jeans that I know that our site is still going to be stable and the elasticity is going to just put resources and the only thing that I'm going to get in the end of the month is just a higher bill because of that. But you don't want to

miss out on those opportunities or in this case for Black Friday, Cyber Monday, those sales. So, it's much more important to have the peace of mind when

you sleep, when you wake up that the site will scale to our enterprise needs.

>> Yeah, that that's definitely a priority for us as well. We did test this um fluid compute during Super Bowl this year. So it was like those spikes and it

year. So it was like those spikes and it it scales well. I think scaling compute is is very important in the sense that it's it's kind of like one of the things that break first. You can have catch you

know for a static websites but nowadays dynamic features are everywhere. So

having something like an elastic platform behind it really makes a difference.

>> Okay. So uh my last questions um I would like to ask this question based more on versel features.

>> Okay. So when it comes to black Friday uh when it comes to versel uh enterprises uses Verscell based on unique conditions that they have and we

also present the PCEL like scalability features and security and the different workflow features at different dimensions but not all the companies interact with them on the same way. So

for example, if we talk about uh monitoring and observability, there are companies that do everything from the dashboard and they go to observability plus or the real time logs.

Uh and also you have usage alerts that notifies you like if something uh if you're reaching out to your allocation of of mius but also we have this new

anomaly alerts obsability which is in beta and that one will also uh let you and notify you if something goes wrong in terms of errors and also if you have

also some usages and that can pipe like in Slack or emails as well. But uh we have other companies they go out of the dashboard. So what they have is log

dashboard. So what they have is log drain. So they pipeline all the logs

drain. So they pipeline all the logs from Versell into an external platform and there they have their their dashboards as well. When you look at it from a platform security perspective,

it's kind of similar in the sense that you have things that we provide by default. The DDoS protection is there.

default. The DDoS protection is there.

You don't need to do anything to get it working. But uh you also have things on

working. But uh you also have things on the extreme like bot ID which is our cases bake uh advanced protection for bots that you actually have to modify

the code to make it work. And in between you have things like your W custom rules that you can do in the dashboard but those are typically designed for your

custom business requirements and some optim features like manage rule sets. So

you can define uh where you want to be in how you want to be involved with the platform security also workflow.

Workflow is very interesting because there are different ways in which you can handle it like some companies have like instant rollbacks as the way to deal with if something goes wrong in

production. It's like this deployment

production. It's like this deployment sequence and then you can select a previous one and you can say now you are my um live deployment and that one will

be without uh zero downtime. So that's

one way but there are flags rolling releases which is our newest um deployment strategy and also you have the people. So you have the 247 support

the people. So you have the 247 support team our SLAs that will depend on your contract and also weekday support that is kind of like my role that we can meet

before and then we can define and check your configurations and see like if everything is going correctly.

So with all those features that I presented, I lost my my breath there.

The question is like if you think about your Black Friday 2025 playbook, which Versell features have you considered like a must-h have and what roles and responsibilities have you defined for

your team in relation to those features uh with Versell? Yeah, I I will start off by saying definitely a lot of features and and for those maybe watching this, uh don't stress out about

thinking you have to implement all of these all at once. Um even us, uh we're still kind of slowly picking away at things we want to add on in terms of

this list. We're not all fully there. Um

this list. We're not all fully there. Um

but we do use a diverse amount on this list and it is also kind of either managed or supported by different members of the team with different uh tiers of kind of permissions on the

site. So I'll go over a few specifics

site. So I'll go over a few specifics just so everyone has like an example. Um

starting with something like the real-time logs. Um, we have our uh QA

real-time logs. Um, we have our uh QA who periodically checks these logs on a consistent basis to see if anything is popping up in the logs that we might need to inherently create a bug ticket

for in the back end because we're seeing specific either 4x or 5x errors. Um, in

terms of platform security, always that's the top of the list in terms of priority, especially for myself. Um, so

we've we've really over the past year started to dial in on specific uh W custom rules as well as tested you guys' bot protection in terms of like implicit

um rule sets. Uh instant rollbacks. I

think this is the one that's like uh gives me less headaches throughout the years compared to before um because it instant.

Um we could be on this call right now. I

could get a Slack message that hey, there's this issue and I could roll back while doing this webinar and and not even skip a beat. Um, that's how good it is. And the history goes all the way far

is. And the history goes all the way far back. We can even look into the previous

back. We can even look into the previous deployments and see if the issue was persistent there to determine which one we want to roll back to. So, we have all

the history. Um and then uh one thing

the history. Um and then uh one thing recently that we've been doing is um our

kind of business side and um more on digital kind of like digital analytics is starting to leverage [snorts] the web

analytics within Verscell. Um so they're not actually technical team at all but we give them um view only permissions and they've been leveraging looking at the web analytics to kind of track how

some of their campaigns are going uh because Versel being the hosting provider has access uh to some of the either endpoints the query parameters um to see how their campaigns are going. So

that's like a new one. So it just goes to show and then sorry lastly the 247 support. It's another big one that gives

support. It's another big one that gives me less headaches throughout the year.

Um, I will first and foremost say Sebastian is not someone that's new to me. Uh, when we first joined Verscell,

me. Uh, when we first joined Verscell, we also uh built our website on Nex.js and Nex.js is kind of, you know, under the Verscell umbrella. And not only did

we get support on Versel itself, but also on Next.js and Sebastian was very helpful getting us across the finish line there. It's one of those things

line there. It's one of those things where um you may not need it every day, but when you need it, you're sure glad they're there. Um, so in terms of ticket

they're there. Um, so in terms of ticket support, um, I definitely have, um, kind of take take the lead there of any P1

tickets. Um, but I will say because we

tickets. Um, but I will say because we have good support and not nearly as much issues in the past, it's one of those things it's nice to have, but we're happy to say we haven't had to leverage

that as much as we would if we weren't on Versel.

>> Appreciate that. I appreciate that.

Yeah, we were before. um is is one of those uh weekday support things that I was talking it was good to see what were the especially at that time uh the different catching strategies that can

make a difference when it comes to to something like Black Friday even if it's not only the CDN catch but you can do data catch or things like that to protect against like you know traffic

spikes and you mentioned something that that I will I would also like to highlight the the 247 support team that that you define you take ownership of those tickets. That's important because

those tickets. That's important because anyone on the team can technically go and file a ticket, but the more control there is in who files it and when it's

filed, it also helps into like how we articulate um actions like contingency actions coordinated coordinatorally which is a challenge if if you have

someone that notifies the support the creates a support ticket but is not involved in the decision making if we need to escalate something on the client side. So I think that that's good as a

side. So I think that that's good as a >> I would even add uh this this is more at anyone that's listening to this who's also building a nexjs at least with the

enterprise support um you guys are just able to get even more dig a little bit deeper into root causes and there's been times where you guys gain access to our

codebase uh for an issue and actually look into it which is kind of like a beautiful thing with Nex.js and PCEL working well together. Um, historically

in the past you would have a lot of vendors pointing the finger at each other. But because our framework and our

other. But because our framework and our hosting provider is the same, we can solely work on the issue in house without having to go to a bunch of different people who are kind of

pointing the blame somewhere else. So

it's it's much better work process when it's all combined into one.

>> Yeah, we try to keep everything good for our customers for sure.

>> So those were all the questions I I had.

So I think we can open the Q&A section.

Sharon.

Okay. So our first question is how can I be notified by Brazil if something goes wrong? So I I think the new feature uh

wrong? So I I think the new feature uh it's part of observability plus that is called anomaly alerts. Uh it's

definitely worth checking. You can

define if you wanted to send the notification to Slack or email. uh and

it it will it is also attached to errors spikes. So in case there is an error

spikes. So in case there is an error then it will notify you like proactively before it becomes something worse. Um

and also you have by default the notifications alerts and usage which are also related to if something goes wrong it tends to spike. So I will recommend

you to verify your notification settings before Black Friday because those needs to I mean we have cases but no the email that is there is no one is taking care of that email or something like that. So

make sure that you have everything there.

Okay. Another questions I see is like what is the best way to receive support from Brazil in case something happens?

Um for sure is the open a support case uh on on on holidays and and things like Black Friday there is a severity and whenever

you open a ticket a support ticket with Brazil there is a severity severity category and if you have like your production traffic or your production site is suffering that gets treated like

prioritized for the entire team. We also

have our product team, our platform team of course making sure that everything works smoothly but having a support case is the official way in which we can track you know SLAs and the time and the

responses. So it's some of our

responses. So it's some of our enterprise customers have us on the slack channels and they can reach out to us uh like the weekday support but the

the initial that it always should be like a support case.

Okay, another question is where can I find downtime of a sale incident? Uh,

this is good. Um, I think one way you can buy your support ticket and normally they they got under investigation and someone is going to check on that but uh

check up our versel status page and in the versell status page we update any incident and we put the downtime there if you needed to correlate down times

across other stacks on your platform.

Okay. So, this one um where can I find overall recommendations to prepare for uh Black Friday? Uh I don't know. We do

have a production checklist and we also have some official documentation on on on Black Friday itself. If you look at like preparing for Black Friday versel,

it was it will show you. But um I don't know, Michael, if you want to describe how it was for you, like uh how did you engage with us when we when we were

getting closer to Black Friday 2024? How

was the process for you?

>> To clarify, you mean like prepping communication with you guys just to make sure we're all in a good place? Is that

what you're kind of referring to?

>> Yeah. And a little bit of like the process. I will start with that because

process. I will start with that because it's kind of like the standard procedure. We send the production

procedure. We send the production checklist whenever we we're approaching there. But what else?

there. But what else?

>> Yeah. So, I would say it's a bit different than our first year. Our first

year, I wouldn't say we had trust issues, but we just we hadn't seen the proof yet. So, we were much more uh we

proof yet. So, we were much more uh we wanted to prep a little bit more. Um,

and during that time we met with you guys prior and we ended up doing some um, stress tests on the website to kind of see how it would behave as if it was uh, Black Friday, Cyber Monday in terms

of visitors and clicks and uh, everything went super smooth. So, we

were very confident. You guys obviously with enterprise support um, we have direct communications with our account account reps. We meet on a recurring

account reps. We meet on a recurring basis um, throughout the years. were

just always aligned on goals and understand that at this point it's more about stability. And so this year going

about stability. And so this year going into it um it's been a little bit more just higher trust because it was proven

last year. So um we kind of know we have

last year. So um we kind of know we have our bearings going into this exactly u what we can expect and no different than last year. So it's it's I will say it's

last year. So it's it's I will say it's been easy. I I don't know how else to

been easy. I I don't know how else to really say it, but uh it's a great feeling. You know, knock on wood, of

feeling. You know, knock on wood, of course. Uh but um we're in a place right

course. Uh but um we're in a place right now where we're on code freeze and the name of the game is just sight speed and stability.

>> Yeah, I think I think after the code freeze is like knock on wood. Let's wait

for the better. And we really hope that you have a very boring Black Friday this year as well, Michael.

>> Appreciate it.

>> And just that's pretty much uh that's all the time we have for today. So,

thank you so much for joining us. Uh,

both mine and Michael's emails are here on the screen. So, if you have any other questions, feel free to reach out directly. And thanks again for being

directly. And thanks again for being part of today's session. And we hope to see you on the next one. Goodbye.

Loading...

Loading video analysis...