A few years back, I went down a (way too deep) rabbit hole on how to build the fastest possible computer for excel. And after seeing this post, I thought I'd share my data + results.
I had this idea after working a job that had some insanely large excel sheets for financial computations. These sheets could be converted to something like power query or python… but oh boy, that would have taken forever. We're talking sheets that took 30-60 minutes to calculate, and which were embedded into the core of the firm's processes. So even if I did speed them up through better design, my boss would not have been happy.
So… I set out with the help of a friend to find the fastest possible computer to run monster excel sheets. And the answer was not what I expected. To do this, my friend and I tested the RAM size, CPU speed, and number of CPU cores.
RAM Size (GBs)
Online and at work, I always heard how important RAM size was to fast excel. Well, this is true… to a point. Ram (or the space in short term memory) only becomes a problem if the workbook is so big that your computer starts running out of space. So, if your RAM is too small, like 4 or 8gb, this becomes a bottleneck. However, if your RAM is big enough, the returns rapidly diminish.
Here's what we saw:
RAM | Minutes to Process Monster Excel Book |
---|---|
8 | 17 |
16 | 9 |
28 | 8 |
32 | 7.5 |
56 | 6 |
Graph: https://imgur.com/a/XYl9fXP
So, based on the above, below 16GB RAM can cause slow downs. But after that, your gains are pretty limited. And a max speed up we saw was around 3 times faster if you started out with 8gb on a monster sheet.
CPU Speed
I also heard all the the time that faster CPUs would really affect excel speed. So, moving from an i3 to an i7 processor should have a massive difference. Well, we tested this out… and while it helped, it certainly wasn't groundbreaking.
CPU Speed (Gigahertz) | Minutes to Process Monster Excel Book |
---|---|
2.3 | 16 |
3.4 | 8.5 |
3.5 | 7.9 |
3.7 | 7.35 |
Graph: https://imgur.com/a/HZnmywY
So, GPU speed certainly helps… but it's still limited, particularly because during the time of research, it was hard to find chips much faster than those above. Nowadays, I see chips like i9 that are 6 ghz, so theoretically you could get 3-4 times faster by maximizing your cpu speed.
CPU Cores
Something no one ever talked about was how the number of cores affected processing time- but holy moly, this was the goldmine! We were pretty shocked at how much the number of cores impacted processing time.
Cores | Minutes to Process Monster Excel Book |
---|---|
8 | 16 |
16 | 4 |
20 | 3 |
64 | 1.3 |
72 | 1 |
96 | .6 |
Graph: https://imgur.com/a/lq6KrZU
And here was our winner! Core number has a HUGE difference on excel speed, and we were able to see an improvement of about 30 times faster!
So, why does this happen?
Here's our explanation: Excel is optimized pretty well to run parallel processes. With RAM, you're increasing the amount of space to run these processes… but if there already is enough space, then it won't help much. With CPU speed, it's like trying to move all your belongings across the country by buying a fancy faster car (like a Ferrari). Sure, the car may get there quicker, but it's going to take a ton of trips, and just making a single car faster will have a limited effect. But increasing CPU cores is like buying 50 slow cars (a fleet of honda civics)- sure, they may not be as quick, but the sheer volume of cars makes up for it since there are far, far less trips back and forth.
How can you take advantage of this?
We performed all our testing on virtual PCs from Azure, and used a massive excel book filled with complex calculations such as sumif, countif, etc. These virtual PC's ranged in price anywhere between $200 and $3000 dollars a month to run. So, if you really want fast excel speed, you can log into a virtual VM from microsoft with a ton of cores, and do your excel there. Just don't forget to turn it off afterwards… because you'll rack up costs fast. You don't want to be surprised by that bill.
OR, what you can do is build a beast of a PC. This can get real expensive, but if your work is valuable enough (financial/stonks), it may be worth it. For example, the Ryzen AMD Threadrippers (96 cores) would work incredibly well… but get ready to drop a few thousand dollars on the CPU alone. If you do this, minimize ram and cpu speed to a lower value (but not tiny), and put almost all your money into the cores.
Now, something to keep in mind is that if you use formulas like INDIRECT, these can kill your speed no matter what computer you are using. Indirect forces excel to calculate in a single threaded manner, bottlenecking everything… so avoid, avoid, avoid if you care about speed. There's a few other functions and features of Excel like this too, so keep a watch out for them- because even a beast computer won’t help much in these scenarios.
So, what did I do with this information?
A friend and I built an excel add in called Yeet Sheets in that hooked excel up to a super fast computer in the azure cloud, so that when you clicked the "calculate" button, hour long workbooks would take like 2 minutes. At one point, we were using something like 400 core pcs to test speed- and holy moly, is was insanely fast. Shout out to my friend who helped me here, because he's a beast in coding + smarts.
Unfortunately, there was not a lot of finance charge on the sector for this add in, so we ended up shutting Yeet Sheets down a few years ago and it's no longer available. There were a few reasons for this, including that large data processing is being done more and more in tools like Python. In addition, there can be clever ways to make excel faster through proper design rather than maxing out the PC hardware, though these ways can take a lot of optimizing by an excel expert to get right. But we certainly learned a lot along this path!
Anyways, I thought r/excel might enjoy this examination- and can get some of you out there the lightning fast upgrade you deserve 🙂
Interesting. Thanks for sharing.
Very cool. Thanks for writing this up. I love the name Yeet Sheet.
I enjoyed this post.
One thing I’d like to mention here is that if you are doing this at home you can get second hand Xeon workstations on the cheap on eBay. I picked up a quad core HP Z440 for $175 AUD. This included 32 GB of ECC Ram and a Samsung 980 Pro NVMe M.2 drive.
I upgraded the 4GB Quadro GPU to an 8GB Radeon RX 6600 and can easily upgrade to a higher core count Xeon but I’m not CPU constrained.
I mostly work in Power BI where I find RAM is the biggest bottleneck for large Power Query workloads. I have not found core speed or core count to make a huge impact on performance for working in Power BI Desktop.
Thanks for testing and thanks for sharing this information. 👍🏽
Regarding core count, it occurs to me it may also tie to available caching as well. Just curious if you explored onboard cache sizes at all
Great post! Very insightful.
“Yeet Sheets”
That’s awesome! 🤣👍
Ha I unfortunately went through a similar excercise. I was working on a deal last year where we would get raw data dumps for an excel file with 800k lines. The partner on the deal hated power query because how dare software and hardware advance past his era of excel so I had to run our analysis just using run of the mill formulas on this shitty data that needed multiple passes before coming clean.
It took a godforsaken amount of time on a gen7 x1 thinkpad when I was in the office. Back home I was working on upgrading my photo editing rig for bump up in MP and noticed dramatically better performance with 32gb of RAM and I think it was a mid tier Ryzen and an old nvidia 1070. 30 some minutes for the calcs to run down to mid teens. Upgrading an Intel i7 12700k and an equivalent mobo had some increase in performance, but ramping up to then 64 GB of ram was the kicker to get it to a handful of minutes. I did put in a 3060 ti to help with AI photoprocessing but less of an excel boost, rather a huge one on lightroom
This! This is THE shit! The most useful thing I saw on the internet this whole week!
Should definitely include skus since the speed of a cpu is not only dependent on the clock speed but also the architecture which dictates the ipc. You could be testing very different architectures that give very different results.
Excellent post. It made my save list.
Now for power bi plssss
Yeah really good insights! As someone who is going to be in the market for a computer really soon this is just *chefs kiss* !
Why do people hate databases ?
This is so useful to know, the different RAM sizes and CPU cores are crazy to see honestly
Wish I had an award to give. Amazing stuff right there!
Fantastic work.
Not sure how feasible this is, but isn’t there a place where you can still host/sell the Yeet Sheet for the (few) people who might be interested in this?
Seems a shame to stop selling something since you put all the work into it anyway and now it’s collecting dust.
At the very minimum you might get coffee money.
I knew Excel had decent concurrency and parallelisation support, but didn’t realise it went that far. Obviously there are diminishing returns, but 16-20 cores seems to be the sweet spot.
I don’t really use large data sets anymore – most of my work is done on more technical workbooks with smaller data sets, but always good to know.
Maybe I just am missing something in the post, but how much Ram was used on the CPU speed and # of cores test? I assume 16 since that was the diminishing gains breakpoint, but wanted to confirm.
But what I’m seeing is all of the tests had diminishing returns as they scaled higher. Getting a computer with 16 GB of RAM with a 16 core processor at about 3.4 GHz (which could probably be achieved through overclocking if there isn’t a stock model at this speed) would net a significant amount of the gains and be relatively affordable for the average person. Let the high finance peeps blow their money for every last second they can win back.
There are a few things I will add to this…
Formats in Excel takes up memory. The more distinct formats, the more memory. This can be mitigated some by using styles, but too many styles in place lead to the same problem. When you are out of memory for Excel to display formatting, you will not see that formatting displayed, even if the formatting is in place.
The worst case of this I saw was a workbook in the days of 32-bit Excel where the user kept copies of past weeks’ reports for the fiscal year as separate worksheets. By the middle of the fiscal year each worksheet’s formatting was no longer displayed. Yes, each worksheet had the same formatting, but because each worksheet was separate, and cells were formatted directly without using styles, each worksheet’s “bold underlined blue cell” was considered a separate format.
VBA in Excel is single threaded. Frequent use of UDFs based on VBA will not benefit from having more cores. This is why LAMBDA and Name Manager for UDFs is superior to VBA based UDFs.
Thank you for this. I am nowhere in your league and I recently did a minor upgrade to my PC, but the biggest difference was going from 8 to 20 cores. This was not the goal, just something faster than the old motherboard. Seriously, processes that took minutes now take seconds and times when I had to be careful about starting a macro before all the calculations were complete are no longer a significant concern.
You just cost me a lot of money on a new pc I’ll be buying Monday, but I’ll be thankful for years.
Great job op, thanks for sharing
Great work! It would be awesome to see what variables you held constant and at what level while testing the one. Or possibly running more tests to see how varying 2 at a time impact performance. E.g., is core performance sensitive to RAM. Does one bottleneck the other/is there an efficiency horizon? Additionally I’m curious if you or this community knows if diff kinds of functions or sheet structures can take better advantage of multi core, etc..
Thx again for the big lift
Godbless you.
Nice, I’m pleasantly surprised at how well it scales with the core count.
You still have to do Intel vs AMD chart – AMD has affordable 12 or 16 cores (ok, 12 are the affordable ones) with large L3 cache that may or may not shine for excel. Then maybe using some super fast storage.
Wow…probably the most informative post I have read to date that explains how excel works and also how CPU configurations affect it. Kudos for taking the time and energy you may have spent to understand all this, but more importantly for explaining to us laymen 🙏🏽🙏🏽
For the core count in your study, are they all physical cores or it include hyperthreading cores?
Thanks for the great writeup!
Very cool. You did the work, good Reddit person
Ya lmao. I have an i9 12900k with 64 gigs of ram with a 3080ti.
I literally have zero lag.
What were the Excel and OS version(s) used?
Thanks for this. Good info!
Insightful post
As an abuser of power query (excel or pbi), am interested to know if cores impact equally so.
If Excel is too slow, you should use another tool.
And as for PowerQuery, I fucking hate it.
Amazing effort but I have to ask…are you controlling other variables when you test one? E.g. i3 vs i7 is a terrible way to test CPU clock.
The core count results look very odd too. It’s usually understood that doubling core count doesn’t double performance, for good reasons. Yet the result here shows better than linear improvement (double core count -> performance more than double).
Perhaps you could share more information on the system specs etc
Ram clock rate/latency?
CPU cache sizes?
anything that goes beyond the “buying a pc from a discounter ad i get in the mail specs” would be interesting…
this only tells: bigger ram and fast cpu freq seems better without going into detail.
maybe team up with some other subreddit where people are posessed with their builds and want as much benchmarks as possible 🙂
also try different files/calculations since some need the FPU more than others
This is like crack-cocaine for me. Thank you