Breaking the GPU market with Tesla M40

video on youtube on Miyconst channel – https://www.youtube.com/watch?v=v_JSHjJBk7E

During these crazy times of inflation and mining driving GPU prices up to the moon with the dogecoin, a “Xeon 2678 v3” of the GPUs had to come. Scrolling through youtube, a video of a guy testing the Tesla k80 came up, a beasty compute card with 24gb ram and dual GK210. Sure, it was powerful, but the Kepler architecture aged like french cheese in the sun, unlike GCN, and its dual gpu nature does not help; i made a mental note about looking for the Maxwell successor of that one. But that same video probably also made its way to Kostiantyn Cherniavskyi aka Miyconst , wich messaged me about the Tesla M40 shortly after. We decided to buy some cards to test. I am going to report here on our findings, all the contraptions required to make these amazing card work and save us in this GPU shortage times!

The story behind the chip

Maxwell architecture was a turning point for nVidia, designed with the rising laptop market in mind, just like Intel was doing on the CPU side with Haswell in the same era. Let’s have a small flashback: after the notoriously power-hungry Fermi in 2010, strongly compute-focused with huge DP performance, in 2012 they decided to design Kepler with more focus on graphical performance, cutting the DP ratio, and reaping tefficiency gains thanks to the 28nm node.

nVidia though, just like AMD, had to stick with the same node for a whole three years from 2012 to 2015, as TSMC had delays in the 20nm node (wich ended up being cancelled entirely – but that meant just moving to 16 back in schedule, unlikely some other blue company did later with 7nm..) so the efficiency gains in terms of power and die space in maxwell came from clever redesign and cuts on the FP64 units ratio. And this time, unlike Kepler’s “hybrid” GK110, they didn’t even decide to increase it on their top tier die, the GM200 at the core of the M40, Titan X Maxwell and 980ti. This means that the Tesla m40 accelerator was the most “pure graphical” compute card to be ever made by nVidia.

M40: a headless monster

But..it has no display outputs! And here, my friends, in a twist of events it’s actually Windows 10 that comes to help, with the recent versions having the ability to manually specify a different render GPU from the one used to drive the actual display. Now this feature is a byproduct of the muxless dual gpu designs, wich allows laptops to drive the display connected to the GPU only, waking up the dGpu just to compute the 3d scene when needed, and disabling it completely when the iGpu alone can handle, like basic office or browsing activities. With a simple command we can (somehow unpolitely) ask the nVidia driver to switch the M40 to graphic mode, reboot, go to the control panel and manually specify wich apps we want to run on the Tesla
And here’s where we are all happy gaming on our new shiny 12gb server gpu right?
Well…not so fast my friends

… running very hot

Being designed for server operations, the m40 expect the sysadmin to provide it with the very powerful and noisy industrial fans that keep the 250w beast tamed with a simple blower-style cooler that has no heatpipes and is just 1,5 slot wide, as the beautiful plastic green shroud just does not help at all in the process. There are two important things we need to do here; we have to implement a proper cooling solution, and make sure that the card runs as efficient as the silicon lottery allows it to with vBios modding.

Stay tuned for updates!