BARE FEATS - real world Mac speed tests

MAIN INDEX of latest speed tests

PERFORMANCE TWEAKS: Doom 3 for Mac

Originally posted March 11th, 2005, by rob-ART morgan, mad scientist
Includes technical discussion on Doom 3 progammers' attempts to optimize Doom 3 for the Mac.

Mac gamers have been anxiously awaiting Doom 3 for the Mac. Hope springs eternal that the Mac version of Doom 3 will perform as well on the Mac as it does on the PC. In our "Mac versus PC" game test page, we ran Doom 3 at 1600x1200 with Video Quality set to HIGH and the first four Advanced Options all set to YES. The G5/2.5GHz Power Mac with the two fastest graphics cards was handily beaten by "high end" Windows PCs.

There are already some excellent articles on Doom 3 for the Mac on at other sites. (See links below.) We decided to do an article probing ways to squeeze out a little more performance with only sight loss of visual quality and realism. These are the steps we've found helpful:

STAGE ONE: Run at 1600x1200 with Video Quality set to HIGH; First FOUR Advanced Options set to YES. This is the starting point and the settings we used to compare the Mac to the PC in another article.

STAGE TWO: Turn off SHADOWS in ADVANCED OPTIONS -- a step suggested by the developer.

STAGE THREE: Turn off Anisotropic Filtering either with console command (image_anisotropic 0). This filter only affects distant details and, in my opinion, won't be make a noticeable difference in realism during game play. (In the case of the ATI X800 XT, we used the ATI Displays utility to override the Anisotropic Filtering settings in Doom 3, because it gave us faster times whether set at 8 or OFF.)

STAGE FOUR: Switch from Trilinear Filtering to Bilinear. This is done by replacing:
seta image_filter "GL_LINEAR_MIPMAP_LINEAR"
with
seta image_filter "GL_LINEAR_MIPMAP_NEAREST" (in the config file).
This tweak helped the GeForce 6800 Ultra but didn't do anything for the Radeon X800 XT.

STAGE FIVE: This tweak only works on ATI Radeon graphics cards. We used ATIccelerator II to overclock the Radeon X800 XT (core clock set 5% faster to 500MHz and memory clock set 11% faster to 1100MHz). (We don't recommend this. It can cause permanent damage to your graphics card. If you see screen artifacts, then change the settings back to default immediately.)

For the record, we only saw a 5% gain with overclocking when all the tweaks set. But at the "normal" High Quality setting with Shadows ON and Anisotropy set to 8, the gain from overclocking is 13%.

The graph below shows the gains we achieved with each stage of tweaking. We included the times from various Windows PC configurations as a reference point:

LEGEND for graph
Opteron 252 = AMD Opteron 252 (2.6GHz, dual processors, PCI-express)
Athlon FX55 = AMD Athlon FX55 (2.6GHz, PCI-Express)
Xeon GeFU = Intel Dual Xeon GeFU 3.4GHz (PCI-Express)
G5/2.5 = G5/2.5GHz Power Mac (8X AGP)
GeFU = nVidia GeForce 6800 Ultra (Mac 8X AGP or PC PCI-Express)
X850 = ATI Radeon X850 XT (PC edition, PCI-Express)
X800 = ATI Radeon X800 XT (Mac 8X Edition)
(1) = STAGE ONE, (2) = STAGE TWO, (3) = STAGE THREE, (4) = STAGE FOUR
(5) - STAGE FIVE
Blue bars = PCs running at the same settings as untweaked Macs
Red bar = fasted tweaked Mac

CONCLUSION ON OUR TWEAKS of Doom 3 for the Mac
Even with up to five tweaks, the fastest Macs with the fastest graphics cards could not catch up to the Windows PCs. The Mac is a very sophisticated personal computer. Should we care if it can't play games as well as the "crude" PC? I say, "YES!" Why can't the Mac's amazing architecture, rich development tools, and state of the art OS combine to provide top notch OpenGL performance in the newest 3D game? "Why does Doom 3 run so slow on the Mac compared to the PCs?"

MAC GAME PERFORMANCE BRIEFING FROM THE DOOM 3 DEVELOPERS
Glenda Adams, Director of Development at Aspyr Media, has been involved in Mac game development for over 20 years. I asked her to share a few thoughts on what attempts they had made to
optimize Doom 3 on the Mac and what barriers prevented them from getting it to run as fast on the Mac as in comparable Windows PCs. Here's what she wrote:

"Just like the PC version, timedemos should be run twice to get accurate results. The first run the game is caching textures and other data into RAM, so the timedemo will stutter more. Running it immediately a second time and recording that result will give more accurate results.

The performance differences you see between Doom 3 Mac and Windows, especially on high end cards, is due to a lot of factors (in general order from smallest impact to largest):

1. PowerPC architectural differences, including a much higher penalty for float to int conversion on the PPC. This is a penalty on all games ported to the Mac, and can't be easily fixed. It requires re-engineering much of the game's math code to keep data in native formats more often. This isn't 'bad' coding on the PC -- they don't have the performance penalty, and converting results to ints saves memory and can be faster in many algorithms on that platform. It would only be a few percentage points that could be gained on the Mac, so its one of those optimizations that just isn't feasible to do for the speed increase.

2. Compiler differences. gcc, the compiler used on the Mac, currently can't do some of the more complex optimizations that Visual Studio can on the PC. Especially when inlining small functions, the PC has an advantage. Add to this that the PowerPC has a higher overhead for functional calls, and not having as much inlining drops frame rates another few percentage points.

3. More robust and modern OpenGL implementation on OS X. The fact that OpenGL is engineered from the ground up on OS X to be accessible from many applications at once is wonderful for the rest of the world, but does have a performance hit for games. Sharing GL with the rest of the system invokes a small overhead that Windows doesn't have, since Windows can basically assume GL is just in use for one application.

4. OpenGL framework/drivers split on OS X. On Windows, ATI and nVidia are responsible for the OpenGL code all the way from the hardware to the game. On the Mac, Apple handles the top layers of OpenGL and then hands data off to the video card drivers. On Windows this allows the video card manufacturers to do some more direct optimizations that make sure data gets passed to the card as fast as possible. The Mac can't short circuit that process, since there is a fairly well defined boundary between GL and the video card drivers. This is complicated by the more modern GL implementation on OS X as well- Apple can't just put in a bunch of hacks to shove data around the wall and into the cards, just for the game.

5. And the last, but definitely most important factor: Amount of time Apple/ATI/nVidia have had to optimize specifically for Doom 3. On Windows, ATI/NVIDIA spent multiple programmer years tuning their OpenGL implementations for Doom 3, starting back over a year ago while the game was still in development. Apple/ATi/NVIDIA have done an immense amount of work on OS X's GL in the last 3-4 months, but there is no way they could get as much done as the dozens of Windows engineers working on the problem for over a year. 10.3.8 includes a huge number of GL optimizations that make a big difference in Doom 3, and the game wouldn't have been in any shape to ship without these. One of the biggest things ATi & nVidia do on the PC for Doom 3 is have application specific OpenGL optimizations just for the game. They can detect Doom3 is the application using GL, and even which shaders it is downloading -- then they can shift to a mode that is highly optimized just for those cases.

The good news on all of these fronts, especially the last one, is that Doom 3 is such a highly visible benchmarking application, Apple/ATI/NVIDIA/Aspyr are all going to be continuing to work on increasing performance over the coming months/years. Just like what happened with Quake 3, the Mac OS matured, video card drivers got more optimized, and the game was tweaked so that eventually Mac performance is now as good or better than comparable PC hardware (I'd be really interested to see benchmarks with Quake 3 with the original shipping Mac app & version of OS X versus the latest app & current OS on the same hardware). Games drive hardware and the OS, and Doom 3 will likely push Apple to upgrade consumer video cards and continue to spend engineering time in the future to speed up OpenGL."

WHY HIGH RESOLUTION AND HIGH SETTINGS ARE COOL
The whole idea of running at high resolutions like 1600x1200 is to increase the sense of realism. Textures and objects are more detailed and focused. And you have no need to turn on Anti-Aliasing since jaggies are minimal.

It was interesting that when I ran the G5/2.5 Power Mac with Radeon X800 XT at 1600x1200 High Quality (No Shadows), our average wasn't much slower than running at 800x600 Medium Quality (No Shadows). My point is that with high end graphic cards, you don't reach ultra high framerates at low settings, but you don't lose significant speed at high settings. What's really cool is running at 1920x1200 custom resolution on my 23" Cinema. I felt totally immersed in the game and I averaged 48 fps. Of course, if I turn Shadows back at 1600x1200 or 1920x1200, the framerate drops into the mid 30s.

One might ask, "What should be the minimum target average framerate?" I say 40 fps. Let me explain.

The human eye won't detect stutter unless your framerate drops below 16 frames per second. Even if your Mac averages 25 fps, your framerate will often drop to under 10 fps when you encounter complex fight scenes. So the most important number is your MINIMUM framerate, not the average.

It would be nice if Doom 3 captured and published the minimum framerate like UT2004 does. You might try turning on the FPS display while running Doom 3 to observe how low the framerate dips while running the Demo1. (In console mode, enter "com_showFPS 1") You want to see if your framerate drops below 16. If it does, you might want to try lower quality settings or lower resolution. If you go strictly by Doom 3's published averaged at the end of the timedemo run, your goal should be to average 2.5 to 3 times the 16 fps "stutter threshold" to avoid any detectible jerking. Therefore, your target average when running the benchmark sequence should be 40 fps or higher.

COLLABORATION & EDUCATION
Robert Uyehara and his Macologist crew have been sharing their test data and insights with me since I first started testing Doom 3. They have put together two excellent articles on Doom for the Mac. The first article gives you the history of Doom and includes some early tests results. Their Performance Followup article contains extensive test results for various Macs at 640x480, 800x600, and 1024x768, along with detailed explanations on factors that affect performance.

RUNNING YOUR OWN DOOM 3 BENCHMARKS
If you want to run the standard benchmark that comes with Doom 3, it's easy.
1. Choose the video settings you want. (You may have to quit and relaunch after some changes.)
2. Enter console mode (Control + Option + ~).
3. Type "timedemo demo1" (without the quotes, of course), then press RETURN key
It will load and playback a captured fight sequence, then display on the screen the average frame rate.
4. You'll notice the first run jerks and stutters a lot. That's because it is busy caching textures and other data into RAM, as Glenda pointed out above. If you return to Console and run "timedemo demo1" a second time, youll notice it starts right away and runs smoothly. Use the results from the second run as the indicator of your Mac's Doom 3 performance.

(NOTE: One reader informed me that adding a "1" parameter at the end of the "timedemo" command (timedemo demo1 1) will pre-cache the run. I tried it but I still get 1 to 2 fps faster on the second run. So even if you pre-cache the first run, I suggest making two runs and note the second run's time.)

If you make a lot of benchmark runs, I have a suggestion for a short cut:
Go into the DoomConfig.cfg (in User/Library/Application Support/Doom 3 folder). Open it with TextEdit. Use the "bind" command to assign to console commands to a function key. In my case, I entered one change:
bind "F1" "timedemo demo1" (or bind "F1" "timedemo demo1 1" for pre-caching)
Now all I have to do to run the benchmark is to press the F1 function key after launching Doom3. You don't have to go to console mode to start the timedemo run and you don't even have to click "OK" after the first run is complete. Just hit F1 again.

There are other little short cuts that save time during benchmarking -- like pressing ESC to skip the startup video or programming a function key to EXIT.

We posted another page on Doom 3. This time we chose mid-range Macs and ran at 800x600 Medium with Shadows off. Check it out.

RELATED LINKS

Bare Feats runs Doom 3 on various mid-range Macs at 800x600 Medium with Shadows OFF.

Mac versus PC in 3D Gaming (Our related article where we match up the fastest G5 Power Mac configurations against some of the fastests PCs.)

Mac versus PC running Photoshop, After Effects, etc.

Doom 3 "First Look" by MacWorld. (Includes benchmark results)

SharkeyExtreme compares the PC versions of the GeForce 6800 GT and Ultra with the Radeon 9800 XT and X800 XT running Doom 3.

Anandtech Doom 3 shootout with PC versions of GeForce 6800 GT, 6800 Ultra, Radeon X800 XT, and 9800 XT.

WHERE TO BUY VARIOUS GRAPHICS CARDS FOR YOUR POWER MAC and MAC PRO

For your Mac Pro, you have the following 16X PCI Express (PCIe) options:
The GeForce 7300 GT (16X, 256MB, dual-link DVI + single-link DVI port) is the default. We recommend the Radeon X1900 XT (16X, 512MB, two dual-link DVI ports) as a CTO option. It's much faster than the GeForce 7300 GT and just as fast as the expensive Quadro FX 4500. According to Alias/Autodesk, the X1900 XT is the only graphics card without limitations when using Maya 8.5. To custom order your Mac Pro with the Radeon X1900 XT, go to the Apple Store and click on the Mac Pro graphic.

If you didn't order the Radeon X1900 XT with your Mac Pro, you can order the Radeon X1900 XT as an aftermarket kit for your Mac Pro, go to the Apple Store and click on DISPLAYS in the left margin or do a search on "X1900."

NOTE: Mac Pro PCIe graphics cards will not work in Power Mac G5s with PCIe slots -- and vice versa. Nor will Windows PC PCIe graphics cards work in the Mac Pro.

Graphics Card Options for the Dual-Core or Quad-Core G5 with 16X PCI Express slot:
The best option for your Dual-Core or Quad-Core G5 with PCIe slots is the ATI Radeon X1900 G5 Mac Edition released in November 2006. You can buy it directly from ATI's Online Store for $299 (with "trade up" allowance).

It's also sold by Small Dog Electronics and Other World Computing.

The following cards only work on a G5 Power Mac with 8X AGP slot:
The "G5 only" Radeon X800 XT Mac Edition (8X AGP, 256MB, ADC + Dual-Link DVI port) is available from ATI Online Store, Apple's Online Store, Buy.com, Other World Computing, and Small Dog Electronics. (The MSRP is $299)

Apple's Online Store is no longer selling the GeForce 6800 GT or Ultra, which had Dual-Dual-Link DVI ports (for two 30" Cinemas).

The "G5 only" Radeon 9800 Pro Mac Special Edition (8X AGP, 256MB, ADC + DVI port) is no longer made by ATI.

The following cards work on both the G5 Power Mac (8X AGP) and G4 Power Macs with 2X or 4X AGP:
Other World Computing has the new ATI Radeon 9800 Pro Mac (2X/4X AGP, 256MB, DVI + VGA ports) graphics card in stock for $259. ATI has it on their Online Store for $249. The SKU number is 100-435058, in case you want to make sure you are getting the right card.

ATI Online Store, Buy.com and Other World Computing have the Radeon 9600 Pro PC and Mac Edition (4X AGP, 256MB, DVI + Dual-Link DVI port) as well. It's compatible with late model G4 Power Macs and all G5 Power Macs with AGP slots. Priced at $199 MSRP it is the lowest priced AGP graphics card with Dual-Link DVI support.

WHERE TO BUY G5 POWER MACS
When ordering products from Apple Store USA, please click THIS TEXT LINK or any Apple display ad as your "portal" to the online store. In so doing, you help to support Bare Feats.

For new and refurbished G5 Power Macs, check with Small Dog and Power Max.

WHERE TO BUY WINDOWS PCs
There are many places to buy PCs. We want to plug WhisperPC since they provided the Intel Dual Xeon 3.4GHz.

The AMD Athlon FX55 (2.6GHz) was provided courtesy of @XiComputer. To price the Athlon FX55, visit their "configure now" page.

Has Bare Feats helped you? How about helping Bare Feats?

© 2005 Rob Art Morgan
"BARE facts on Macintosh speed FEATS"
Email , the webmaster and mad scientist