SOD: Sidebar Diversion

I couldn’t get the idea out of my head that the Avatar rendering cluster required 1 petabyte of storage. However, this slide show of the facilities used for filming the actors opened my eyes. [eye opening slide show]

The petabyte is required not just for the finished product. It’s needed to store all the sensor and camera data as well. Okay. I accept that Weta needed 1PB. How does one go about creating a petabyte storage facility? What are the tradeoffs? How much does it cost to build and then to maintain?

I need to get this out of my head and free up some brain cycles to continue with my Seeds of Discontent series. This article is a sidebar.

Disclaimer: I’m using a server build derived from the Seeds of Discontent serial (see also, Seeds of Discontent: System Benchmarking Hardware). It isn’t a traditional server room server but it serves as a baseline talking point. Besides, I can get budgetary pricing off newegg.com. It’s good enough for the purposes of this discussion. I’m sure the staff at Weta (or ILM or Dreamworks or Pixar) would have more insight into what *really* works.

I’ll choose the 4U chassis for this exercise since it better approximates the airflow of a desktop chassis. The 50U rack is taller than usual but it allows for 12 4-U servers plus a switch.

In the Seeds of Discontent serial, the HDD array uses the cheapest drives available; The final size is not important. In this sidebar discussion, final HDD capacity *is* important. This is not a single desktop box but a compute and storage cluster.

Each server functions as a compute unit as well as a storage unit.

A commodity multi-core CPU together with two commodity PCI Express GPU subsystems comprise the compute unit. Primary storage and swap for the compute unit is a two-disk RAID 0 SSD. The GPU cards have their own RAM while the CPU has two triple-channel banks of DDR3.

The server also hosts an HDD array which is not private storage but part of a larger storage cluster. For this exercise, I arbitrarily choose GlusterFS.

I’ve budgeted $2,500 USD per server (less the HDD array).

(circa mid-November 2010)
case+PSU       $ 200
mainboard        230
SSD              135
SSD              135
DDR3 24GB        550
CPU              850
GPU              200
CPU              200
--------------------
             $25,000 USD
              $2,500 USD

In addition, I’ve budgeted $1,000 per rack and $1,000 per switch. A fully constructed rack sans HDD is $32,000. YMMV.

rack         $ 1,000
switch         1,000
servers       30,000 (2,500 * 12 servers/rack)
--------------------
             $32,000 USD

Let’s add the HDD to build out a petabyte cluster. Since I’m using the Asus Rampage III (admitedly not a server mainboard), the two GPU fully consume the PCI Express lanes. There aren’t any lanes left for a RAID card. The two SSD drives occupy the two SATA III channels leaving the seven SATA II channels for the HDD array. Each server then adds seven HDD (and each rack adds 84 HDD) to the cluster.

As seen from this list (circa mid-November 2010), the lower capacity drives are not the cheapest drives per gigabyte.

SKU               $USD  cnt    GB   $/GB $K/PB  HDD/PB
---------------   ----  ---  ----  ----- -----  ----
WD5002ABYS-20PK   1750   20   500  0.175  175   2000
WD7502ABYS-20PK   2500   20   750  0.167  167   1333
0A39289-20PK      2700   20  1000  0.135  135   1000
WD7501AALS-20PK   1550   20   750  0.103  103   1333
WD6401AALS-20PK   1300   20   640  0.102  102   1563
WD5000AADS-20PK    900   20   500  0.090   90   2000
0F10381-20PK       900   20   500  0.090   90   2000
WD1001FALS-20PK   1800   20  1000  0.090   90   1000
WD5000AAKS-20PK    880   20   500  0.088   88   2000
WD6400AARS          55    1   640  0.086   86   1563
WD7500AADS-20PK   1250   20   750  0.083   83   1333
ST3500418AS         40    1   500  0.080   80   2000
WD7500AADS          55    1   750  0.073   73   1333
WD20EVDS-20PK     2800   20  2000  0.070   70    500
0F10383-20PK      1400   20  1000  0.070   70   1000
WD10EALS-20PK     1350   20  1000  0.068   68   1000
WD10EARS            65    1  1000  0.065   65   1000
WD10EARS-20PK     1200   20  1000  0.060   60   1000
WD15EARS-20PK     1700   20  1500  0.057   57    667
ST31000528AS        50    1  1000  0.050   50   1000
ST31500341AS        60    1  1500  0.040   40    667

Downselect the cheapest drive at each capacity point greater than or equal to 1TB.

SKU               $USD  cnt    GB   $/GB $K/PB  HDD/PB
---------------   ----  ---  ----  ----- -----  ----
WD20EVDS-20PK     2800   20  2000  0.070   70    500
ST31000528AS        50    1  1000  0.050   50   1000
ST31500341AS        60    1  1500  0.040   40    667

I want to minimize the number of drives while minimizing costs. The Seagate 1TB drive is both more expensive per GB and requires more drives per PB than the Seagate 1.5TB drive. It is immediately eliminated. The competition is between the Western Digital 2TB and the Seagate 1TB drive.

If cost were the only issue, then the Seagate drive would win. If drive count were the only issue then the Western Digital drive would win. To get closer to an answer, let’s build out the storage cluster.

Petabyte Cluster with Just a Bunch of Disks

        ======= count ======  ========= cost ==========
Drive   HDD   Servers  Racks    HDD     Rack     Total
-----  -----  -------  -----  -------  -------  -------
2.0TB    500       72      6   70,000  192,000  262,000
1.5TB    667       96      8   40,000  256,000  296,000

Even though the 2TB drives cost $30,000 more than the 1.5TB drives, the total cluster cost is $34,000 for the 1.5TB drive choice. Furthermore, there is no redundancy to protect against drive failure. How likely is a drive to fail? It’s not just likely to happen. It will happen. If a single drive will on average fail in five years, then a single drive in a pool of 500 drives will on average fail in 1/100 of a year (or roughly two drives a week).

Without debate, I posit that it is not possible to make nightly backups of a petabyte storage cluster. Firstly, we’d need a second petabyte. Secondly, that’s a lot of data to move and the cluster needs to continuously run compute jobs (rendering). The solution is redundancy either through local machine (e.g., RAID 6) or through GlusterFS replication (analogous to RAID 10 but at the cluster level).

Petabyte Cluster with RAID-6

        ======= count ======  ========= cost ==========
Drive   HDD   Servers  Racks    HDD     Rack     Total
-----  -----  -------  -----  -------  -------  -------
2.0TB    700      100    8.3   98,000  268,000  366,000
1.5TB  1,087      156     13   56,000  416,000  472,000
Petabyte Cluster with RAID-10

        ======= count ======  ========= cost ==========
Drive   HDD   Servers  Racks    HDD     Rack     Total
-----  -----  -------  -----  -------  -------  -------
2.0TB  1,000      144     12  140,000  384,000  524,000
1.5TB  1,334      292     16   80,000  512,000  592,000

The drive count starts to really add leading to more frequent drive failures. With a RAID 10 (or the equivalent GlusterFS replication scheme), the operations team can expect to replace five to six drives per week.

But the larger question is, how many servers are needed for the compute cluster? What if rendering needed no more than 60 compute nodes? If we fixed the compute nodes count to 60, we would need to add more drives per server. For the sake of discussion, assume we could load 24 drives per server but that doubles the cost per server before including drive costs (i.e., 2 * $2,500 = $5,000 per server sans HDD).

Furthermore, assume we’re using GlusterFS replication for the storage cluster redundancy. This presses the drive count up but lowers the complexity of building and maintaining local RAID systems each with 24 drives.

Petabyte 60 Server Cluster with RAID-10

        ======= count ======  ========= cost ==========
Drive   HDD   Servers  Racks    HDD     Rack     Total
-----  -----  -------  -----  -------  -------  -------
2.0TB  1,000       60      5  140,000  160,000  300,000
1.5TB  1,334       60      5   80,000  160,000  240,000

There is a $60,000 capital cost difference between the two clusters. Drive failure rates are a third lower for the 2TB drive cluster but the larger drive costs more per GB.

Drive Failure Rate
(5 Year time to fail)
Petabyte 60 Server Cluster with RAID-10

       == failure cost ==   HDD   failrate  === month ===
Drive  HDD  labor  subtot  units   (days)   fails   cost
-----  ---  -----  -----   -----  --------  -----  ------
2.0TB  140     70  $ 210   1,000     1.825   16.4  $3,518
1.5TB   60     70    130   1,334     1.368   21.9   2,851

These number presumed that both drives had the same MTBF. Digging a bit further we find that the WD20EVDS claims 1 million hours MTBF and the ST31500341AS claims 750,000 hours MTBF. That is, the operations staff can expect the Seagate drives to fail at a rate 1/3 greater than that of the Western Digital drives.

Sidenote: The two drives are in different classes. The Seagate drive spins at a faster rate (7,200 RPM) and claims performance. The slower Western Digital drive (5,400 RPM) claims consistency and lower power. However, the slower rate is fine for the storage cluster which acts as secondary storage.

I will not attempt to sort the apples to oranges comparisons between the two drive manufacturers. I shall take my previous calculations and adjust the 1.5TB drive cost up by a third.

Drive Failure Rate
(Adjusted Fail Rate)
Petabyte 60 Server Cluster with RAID-10

                        original      adjusted
       fail    HDD   === month ===  === month ===
Drive  cost   units  fails   cost   fails   cost
-----  -----  -----  -----  ------  -----  ------  
2.0TB  $ 210  1,000   16.4  $3,518   16.4  $3,518
1.5TB    130  1,334   21.9   2,851   29.2  $3,801

A reversal of recurring costs. Does it matter? No. Not really. In my opinion, it’s more important to minimize the day-to-day operation hassles. The three hundred bucks a month (one way or the other) is noise. The $60K difference in initial capital costs is significant but not as significant as reliable operations.

Rumor has it that both Seagate and Western Digital will soon release a 3TB drive.

Update 2010-12-17: Xbit Labs reports on Hitachi’s new sixth-generation perpendicular magnetic recording (PMR) which “enable 3.5″ hard drives with 4TB or even 5TB capacities.”

My final ponderings on this fantasy cluster looks at the impact of future HDD capacities. For this, I simply speculate on the cost per gigabyte. If you have pricing information more closely tied to reality, please let me know. 🙂

Drive Failure Rate
(unadjusted 5year rates)
Petabyte 60 Server Cluster with RAID-10 (or equivalent)

       $/GB   == failure cost ==   HDD   failrate  === month ===
Drive         HDD  labor  subtot  units   (days)   fails   cost
-----  -----  ---  -----  -----   -----  --------  -----  ------
5.0TB  0.100  500     70    570     400     4.564    6.6  $3,762
4.0TB  0.090  360     70    430     500     3.651    8.2   3,526
3.0TB  0.080  240     70    310     667     2.737   11.0   3,410
2.0TB  0.070  140     70    210   1,000     1.825   16.4   3,518
1.5TB  0.040   60     70    130   1,334     1.368   21.9   2,851

This isn’t the end of the tradeoff line. At some point, 3.5 inch HDD will yield to the 2.5 inch form factor. The larger larger drives just won’t be available. That dynamic will change the equation. GPU cards will become increasingly more capable. CPU core counts will increase. RAM costs decline. Fewer servers will be needed. Fewer racks. Less power. I’m sure in my naïveté I’ve underestimated much here. However, I do believe one day the entire data center used to build Avatar will fit inside a 40 foot shipping container. And then, some time later–but not much later–that compute power will shrink to fit in a 20 foot container. And so on. The important point is that the capital costs (exluding facilities) for this fantasy cluster is under a million dollars. And it gets cheaper by the day.

SOD: System Benchmarking Hardware

I’ve been working on and off on the next Seeds of Discontent article. This is a tidbit of an upcoming post but I wanted to push it out now since I need it published for a divergent sidebar article. It stands here bald and raw. That’s life.

This system uses most of the Rampage III resources. The two Radeon cards will completely consume the PCI Express lanes (2 x16). The 24GB DDR3 will fill the six DRAM slots.

There are 7 SATA II and 2 SATA III ports.

The two SATA III ports connect the two SSD drives as a RAID Level 0. All non-user data is served from this drive (e.g., the operating system and benchmark applications).

One SATA II port connects the DVD writer.

Six SATA II ports connect the six HDD as a RAID Level 0. The HDD were selected for their cache (64MB), interface (SATA III, though connected to SATA II), spindle speed (7200 rpm, not as fast as 10,000 or 15,000 rpm but much, much cheaper) and price (cheap).

The two video cards were selected for being current and for being cheap (sort of).

Not shown in the picture are four 1920 x 1080 LCD monitors. These monitors pivot making them nice for a development (4320 x 1920).

Item 1                                         $  60
Antec Three Hundred Computer Case
Black Steel ATX Mid Tower

Item 2                                           180
CORSAIR AX850 850W Power Supply

Item 3                                           230
ASUS Rampage III LGA 1366 Intel X58
Micro ATX Intel Motherboard
N82E16813131658

Item 4                                           880
Intel Core i7-970 Gulftown 3.2GHz LGA 1366
130W Six-Core Desktop Processor
L2 Cache: 6 X 256KB
L3 Cache: 12MB
BX80613I7970

Item 5                                           225
Item 6                                           225
Patriot Gamer Series (3 x 4GB)
SDRAM DDR3 1333 (PC3 10666)
PGS312G1333ELK

Item 7                                            70
Item 8                                            70
Item 9                                            70
Item 10                                           70
Item 11                                           70
Item 12                                           70
Western Digital 3.5" Caviar
640GB 7200 RPM SATA 6.0Gb/s
Cache: 64MB
WD6402AAEX

Item 13                                          134
Item 14                                          134
Crucial RealSSD C300 2.5" SSD
64GB MLC SATA 6.0Gb/s
CTFDDAC064MAG-1G1

Item 15                                           10
SilverStone Bay Converter
mounting bracket 3.5" to 2 x 2.5"
SDP08

Item 16                                          185
Item 17                                          185
ASUS Radeon HD 6850
EAH6850 DirectCU/2DIS/1GD5
1GB 256-bit GDDR5 PCI Express 2.1 x16
EAH6850

Item 18                                           11
HIS CrossFire Bridge Interconnect Cable
HCFBC4065

Item 19                                          250
Item 20                                          250
Item 21                                          250
Item 22                                          250
Acer 1920 x 1080 LCD Monitor
B243H

Item 23                                           25
LITE-ON 24X DVD Writer Black
iHAS424-98

Mid-Peninsula Aikido

Calligraphy rendering of Aikido characters

These are my notes made while looking for an Aikido dojo in the San Francisco Bay’s mid-peninsula region. Specifically the communities in and around Palo Alto, CA. (image: wikimedia.org)

[Update 2011-12-20: I have chosen to train at Aikido Center and started attending Zazen and Aikido classes in October 2011.]

I won’t get into the ‘why’ I’m looking for an Aikido dojo in this posting. Most of what I would say can be gleaned from Stan Wrobel’s excellent book, Aikido for Self Discovery: Blueprint for an Enlightened Life. I’ve included a short book review.

I made this handy google map of the Aikido dojos listed below. There is a KML file for the Google Earth crowd. Longitude and latitude coordinates were converted from the street addresses using capelinks.com.

I have included the Sensei for each dojo. It’s partly out of respect and partly out of a sense that the Sensei defines the nature of a particular dojo.

I met Doran Sensei once, briefly, some ten years ago. In those few minutes, he made a lasting impression. You cannot honestly talk about Aikido on the mid-peninsula without including Doran Sensei and his contributions.

I never met Kushida Sensei (he lives in Michigan) but I attribute his influence at Enbukan dojo. I trained for two years under a former student of Kushida Sensei and have respect for the schools under his watch.

The Aikido Center (aka Aikido Center of Mountain View) has been in its current location for many years. The dojo has a strong zen element. That is zen as in ‘zen practice’ not to be confused with ‘zen decor’.

Aikido of Mountain View has roots to the beginnings of Aikido in the area through Nadeau Sensei. I may be mistaken but I believe the dojo was once located on Charleston near what is now REI.

The Institute of Transpersonal Psychology (ITP) has a dojo but I’m under the impression that it is open only to students, faculty and staff. Ironically, I’m a Masters Program student at ITP but since I’m in the Global Program, I don’t attend classes on campus. I live nearby so maybe I will be eligible. In any event, if/when I find out more information, I’ll update the listing. Frager Sensei is also Dr. Robert Frager, co-founder (along with Dr. James Fadiman) of ITP. Aikido is an integral part of the ITP Resident Program.

I don’t know anything more about the remaining dojo than what I’ve read on their websites. I recommend reading their material first hand.

On the matter of dues (or tuition), I included the figures I found on their websites. I feel the amount of the dues is not really important. Note that all the schools charge about the same amount (hundred bucks a month, give or take). This is a really expensive part of the country; it is very expensive to operate a dojo here. My guess is that the dues keep the doors open. Barely.

Only the adult class schedule is given. Most schools have a separate children’s schedule on their website.

The following list is in alphabetical order by dojo name. The information reflects what was listed on each dojo’s respective website cira mid-November, 2010. Things change with time. Check with the school for the latest information.

Aikido Center
Shiohira Sensei

2560 Wyandotte Avenue, #A
Mountain View, CA. 94043

Monday      18:30-20:00        basic
            20:00-20:30        open
Tuesday     17:45-18:30        zazen ?
            18:30-20:00        basic
            20:00-20:30        open
Wednesday   18:30-20:00        beginners
            20:00-20:30        open
Thursday    17:45-18:30        zazen ?
            18:30-20:00        basic
            20:00-20:30        open
Friday      18:30-20:00        beginners
            20:00-20:30        open
Saturday    10:00-11:00        basic
            12:20-13:00        basic

$100/month
Aikido of Mountain View
Nadeau Sensei

1924 Plymouth Street
Mountain View, CA 94043

Monday      18:30-20:00        fundamental
Tuesday     18:30-20:00        basic
Wednesday   18:30-20:00        basic
Thursday    18:30-20:00        basic
Saturday    09:30-11:00        weapons
            11:00-12:30        basic
Sunday      11:00-12:30        basic

$100/month
Aikido West
Doran Sensei

3164 Bay Road
Redwood City, CA 94063

Monday      06:30-07:00        basics
            18:00-19:00        basics
            19:00-20:15        general
Tuesday     18:00-19:00        basics
            19:00-20:15        general
Wednesday   06:30-07:00        basics
            18:00-19:00        basics
            19:00-20:15        general
Thursday    18:00-19:00        basics
            19:00-20:15        general
Friday      18:30-20:00        general
Saturday    08:45-09:45        basics
            10:00-11:00        general
Sunday      08:30-09:45        weapons
            10:00-11:00        general

$95/month
Aikido@Noon
Hale Sensei

501 Laurel Street,
Menlo Park, CA 94025

Monday      12:00-12:50
Wednesday   12:00-12:50

$333/3months resident
 449/3months non-resident
(Through City of Menlo Park)
Enbukan Dojo
Kushida Sensei

209 West Evelyn Avenue
Mountain View, CA 94041

Monday      19:00-19:50        2
            20:00-20:50        3
Tuesday     18:15-18:55        intro
            19:00-19:50        1
            20:00-20:50        2
Wednesday   19:00-19:50        3
            20:00-20:50        2
Thursday    18:15-18:55        intro
            19:00-19:50        2
            20:00-20:50        1
Friday      19:00-20:30        open
            20:00-20:50        3

$90/month
Institute of Transpersonal Psychology
Frager Sensei

1069 East Meadow Circle
Palo Alto, CA 94303

?           ?                  ?

Tuition unknown
Rocky Valley Aikido
Heery Sensei
Tuazon Sensei

1211 Arbor Road
Menlo Park, CA 94025

Tuesday     19:00-20:00        St. Raymond School
Wednesday   17:30-18:30        Mitchell Park
Thursday    19:00-20:00        St. Raymond School

$60/month
Sierra Aikido
Levenbach Sensei
Bennett Sensei

2134 Old Middlefield Way, Suite A
Mountain View, CA 94043

Monday      19:00-20:00        weapons
Tuesday     12:00-13:00
Wednesday   19:00-20:00
Thursday    12:00-13:00
Friday      19:00-20:00
Saturday    11:00-12:00        weapons

$100/month
Stanford Aikido
Doran Sensei

Arillaga Family Sports Center
Wrestling Room
641 Campus Drive
Stanford, CA 94305

Monday      12:00-13:00
Tuesday     12:00-13:00
Wednesday   12:00-13:00
Thursday    12:00-13:00
Friday      12:00-13:00

$ 50/quarter  Student
 105/quarter  Stanford Faculty/Staff
 225/quarter  Non-Stanford Affiliated
  15/quarter  Anyone paying dues at Aikido West

Will it Blend?

In the previous article Seeds of Discontent, I jotted down a few benchmarks to compare OS X, Windows and Linux performance at a system level. In this article, I explore further a test built around graphics rendering engines. In particular, I was impressed by Sintel, an open source movie built with (among others) Blender.

I don’t know how long it takes to assemble the movie from source assets into a shipping product but I’ll venture it is quite a compute intensive process. I got the idea for this benchmark while reading on the sintel.org home page that the project is re-rendering the film for 4K and that it should be ready later this year (still a few months away). I didn’t know what 4K was so I looked it up. That lead me to compile a table of film formats and supported Frames per Second (FPS). I also appended three rows for HDTV of which only 720p and 1080p are real. 4320p is something bandied about as future but I couldn’t find material online that would lead me to believe it’s anything but a concept. I included it to define the upper envelope edge. (Note: different formats support different levels of FPS.)

Fomat   Width    Height    Frames per Second (FPS)
------- ------   ------  ----------------------------
2k 16:9 (2,048 × 1,152)  24, 25, 30, 50, 60, 100, 120
2K 2:1  (2,048 × 1,024)  24, 25, 30, 50, 60, 100, 120
2K ANA  (1,408 × 1,152)  24, 25, 30, 50, 60, 100, 120
 
3k 16:9 (3,072 × 1,728)  24, 25, 30, 50, 60
3K 2:1  (3,072 × 1,536)  24, 25, 30, 50, 60
3K ANA  (2,112 × 1,728)  24, 25, 30, 50, 60

4k 16:9 (4,096 × 2,304)  24, 25, 30
4K 2:1  (4,096 × 2,048)  24, 25, 30
4K HD   (3,840 × 2,160)  24, 25, 30
4K ANA  (2,816 × 2,304)  24, 25, 30

4.5k WS (4,480 × 1,920)  24, 25, 30

720p    (1,280 ×   720)  24, 25, 30, 50, 60
1080p   (1,920 × 1,080)  24, 25, 30, 50, 60
4320p   (7,680 × 4,320)  24, 25, 30, 50, 60

From this, I build a spreadsheet table of “Gigabytes per Second” of uncompressed data and converted the table into a set of data files compatible with gnuplot. Below are plots for 24 and 60 FPS for three formats.

Gigabytes per Second, 1080p, 4K 16:9, 4320p, 24 frames per second

Gigabytes per Second, 1080p, 4320p, 60 frames per second

I chose 24 FPS since it is a standard film frame rate. I excluded 25 and 30 FPS since they were so close to 24 FPS that they didn’t add any additional insight into designing the test. (25 and 30 FPS happen to be half of 50 and 60 FPS, which I discuss next). One datapoint to consider is that Blu-Ray supports 1080p at no greater than 24 FPS.

Just as the world is split between driving on the left or right hand side of the road, the world’s electrical power is split between 50 and 60 Hz. Neither is right or wrong but they are different and the two don’t mix well (like people driving on both the left and right side of the same road don’t mix well). I currently live in a 60Hz world so I biasedly chose the familiar but I could have easily chosen 50Hz. They are so close to each other in context of this exercise that the differences are negligible.

I chose two HD formats (4320p and 1080p) plus one film format (4K 16:9). The film format file size falls between the two HD formats. Note that 4K doesn’t support a frame rate obove 30 FPS and isn’t included in the 60 FPS graph.

Note that I haven’t included compression in this test as I believe it to be a different strain on the system.

My thoughts for the blender test was to establish a benchmark that would be inherently obvious and understandable to anyone, especially to those without any previous knowledge of video rendering or video formats. It’s more intuitive to say, “This system renders the video 1000 times slower than real time.” It also makes for a nice roadmap chart. Consider time on a linear x-axis and rendering ratio on a logrithmic y-axis.

A second roadmap milestone map could be what format/fps combination(s) achieve real time rendering status for a specific system on a specific date.

It would be interesting to see the performance of an ad hoc office cluster vs. a 40 foot shipping container of specialized hardware. If this sounds crazy, juxtapose any of the recent systems from Tom’s Hardware System Builder Marathon against the original IBM PC. (Or the ENIAC ;^)

In the conclusions section, I’d want to see analysis/conjecture about the test results regarding system bottlenecks. In an attempt to compare the systems, I’d also want to see the individual system price divided by the performance ratio. For example,

System A
Cost         $1200 USD
Ratio        0.0005245  (1830:1)
Cost/Ratio   2,196,000

System B
Cost         $1153 USD
Ratio        0.0005736  (1735:1)
Cost/Ratio   2,021,209

Cost Comparison
System A  : System B
    1,200 : 1,153
    1.040 : 1  (A is 4% more expensive than B)

Ratio Comparison
System A  : System B
    1,830 : 1,735
    1.055 : 1  (A is 5.5% slower than B)

Cost/Ratio Comparison
System A  : System B
2,196,000 : 2,021,209
    1.086 : 1  (A is 8.6% more expensive to
                to finish the same job as B)

If one starts to compare operating environments (Windows 7, OS X, Ubuntu, Fedora) on the same system and then repeats those tests across various systems (Dell, Apple, Whitebox), we start to get an apples to apples comparison in price, performance and price/performance ratios.

Note: the spreadsheet (numbers, exported to excel), data files, gnuplot scripts and images are available for download.

[update 2010-10-05: I just read at thenextweb.com that rendering Avatar requires 1 petabyte of drive space. That’s in 3D but it still makes me wonder why there’s a wide gap between my calculations of a hypothetical 2 hour movie and that of Avatar. The article at information-management.com gives a reference of 17.28 gigabyte/minute. At 166 minutes, this is 2.8 terabytes (a long way off from a petabyte). Perhaps there are a lot of intermediate stages for each frame which must be saved? Nevertheless, I believe the test is still interesting.]

Seeds of Discontent

I like reading Tom’s Hardware Guide. I liked it better in the site’s early days when it wasn’t so javacript and flash heavy and the articles were idomatically ‘German-English’. A lot happens in twelve or so years. Still, Tom’s is the best source of information on the web.

I read a CPU benchmark this morning which got me wondering if there were a possiblity to benchmark on any operating system but Windows. I concluded for the kind of benchmarks seen on Tom’s–not really.

Many (most?) of the benchmarks rely on software only available on Windows. The platforms are varied while holding the software and operating system constant. This makes sense as Tom’s Hardware is, well, primarily a hardware site.

But what could we could we learn by holding the hardware constant and vary the operating systems? A lot, I believe.

One point I’ve noticed in reading Tom’s benchmarks over the years is the increasingly predictable nature of the results. Motherboards often perform within a few percentage points of each other. THere is some variance in different CPUs but rarely unpredicable. The conclusions often center around cost-performance comparisons to see if the premium parts are justified.

What really motivated me to write this blog entry was a video card benchmark. The two major vendors (AMD née ATI, NVidia) have been long locked in fierce battle. The benchmark seems to keep the software constant and only varies the video card hardware but the lines are not so cleanly drawn. Each video card has a unique architectural philosphy and design. Software modules are written to detect and take special advantage of these differences in video cards.

I argue that a systems approach might make for an interesting benchmark and would likely yield surprising results. Furthermore, such a benchmark would spark the desktop religious wars that have subsided in recent years (e.g., 68000 vs x86, PowerPC vs x86, IPX vs TCP/IP, RLL vs ESDI, Word vs. WordPerfect, Microsoft vs. Novell …). It seems the good fight has now moved on to smartphones and tablets. The desktop market has become docile.

My aim is to whack the hive. This is my vision.

1. This benchmark has a name: “Seeds of Discontent”

Why this name? It fits. How?

Ah. A little about me. My first job out of college was writing assembly language graphics routines used in medical imaging. On a 12MHz 80286. This introduced me to graphics hardware leading to a job at National Semiconductor (yes, National once developed graphics chips) and a deeper understanding of PC architecture. In the late nineties, I moved on to a small consumer electronics startup and eventually into pure software. I’ve seen the industry from several perspectives.

The early PC days placed importance on CPU speed for better performance and hardware integration (combining many chips into a single chip) for lower cost. The early days are gone and the funeral was the death of Comdex which happened a few years before the last Comdex. Those last few Comdex shows were zombies and those of you who were there know what I mean.

Nowdays, the big advances in hardware don’t spur the kind of religious fervor of old. It evokes a semi-interested “Oh, that’s kinda cool” response. (e.g., PATA to SATA, HDD TO SDD, dual core to quad core to six core). Interesting but not controversial.

This isn’t to say the hardware designers are sleeping. They are not. It’s just that hardware advancments alone won’t produce the kinds of dramatic performance improvements year-over-year that we saw during the eighties and nineties. The kinds of hardware advances we now see require specialized software to take advantage of those advances. And the industry knows it.

The greatest potential for seeing surprisingly results in a benchmark are not in swapping out CPUs or mainboards. The greatest potential lies in swapping out the operating systems. If there is one scrap of religiosity left in the industry, it’s in operating systems. As it happens, the OS is also the component that now has the biggest potential to influence the industry.

Which brings me to my point. I believe the right set of benchmark tests could pit OS vs OS and spur the kind of religious spat that leads to real progress. But with peace breaking out all over the industry, whipping up a bit of OS hooiganism might just whip up a real revolution. As David W. once told me, “Man’s best friend is his dogma.” Today, there is nothing more religious in the industry than operating systems.

2. “Seeds of Discontent” has a mission: I see you, OS

Chicken and egg.

Mulitcore CPUs and graphics engines both require specialized software to take advantage of the hardware. But today’s changing hardware is a moving target. It’s hard for a small software firms to write specialized hardware for an ever changing platform. But without software running on a platform (e.g., graphics), there is no “tie down” for a hardware vendor to stick with a stable API (I’m looking at you Nvidia, AMD).

In the multicore CPU arena, it takes a partner in the OS vendor to take full advantage of the multicore architecture. This isn’t to say mutithreaded software is held back by today’s OS but rather todays OS don’t do enough to create a furtile ground for multithreaded/multicore software development. (Yeah, the penguinistas are going ape-shit about now but I stick with my proposition.)

The real problem, as I see it, is that it takes a coordinated effort to make the necessary software development changes. No body wants to spend their money first. Nobody. Especially those with no money (the solo developer with no resources other than time.)

Disclaimer: I am a macinista.

Apple’s Snow Leopard (I have no information on Lion) arrived with much fanfare about Grand Central Dispatch, LLVM, OpenCL, blah blah blah and other multicore magic. I love the candy-coated icons for Core-X. But how has that translated into better software support for modern hardware? You don’t hear much about it (even if it has or hasn’t happened).

“Seeds of Discontent” is a set of benchmarks to provide an adequacy gap for each OS to buck up and meet the challenge of each other. Really. Can anyone name a benchmark today by which one OS can embarrass another? No.

The real value add an OS can deliver is hardware abstraction. (Unless you are the dominate OS -Microsoft–software developers are undermotivated to deliver software cutomized for specialize hardware under the OS.) I want OS vendors motivated to deliver better performance with cost-effective effort for software vendors. Not exlusively but in a multi-vendor environment.

Benchmarks give a measurable performance comparison between operating systems. And the operative word here is “System”. Given a fixed hardware platform, how do the various system’s perform. It’s not which comes first “chicken or egg,” what comes first is the benchmark and the “gap of shame.” Nobody wants to be last. Nobody.

3. Who?

To compare various Operating Systems presupposes a fixed hardware platform, (I say.) It also presupposes running the same software on all systems. Technically, that is impossible. Well, impossible except in a systems perspective.

Software makes OS calls to handle many functions. Mostly this is to abstract the hardware details from the OS. What’s needed then is software that has been compiled for each of the target systems. Since I’m interested in system performance (i.e., what you see when you use the system), I don’t care about how the software is written. If a package is not opitimized to use the special benefits offered by the OS, I don’t care. In my world, the experience is diminished.

So the one thing that is constant in a benchmark is the hardware. This isn’t such a problem for Linux and Windows but does present a problem for OS X. I want OS X included. Here lies a fork in the road. Does the benchmark use an Apple computer and install all OS on that machine? Or does the benchmark use a white-box PC and install all OS that machine?

I vote for the latter. This means that only way to make a proper benchmark is to build a hackintosh. This presents a problem since Apple forbids it. I recommend, then, including a “Brand-X” OS. Of course, the people actually performing the benchmark cannot comment but I would want Brand-X to be OS X, ostensibly the latest shipping version at the time of the benchmark. Of course, Brand-X could be anything. Maybe even Solaris ;^). But what about the other operating systems?

Brand-X
Windows
Ubuntu
Fedora

4. What?

Operating systems alone do not a benchmark make. What software applications could be used for the tests? Tom’s uses software that’s only available on Winodws so those won’t work. I give here a short candidate list. This isn’t a definitive list but a starting point for discussion. All packages are available for compilation and execution on all platforms.

4.1 inkscape: complex vector graphics generation
4.2 gimp: complex bit image manipulation
4.3 blender: complex 3D image manipulation, rendering. (Could we get the raw sources to render the entirity of Sintel?)
4.4 aqsis renderer: another rendering engine
4.5 brl-cad: yeat another rendering engine
4.6 handbrake: DVD ripping
4.7 ffmpeg: audio transcoding
4.8 R, rgl: converting data into visualization (e.g., httpd logfiles)
4.9 activemq: build with unit tests (requires java, maven)
4.10 tesseract-ocr: OCR convert scaned text to text files
4.11 nanoc: convert text file + gutenberg to static HTML site
4.12 nginx: file server of a static assets (see 4.12)
4.13 quake (II, 4, spasm): compile, performance benchmark

Conclusion: Will this go anywhere? I hope so. But I believe for it to come alive requires a large community to care about it. For them to care about it requires a benchmark that defines a tangible result. The difference between 63 and 68 frames per second in some game that only runs on one platform is largely irrelevant; it only says that if you spend a few hundred dollars more, you get a few more frames per second.

What would make sense? Well, for example, how many frames per second could one render raw source files for the short film Sintel at HD 1080p? My guess that right now that is under one frame per second. That sets a milestone. I really don’t know. Maybe it is greater than 1 FPS on a desktop machine running a quadcore CPU and some sort of GPU acceleration. Then again, maybe not. But there will be a price point for that. So maybe a better benchmark is $US/FPS or €/FPS (that is, cost of the system over frames per second.)

Another benchmark that puts pressure on the OS could be quake (pick your poison/version). If the only objective were FPS, then I’m confident that all OS versions could (with effort) come out about the same when running on the same platform. However, a benchmark which relied upon the OS (e.g., OpenGL, DirectX) for rendering would provide a more apples-to-apples comparison. The astute reader would immediately recognize that DirectX has an advantage in games. That’s the point. If non-Windows hopes to compete, they have to compete with the experience of games on DirectX.

This extends to the other benchmarks as well. Apple’s (remember, I’m a fan) deployment of Grand Central Dispatch, OpenCL, LLVM in Snow Leopard was exciting but I want to see it make it’s way to real software. I also use CentOS in my servers so I have a vested interest in the evolution of Fedora. Since I’m interested in Fedora, I’m interested in the other major linux OS, Ubuntu. And since OS X, Fedora and Ubuntu are minority players and these benchmarks set a bar, I’m interested in Windows. In the end, a battle for tangible system level benchmark performance benefits all.

I don’t believe the desktop is dead. After all, software written for consoles, or smartphones or tablets (or even other desktops) are written on desktops. The industry is asleep. I want to whack the hive.

“Breakin’ the law!” –Beavis

filenames with spaces

I downloaded Unity last week. My first bit of installation geekery was to push the documentation to marmaduke, my CentOS server. I drop the documentation directory into a virtual host and let apache serve the index (since I solved my Apache Directory Indexing problem a bit back.)

However, I found that the pages had missing images. Looking at the HTML source, I found that some of the image URLs used ‘images’ (lower case i) while others used ‘Images’ (upper case I). To make matters worse, CSS and javascript files were also stored in the images directory.

I wanted to simply change all occurances of Images to images. The Unity docs have a lot of files. How many? Just under four thousands.

$ find -name '*html' | wc --lines
3952

Shouldn’t be too hard of a script. However, piping the output of find didn’t work since many filenames contained spaces. Spaces in filenames cause problems.

The bash for loop treated the spaces as token markers. To solve this, I simply dumped the filenames to a tempfile and read them back in. I’m sure there’s a one liner somewhere but this solved my problem.

Solution: bash, find and perl search and replace over multiple files and the filenames contain spaces.

#!/bin/bash

find -name '*html' > filenames

while read line; do
  perl -pi -e 's/Images/images/g' "${line}"
done < filenames

rm filenames