EC2 and S3 Success Story

I’ve been building systems lately on Amazon’s Elastic Compute Cloud (EC2). At first, I was only interested in Amazon’s Simple Storage Solution (S3) after seeing the SmugMug slide show.

I hadn’t really considered using EC2 since we had more servers in colocation than I really needed. But I had a file storage problem. When you have a thousand files, you stick them in a directory. When you have a million files, you cannot simply stick them in a single directory. You distribute them across multiple directories. What a PITA.

My first thought was to use MogileFS. It handles the directory hashing for you and distributes redundant copies of files across multiple servers. I had extra servers. Sweet. But before I rushed off and started building my shiny new filesystem, I wanted to check out the competitors. That led me to SmugMug. And that led me to S3.

I work at a tiny startup. I had a problem and very few developers to ask for help. Every hour I needed from was a significant impact on another project. And dammit, all the open projects were on fire. I needed to solve my file system problem and fast.

So up on S3 the files went. XML files. Beaucoup XML files.

It was painless. It was simple. It was cheap. The monthly S3 cost is a fraction of a server’s cost in colocation. Sweet!

Wait! If that’s so yummy, why not move XML processing up to EC2? Our XML processing load was increasing…increasingly increasing. I rewrote our XML processing app, built a custom amazon machine image (centos + apache + tomcat) and fired it up. Nice!

Building the machine instance was a pain but worth the effort. I learned a lot about centos that I didn’t previously know or really understand. However, I wish I had a real system administrator on staff. It would have hurt less.

One of the goals for the EC2-based XML processing was to shift from offline XML processing to a RESTful web service. That is, rather than queue the XML processing in a single process, I needed to finish the XML processing during the HTTP request. On demand processing. Done in seconds (not tens of minutes). And handle multiple concurrent processing requests.

Here is the EC2 <--> S3 connection. For each file received for processing, I write dozens to hundreds of files to S3 plus open scads of HTTP connections to other web servers. Running these in a single thread burned precious time. Even though we “write” to S3, the underlying mechanism is another HTTP request.

Simple. Build a thread pool for the HTTP requests and run multiple threads concurrently. That worked swimmingly but for one issue. It didn’t take long until I started seeing the “Too many open files” in the exception logs.

Normally, the limit on open files is quite adequate. But you bolt Apache’s HttpClient to the backend of your webapp and supercharge it with a healthy thread pool and you will overwhelm the default settings. Centos will not “garbage collect” the spent files from completed HTTP requests fast enough.

The solution: Up the limits on open files. The default is 1024. Simply edit /etc/security/limits.conf and change the soft and hard values for nofile. I’m sure there is a maximum size but these values have been working for me. What’s appropriate for your system is dependent on your system. You will need to pick size values for yourself.

#*               soft    core            0
#*               hard    rss             10000
#@student        hard    nproc           20
#@faculty        soft    nproc           20
#@faculty        hard    nproc           50
#ftp             hard    nproc           0
#@student        -       maxlogins       4
*                soft    nofile          8192
*                hard    nofile          65536

What was the net result of moving XML processing and storage up to the Amazon Cloud? Retired 60% of the servers in colocation. Built a scalable infractructure. Reduced overall monthly hosting costs. Fewer moving parts.

Now, if only I had a system administrator…

Your email will never published nor shared. Required fields are marked *...



Type your comment out: