<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>

<channel>
	<title>Ruby, Rails, and Technology</title>
	<atom:link href="http://tech.notaproblem.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://tech.notaproblem.com</link>
	<description>Exploring performance and deployment related issues</description>
	<pubDate>Tue, 22 Sep 2009 07:16:46 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Apache + Passenger vs. Nginx + Passenger which is best? Lai</title>
		<link>http://tech.notaproblem.com/2009/09/22/hello-world/</link>
		<comments>http://tech.notaproblem.com/2009/09/22/hello-world/#comments</comments>
		<pubDate>Tue, 22 Sep 2009 07:16:46 +0000</pubDate>
		<dc:creator>DrMark</dc:creator>
		
		<category><![CDATA[Rails]]></category>

		<category><![CDATA[Apache]]></category>

		<category><![CDATA[deployment]]></category>

		<category><![CDATA[Nginx]]></category>

		<category><![CDATA[Passenger]]></category>

		<category><![CDATA[Performance]]></category>

		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://tech.notaproblem.com/?p=1</guid>
		<description><![CDATA[Apache + Passenger vs. Nginx + Passenger which is best? Read on for the clear winner&#8230;
Ok, I admit it. I am a Mongrel cluster user. Yes, I am familiar with Passenger and Thin and most of the other deployment options. Yes, I am aware they are supposedly better or faster or stronger or whatever when [...]]]></description>
			<content:encoded><![CDATA[<p>Apache + Passenger vs. Nginx + Passenger which is best? Read on for the clear winner&#8230;</p>
<p>Ok, I admit it. I am a Mongrel cluster user. Yes, I am familiar with Passenger and Thin and most of the other deployment options. Yes, I am aware they are supposedly better or faster or stronger or whatever when compared to a Mongrel cluster. However, up until now we have taken the safe and proven deployment strategy of using Mongrel clusters for our production projects (with all the challenges that entails&#8230;).</p>
<p>I have been interested in Apache + Passenger since Hongli Lai announced it because of the supposedly easier configuration and its good performance. I was impressed with the performance increase of Passenger + REE relative to Mongrel just like everyone else. When Passenger was announced we started examining it. We even went as far as deploying it on our staging servers and internal servers with good success. However, for various reasons, we never deployed our main production applications using Passenger.</p>
<p>Well, the recent introduction of Nginx + Passenger has caused us to examine our deployment options again. Nginx has long been regarded as using less memory and having better performance with cached content than Apache. There have been countless articles purporting that one option is better than another.</p>
<p>I don&#8217;t like speculation when it comes to deployment options. I prefer proof. In addition, keep in mind that this article is for my own benefit. Whatever option I find to be the best (for me) will be the option we use in the future. So, I have an interest in doing a thorough evaluation.</p>
<p>In this article, I make a very careful analysis comparing the performance of Nginx + Passenger to Apache + Passenger.</p>
]]></content:encoded>
			<wfw:commentRss>http://tech.notaproblem.com/2009/09/22/hello-world/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Engine Yard Solo vs. EC2onRails: Pay Engine Yard extra or do it yourself?</title>
		<link>http://tech.notaproblem.com/2009/05/24/engine-yard-solo-vs-ec2onrails-pay-engine-yard-extra-or-do-it-yourself/</link>
		<comments>http://tech.notaproblem.com/2009/05/24/engine-yard-solo-vs-ec2onrails-pay-engine-yard-extra-or-do-it-yourself/#comments</comments>
		<pubDate>Mon, 25 May 2009 04:46:23 +0000</pubDate>
		<dc:creator>DrMark</dc:creator>
		
		<category><![CDATA[Performance Testing]]></category>

		<category><![CDATA[Rails]]></category>

		<category><![CDATA[Amazon EC2]]></category>

		<category><![CDATA[EC2]]></category>

		<category><![CDATA[EC2onRails]]></category>

		<category><![CDATA[Engine Yard]]></category>

		<category><![CDATA[Nginx]]></category>

		<category><![CDATA[Passenger]]></category>

		<category><![CDATA[Ruby on Rails]]></category>

		<guid isPermaLink="false">http://tech.notaproblem.com/?p=81</guid>
		<description><![CDATA[Engine Yard is well known in the Rails community for their expertise in deploying and scaling Rails applications. When Engine Yard recently introduced Engine Yard Solo, I was very interested since I have been using the open source project EC2onRails. In particular, I wondered if you should pay Engine Yard for the Solo product when you [...]]]></description>
			<content:encoded><![CDATA[<p>Engine Yard is well known in the Rails community for their expertise in deploying and scaling Rails applications. When Engine Yard recently introduced <a title="Engine Yard Solo" href="http://www.engineyard.com/solo" target="_blank">Engine Yard Solo</a>, I was very interested since I have been using the open source project <a title="EC2onRails" href="http://ec2onrails.rubyforge.org/" target="_blank">EC2onRails</a>. In particular, I wondered if you should pay Engine Yard for the Solo product when you can get EC2 running on your own for free. (excluding Amazon EC2 charges.)</p>
<p>After hours of testing and analysis, I have decided that the answer depends upon your skill level. Read on for the details.</p>
<h3><span id="more-81"></span>The contestants: Engine Yard Solo vs. EC2onRails</h3>
<h4>Engine Yard Solo</h4>
<p>First up, we have Engine Yard&#8217;s new <a title="Engine Yard Solo" href="http://www.engineyard.com/solo" target="_blank">Solo</a> product. According to Engine Yard:</p>
<p style="padding-left: 30px;"><strong>&#8220;Solo</strong> is an inexpensive, web-based platform for on-demand management of your Ruby on Rails web application on Amazon’s cloud computing infrastructure. Manage your web application from development to deployment. You get the <a href="http://www.engineyard.com/solo/features">Engine Yard Ruby on Rails hosting and deployment expertise</a> wrapped in an easy-to-use interface.&#8221;</p>
<p>The idea behind Solo appears to be that many people would like the flexibility of using <a title="Amazon's EC2 environment" href="http://aws.amazon.com/ec2/" target="_blank">Amazon&#8217;s EC2 environment</a> but lack the technical expertise to configure the server themselves. Solo enables someone with little or no technical knowledge to set up and manage an EC2 server. To get an idea of how easy the system is to use, Engine Yard has a number of <a title="View Engine Yard Solo screencasts" href="http://www.engineyard.com/solo" target="_blank">screencasts  online</a>.</p>
<p><strong>The Good:</strong> I created a Solo account and used it for several hours over the course of a few days in order to write this article. Happily, I can report that Solo performs as advertised. You will be able to very quickly get a Rails project up and running with Solo (assuming you have a fair understanding of deploying Rails applications. See below for a discussion). The Engine Yard team has successfully removed some of the complexity of running your own Rails server. They also provide a well optimized and secure server.</p>
<p><strong>The not so Good:</strong> Given that Engine Yard appears to be targeting Solo at less knowledgeable users, I have a few issues with Engine Yard&#8217;s Solo.</p>
<ol>
<li><em>Cost</em>: In order to use Solo, you pay an additional $0.06 per hour for a small instance<sup class='footnote'><a href='#fn-81-1' id='fnref-81-1'>1</a></sup> <span style="text-decoration: line-through;">(plus a premium over Amazon&#8217;s prices for the storage and bandwidth)</span> <sup class='footnote'><a href='#fn-81-2' id='fnref-81-2'>2</a></sup>. This works out to be a minimum of $43.20 per small instance per month <span style="text-decoration: line-through;">(plus storage and bandwidth charges)</span>.<sup class='footnote'><a href='#fn-81-3' id='fnref-81-3'>3</a></sup></li>
<li><em>Lack of instructions</em>: The instructions for new users are not sufficient.<sup class='footnote'><a href='#fn-81-4' id='fnref-81-4'>4</a></sup> After you log in to Solo, you are greeted by a very clean layout, with no instructions or tips on what to do next. Should you &#8220;Create New Environment&#8221; or &#8220;Create New Instance&#8221; or one of the other choices? An inexperienced user may very well be lost at this stage. One of the things that 37Signals does very well is provide tips in order to get a new user started.</li>
<li><em>Deployment process</em>: The steps that you must take in order to deploy a new application (like adding gems and Unix packages) are spread across numerous places in the system. These items should be placed into a more logical sequence or in some sort of wizard.</li>
<li><em>Log files</em>: The process of viewing deploy log files is less than optimal. My application crashed several times during deployment until I had all of the configuration options set properly.<sup class='footnote'><a href='#fn-81-5' id='fnref-81-5'>5</a></sup>  In order to view the deploy log files, you must view them in the browser. Unfortunately, the deploy log files were very, very large which causes trouble with your browser. This caused trouble even on my machine, which is an 8 core machine with 16GB of ram. The log files should be downloadable.</li>
<li><em>Login process</em>: This is a minor issue but there is no obvious login link. If you go to the Solo page, you will see a sign up link, but no link for returning users.<sup class='footnote'><a href='#fn-81-6' id='fnref-81-6'>6</a></sup></li>
</ol>
<p>Overall, these are minor issues that can easily be addressed in future versions of Solo.</p>
<p><strong>The take so far:</strong> The current version of Solo is an excellent start. A few wizards and the liberal use of tips and suggestions could make this into a killer app.</p>
<h4>EC2 on Rails</h4>
<p>Next, we have <a title="EC2onRails" href="http://ec2onrails.rubyforge.org/" target="_blank">EC2onRails</a>. EC2onRails is an open source project led by Paul Dowman. According to the EC2 on Rails website:</p>
<p style="padding-left: 30px;">&#8220;EC2 on Rails is an Ubuntu Linux server image for <a href="http://www.amazon.com/b/ref=sc_fe_l_2/102-6342260-7987311?ie=UTF8&amp;node=201590011&amp;no=3435361">Amazon’s EC2 hosting service</a> that’s ready to run a standard Ruby on Rails application with little or no customization. It’s a Ruby on Rails <a href="http://en.wikipedia.org/wiki/Virtual_appliance">virtual appliance</a>.&#8221;</p>
<p>Paul Dowman and the other EC2onRails committers (like Adam Greene (aka skippy)) deserve recognition for the massive amount of work they have done.<sup class='footnote'><a href='#fn-81-7' id='fnref-81-7'>7</a></sup></p>
<p><strong>The Good:</strong> The best part of the EC2onRails system is that it is open source. There is no charge for using the code and you have the ability to tailor it to your specific application. The EC2onRails code works well for creating an Apache + Mongrel cluster server. The EC2onRails team has done a good job configuring the various parts of the server. You can easily create an all-in-one server (with a web server, application server, and database) or split your application server (Rails) and the database onto separate boxes. Once you have a running server, it is monitored and maintained with tasks like database backups happening automatically.</p>
<p><strong>The not so Good:</strong> Keep in mind that we are evaluating the experimental branch of EC2onRails running Nginx+Passenger, not the production branch of 0.9.9.1. So, some of my comments here will be addressed by the time this code reaches production quality. Given that this is an open source project, its weaknesses are like many other open source projects.</p>
<ol>
<li><em>Support</em>: The EC2onRails project is open source and lacks official support. It is intended for more advanced users. If you have an issue, your best bet is to reach out to the very helpful <a title="Visit the EC2onRails Google group" href="http://groups.google.com/group/ec2-on-rails-discuss" target="_blank">Google group</a>.</li>
<li><em>Documentation</em>: The documentation is good for deploying an Apache+Mongrel cluster configuration but is under revision for the Nginx+Passenger configuration. There are also items that are not covered in the documentation. For example, there are many configuration options and steps in the process that may require you to search the Google group or read the source code.</li>
<li><em>Ease of use</em>: The system is not as easy to use as Solo. There are a number of steps that must be done manually in order to get a running server (e.g. deploying SSH keys). To be fair, most of these steps only need to be performed one time.</li>
<li><em>Nginx+Passenger is alpha</em>: Support for Nginx+Passenger is currently experimental. Because of this, there is no AMI image available for you to use. This means that you must first create the AMI (a time consuming step) then configure your EC2onRails server.</li>
<li><em>The future?</em> Now that Ubuntu is <a title="Read more about the Ubuntu EC2 images" href="http://www.ubuntu.com/ec2" target="_blank">offering EC2 instances</a>, the future of an important piece of the EC2onRails infrastructure is unclear. Eric Hammond&#8217;s <a title="http://ec2ubuntu-build-ami.notlong.com" href="http://ec2ubuntu-build-ami.notlong.com" target="_blank">ec2ubuntu-build-ami script</a> is used when building a new AMI <span style="text-decoration: line-through;">and it may not be supported by him in the future</span>.<sup class='footnote'><a href='#fn-81-8' id='fnref-81-8'>8</a></sup> This should not affect EC2onRails end users, but may require changes to the core EC2onRails code.</li>
</ol>
<p>Overall, the EC2onRails project is a good example of a successful open source project. It has its rough spots but there are a good group of people using it and it is free :). Most of the issues listed above will be addressed by the time the code becomes official.</p>
<p><strong>The take so far:</strong> The current version of EC2onRails offers an inexpensive and relatively easy way to get a Rails application running on EC2 using Apache+Mongrel cluster. If you want to deploy a Nginx+Passenger server, you will have a fair bit of work because the code is still experimental.</p>
<p>Here is my analysis of EC2onRails versus Engine Yard Solo so far:</p>
<table style="text-align: left;" border="0" align="center">
<tbody>
<tr>
<th width="50%"><strong>EC2onRails</strong></th>
<th width="50%"><strong>Engine Yard Solo</strong></th>
</tr>
<tr>
<td valign="top"><em>Advantages</em>:</p>
<ul>
<li>Free</li>
<li>Very customizable</li>
<li>Use as many instances as you like</li>
</ul>
<p><em>Disadvantages</em>:</p>
<ul>
<li>Limited support options (Google groups)</li>
<li>Can be difficult to set up if you want Nginx</li>
<li>Nginx support is experimental</li>
<li>Out of date instructions</li>
</ul>
</td>
<td valign="top"><em>Advantages</em>:</p>
<ul>
<li>Easy to use</li>
<li>Nice web-based UI</li>
<li>Some Engine Yard support. (limited to web-based support forums)</li>
<li>Reliable and proven</li>
</ul>
<p><em>Disadvantages</em>:</p>
<ul>
<li>Cost</li>
<li>Insufficient instructions / tips for new users</li>
<li>Solo provides one instance per environment up to a maximum of three environments</li>
<li>Limited customization options</li>
</ul>
</td>
</tr>
</tbody>
</table>
<p>So which one you use depends so far on your expertise. If you have good technical skills, use EC2onRails and save the money. If you are less technically inclined, use Solo. However, we have not yet examined <strong>PERFORMANCE</strong>! In short, is the Engine Yard Solo server faster than an EC2onRails server?</p>
<h3>Hypothesis:</h3>
<p>The performance of an Engine Yard Solo server will be the same as an EC2onRails server when configured using Nginx and Passenger.</p>
<h3>Setup:</h3>
<p>For these tests, I used 4 EC2 instances configured as shown here:</p>
<div id="attachment_130" class="wp-caption aligncenter" style="width: 478px"><img class="size-full wp-image-130" title="performance-test-setup-eysolo_vs_ec2onrails" src="http://tech.notaproblem.com/wp-content/uploads/2009/05/performance-test-setup-eysolo_vs_ec2onrails.png" alt="Test server configuration" width="468" height="361" /><p class="wp-caption-text">Test server configuration</p></div>
<p>Great care is required when doing performance tests to ensure your results are legitimate. There are many factors that affect server performance and not accounting for any of these factors can make your results invalid. For example, in many web server performance tests that I read, the authors simply run Apache Bench (ab) against a couple of configurations then proclaim a winner. A couple of potential problems with this approach are 1) the results are not statistically testable and 2) something may have changed on the server between test runs.</p>
<p>In order to get valid results, we need to take multiple samples of the performance. With those samples we need to get both the mean and the standard deviation. When we have this information, we can perform a two-sample t-test to statistically prove that one configuration is better. Fortunately, the httperf tool provides the information that we need.</p>
<p>For these tests, each test server instance was configured exactly the same. Each instance was created using the EC2onRails code from <a title="View my Nginx+Passenger branch on Github" href="http://github.com/DrMark/ec2onrails/tree/nginx-passenger" target="_blank">my fork on Github</a>. They were EC2 Small Instances and had minimal services running. The EC2onRails server was also created using my EC2onRails branch. The Engine Yard Solo server was created using the Solo web-based GUI. Both servers used the exact same Rails code<sup class='footnote'><a href='#fn-81-9' id='fnref-81-9'>9</a></sup> and had the exact same data in the database. All tests were performed at the same time on both configurations. Each test was then repeated on the other configuration to ensure the testing server was not biasing the results.</p>
<p>Details for the EC2onRails server:</p>
<ul>
<li>Ubuntu 8.04</li>
<li>Rails 2.3.2</li>
<li>Nginx 0.6.36</li>
<li>Passenger 2.2.2</li>
<li>MySQL</li>
<li>Memcached</li>
</ul>
<p>Testing process: (all of these steps are performed on each testing server at the same time)</p>
<ul>
<li>Warm up each server using ab:<br />
ab -n 100 -c 1 ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com/</li>
<li>Run this a second time to ensure the server is ready.</li>
<li>Check that all requests were served successfully.</li>
<li>Using an automated script:
<ul>
<li>Send 5 requests using httperf to estimate the time required to get 45 samples.</li>
<li>Get at least 45 samples using httperf.</li>
</ul>
</li>
<li>Run the test again using the other testing server.</li>
<li>Verify that the two samples from different testing servers are not statistically different from one another.</li>
</ul>
<h2>Results:</h2>
<p>The test results for the EC2onRails server are not statistically different than the test results in <a title="EC2onRails - 2 Small Instances or 1 High-CPU Medium Instance?" href="http://tech.notaproblem.com/2009/05/17/ec2onrails-2-small-instances-or-1-high-cpu-medium-instance/" target="_blank">my last post</a>. This gives me confidence in the validity of these results.</p>
<h3>No ActiveRecord:</h3>
<p>The first set of tests was performed against a page that doesn&#8217;t access the database. This page is not cached and has numerous images, scripts, and partials. This should be a test of the instance&#8217;s ability to process Rails requests. If the Solo instance is better optimized it should show up in this test.</p>
<p>The results:</p>
<table border="1" width="80%" align="center">
<tbody>
<tr>
<th scope="col"></th>
<th scope="col">EC2onRails</th>
<th scope="col">Engine Yard Solo</th>
</tr>
<tr>
<td>Duration</td>
<td>268.18</td>
<td>264.87</td>
</tr>
<tr>
<td>Average (requests/sec.)</td>
<td>13.9</td>
<td>16.1</td>
</tr>
<tr>
<td>Std.Dev.</td>
<td>0.6</td>
<td>1.2</td>
</tr>
<tr>
<td>Samples</td>
<td>53</td>
<td>52</td>
</tr>
<tr>
<td>Max</td>
<td>15.2</td>
<td>18.2</td>
</tr>
<tr>
<td>Min</td>
<td>12.6</td>
<td>13.2</td>
</tr>
<tr>
<td>Avg. High</td>
<td>15.1</td>
<td>18.5</td>
</tr>
<tr>
<td>Avg. Low</td>
<td>12.7</td>
<td>13.7</td>
</tr>
<tr>
<td>2xx</td>
<td>3714</td>
<td>4259</td>
</tr>
</tbody>
</table>
<p><strong>Conclusion:</strong> The Solo instance is statistically proven to be faster at the 99% confidence level. The Solo instance is about 16% faster than the EC2onRails server at serving standard Rails requests.</p>
<h3>Moderate Database usage:</h3>
<p>The next set of tests was performed against a page that has moderate database usage. This page accesses eight different models. The page is not cached and has several images, scripts, and partials. This should be a test of the instance&#8217;s ability to process Rails requests as well as handle the database load.</p>
<p>The results:</p>
<table border="1" width="80%" align="center">
<tbody>
<tr>
<th scope="col"></th>
<th scope="col">EC2onRails</th>
<th scope="col">Engine Yard Solo</th>
</tr>
<tr>
<td>Duration</td>
<td>267.66</td>
<td>263.20</td>
</tr>
<tr>
<td>Average (requests/sec.)</td>
<td>6.9</td>
<td>8.3</td>
</tr>
<tr>
<td>Std.Dev.</td>
<td>0.3</td>
<td>0.4</td>
</tr>
<tr>
<td>Samples</td>
<td>53</td>
<td>52</td>
</tr>
<tr>
<td>Max</td>
<td>7.4</td>
<td>9.0</td>
</tr>
<tr>
<td>Min</td>
<td>6.3</td>
<td>7.0</td>
</tr>
<tr>
<td>Avg. High</td>
<td>7.4</td>
<td>9.0</td>
</tr>
<tr>
<td>Avg. Low</td>
<td>6.3</td>
<td>7.5</td>
</tr>
<tr>
<td>2xx</td>
<td>1860</td>
<td>2193</td>
</tr>
</tbody>
</table>
<p><strong>Conclusion:</strong> The Solo instance wins again by a fair margin. The Solo instance is statistically proven to be faster at the 99% confidence level. I was not expecting the Solo instance to perform 20% faster.</p>
<h3>Heavy Database usage:</h3>
<p>This set of tests was performed against a page that has heavy database usage. This page accesses numerous models. The page is not cached and has several images, scripts, and partials. This should be a test of the instance&#8217;s ability to process Rails requests as well as handle the database load. With the additional optimization, the Solo instance will probably win by a large margin.</p>
<p>The results:</p>
<table border="1" width="80%" align="center">
<tbody>
<tr>
<th scope="col"></th>
<th scope="col">EC2onRails</th>
<th scope="col">Engine Yard Solo</th>
</tr>
<tr>
<td>Duration</td>
<td>298.60</td>
<td>232.45</td>
</tr>
<tr>
<td>Average (requests/sec.)</td>
<td>2.0</td>
<td>2.2</td>
</tr>
<tr>
<td>Std.Dev.</td>
<td>0.1</td>
<td>0.1</td>
</tr>
<tr>
<td>Samples</td>
<td>59</td>
<td>46</td>
</tr>
<tr>
<td>Max</td>
<td>2.2</td>
<td>2.4</td>
</tr>
<tr>
<td>Min</td>
<td>1.8</td>
<td>2.0</td>
</tr>
<tr>
<td>Avg. High</td>
<td>2.2</td>
<td>2.4</td>
</tr>
<tr>
<td>Avg. Low</td>
<td>1.8</td>
<td>2.0</td>
</tr>
<tr>
<td>2xx</td>
<td>585</td>
<td>505</td>
</tr>
</tbody>
</table>
<p><strong>Conclusion:</strong> The Solo instance wins again, although by a smaller margin. The Solo instance is statistically proven to be faster at the 99% confidence level. I thought that the Solo instance would perform better in this test. I was not expecting the Solo instance to only perform 10% faster.</p>
<h3>Overall Conclusion:</h3>
<p>The test results were surprising. The Engine Yard Solo server is 10%-20% faster than an EC2onRails instance. Hopefully the EC2onRails team can investigate performance and close this performance gap.</p>
<h3>The Winner!</h3>
<p>Picking a winner in this case is difficult. My recommendation will depend upon your level of expertise. Engine Yard Solo does offer a fast and convenient way to create EC2 instances. Solo is also easier to use than EC2onRails and requires less technical knowledge to operate.</p>
<p>If you are not comfortable building your own server, I suggest that you seriously consider Engine Yard&#8217;s Solo. Solo is a reasonably priced way to get a professionally configured Rails server in the EC2 environment. You should also consider some of the other solutions like <a title="Visit the Rightscale website" href="http://www.rightscale.com/" target="_blank">RightScale</a> (which is much more expensive than Solo).</p>
<p>If you are comfortable setting up your own server, I recommend EC2onRails. It is a great project that is constantly evolving. Personally, I have difficulty justifying the cost of Solo. It seems to me that you are paying $43.20 per month for Engine Yard to configure a server for you. I would find a one time setup charge much more appealing than paying over and over. The EC2onRails project allows you to set up EC2 instances for free that performs almost as well as the Solo instance.</p>
<p>In short, both choices are excellent. The best choice for you, will be up to you <img src='http://tech.notaproblem.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h4>What&#8217;s next?</h4>
<p>In future posts, I will explore the various deployment options. Next up is Apache+Passenger vs. Nginx+Passenger.</p>
<h5><span style="text-decoration: underline;">Footnotes</span>:</h5>
<div class='footnotes'>
<div class='footnotedivider'></div>
<ol>
<li id='fn-81-1'>The charges for other instance sizes vary. See the Solo pricing for details <span class='footnotereverse'><a href='#fnref-81-1'>&#8617;</a></span></li>
<li id='fn-81-2'>Michael Mullany from Engine Yard was kind enough to let me know they don&#8217;t mark up Amazon&#8217;s prices for storage and bandwidth. I will be more careful with my fact checking next time <img src='http://tech.notaproblem.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> <span class='footnotereverse'><a href='#fnref-81-2'>&#8617;</a></span></li>
<li id='fn-81-3'>Assuming continuous usage of 24 hours per day for 30 days. <span class='footnotereverse'><a href='#fnref-81-3'>&#8617;</a></span></li>
<li id='fn-81-4'>I am positive that a usability study would confirm this assertion. <span class='footnotereverse'><a href='#fnref-81-4'>&#8617;</a></span></li>
<li id='fn-81-5'>Several of the crashes during deployment turned out to be caused by the ThinkingSphinx (TS) plugin. TS preloads the models which bombs when the database is empty. I ultimately ended up writing a Capistrano task to disable the TS plugin prior to running the db:create tasks for my normal deployments. Unfortunately, I didn&#8217;t see a way to pass custom tasks to EY. I ultimately had to create a branch of my code with TS disabled, then enable after deployment. <span class='footnotereverse'><a href='#fnref-81-5'>&#8617;</a></span></li>
<li id='fn-81-6'>Yes, I realize the signup page is also the login page and I realize that I could just bookmark https://login.engineyard.com. However, there should be a link to the login form for returning users (which may visit infrequently). <span class='footnotereverse'><a href='#fnref-81-6'>&#8617;</a></span></li>
<li id='fn-81-7'>I also have a number of commits in the code. See my <a title="Go to my Github account" href="http://github.com/DrMark/ec2onrails/tree/nginx-passenger" target="_self">Nginx+Passenger github branch</a> for details. <span class='footnotereverse'><a href='#fnref-81-7'>&#8617;</a></span></li>
<li id='fn-81-8'>Great news! In the comments Eric Hammond says that some version of the build script will be maintained for the foreseeable future. Thanks Eric! <span class='footnotereverse'><a href='#fnref-81-8'>&#8617;</a></span></li>
<li id='fn-81-9'>The code used in these tests is from one of our live production sites. It is the same code used for all of my testing. <span class='footnotereverse'><a href='#fnref-81-9'>&#8617;</a></span></li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://tech.notaproblem.com/2009/05/24/engine-yard-solo-vs-ec2onrails-pay-engine-yard-extra-or-do-it-yourself/feed/</wfw:commentRss>
		</item>
		<item>
		<title>EC2onRails - 2 Small Instances or 1 High-CPU Medium Instance?</title>
		<link>http://tech.notaproblem.com/2009/05/17/ec2onrails-2-small-instances-or-1-high-cpu-medium-instance/</link>
		<comments>http://tech.notaproblem.com/2009/05/17/ec2onrails-2-small-instances-or-1-high-cpu-medium-instance/#comments</comments>
		<pubDate>Mon, 18 May 2009 03:12:18 +0000</pubDate>
		<dc:creator>DrMark</dc:creator>
		
		<category><![CDATA[Performance Testing]]></category>

		<category><![CDATA[Rails]]></category>

		<category><![CDATA[Amazon EC2]]></category>

		<category><![CDATA[EC2]]></category>

		<category><![CDATA[EC2onRails]]></category>

		<category><![CDATA[Nginx]]></category>

		<category><![CDATA[Passenger]]></category>

		<category><![CDATA[Ruby on Rails]]></category>

		<guid isPermaLink="false">http://tech.notaproblem.com/?p=20</guid>
		<description><![CDATA[I recently watched the excellent RailsLab videos on tuning and optimizing Rails applications. One of the videos suggests that a great way to scale a Rails application is to separate the web and database components onto separate machines. We have been happily using Amazon's EC2 system for about a year. Amazon's recent introduction of the High-CPU Instance (High-CPU) made me curious how it would perform when compared against two Small Instances in a cluster (Cluster), since they would cost the same.]]></description>
			<content:encoded><![CDATA[<p>I recently watched the excellent <a title="RailsLab" href="http://railslab.newrelic.com/" target="_blank">RailsLab</a> videos on tuning and optimizing Rails applications. One of the videos suggests that a great way to scale a Rails application is to separate the web and database components onto separate machines. We have been happily using Amazon&#8217;s EC2 system for about a year. Amazon&#8217;s recent introduction of the High-CPU Instance (High-CPU) made me curious how it would perform when compared against two Small Instances in a cluster (Cluster), since they would cost the same.</p>
<p>You will be happy to know that after hours of rigorous testing, I have identified a clear winner. Read on for the details.</p>
<p><span id="more-20"></span>For those of you that are not familiar with Amazon&#8217;s specifications for the two instance types:</p>
<table style="text-align: left;" border="0" align="center">
<tbody>
<tr>
<th> <strong>Small Instance (default)</strong></th>
<th><strong>High-CPU Medium Instance</strong></th>
</tr>
<tr>
<td>
<ul>
<li> 1.7 GB memory</li>
<li> 1 <span class="caps">EC2</span> Compute Unit (1 virtual core with 1 <span class="caps">EC2</span> Compute Unit)</li>
<li> 160 GB instance storage (150 GB plus 10 GB root partition)</li>
<li> 32-bit platform</li>
<li> I/O Performance: Moderate</li>
<li> Price: $0.10 per instance hour</li>
</ul>
</td>
<td>
<ul>
<li> 1.7 GB of memory</li>
<li> 5 <span class="caps">EC2</span> Compute Units (2 virtual cores with 2.5 <span class="caps">EC2</span> Compute Units each)</li>
<li> 350 GB of instance storage</li>
<li> 32-bit platform</li>
<li> I/O Performance: Moderate</li>
<li> Price: $0.20 per instance hour</li>
</ul>
</td>
</tr>
</tbody>
</table>
<p>Following the RailsLab recommendation means that you place the web / application server and database on separate instances for best performance. I am sure that two small instances configured as below would outperform a single small instance. However, we wondered how the High-CPU instance would compare against two Small instances when configured like this:</p>
<div id="attachment_39" class="wp-caption aligncenter" style="width: 478px"><img class="size-full wp-image-39" title="blog-server-diagram1" src="http://tech.notaproblem.com/wp-content/uploads/2009/05/blog-server-diagram1.png" alt="RailsLab recommended configuration" width="468" height="335" /><p class="wp-caption-text">Two Small instances vs. One High-CPU instance</p></div>
<h3>Hypothesis:</h3>
<p>Two Small instances will outperform one High-CPU instance in a standard Rails application.</p>
<h3>Setup:</h3>
<p>For these tests, I used 5 EC2 instances configured as shown here:</p>
<div id="attachment_45" class="wp-caption aligncenter" style="width: 478px"><img class="size-full wp-image-45" title="blog-server-diagram-test_setup2" src="http://tech.notaproblem.com/wp-content/uploads/2009/05/blog-server-diagram-test_setup2.png" alt="Testing setup" width="468" height="535" /><p class="wp-caption-text">Testing setup</p></div>
<p>Great care is required when doing performance tests to ensure your results are legitimate. There are many factors that affect server performance and not accounting for any of these factors can make your results invalid. For example, in many web server performance tests that I read, the authors simply run Apache Bench (ab) against a couple of configurations then proclaim a winner. A couple of potential problems with this approach are 1) the results are not statistically testable and 2) something may have changed on the server between test runs.</p>
<p>In order to get valid results, we need to take multiple samples of the performance. With those samples we need to get both the mean and the standard deviation. When we have this information, we can perform a two-sample t-test to statistically prove that one configuration is better. Fortunately, the httperf tool provides the information that we need.</p>
<p>For these tests, each instance was configured exactly the same. Each instance was created using the EC2onRails code from my fork on Github. Both the Small Instance cluster (Cluster) and the High-CPU Instance (High-CPU) were running the same version of all code. They also had the exact same data in the database. The code used in these tests is from a live production site. All tests were performed at the same time on both configurations. Each test was then repeated on the other configuration to ensure the testing server was not biasing the results.</p>
<p>Details:</p>
<ul>
<li>Ubuntu 8.04</li>
<li>Rails 2.3.2</li>
<li>Nginx 0.6.36</li>
<li>Passenger 2.2.2</li>
<li>MySQL</li>
<li>Memcached</li>
</ul>
<p>Process: (all of these steps are performed on each testing server at the same time)</p>
<ul>
<li>Warm up each server using ab:<br />
ab -n 100 -c 1 ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com/</li>
<li>Run this a second time to ensure the server is ready.</li>
<li>Check that all requests were served successfully.</li>
<li>Using an automated script:
<ul>
<li>Send 5 requests using httperf to estimate the time required to get 45 samples.</li>
<li>Get at least 45 samples using httperf.</li>
</ul>
</li>
<li>Run the test again using the other testing server.</li>
<li>Verify that the two samples from different testing servers are not statistically different from one another.</li>
</ul>
<h2>Results:</h2>
<h3>No ActiveRecord:</h3>
<p>The first set of tests was performed against a page that doesn&#8217;t access the database. This page is not cached and has numerous images, scripts, and partials. This should be a test of the instance&#8217;s ability to process Rails requests. Given the additional &#8220;Compute Units&#8221; available to the High-CPU instance, I expect it to win easily.</p>
<p>The results:</p>
<table border="1" width="80%" align="center">
<tbody>
<tr>
<th scope="col"></th>
<th scope="col">Small Instance (default)</th>
<th scope="col">High-CPU Medium Instance</th>
</tr>
<tr>
<td>Duration</td>
<td>233.15</td>
<td>291.26</td>
</tr>
<tr>
<td>Average (requests/sec.)</td>
<td>14.0</td>
<td>22.1</td>
</tr>
<tr>
<td>Std.Dev.</td>
<td>0.7</td>
<td>1.8</td>
</tr>
<tr>
<td>Samples</td>
<td>46</td>
<td>58</td>
</tr>
<tr>
<td>Max</td>
<td>15.4</td>
<td>24.5</td>
</tr>
<tr>
<td>Min</td>
<td>11.3</td>
<td>15.8</td>
</tr>
<tr>
<td>Avg. High</td>
<td>15.4</td>
<td>25.7</td>
</tr>
<tr>
<td>Avg. Low</td>
<td>12.6</td>
<td>18.5</td>
</tr>
<tr>
<td>2xx</td>
<td>3268</td>
<td>6428</td>
</tr>
</tbody>
</table>
<p><strong>Conclusion:</strong> The High-CPU instance is statistically proven to be faster at the 99% confidence level. The High-CPU instance is about 58% faster than the Cluster at serving standard Rails requests.</p>
<h3>Moderate Database usage:</h3>
<p>The next set of tests was performed against a page that has moderate database usage. This page accesses eight different models. The page is not cached and has several images, scripts, and partials. This should be a test of the instance&#8217;s ability to process Rails requests as well as handle the database load. With the additional memory and dedicated machine for the database, the Cluster might have a chance here.</p>
<p>The results:</p>
<table border="1" width="80%" align="center">
<tbody>
<tr>
<th scope="col"></th>
<th scope="col">Small Instance (default)</th>
<th scope="col">High-CPU Medium Instance</th>
</tr>
<tr>
<td>Duration</td>
<td>233.15</td>
<td>247.61</td>
</tr>
<tr>
<td>Average (requests/sec.)</td>
<td>6.7</td>
<td>11.3</td>
</tr>
<tr>
<td>Std.Dev.</td>
<td>0.4</td>
<td>0.5</td>
</tr>
<tr>
<td>Samples</td>
<td>52</td>
<td>49</td>
</tr>
<tr>
<td>Max</td>
<td>7.4</td>
<td>12.1</td>
</tr>
<tr>
<td>Min</td>
<td>5.4</td>
<td>9.6</td>
</tr>
<tr>
<td>Avg. High</td>
<td>7.5</td>
<td>12.3</td>
</tr>
<tr>
<td>Avg. Low</td>
<td>5.9</td>
<td>10.3</td>
</tr>
<tr>
<td>2xx</td>
<td>1760</td>
<td>2812</td>
</tr>
</tbody>
</table>
<p><strong>Conclusion:</strong> The High-CPU instance wins again by a large margin. The High-CPU instance is statistically proven to be faster at the 99% confidence level. I thought that the Cluster would perform better in this test. I was not expecting the High-CPU instance to perform 69% faster.</p>
<h3>Heavy Database usage:</h3>
<p>This set of tests was performed against a page that has heavy database usage. This page accesses numerous models. The page is not cached and has several images, scripts, and partials. This should be a test of the instance&#8217;s ability to process Rails requests as well as handle the database load. With the additional memory and dedicated machine for the database, the Cluster might have a chance here.</p>
<p>The results:</p>
<table border="1" width="80%" align="center">
<tbody>
<tr>
<th scope="col"></th>
<th scope="col">Small Instance (default)</th>
<th scope="col">High-CPU Medium Instance</th>
</tr>
<tr>
<td>Duration</td>
<td>252.18</td>
<td>268.19</td>
</tr>
<tr>
<td>Average (requests/sec.)</td>
<td>2.0</td>
<td>4.4</td>
</tr>
<tr>
<td>Std.Dev.</td>
<td>0.1</td>
<td>0.2</td>
</tr>
<tr>
<td>Samples</td>
<td>50</td>
<td>53</td>
</tr>
<tr>
<td>Max</td>
<td>2.2</td>
<td>4.4</td>
</tr>
<tr>
<td>Min</td>
<td>1.6</td>
<td>3.7</td>
</tr>
<tr>
<td>Avg. High</td>
<td>2.2</td>
<td>4.4</td>
</tr>
<tr>
<td>Avg. Low</td>
<td>1.8</td>
<td>3.6</td>
</tr>
<tr>
<td>2xx</td>
<td>494</td>
<td>1073</td>
</tr>
</tbody>
</table>
<p><strong>Conclusion:</strong> The High-CPU instance wins again by a large margin. The High-CPU instance is statistically proven to be faster at the 99% confidence level. I thought that the Cluster would perform better in this test. I was not expecting the High-CPU instance to perform 120% faster.</p>
<h3>Cached Page performance:</h3>
<p>This set of tests was performed against a page that was cached. It has several images, scripts, and partials. This should be a test of the instance&#8217;s ability to server pure web server (nginx) requests. Since nginx uses such small amounts of memory and CPU, both configurations might be similar.</p>
<p>The results:</p>
<table border="1" width="80%" align="center">
<tbody>
<tr>
<th scope="col"></th>
<th scope="col">Small Instance (default)</th>
<th scope="col">High-CPU Medium Instance</th>
</tr>
<tr>
<td>Duration</td>
<td>725.40</td>
<td>642.11</td>
</tr>
<tr>
<td>Average (requests/sec.)</td>
<td>93.0</td>
<td>350.4</td>
</tr>
<tr>
<td>Std.Dev.</td>
<td>18.5</td>
<td>46.4</td>
</tr>
<tr>
<td>Samples</td>
<td>144</td>
<td>128</td>
</tr>
<tr>
<td>Max</td>
<td>122.2</td>
<td>399.6</td>
</tr>
<tr>
<td>Min</td>
<td>39.9</td>
<td>180.4</td>
</tr>
<tr>
<td>Avg. High</td>
<td>130</td>
<td>443.2</td>
</tr>
<tr>
<td>Avg. Low</td>
<td>56.0</td>
<td>257.6</td>
</tr>
<tr>
<td>2xx</td>
<td>67500</td>
<td>224999</td>
</tr>
</tbody>
</table>
<p><strong>Conclusion:</strong> The High-CPU instance wins again by a very large margin. The High-CPU instance is statistically proven to be faster at the 99% confidence level. I thought that the Cluster would perform better in this test. I was not expecting the High-CPU instance to perform 277% faster.</p>
<h3>UPDATE: Testing performance under load<sup class='footnote'><a href='#fn-20-1' id='fnref-20-1'>1</a></sup></h3>
<h3>Load Testing:</h3>
<p>The advantage of having a separate database server should become more pronounced as the site becomes more heavily used. The tests so far were testing responsiveness not ability to handle load. In order to test load, I am using ab with 12 concurrent users like this: &#8220;ab -n 6186 -c 12 http://174.129.193.145/page/id&#8221; Each test was tailored to last about three minutes.</p>
<p>The results for the medium database usage page:</p>
<table border="1" width="80%" align="center">
<tbody>
<tr>
<th scope="col"></th>
<th scope="col">Small Instance (default)</th>
<th scope="col">High-CPU Medium Instance</th>
</tr>
<tr>
<td>Concurrency Level</td>
<td>12</td>
<td>12</td>
</tr>
<tr>
<td>Average (requests/sec.)</td>
<td>6.55</td>
<td>18.62</td>
</tr>
<tr>
<td>Time per Request</td>
<td>1831.977 [ms] (mean)</td>
<td>644.545  [ms] (mean)</td>
</tr>
<tr>
<td>Time per Request</td>
<td>152.665 [ms] (mean, across all concurrent)</td>
<td>53.712 [ms] (mean, across all concurrent)</td>
</tr>
<tr>
<td colspan="3"><strong>Percentage of requests served within a certain time (ms)</strong></td>
</tr>
<tr>
<td>50%</td>
<td>1374</td>
<td>621</td>
</tr>
<tr>
<td>66%</td>
<td>1575</td>
<td>682</td>
</tr>
<tr>
<td>75%</td>
<td>1709</td>
<td>728</td>
</tr>
<tr>
<td>80%</td>
<td>1812</td>
<td>759</td>
</tr>
<tr>
<td>90%</td>
<td>2198</td>
<td>837</td>
</tr>
<tr>
<td>95%</td>
<td>2974</td>
<td>896</td>
</tr>
<tr>
<td>98%</td>
<td>7700</td>
<td>963</td>
</tr>
<tr>
<td>99%</td>
<td>15966</td>
<td>1039</td>
</tr>
<tr>
<td>100%</td>
<td>47130 (longest)</td>
<td>1864</td>
</tr>
</tbody>
</table>
<p><strong>Conclusion:</strong> The High-CPU instance wins again by a very large margin. Notice in particular how the Cluster is slower overall and some of the requests queue up and take a very long time to serve. Only getting 80% of your requests in under 2 seconds is simply not acceptable in a production environment. On the other hand, the High-CPU instance is able to serve 100% of the requests in under 2 seconds.</p>
<p>I thought that the Cluster would perform better in this test. I was not expecting the High-CPU instance to perform 184% faster. Apparently the High-CPU machine is so much more powerful that even having the database on its own box isn&#8217;t enough for the small cluster to win.</p>
<h3>Overall Conclusion:</h3>
<p>The test results were very surprising. I expected the performance between the two configurations to be similar, particularly in database intensive tests.</p>
<p>Given the dramatically better performance of the High-CPU instance in every test situation, I can&#8217;t recommend using a cluster of two Small instances for a Rails application. Your money will be much better spent using a single High-CPU instance.</p>
<p>In future posts, I will explore the various deployment options. Next up is Apache+Passenger vs. Nginx+Passenger.
<div class='footnotes'>
<div class='footnotedivider'></div>
<ol>
<li id='fn-20-1'>These tests were performed at the same time as the ones above. I simply neglected to post them originally <span class='footnotereverse'><a href='#fnref-20-1'>&#8617;</a></span></li>
</ol>
</div>
]]></content:encoded>
			<wfw:commentRss>http://tech.notaproblem.com/2009/05/17/ec2onrails-2-small-instances-or-1-high-cpu-medium-instance/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Aloha!</title>
		<link>http://tech.notaproblem.com/2009/05/17/im-a-test-junkie/</link>
		<comments>http://tech.notaproblem.com/2009/05/17/im-a-test-junkie/#comments</comments>
		<pubDate>Sun, 17 May 2009 22:54:21 +0000</pubDate>
		<dc:creator>DrMark</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://tech.notaproblem.com/?p=16</guid>
		<description><![CDATA[Aloha everyone!
I have decided to set up a blog to talk about Ruby, Rails, and other tech related issues. I hope that you find something useful. If you have questions, comments, or suggestions, feel free to drop me a note.
]]></description>
			<content:encoded><![CDATA[<p>Aloha everyone!</p>
<p>I have decided to set up a blog to talk about Ruby, Rails, and other tech related issues. I hope that you find something useful. If you have questions, comments, or suggestions, feel free to drop me a note.</p>
]]></content:encoded>
			<wfw:commentRss>http://tech.notaproblem.com/2009/05/17/im-a-test-junkie/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
