maratishe.github.io Heroku is Best Modeled as a Bottle
Author: maratishe@gmail.com -- created 150708
This study takes the same basic approach that I presented recently under the name Cloud Probing -- you can search and find papers/slides on the topic. The idea is to optimize VM/container topologies dynamically at Service Provider (SP) side rather than rely on whatever the toolkit Amazon (among few others) offers you at the time. Trust me, the current offerings are rather meager. Hence the idea.
It's only a half-step to this study. This time the idea is to apply the same concept of probing to container populations -- where Heroku and Docker are the two popular platforms where you get to play with those. Docker is DiY and nicely fits the original cloud probing idea. Heroku takes some work. This blog post shows the results of this work, leading up to the Bottle Model representing Heroku space.
Experimental Setup
It is impossible (or very hard) to probe Heroku apps from inside Heroku. Besides, it might be better to do it from the outside towards the inside, anyway. So, just like in the first Cloud Probing case, a population of VMs were scattered around the 9 Amazon regions and used as probes. Probing is very simple -- a VM would send a probe at random times and with a random size (download) -- and capture the easily calculable HTTP throughput from size and the time and it took the request to complete.
Separately from Amazon EC2 and Heroku, there is also a local manager that uses Heroku and EC2 APIs in order to orchestrate the whole thing.
Let me omit the details about how Heroku app is scaled -- this is a simple technique that regularly brings the app down to scale=1 and then grows it up to scale=15 , the hope is that newly created dynos would be mapped to different VMs and different PMs. Ultimately, the hope is the there is a difference between VMs and PMs, which can be grasped by the probing and lead to optimization. This basic approach worked well on Amazon where there was tangible difference across DCs and inside DCs (regions).
Results
The results are touch to visualize. Because of the high number of permutations across EC2 regions and Heroku population size -- the latter measured in dynos or scale parameter in API. One interesting fact immediately catches the eye. The throughput in relatively (compared to other regions) much higher when EC2 probes are located in the Virginia DC . We can see some minor dependence on the how many dynos are used but not at the level of the so-called statistical significance . The difference between Virginia and other places, on the other hand, cannot be ignored. The conclusion here is obvious -- it looks like Heroku lives in Virginia DC . I did not actually check if Heroku advertises this fact openly, so this is not a claim of a universe-changed-forever discovery. This is simply the visual inspection of the results.
Now, having removed Virginia from results and redrawing the plots, we can see a more leveled performance. This time DCs can be roughly split into two groups -- the upper 4 rows are relatively better in terms of throughput to the Heroku app then others. Note that those are mostly US or Europe-based DCs.
Outcome : the Bottle Model
A simple model based on the above results. They clearly show the presence of the bottleneck effect where the bottleneck itself is the exit of the Virginia DC . I cannot quite put my finger on what actually causes the bottleneck. It may be the Heroku DNS and routing logic that needs to advertise the apps to the outside world, while inside the Virginia DC the logic might be simplified. Whatever the cause, the bottleneck is there. The practical value of this finding is obvious:
(1) Heroku app managers should be aware that Heroku puts a limit on throughput -- not intentionally but because of its design. One can also assume that throughput hogging is a major concern as competition among Heroku apps (especially hogs) is tougher given the bottleneck.
(2) If one's app has both EC2/S3 and Heroku parts, then one is better off putting both inside the Virginia DC . Even if end users might prefer EC2/S3 parts to be closer to them physically -- even in such cases more and higher priority traffic is exchanged between EC2/S3 and Heroku than between each of these parts and end users.
That's all. Enjoy.
Written with my own local WYSIWYG editor.