Load Testing
I have to confess that load testing is one of my favorite
activities in software development. It is stressful, takes a long
time, but usually you learn a lot about your application and the
technologies you are using in it. And seeing the numbers to get better
as you cleverly optimize your application is one of the greatest geek pleasures.
Anyway, we were getting closer to launching the first version
of StoryTroop and we decided it was time to test how the application
would behave in our infrastructure. This way, we would have time to
change it, if the numbers were bad.
We chose to host StoryTroop on Heroku Cedar
stack. We didn't want the hassle of maintaining our own servers
and since our architecture was based on Play! Framework and MongoDB, both supported by Heroku,
it seemed a good option to start with.
We chose the Heroku
MongoHQ addon to host our data and it took me one full day to
generate the data load we needed to run the test.
Bruno Souza had created a nice JMeter installation on Amazon
AWS and we fired several machines to simulate many simultaneous users.
In order to monitor the results, we installed the New Relic Heroku addon. New Relic
gives you nice graphs and it would be very useful during the test.
We started with a few users and 1 Heroku dyno and the results
were good, with an average execution time of 600ms per request. But
when we increased to 50 simultaneous users the average went up to 8s
per request. Yes, you read it correctly. Very bad. The Heroku queue
was always close to zero, what bugged us. If there was no queue in the
dyno, why was it taking so long to respond? Well, in my experience,
when you have performance problems in an application like this, the
database is the usual suspect. So, a MongoDB query optimization was in order.
MongoDB provides a tool that is a life saving when trying to
find slow queries: the Profiler.
When you activate the Profiler, MongoDB will save statistics on a
special collection that you can query later. There is a small overhead
when running the Profiler, but it is a small one and the tool will
help you a lot.
We activated the Profiler executing this command on the
MongoDB console:
db.setProfilingLevel(1,50);
which means "log only the operations slower than 50ms".
When you want to deactivate the profiler, just run:
db.setProfilingLevel(0);
And to see the slowest operations, you can run:
db.system.profile.find().sort({millis:-1});
Once you did a few optimizations, you can see the latest results running:
db.system.profile.find().sort({$natural:-1});
As expected, we had some pretty slow queries going on. After changing a few indexes, the execution time for 50 simultaneous users dropped from 8s per request to 564ms per request. Much better, huh?
So, here are a few tips on index optimization on MongoDB:
1. Use explain() to understand your queries: this command will show you if an index is used or not and how many objects are scanned to find the answer. Here is an example:
db.Post.find({"hashtags": {$in :
["storytroop"]}}).explain();
{
"cursor" : "BtreeCursor post_timeline",
"nscanned" : 1,
"nscannedObjects" : 1,
"n" :
1,
"millis" : 27,
"nYields" : 0,
"nChunkSkips" :
0,
"isMultiKey" : true,
"indexOnly" : false,
"indexBounds"
: {
"hashtags" : [
[
"storytroop",
"storytroop"
]
]
}
}
The cursor field indicates that MongoDB is using an index
called post_timeline, so that's good. nscanned is the number of items
examined and n is the number of objects returned by the query. As a
general rule, you want to have nscanned as close as possible to n. So,
this query is behaving very well. Another important field to look at
is millis. It will give you the time the query will take to run, in milliseconds.
2. Remember that MongoDB uses only one index per query. See
this example:
db.Post.ensureIndex({"username":1});
db.Post.ensureIndex({"lastUpdate":1});
db.Post.find({"username":
"johndoe"}).sort({"lastUpdate":1}).explain();
{
"cursor" : "BtreeCursor
lastUpdate_1",
"nscanned" :
534741,
"nscannedObjects" : 534741,
"n" : 48382,
"millis" :
7517,
"nYields" : 1,
"nChunkSkips" : 0,
"isMultiKey" :
false,
"indexOnly" : false,
"indexBounds" : {
"lastUpdate" :
[
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
]
}
}
See that the number of scanned
items is bigger than the number of returned objects? Now, try this:
db.Post.ensureIndex({"username":1, "lastUpdate":1});
db.Post.find({"username": "johndoe"}).sort({"lastUpdate":1}).explain();
{
"cursor" : "BtreeCursor
username_1_lastUpdate_1",
"nscanned" :
48382,
"nscannedObjects" : 48382,
"n" : 48382,
"millis" :
177,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" :
false,
"indexOnly" : false,
"indexBounds" : {
"username" :
[
[
"johndoe",
"johndoe"
]
],
"lastUpdate" : [
[
{
"$minElement" : 1
},
{
"$maxElement" : 1
}
]
]
}
}
Now n == nscanned and the query is a lot faster (see the millis
field). So, the lesson here is to match your index with your queries.
Run explain to see if you are using an efficient index.
3. Use limit: that's pretty obvious, but for large collections
you should always specify a limit to retrieve only a few items each time.
4. Try to retrieve only the fields you really need: if you are
using frameworks like Morphia, for example, to map your collection to
objects, the default will be to retrieve all the fields of a
collection. But many times, you just need a few fields to do what you
have to do. If you limit the number of fields retrieved, the query
will be a lot more efficient.
See the time to execute this:
db.Post.find().explain();
{
"cursor" : "BasicCursor",
"nscanned" : 534741,
"nscannedObjects" : 534741,
"n" :
534741,
"millis" : 18123,
"nYields" : 504,
"nChunkSkips" :
0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds"
: {
}
}
compared
to:
db.Post.find({},{"username":1}).explain();
{
"cursor" : "BasicCursor",
"nscanned" : 534741,
"nscannedObjects" : 534741,
"n" :
534741,
"millis" : 429,
"nYields" : 0,
"nChunkSkips" :
0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds"
: {
}
}
You can retrieve
only a few fields if you are using Morphia as well. Here is how:
Post.q().retrievedFields(true, "username").asList();
Just be careful: all the other fields will be null. So, make
sure you are really not using the other fields in your code.
So, after doing this index optimization phase, we had a better
response time. 50 simultaneous users in 564ms per request. We than ran
another test with 200 simultaneous users and we got 1.9s per request.
Ok, not so good, but hey, we can just start a new dyno and see the
number go down, right? Wrong. With a second dyno the average per
request was 3.2s. Exactly. With 2 dynos the results were a lot worse.
That required an additional investigation. Back to MongoDB.
If a second dyno was making things worse, chances were that
more requests were generating more stress on the database. Reviewing
our queries again, I was suspicious of a group operation we were
using. It should be a very fast operation, since it was iterating over
a few items, but reading MongoDB documentation again, in how
concurrency work
(http://www.mongodb.org/display/DOCS/How+does+concurrency+work), I
found something that I had missed the first time I read it:
"The group command takes a read lock and does not allow any
other threads to execute JavaScript while it is running."
Lock? Hmmm... that does not sound good. So, I refactored the
code to get rid of the group operation. After this, the 200
simultaneous users running with 2 Heroku dynos had an average
execution time of 475ms per request. What a difference, huh?
And this was our first load testing session for StoryTroop. We
will run other tests as new features are created but so far, I guess
we have enough for launching a pretty fast web site, specially since
we'll still add caching and other optimizations. More fun ahead.





