Real, happened in Tampa, FL, United States
05-02-2012

Load Testing

I have to confess that load testing is one of my favorite activities in software development. It is stressful, takes a long time, but usually you learn a lot about your application and the technologies you are using in it. And seeing the numbers to get better as you cleverly optimize your application is one of the greatest geek pleasures.

Anyway, we were getting closer to launching the first version of StoryTroop and we decided it was time to test how the application would behave in our infrastructure. This way, we would have time to change it, if the numbers were bad.

We chose to host StoryTroop on Heroku Cedar stack. We didn't want the hassle of maintaining our own servers and since our architecture was based on Play! Framework and MongoDB, both supported by Heroku, it seemed a good option to start with.

We chose the Heroku MongoHQ addon to host our data and it took me one full day to generate the data load we needed to run the test.

Bruno Souza had created a nice JMeter installation on Amazon AWS and we fired several machines to simulate many simultaneous users.

In order to monitor the results, we installed the New Relic Heroku addon. New Relic gives you nice graphs and it would be very useful during the test.

We started with a few users and 1 Heroku dyno and the results were good, with an average execution time of 600ms per request. But when we increased to 50 simultaneous users the average went up to 8s per request. Yes, you read it correctly. Very bad. The Heroku queue was always close to zero, what bugged us. If there was no queue in the dyno, why was it taking so long to respond? Well, in my experience, when you have performance problems in an application like this, the database is the usual suspect. So, a MongoDB query optimization was in order.

MongoDB provides a tool that is a life saving when trying to find slow queries: the Profiler. When you activate the Profiler, MongoDB will save statistics on a special collection that you can query later. There is a small overhead when running the Profiler, but it is a small one and the tool will help you a lot.

We activated the Profiler executing this command on the MongoDB console:

db.setProfilingLevel(1,50);

which means "log only the operations slower than 50ms".

When you want to deactivate the profiler, just run:

db.setProfilingLevel(0);

And to see the slowest operations, you can run:

db.system.profile.find().sort({millis:-1});

Once you did a few optimizations, you can see the latest results running:

db.system.profile.find().sort({$natural:-1});

As expected, we had some pretty slow queries going on. After changing a few indexes, the execution time for 50 simultaneous users dropped from 8s per request to 564ms per request. Much better, huh?

So, here are a few tips on index optimization on MongoDB:

1. Use explain() to understand your queries: this command will show you if an index is used or not and how many objects are scanned to find the answer. Here is an example:


    db.Post.find({"hashtags": {$in : ["storytroop"]}}).explain();
    {
        "cursor" : "BtreeCursor post_timeline",
        "nscanned" : 1,
        "nscannedObjects" : 1,
        "n" : 1,
        "millis" : 27,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : true,
        "indexOnly" : false,
        "indexBounds" : {
              "hashtags" : [
                      [
                           "storytroop",
                           "storytroop"
                     ]
           ]
      }
}


The cursor field indicates that MongoDB is using an index called post_timeline, so that's good. nscanned is the number of items examined and n is the number of objects returned by the query. As a general rule, you want to have nscanned as close as possible to n. So, this query is behaving very well. Another important field to look at is millis. It will give you the time the query will take to run, in milliseconds.

2. Remember that MongoDB uses only one index per query. See this example:

    db.Post.ensureIndex({"username":1});
    db.Post.ensureIndex({"lastUpdate":1});
    db.Post.find({"username": "johndoe"}).sort({"lastUpdate":1}).explain();
      {
         "cursor" : "BtreeCursor lastUpdate_1",
         "nscanned" : 534741,
         "nscannedObjects" : 534741,
         "n" : 48382,
         "millis" : 7517,
         "nYields" : 1,
         "nChunkSkips" : 0,
         "isMultiKey" : false,
         "indexOnly" : false,
         "indexBounds" : {
         "lastUpdate" : [
               [
                     {
                   "$minElement" : 1
                    },
                    {
                    "$maxElement" : 1
                    }
                ]
            ]
        }
    }
    
See that the number of scanned items is bigger than the number of returned objects? Now, try this:

    db.Post.ensureIndex({"username":1, "lastUpdate":1});

    db.Post.find({"username": "johndoe"}).sort({"lastUpdate":1}).explain();

    {
        "cursor" : "BtreeCursor username_1_lastUpdate_1",
        "nscanned" : 48382,
        "nscannedObjects" : 48382,
        "n" : 48382,
        "millis" : 177,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {
        "username" : [
            [
            "johndoe",
            "johndoe"
            ]
        ],
    "lastUpdate" : [
        [
            {
            "$minElement" : 1
            },
            {
            "$maxElement" : 1
            }
        ]
        ]
        }
    }
    
Now n == nscanned and the query is a lot faster (see the millis field). So, the lesson here is to match your index with your queries. Run explain to see if you are using an efficient index.

3. Use limit: that's pretty obvious, but for large collections you should always specify a limit to retrieve only a few items each time.

4. Try to retrieve only the fields you really need: if you are using frameworks like Morphia, for example, to map your collection to objects, the default will be to retrieve all the fields of a collection. But many times, you just need a few fields to do what you have to do. If you limit the number of fields retrieved, the query will be a lot more efficient.

See the time to execute this:

    db.Post.find().explain();
    {
        "cursor" : "BasicCursor",
        "nscanned" : 534741,
        "nscannedObjects" : 534741,
        "n" : 534741,
        "millis" : 18123,
        "nYields" : 504,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {
        }
    }
    
compared to:
    db.Post.find({},{"username":1}).explain();
    {
        "cursor" : "BasicCursor",
        "nscanned" : 534741,
        "nscannedObjects" : 534741,
        "n" : 534741,
        "millis" : 429,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {
        }
    }
    
You can retrieve only a few fields if you are using Morphia as well. Here is how:

Post.q().retrievedFields(true, "username").asList();

Just be careful: all the other fields will be null. So, make sure you are really not using the other fields in your code.

So, after doing this index optimization phase, we had a better response time. 50 simultaneous users in 564ms per request. We than ran another test with 200 simultaneous users and we got 1.9s per request. Ok, not so good, but hey, we can just start a new dyno and see the number go down, right? Wrong. With a second dyno the average per request was 3.2s. Exactly. With 2 dynos the results were a lot worse. That required an additional investigation. Back to MongoDB.

If a second dyno was making things worse, chances were that more requests were generating more stress on the database. Reviewing our queries again, I was suspicious of a group operation we were using. It should be a very fast operation, since it was iterating over a few items, but reading MongoDB documentation again, in how concurrency work (http://www.mongodb.org/display/DOCS/How+does+concurrency+work), I found something that I had missed the first time I read it:

"The group command takes a read lock and does not allow any other threads to execute JavaScript while it is running."

Lock? Hmmm... that does not sound good. So, I refactored the code to get rid of the group operation. After this, the 200 simultaneous users running with 2 Heroku dynos had an average execution time of 475ms per request. What a difference, huh?

And this was our first load testing session for StoryTroop. We will run other tests as new features are created but so far, I guess we have enough for launching a pretty fast web site, specially since we'll still add caching and other optimizations. More fun ahead.


 

Real, happened in São Paulo, SP, Brazil
05-31-2012

Error pages

I have to admit that I love clever error pages. If an error is already bad, why do error pages have to look awful as well?

Today Patricia sent us two options to our 404 error pages and they are both awesome. A 404 error happens when a page is not found. Here they are:

We ended up choosing the first option, but we liked the second one so much that it became our 500 error page. An 500 error happens when there is an application error. Yes, when we messed up. You can see the final version of the 404 error page if you access a page that does not exist, like this one: http://www.storytroop.com/this-does-not-exist and the 500 error page you will hopefully never see :-)


 

Real, happened in São Paulo, SP, Brazil
July 2012

How to explain your startup in a business card

The problem with an innovative service is how to explain it in a few seconds. People are not going to give you half an hour so you can go over all your features and a long explanation of your vision.

So, when we were thinking about creating our business cards, we wanted to use them to also help us to explain StoryTroop. It would not be easy because you know how small a business card is. After a 4 hours bike ride thinking about it, it came to me the idea of using the back of the card to show stories that would be different if the reader had access to more than one perspective. We could print several different stories in different cards and make them really cool. Thus, when seeing how the story would be different on the card, they would understand our service. Here is my first sketch of the idea:

 

Ugly, I know, but it was just a sketch. Once we decided to do it, the problem was to find an affordable and good quality service that would implement our vision. We had used Moo.com in the past and I was a big fan of them. Moo.com prints cards with different backs at a very affordable price. Besides, their quality is amazing. So, Bruno and I wrote the stories, Marco did an awesome job creating the final design with a designer and a few weeks later we got the cards by mail. Here they are:


Pretty good, huh? Our only problem now is that people don't want to have just one card. They want to have one of each story! Good job, Moo.com


From May 2010 to 03-17-2013 18:36:16
1,202 views
1 trooper retold it
2 troopers favorited
See the whole story