As an architect, I have a very interesting job of having to understanding challenges in existing systems in multiple dimensions and still add new functionality. Recently I was approached with designing a system to store API usage for billing. It obviously came with a fine print, "it should not have any impact on the existing system". Now we all know there is no "free lunch" but how far really can we go with this?
To start with the simplest design, let's just introduce a database, say MySQL with a table which stores the Customer-Id, Api-Id & the Call-Count. The database is initialized with a set of rows covering all combinations of Api-Ids and Customer-Ids. Doing this allows us to ignore the existence/non-existence of a particular row.
The flow of events is self-explanatory, a client makes an API call, the call is records by firing the SQL 'update apitracker set CallCount = CallCount + 1 where CustId = ? and ApiId = ?' after which we proceed to execute the existing business logic.
How did this peform?
Updates | 2 Threads (ms) | 4 Threads (ms) | 8 Threads (ms) | Avg Rate per sec |
---|---|---|---|---|
100 | 505 | 406 | 486 | 215 |
1000 | 5094 | 4549 | 5232 | 202 |
10000 | 55306 | 56644 | 54070 | 181 |
It looks like we have an average rate of updating 200 records per sec. In other words we are adding 0.2ms to every API call. Let's say we have 50 million API cals a day, we'll now be slower by 10,000 seconds or 3 hours.
Can we do better?
I decided to use Redis. The INCR functionlity of Redis which increments a key's value fits the requirement perfectly. Since Redis is not just an in-memory cache and can also persist it's state to disk, the information is also durable which is important in our scenario.
Now, the requirement states that this data must be stored indefinitely, using a database is still a good option except that the database is going to be used in a slightly different manner. For the day's API recording, it's Redis that will do the job. We'll copy the data from Redis to MySql during off-peak hours every day.
How did this peform?
Updates | 2 Threads (ms) | 4 Threads (ms) | 8 Threads (ms) | Avg Rate per sec |
---|---|---|---|---|
100 | 16 | 5 | 15 | 8333 |
1000 | 47 | 47 | 38 | 22727 |
10000 | 326 | 267 | 255 | 35377 |
In this case, it looks like we have an average rate of updating 20,000 records per sec, actually 22,146 to be exact but who cares?. It's a staggering improvement of a 100 times. By using Redis, we should bring down the additional 3 hours to 2 minutes.
Note: Redis was configured with AOF with appendfsync every second as well as the default save options of save 900 1, save 300 10 and save 60 10000. Often moving to slower but more flexible system during off-peak hours really boosts performance of systems if the data we're dealing with is not needed immediately. This was my scenario and it could therefore be leveraged to maintain system performance.