Overview
So you've decided to use AWS lambdas for your application. Lambdas have great integration with various AWS services like S3 and are a good choice for handling notifications generated by AWS services. Now that you've selected Lambdas for the tasks, which programming language are you considering to write your functions?
What's your usecase?
Though serverless has been a buzzword for sometime now, if you're serious about building a successful application, you must take these buzzwords with a pinch of salt. Just because a technology is new does not mean its the right one for you. I remember developing a video playing mobile application like YouTube purely based on AWS lambdas. Though the tech-stack looked good on paper, the performance was absymal. It's important to understand what your requirements are in terms of:
- Responsiveness
- Running time of the function logic
- Frequency of invocations
Responsiveness
Like I mentioned earlier, we use lambda functions to respond to user actions made within the video mobile app. Obviously such an app has to have responsiveness as one of its top priorities. Lambdas unfortunately are not greate at this. The AWS lambda service has come a long way but nothing beats the performance of an application dealing with local memory access. Remember, lambdas have no state so you'll have to rely of distributed caching systems like Redis which means multiple network trips to get the data that you need even if its cached. Lambdas however are great for responding to backend events, triggering additional workflows and the likes.
Running time of the function logic
Lambda functions aren't supposed to be long running functions. AWS lambda used to have an upper limit of 5 mins which has increased to 15 mins for its maximum execution time. Lambdas are supposed to only do one thing and one this well and on top of that finish it in a couple of seconds.
Frequency of invocations
Eac AWS lambda executes in its own container. When a lambda executes for the first time, a considerable amount of time is spent in initialization. If the lambda is invoked not "too long" after, the same container is reused. The idea is to keep lambdas from going out of context so they don't have to spend time on initialization and are ready to execute immediately. Some system regularly invoke lambdas just to keep them "warm".
The programming language used
The programming language chosen has a bearing on all the three points discussed. Compiled languages which require a heavy runtime like JVM or the .NET runtime spend more time first time they execute compared to languages like NodeJS or Python which don't have a heavy runtime. Here is a comparison of a lambda function which:
- Responds to S3 put events
- Reads first 4K of the S3 object
- Tries to find a string within the data read
Here we compare the performance of the lambda written in Python with one written in Java (Java11) for a cold start:
Billed duration | Init duration | Max memory used | |
---|---|---|---|
Python | 262 | 408.89 | 71 |
Java | 13087 | 404.08 | 152 |
Next we compare the performance of the lambda written in Python with one written in Java (Java11) with a warm start:
Billed duration | Init duration | Max memory used | |
---|---|---|---|
Python | 164 | 0 | 72 |
Java | 835.81 | 0 | 158 |
Conclusion
If lambdas are operating in the periphery of the application perhaps doing something like filtering data, sending notification to start core workflows, it makes sense to use a language having a small memory footprint like NodeJS or Python, you'll save on costs and not lose aynthing in the bargain.