What are the limitations of AWS Lambda

Deep Dive AWS Lambda - Part II

Welcome to the second part of this miniseries that focuses on how to design Lambda functions and how to optimize your code. This article builds on concepts that have already been explained in previous blog posts. Be sure to take a look at our previous blog posts “Introduction to Serverless Computing with AWS Lambda” and “Deep Dive AWS Lambda - Part I” for a solid overview of the topic.

The handler method

When you create a Lambda function, you specify a handler method. This method is generated by default and is the entry point for a request served by the function. The function receives two objects as parameters:

The eventObject contains details about the event that called the Lambda function. For example, the function could receive data from an http POST request. Usually the type is the eventObject a dictionary (in Python). However, the object type can vary (or). The content and object type of the object are determined by the type of call.

The contextObject provides information about the lambda function. You may need to know how much time is left before the function times out or how much memory is available before the limit is reached. This type of information and more can be obtained from the contextObject to be retrieved. A full list can be found in the official documentation [1].

Some Lambda functions contain a return value. This strongly depends on the execution model, which was discussed in the first part of the Deep Dive. If the function is called synchronously, the calling service expects a response to its request to AWS Lambda. The function got to i.e. return a value to the service, otherwise the further execution of the entire application is blocked. Asynchronous calls, on the other hand, do not wait for a response. A return instruction is therefore obsolete.

Best practice: design

Many of the topics in this section are closely related to microservice architecture design. This is no surprise, as AWS Lambda is itself part of a microservice environment and at the same time an integral part of many microservice architectures.

The first important rule is the separation of business and lambda logic. The handler method belongs to the latter. It should only be used to extract information from the eventObject or, if necessary, information from the contextObject. Further methods within the Lambda function should then process the extracted information. Once processing is complete, a response may be returned to the handler method. This is then returned to the calling service. This principle enables developers to relate unit and integration tests only to the business logic.

Another important rule relates to the architecture of the code. Features should only serve one purpose. Don't write functions that do more than one thing. This will help other developers understand and maintain your code.

Lambda functions must also be stateless, and no state should be stored in the context of the lambda function itself. The code only exists if the function is triggered by an event. If something needs to be persisted, consider using S3 or DynamoDB. Both services scale horizontally and are easy to use. As a rule of thumb, use DynamoDB when you need millisecond latency and your data is changing quickly. If throughput is not important and the data is not changing much, use S3.

After all, you should only include the dependencies that you need. This is easier said than done, but it has a significant impact on deployment and production performance. Integrating entire SDKs into your code amounts to a larger deployment package and thus longer deployment processes. In addition, the function's runtime needs more time to be available (see also cold and warm start in the introductory article). To control dependencies in your code, consider another of our blog articles: Dependency Management for AWS Lambda

Best practice: code

The AWS Developer Documentation has sections titled "Working With ...". This section gives specific tips for working with a supported programming language, but includes the same six topics for each programming language:

  • Handler
  • Deployment package
  • Context
  • Logging
  • Errors
  • Tracing

The following part deals with "Logging, Errors and Tracing", "Environment Variables" and "Recursive Code". If you want to learn more about the other topics, be sure to read our previous articles.

Logging, errors and tracing

Logging is important to developers and operational team members alike. It helps in developing code and also in troubleshooting. When a developer runs a Lambda function, they can include output statements in the code that are automatically logged by the service.

For Python, you can Insert statements. However, the larger the project, the more modules are developed, which is why a structured logging approach is necessary. Also, the project eventually moves from development, debugging, and testing to production. Different logging levels should possibly be used for these states.

The module (part of the standard library) can of course be used for structured logging in Python. This offers the following features:

  • Control of the logging level
  • Definition of storage locations for log files
  • Predefined template for log levels
  • Source information integrated in log entries

An even better tool (Python only) are the "AWS Lambda Powertools", which are hosted on GitHub [2]. This extension enables the logging of information (similar to the module), but also contains the libraries for AWS X-Ray. AWS X-Ray is a tracing service that can be used to track API calls between different services and to find bottlenecks in the workflow.

Environment variables

Environment variables make it possible to change the behavior of the function without updating code. By default, the variables are encrypted in the idle state, so sensitive parameters are stored in the environment variables instead of in the code. It is possible to use a key other than the default, and it is also possible to encrypt the variables on a client before entering them into AWS Lambda.

However, there are restrictions on the environment variables (in addition to the naming requirements):

  • Keys that are not reserved by Lambda
  • The total size of all environment variables cannot exceed 4 KB

One use case for environment variables is software testing. Suppose you want to test a function that connects to a production database. You want to connect to the test database for the test. Updating the function code with each test is prone to errors. If you forget to change the parameter again before the function processes production events again, you may lose data. It is safer to use two Lambda functions with identical code but different values ​​for the database connection.

The limitations described above have shown that there are certain environment variables that Lambda reserves. These contain information about the runtime and the function itself. For more information, please visit the developer documentation.

Recursive code

The shortest and probably most important piece of advice: Avoid using recursive code. This can lead to delays in the workflow as the execution time of a function increases due to recursive code in the function. This is particularly dangerous if the function is called synchronously. Another example of recursive code is when your function calls itself using a software development kit (SDK). This increases the number of simultaneous executions very quickly.

If recursive code cannot be avoided, stop criteria should always be implemented to avoid accidental blocks or too many simultaneous calls. Thank you for taking the time to learn about best practices in function design and how to organize your code. The next article will cover “Concurrency” in AWS Lambda and “Monitoring” Lambda functions.

[1] https://docs.aws.amazon.com/lambda/latest/dg/python-context.html
[2] https://github.com/awslabs/aws-lambda-powertools-python - Connect to preview

Further information on the topic (Eng.):