Serverless Computing: My First Lambda Function

This blog describes to some level of detail how I created my first AWS Lambda function (https://aws.amazon.com/lambda/).

Overview

You are about to embark on a steep learning curve. Defining and executing a AWS Lambda function (in short: function) is not only typing in the function specification, its implementation and then invoking it. A lot more is involved – the below blog provides a general flow (not a command-by-command tutorial – for that several references are provided).

From a high level, the architecture looks as follows:

+--------+  +-------------+  +-----------------+  +----------------+
| Client |->| Api Gateway |->| Lambda Function |->| Implementation |
+--------+  +-------------+  +-----------------+  +----------------+
                  |                  |
            +--------------------------------+
            | Identity and Access Management |
            +--------------------------------+

A client invoking a function does so via the API Gateway. For the function to be executed, its implementation has to be provided. When using Java for the implementation, the implementation has to be uploaded in a specific packaging. Identity and Access Management (IAM) governs various access points from a security perspective.

Super briefly, as a summary, in order for a function invocation to be successful the following has to be in place (a lot of moving parts):

User

  • needs to know API Gateway URL (it is shown when selecting a stage within the Stages link on the API Gateway page)
  • Must have an access key and secret key for key-based authentication (configured in IAM)

Api Gateway

  • API definition as well as resource specification

Lambda Function

  • Function specification
  • Function implementation uploaded
  • Function policy allowing API Gateway access

Identity and Access Management (IAM)

  • User, group and role definition
  • Access policy definition assigned to invoking user (directly or indirectly) for API gateway

Aside from the function implementation everything can be specified on the AWS web page. The function implementation is uploaded by means of a defined packaging.

AWS Account

All specification and definition activity takes place in context of an AWS account. If you don’t have one then you need to create one. Chances are you purchased an item on Amazon before; in this case you have an AWS account already.

Identity and Access Management (IAM)

Initially I setup two users. One called “apiDev”, and a regular user.

Then I created two groups “apiDevelopers” and “apiUsers”. apiDevelopers has the policy AdministratorAccess assigned. This allows apiDev to create all artifacts necessary to implement and to invoke a function. I logged in as apiDev for creating the function and all necessary artifacts.

The group apiUsers has no policy assigned initially, however, it will get a (function execution) policy assigned that is going to be specifically created in order to access the function. This establishes a fine-grained permissions allowing the users of the group to execute the function.

Function Definition

The function definition is separate from the function implementation. A function is created without it having an implementation necessarily at the same time. In my case I am using Java and the implementation has to be uploaded in a specific packaging format; and that upload is distinct from specifying the function in AWS Lambda.

A function definition consists of a name, the selection which language (runtime) is going to be used as well as an execution role. The latter is necessary for a function to write e.g. into the Amazon CloudWatch logs. However, a function specification does not include the function parameters or return values. A function specification does not contain its signature. Any input/output signature specification is absent and only the code will contain the authoritative information.

The phrase “creating a function” can therefore refer to different activities, e.g., just the function specification in AWS Lambda, or including its implementation via e.g. an upload.

The instructions for creating the function and its implementation is here: https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-create-api-as-simple-proxy-for-lambda.html. I chose the execution role lambda_basic_execution.

As a side note, AWS Lambda is different from anonymous lambda functions (https://en.wikipedia.org/wiki/Anonymous_function).

Function Implementation

Being an Intellij user I created a separate project for implementing the function. It turns out the easiest way approaching the function implementation was to create a gradle project from scratch using the Intellij project creation option for gradle, and then fill in the AWS function implementation (rather starting with the function implementation and trying to turn it into a gradle project afterwards).

Once the function is developed it has to be uploaded to AWS Lambda in form of a specific packaging. The process of creating the corresponding zip file is here: https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-create-api-as-simple-proxy-for-lambda.html#api-gateway-proxy-integration-lambda-function-java and here: https://docs.aws.amazon.com/lambda/latest/dg/lambda-java-how-to-create-deployment-package.html.

The upload only happens when pressing the “save” button on the AWS Lambda page and it’ll take a while as the package tends to be several GB. Once uploaded one or more tests can be defined on the web page and executed. While this is not a practical unit test approach, it allows to execute the function without an API Gateway integration in place.

After the function implementation (I choose to implement a function computing Fibonacci numbers) the AWS Lambda user interface looks like this:

Note: this screen dump was taken after I integrated the function with the API Gateway; therefore the API Gateway trigger it is displayed in the UI.

API Gateway

One way invoking (“triggering”) a function is via the API Gateway. This requires the specification of an API and creating a reference to the function. The simplest option is using the proxy integration that forwards the invocation from the API Gateway to the function (and its implementation).

The instructions for creating the API in the API Gateway and its implementation is here: https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-create-api-as-simple-proxy-for-lambda.html. I chose the Lambda Function Proxy integration.

An API specification must be deployed into a stage in order for it to be accessible. Stages are a mechanism to e.g. implement the phases of development, testing, or production.

Analogous to AWS Lambda, the API Gateway also allows direct testing within the web user interface and this can be used for some initial testing (but is not feasible for integration testing as it is manual).

Once the relationship between the API definition in the API Gateway and the function is established via resource specifications, the function can be invoked from external to Amazon using e.g. Postman.

After the implementation the API Gateway user interface looks like this:

Note on security: by default the API is not secured, meaning, everybody who knows the URL is free to call it and invoke the associated function.

Securing the Function

There are two different locations in the invocation chain that require security consideration and policy setup:

  • First, the API in the API Gateway needs to be protected
  • Second, the function in AWS Lambda needs to be protected

Securing an API is outlined here:

https://docs.aws.amazon.com/apigateway/latest/developerguide/permissions.html This differentiates accessing the API definition and invoking an API.

For invoking an API I created a policy and attached it to the group apiUsers. Any user within this group is allowed to invoke the API that I created. In addition, I set the authorization to AWS_IAM (see above figure) and that means that the invoking client has to provide the access and secret key when invoking the API.

The function in AWS Lambda is secured using a function policy that can be seen when clicking on the symbol with the key in the AWS Lambda user interface. In my case is states that the API Gateway can access the function when invoked through a specific API.

Note (repeat from earlier): when defining an API in the API Gateway access is open, aka, anybody knowing the URL can execute the function behind the API. While the URL contains the API identifier (and that is randomly generated) and highly unlikely to be guessed, still, access is open.

Once the access policy is defined and put in place, access will be limited according to the policy. However, access restriction is not immediate, it takes some (short) time to become effective.

Function Invocation

In order to invoke the function, a client (in my case Postman) requires the URL. The URL can be found when clicking a stage in the Stages link in the API Gateway UI.

I opted for the IAM authorization using access key and secret key. That needs to be configured in the authorization setting of Postman (it also requires the AWS Region to be specified). No additional headers are required.

As I have defined a POST method, the payload has to be added as well. In my case this is a simple JSON document with one property.

POST /test/ HTTP/1.1
Host: <API URL>
content-type: application/json
Cache-Control: no-cache

{
"fib": 7
}

Once the invocation is set up, and once a few invocations took place, the API Gateway Dashboard will show the number of invocations, errors, etc., separated for API Gateway as well as Lambda functions.

Summary

Defining the first function is an effort as many pieces have to fall in place correctly for it to work out and many mistakes will happen most likely along the way. However, the ecosystem is quite large and has many questions already answered; in addition, AWS has a lot of documentation, which is mostly accurate (but not quite 100%).

The first function, as defined above, now gives me a jump-off platform to investigate and to experience AWS Lambda functions further. Stay tuned for many more blogs that explore a huge variety of aspects and concepts of serverless distributed computing.

Go Serverless!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

 

Serverless Computing: What is it?

In a nutshell: “Just upload code and execute it”.

Serverless Computing

From my viewpoint as an engineer, serverless computing means that I can implement as well as use cloud functionality without having to establish and to manage server deployments.

A quote from Amazon states: “Serverless computing allows you to build and run applications and services without thinking about servers. Serverless applications don’t require you to provision, scale, and manage any servers. You can build them for nearly any type of application or back-end service, and everything required to run and scale your application with high availability is handled for you” (https://aws.amazon.com/serverless/).

For example, if business logic has to be executed, I develop one or more functions (procedures), deploy those “into a serverless cloud” and invoke them without having to worry, for example, about initialization, containers, web or application servers, deployment descriptions or scaling.

Or, when I need database access, I create a database service instance “in the cloud” and use it. Of course, I have to possibly (maybe – maybe not) reason about capacity, but I don’t have to find hardware, find, install and maintain database software images, scale the instances, and so on.

There are many explanations and discussions of serverless computing, for example, https://martinfowler.com/articles/serverless.html or https://martinfowler.com/bliki/Serverless.html, among many others. A clear-cut technical definition of “serverless computing” is still missing.

Serverless Computing: Why is it Interesting?

There are many reasons why serverless computing is interesting and appealing. Cost is one factor, hardware utilization is another. The above references outline many, many more.

However, from a software or service engineering perspective there are many important reasons why this relatively new concept is very much worth exploring and considering for future development and possibly migration from traditional approaches like, for example, Kubernetes.

For one, the focus and the effort on managing server deployments and hardware environments is extremely reduced or removed altogether, including aspects like scaling or failure recovery. This frees up engineering, QA, dev ops and ops time and resources to focus on the core business functionality development.

More importantly, a serverless development and execution environment restricts the software architecture, implementation, testing and deployment in major ways. This reduction in variance allows execution optimization and enables a significant increase in development quality and dependability.

Serverless Computing: What are the Choices?

As in the case of any new major development, several providers put forward their specific implementations, and this is the case for serverless computing as well. I don’t attempt to provide a comprehensive list here, but instead refer to the following page as one example that collected some providers: http://www.nkode.io/2017/09/12/serverless-frameworks.html.

One observation is that there are vendor specific as well as vendor unspecific implementations of serverless computing. It is in the eye of the beholder (based on use cases and requirements) to determine the most applicable environment.

Serverless Computing: Changing Planets

It is easy to label serverless computing as “The Next Big Thing”. However, I believe that a real fundamental shift is taking place that in a major ways breaks with the historical development and engineering of distributed computing. Remote procedure call (RPC – https://en.wikipedia.org/wiki/Remote_procedure_call), Distributed Computing Environment (DCE – https://en.wikipedia.org/wiki/Distributed_Computing_Environment), CORBA (https://en.wikipedia.org/wiki/Common_Object_Request_Broker_Architecture), REST (https://en.wikipedia.org/wiki/Representational_state_transfer), just to name a few, have all in common that the design and engineering has to take hardware and resource topology and deployment into consideration, including scaling, and recovery. Furthermore, it had to reason about “local” and “remote”.

Serverless computing, as it currently is implemented, takes away the “distribution” and deployment aspect to very large extent or even completely. This will change significantly how software engineering approaches system and service construction. Time will tell the real impact, of course, but I’d expect major shifts from a software engineering perspective.

What’s Next: Journey Ahead

From a very pragmatic viewpoint, serverless computing is an alternative software development and execution environment for (distributed) services. And this defines the journey ahead: figuring out how the various aspects of software engineering are realized, like for example

  • Function implementation
  • Procedure implementation
  • Error handling, exception handling and logging
  • Failure recovery
  • Scaling (out and in)

The blogs following this will explore these and other aspects over time.

Summary

Serverless computing, and serverless functions in particular, are very appealing developments, especially from the viewpoint of software development/engineering as well as scaleable execution.

Let’s get hands-on and see what the upside potentials are as well as where the limits and issues lie.

Go Serverless!

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Computing Technology Inflection Point: Happening Right Now

I believe that computing in industry is at a computing technology inflection point right now based on some supporting observations:

  • Databases. Through the NoSQL movement in the database world a shift is taking place:
    1. New database management systems are being created, made available and put into production
    2. The new database systems are by enlarge schema-less and/or support dynamic schema changes
    3. Many support JSON as data structures or data model that is inherently supported by Javascript
  • Server. The Javascript language is enabled as server-side programming language:
    1. Google’s V8-Engine together with Node.js and its Eco-system makes Javascript a real contender as a server-side language
    2. Javascript does not have a class/instance model and supports schema-less programming, including dynamic schema changes
    3. Integrated Development Environments (IDEs) provide full Javascript support
  • User Interface. The combination of HTML5/Javascript is gaining traction:
    1. HTML5 in conjunction with Javascript is a real powerful web development technology set
    2. Javascript libraries are being built that can span all types of user interfaces and user devices (including mobile)
    3. It is possible to have a streaming query from the database all the way to the user interface; its frictionless

Combining all these observations, two major characteristics of a new computing technology stack are crystallizing very clearly:

  • One cross-layer programming language (Multi-tier Programming)

    • It is possible to use the same language in the User Interface, Server and Database, i.e., across all architectural layers
  • Inherent schema-less computing and dynamic schema change support
    • All layers shift to schema-less technology and models

I believe that this is a real inflection point happening right now and will cause a major shift in system design, engineering and evolution. It leaves behind the notion that each layer in a typical technology stack must have its specialized language. It also leaves behind the notion that every concept across domains should be best represented in the Class-Instance paradigm. It opens up the ability to dynamically evolve within the systems’ schema-less or dynamic schema change capabilities and it supports the reuse of logic and types across all layers without loosing or changing semantics.

If this combination of technology continues to gain major traction, I would not be surprised to see that some of the first operating system prototypes written in Javascript will become mainstream at some point in time. And, of course, it would be interesting to see how far the processor manufacturers are in their thought processes or research projects with putting dedicated Javascript execution functionality on the processors themselves.

Being around at this computing technology inflection points is exciting and professionally lays before us interesting choices in business decisions as well as career decisions.