This blog has explained a few Serverless concepts with code samples:

This particular blog entry will show how to use AWS Lambda to store tweets of a tweeter in Couchbase. Here are the high level components:

 

lambda-twitter-couchbase

The key concepts are:

The complete sample code for this blog is available at github.com/arun-gupta/twitter-n1ql.

Serverless Application Model

Serverless Application Model, or SAM, defines simplified syntax for expressing serverless resources. SAM extends AWS CloudFormation to add support for API Gateway, AWS Lambda and Amazon DynamoDB. Read more details in Microservice using AWS Serverless Application Model and Couchbase.

For our application, the SAM template is available at github.com/arun-gupta/twitter-n1ql/blob/master/template-example.yml and shown below:

AWSTemplateFormatVersion : '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Twitter Feed Analysis using Couchbase/N1QL
Resources:
  TrumpFeed:
    Type: AWS::Serverless::Function
    Properties:
      Handler: org.sample.twitter.TwitterRequestHandler
      Runtime: java8
      CodeUri: s3://arungupta.me/twitter-feed-1.0-SNAPSHOT.jar
      Timeout: 30
      MemorySize: 1024
      Environment:
        Variables:
          COUCHBASE_HOST: <value>
          COUCHBASE_BUCKET_PASSWORD: <value>
      Role: arn:aws:iam::598307997273:role/microserviceRole
      Events:
        Timer:
          Type: Schedule
          Properties:
            Schedule: rate(3 hours)

What do we see here?

  • The function is packaged and available in a S3 bucket.
  • Handler class is org.sample.twittter.TwitterRequestHandler and is at github.com/arun-gupta/twitter-n1ql/blob/master/twitter-feed/src/main/java/org/sample/twitter/TwitterRequestHandler.java. It looks like:
    public class TwitterRequestHandler implements RequestHandler<Request, String> {
    
        @Override
        public String handleRequest(Request request, Context context) {
            if (request.getName() == null)
                request.setName("realDonaldTrump");
            
            int tweets = new TwitterFeed().readFeed(request.getName());
            
            return "Updated " + tweets + " tweets for " + request.getName() + "!";
        }
        
    }

    By default, this class reads the twitter handle of Donald Trump. There will be more fun on that coming in a subsequent blog.

  • COUCHBASE_HOST and COUCHBASE_BUCKET_PASSWORD are environment variables that provide EC2 host where Couchbase database is running and the password of the bucket.
  • Functions can be triggered by different events. In our case, this is triggered every three hours. More details about the expression used here are at Schedule Expressions Using Rate or Cron.

Fetching Tweets using Twitter4J

Tweets are read using Twitter4J API. It is an unofficial Twitter API that provides a Java abstraction over Twitter REST API. Here is a simple example:

 

Twitter twitter = getTwitter();
Paging paging = new Paging(page, count, sinceId);
List<Status> list = twitter.getUserTimeline(user, paging);

Twitter4J Docs and Javadocs are pretty comprehensive.

The Twitter API only allows us to read the last 200 tweets. The Lambda function is invoked every 3 hours. The tweet frequency of @realDonaldTrump is not 200 every 3 hours, at least not yet. If it does reach that dangerous level then we can adjust the rate to trigger the Lambda function more frequently.

A JSON representation of each tweet is stored in the Couchbase server, using Couchbase Java SDK. AWS Lambda supports Node, Python and C#. Therefore you can use Couchbase Node SDK, Couchbase Python SDK or Couchbase .NET SDK to write these functions as well.

The Twitter4J API allows us to fetch all tweets subsequent to a particular tweet, using it’s id. As a result, this ensures that duplicate tweets are not fetched. It requires us to sort all tweets in a particular order, and then pick the id of the most recent tweet. This was solved using the simple N1QL query:

SELECT id FROM twitter ORDER BY id DESC LIMIT 1

The syntax is very SQL-like. There will be more on this in a subsequent blog.

Store Tweets in Couchbase

Finally, we want to store the retrieved tweets in Couchbase.

The value of the COUCHABSE_HOST environment variable is used to connect to the Couchbase instance. The value of COUCHBASE_BUCKET_PASSWORD environment variable is to connect to the secure bucket where all JSON documents are stored. Its very critical that the bucket be password protected and not directly specified in the source code. More on this in a subsequent blog.

The JSON document is upserted (insert or update) in Couchbase using the Couchbase Java API:

bucket.upsert(jsonDocument);

 

This Lambda Function has been running for a few days now, and has captured 258 tweets from @realDonaldTrump.

serverless-lambda-couchbase-twitter-bucket

…And an interesting analysis of his tweets is coming shortly!

Talk to us:

The complete sample code for this blog is available at github.com/arun-gupta/twitter-n1ql.

AWS Serverless Lambda Scheduled Events to Store Tweets in Couchbase

About The Author
- Arun Gupta is the vice president of developer advocacy at Couchbase. He has been building developer communities for 10+ years at Sun, Oracle, and Red Hat. He has deep expertise in leading cross-functional teams to develop and execute strategy, planning and execution of content, marketing campaigns, and programs. Prior to that he led engineering teams at Sun and is a founding member of the Java EE team. Gupta has authored more than 2,000 blog posts on technology. He has extensive speaking experience in more than 40 countries on myriad topics and is a JavaOne Rock Star. Gupta also founded the Devoxx4Kids chapter in the US and continues to promote technology education among children. An author of a best-selling book, an avid runner, a globe trotter, a Java Champion, and a JUG leader, he is easily accessible at @arungupta.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>