Posts Tagged REST

Scala Parser Combinators => Win

Parser combinators is such a cool concept, very nice tool in your toolkit if you are working on external DSLs. I've been playing with them a little bit recently. Combining different parsers using higher order functions is fun, especially if you are using Scala.

Parser combinators are provided as a library in Scala over the core language. Let's use an example to walk through the details ...

Problem: HTTP's accept header provides a way for the client to request what content-type it prefers. For this exercise let's just parse the header into a list of AcceptHeader objects. Sorting the list based on the quality factor (or q value) is trivial and not related with this discussion, so skipping.

According to the HTTP specification the grammar for the Accept header is  --

Accept = "Accept" ":" #( media-range [ accept-params ] )
media-range = ( "*/*"
| ( type "/" "*" )
| ( type "/" subtype )
) *( ";" parameter )
accept-params = ";" "q" "=" qvalue *( accept-extension )
accept-extension = ";" token [ "=" ( token | quoted-string )

Example of such a header is: application/html;q=0.8, text/*, text/xml, application/json;q=0.9

Approach

The goal now is to parse the header value into a list of AcceptHeader objects, where an AcceptHeader is defined as a case class:

See below for a possible approach on parsing the accept header using the combinator technique:

Note: I did not implement accept-extension defined in the specification's grammar in this example.

Now let's look at various aspects of the code:
[click on the image to enlarge]

  • First look at the lazy val acceptEntry:
    • mediaType < ~ "/" indicates that result of parsing slash ("/") is irrelevant and only carry forward the result on the left (that's of the mediaType).
    • ~ is a method in the Parsers trait that stands for sequential combinator.
    • opt method stands for optional value for quality factor (q).
    • ^^ is a method in the Parsers trait -- it has a parser on the left and a function on it's right (that's doing some case matching, in this case). If the parsing effort on the left is successful it applies the function on the right to that parse result.
  • The subsequent lines expand and define each of the parsers defined in acceptEntry
    • regex is defined for media type and subtype allowing for alpha-numeric, hyphen (-) and asterisk(*) values
    • For qualityFactor: ";" ~> "q" ~> "=" ~> floatingPointNumber -- ignores all the parsed results on the left as we are only interested in knowing what the value of q is, which is defined as a floatingPointNumber
  • Now jump back to the first line which says accept is rep1sep(acceptEntry, ","). rep1sep is a method in the Parsers trait. We are saying that the accept entry will repeat one or more times, and each entry is separated by a comma (",)

You may test the functionality via

Output: List(AcceptHeader(application,html,0.8), AcceptHeader(text,*,1.0), AcceptHeader(text,xml,1.0), AcceptHeader(application,json,0.9))

We just scratched the surface here. Debasish Ghosh's DSLs in Action dedicated a chapter for parser combinators, which helped me quite a bit in furthering my understanding. (Highly recommend Ghosh's book if you are contemplating about implementing DSLs).

Tags: ,

Conditional Requests with Lift

ETag (or entity tag) values and/or Last modified time of the resource are typically used for this purpose. I'm only discussing ETags here, interchanging this with Last-modified time is trivial, so skipping.

In this post I'm concentrating on deep ETags, where application developer can generate and compare ETags based on the underlying domain objects, database tables, etc.

Other kind of ETags, the shallow ones, can be supported at the framework level. They rely on hash of the representations. A web framework can generate ETag value, and compare them with the representation from the response. Shallow ETags are useful with respect to saving bandwidth but does not eliminate the computation on the server side. (Expect a post on the shallow ETags soon).

Conditional GET

Conditional GET is a great way to conserve bandwidth. An intermediary cache may check with the origin server whether the resource has changed since it last received a representation. The server responds either with the new representation if the resource state changed or send back only the headers with 304 Not Modified response.

Let's start with defining a Product class which is using Lift's Mapper (as ORM). Also, note the use of CreatedUpdated trait, this will automatically add two timestamp fields -- createdAt and updatedAt for insert and update operations respectively.

There are various strategies to generate ETags, I'm using the one that uses the updatedAt field (and use its Long value). Let's first see this in action and get back to implementation details in a bit. Using cURL to test.

Request and Response for a Product of known ID

For subsequent requests the client sends the value of ETag provided by the server. See If-None-Match header in the request below. Adding this header makes the request a conditional one. If the resource doesn't change the server sends back only the headers with 304 header (see below).

As far as implementation is concerned, relevant portion of the code is provided below:


Value of If-None-Match header from the request is compared with the resource ETag value and then either respond with 304 (resource not modified) response or 200 (ok) response. Note that the value of If-None-Match can be an array of ETag values separated by commas, which is accounted for in the code above. NotModifiedResponse used above can very well be a standard sub class of LiftResponse in the framework. Regardless, you could create one as follows, which is actually a wrapper around Lift's InMemoryResponse

Conditional PUT

Conditional PUT is a great approach to enforce that the client is updating the most recent version of the resource state. Client does a GET first and gets the ETag value and uses that in the If-Match header (see below).  The usage of If-Match makes it a conditional request for updates. Server can enforce this by rejecting any updates without If-Match header in the request.

If the ETags match the resource state is updated. The server responds back with 204 (No Content) and with the new ETag value.

Suppose some other client that doesn't have the updated ETag value tries to send a request to update. The server responds with 412 (Precondition Failed) with the new ETag header value (shown below)

Implementation-wise, the code below compares the ETags and responds with either 204 or 412 indicating success or failure of conditional update (It also checks the request's content type and the existence of the resource and respond appropriately).

Just like in the case of GET, added NoContentResponse and PreConditionFailedResponse, both are wrappers around InMemoryResponse.

Complete source of the service is here, just in case.

Tags: , ,

Book Review: REST in Practice

The Book

Title: REST in Practice (Hypermedia and Systems Architecture)

Authors: Jim Webber, Savas Parastatidis, Ian Robinson

Publisher: O'Reilly Media

Review

Couple of years ago, the authors of this book penned one of the finest articles explaining the principles of REST titled How to GET a Cup of Coffee. I was so thrilled when I first heard that the same authors are expanding the concepts into a book form! Now that the book is out and I finished reading it, here are some of my thoughts ...

This book covers a wide spectrum of ideas related to the RESTful systems including RPC-style systems, CRUD-based services, hypermedia systems, caching, Atom syndication and publishing protocol, security, and semantic web. The key is too see HTTP as an application-level protocol and not as a transport protocol. Start with that basic understanding, each chapter in the book deal with various integration challenges in the enterprise. Heart of this book is the focus towards building the systems in a web-centric way.

As the concepts evolve from chapter to chapter they are evaluated against the Richardson's maturity model. At the base of the Richardson's model are the systems that use RPC-style (HTTP-as-transport-protocol) systems. The next level up, Level 1, are the systems that work in a resource-oriented model. Endpoints give way to thinking in terms of resources and URIs (e.g: OrderRequest end-point where a particular function on order is invoked vs. Order as a resource, example.org/order/1234).

Going up the pyramid, Level 2 maturity is attained by conforming to a uniform interface (HTTP verbs) and well-known HTTP response codes. There are many systems that claim to be RESTful but don't go beyond Level 2  (I don't want to sound pedantic, but just pointing out!). There are other articles and books with good details about Level 1 and Level 2 systems. If there is one take away from this book I have to say it's the understanding of the Level 3 of the maturity model, hypermedia systems. HATEOAS (Hypermedia as the Engine of the Application State) principle has been often discussed in various forums but perhaps not that well understood.

As with their InfoQ article, Restbucks, a coffee store web-application, is being built as the discussion proceeds from simple concepts to the more advanced ones. A domain that almost everyone can identify with, and puts the focus on the technical discussion rather than on the domain model intricacies. REST in practice, as the name suggests, takes the approach of implementing the concepts as they are discussed; Java and .NET are used in the book. Reading code is some times easier than understanding the abstract concepts, if you are like me.

Discussion on Atom and Atom publishing protocol is one of the best. If you don't have low latency (in micro seconds) requirement Atom format is the one that has to be given a good consideration while designing event-driven systems. In the penultimate chapter the authors compare Web services (SOAP and WS-* stack) with web-based (REST) systems. They compared both models in great length with respect to security, reliability, transaction management (including two-phase commits). A compelling read for anybody who is trying to get a hang of what these models offer.

Subbu Allamaraju's RESTful Web Services Cookbook is one of my favorites on the topic. I was fortunate enough to read the book during it's draft stage, which actually helped me immensely in understanding and reinforcing some of my concepts.  This book, REST in practice, helped me further in understanding more advanced topics like semantic web (RDF, OWL), and the event-driven system integration. I thoroughly enjoyed the book, and would certainly recommend for all REST enthusiasts (and doubters).

Tags: , ,

JavaOne 2010 – My Impressions (Part 1)

Bad

Let's start with the bad and everything else would be an improvement when compared with that. Yes, food at the conference is awful. I thought that's only me who is looking out for vegetarian options, but apparently many people I knew who eat mostly anything under the Sun (no disrespect) are also equally disappointed. There is a huge scope for improvement in this area.

Some other aspects that need some attention -- it was a huge effort for the first couple of days to find the right conference room in Hilton and in Parc55; folks sitting towards the back couldn't see the slides that well because of the placement of the screen and projector; Poor wifi, worked intermittently. I hope Oracle considers moving Java One to Mascone in the future.

Good

Now that we put the worst behind us I will go to the other extreme and explain what's the best. Yes, there are some good sessions that I attended which I will go over in a minute. But the best part is meeting people and exchanging ideas. I got an opportunity to meet a whole lot of people I know for a while, worked with them online or exchanged tweets but got a chance to meetup for the first time. What more? I have even found a childhood friend whom I lost contact for the last few years!

Sessions

I will summarize some of the sessions that I liked at the conference (and skipping the ones that didn't appeal to me much):

Script Bowl 2010: A Scripting Languages Shootout

Clojure, Groovy, JRuby and Scala are the contending languages this time. They were presented by Rich Hickey, Graeme Rocher, Nick Sieger and Dick Wall respectively. Roberto Chinnici of Oracle acted as a judge/coordinator. Two rounds of presentations took place -- first round dealing with the language features and the second one presenting community features. Liked what I saw about Spock.

It was all in a rapid fire mode as the time allotted is only one hour. I was actually thinking that to be a competition where a specific problem or two would be assigned to each person representing the language and then evaluate how each language approaches the problem. But apparently that's not the case.

The winner was announced based on the audience applause at the end, just for fun, I guess. Groovy topped followed by Scala as a close second.

Advanced Java API for RESTful Web Services (JAX-RS)

Paul Sandoz and Paul Chinnici started off with an overview of JAX-RS -- going over basic annotations (@Path, @Produces, etc.). Runtime resource resolution was dealt with in good detail. Middle part of the presentation was over integrating JAX-RS with EJBs and CDI (Yawn! I'm not much into it CDI).

Then they woke me up with some good discussion over content negotiation and conditional request support of JAX-RS. JAX-RS has got great content negotiation support covering media type, character set, language, and content encoding. Variant API is nice and flexible.

Equally nice is its support for conditional requests (GETs and PUTs) for caching representations and for concurrency control. They have demoed ETag validation and evaluatePrecondition flow.

"If you think you understand Java Generics than actually you don't understand it!" - a quote from the presenters that drew some laughs. The speakers went over how they had to deal with the type erasure issue and a work around of using GenericEntity to preserve type information at runtime. They finished the presentation discussing the pluggable exception handling mechanism of JAX-RS.

Overall, a nice presentation. I have expected them to discuss hypermedia support in Jersey but because of time constraint they couldn't get to it, as Paul Sandoz suggested after the talk when I asked him about it.

The Next Big Java Virtual Machine Language (NBJL)

Stephen Colebourne of OpenGamma is the speaker of this talk with potential controversy from the title itself. There are a lot of folks who swear by their language of choice, for what its worth, I thought it would be interesting to see what conclusion the presenter would arrive and based on what analysis. (There is also another section of people who completely denounce of the notion of NBJL. They consider polyglot programming is here to stay and there is not going to be one big successor, and you choose a language that best serves your task at hand. Let's leave that view point for another day!).

Colebourne defined what he thinks a next big language would be, one that's widely used (big job market) with supporting ecosystem and community. Examples: C, C++, Java, C#, Cobol, VB, Perl, PHP, Javascript, etc. So NBJL is one that challenges Java and eventually displaces it.

There are a lot of design decisions that were made at various points while Java was evolving. Many of them appeared to be right at that point and only after some experience people started to realize that there are better ways to approach. The speaker went over few items that are sore points in Java and what we have learnt from the real world experiences, from technology advancements, and from the other new breed of languages. Colebourne went over checked exceptions, primitives, arrays, 'everything is a monitor' concept, static methods, method overloading, and generics as some areas where NBJL could provide some solid alternatives.

Colebourne presented what can Java do to evolve still maintaining the "feel of Java". Bigger challenges for Java at this point are supporting -- Properties, Continuations, Control abstraction, Traits, Immutability, Design by Contract, and Reified generics. Another important point the speaker made was that the NBJL should be a "blue collar" language, a language of the masses. An average developer should be able to pick it up rather quickly and be productive in a reasonable time frame.

Towards that end the presenter looked at various languages: Clojure, Groovy, Scala, Fantom and discarded every one of them with some justification. Here is where it gets interesting. He concludes with a thought may be that Java should come up with a backward incompatible version and fix its warts. Regardless of how you respond to the concluding thought it was an excellent presentation overall covering a wide spectrum of concepts. If you are interested to hear from the presenter himself check this out.

NoSQL Alternatives: Principles and Patterns for Building Scalable Applications

Nati Shalom of GigaSpaces presented data scalability patterns that emerged out of various NoSQL projects. He compared the new breed of technology against the traditional database scaling approach in terms of consistency, transaction and query semantics.

Some of the factors that are affecting the scalability needs -- social networks changed the web experience in the recent years. Read-mostly applications transformed into read/write and mostly predictable traffic has become a viral one with huge spikes at certain time intervals triggered by the events. SaaS model and cloud is mentioned as a factor. The speaker suggests that the economic downturn has forced the corporates to work efficiently, and no more that throwing more expensive hardware at the problem is an appealing solution.

Another factor that the speaker suggested was that the disk failures are up and lot higher than what are actually reported by the vendors (3% actual vs 0.5% reported). With the advancements in technology memory can be 100 - 1000x more efficient than the disk. RAMClouds become much more attractive for applications with high throughput requirements. New hardware makes it possible to store the entire set of data in-memory.

The presenter went over various alternatives: In-memory (GigaSpaces), Key/Value. Column (BigTable, HBase), Document model (CouchDB), Graph (Neo4J).  Shalom discussed common principles behind the NoSQL alternatives like - design for failure;  scale through partitioning of the data; maintain reliability through replication; provide flexibility between consistency, availability and partition failure; dynamic scaling. Some other principles (not so common) discussed are - document model support, SQL query support, MapReduce, transaction management, and security.

To be continued ...

See here for part 2 of the series.

Tags: , , ,

Lift and Content Negotiation

Overview

This is a follow up of my previous post on Lift and REST discussing the aspects of URI matching and content negotiation. Lift supports Accept header of HTTP by responding back with a representation in accordance with the media type provided in the header. However, it currently doesn't support quality factor (q parameter) of the Accept header, out of the box. Here  I will attempt to provide an approach to provide that support and along the way let's explore another compelling feature of the Lift's REST support.

HTTP Accept Header

Check RFC 2616 for detailed explanation on Accept and other HTTP headers. Let's go over this based on an example: If a client sends an Accept header something like the following --

Accept: application/xml; q=0.8, application/json

is interpreted as: I prefer JSON representation but if you don't have it XML is my second choice.

Quality factor or q parameter is the one that's used to specify the preference. q is a decimal ranging from 0.0 to 1.0 and is delimited with a semi-colon (;) following the media mime type. If no q parameter is specified it defaults to value of 1.0, indicating first among the options provided.

Approach

Before going into the approach for supporting the q parameter (for your Lift-based application) let's get into one of the things that you definitely want to see in a web framework: decouple business logic from the representation. Lift doesn't disappoint you in this area. In my case I was using the same business logic, authorizing the user and making the database lookups, returning the appropriate representation independent of the business logic.

serveJx in RestHelper is there precisely for that purpose. Following code provides the URI matching rule (matches /api/user/{user_id} for GET requests) and returns an object that's of trait Convertable. Also, define an implicit def that converts the object to the appropriate representation (XML or JSON, in this case).

Relevant pieces of case class User is provided below. If you are familiar with Squeryl you may have already identified the annotations provided for the constructor arguments otherwise don't worry about it; Squeryl is an excellent Scala-based ORM (I like Squeryl quite a bit, that's a topic for another blog post!). User implements Convertable, meaning the two representations -- XML and JSON via toXml and toJson functions respectively.

So with all the above in place, the following request would result in an XML or JSON response based on the Accept header. The logic for identifying the appropriate representation  is in the RestHelper's implicit def jxSel shown below. As RestHelper does not support q parameter out of the box, UserManagementService extended from RestHelper changes the behavior by overriding jxSel.

Actual Parsing: Parsing of the Accept header and determining the representation is done using functions in the ClientMediaPreference object. Instead of embedding the code in this already lengthy post, here is a link to the gist covering the parsing logic.

Conclusion

There are a couple of areas that I still have to tighten-up the code, but that's the general idea. One of the Todo items is to send a 406 Not Acceptable response if the representation that a client asks for is not implemented by the server.

Before closing, let's continue checking off another item from Tilkov's litmus test (as we did in the last post) ...

Can I easily use the same business logic while returning different content types in the response?

Answer: Yes, the framework is flexible in this aspect.

Tags: , ,

Lift and REST: URI Matching and Content Negotiation

As a Scala enthusiast I'm currently evaluating Lift, particularly the aspect of its REST support. This could become a series of posts on the topic as I try to understand the framework better and may be subject it to a litmus test proposed by Stefan Tilkov. Of course, I will not be making a judgment call whether the framework is RESTful or not as I don't want to get into a dogmatic discussion!

So far I like what I see, it is a productive framework with a nice set of features (and has cherry-picked some best ideas from other frameworks too), and what else it's Scala-based (my current favorite language).

Setup

Getting started instructions on Liftweb's wiki worked perfectly fine for me. I used instructions for Maven and I'm using Scala 2.8.

URI Matching

RESTHelper is the trait that you may want to extend for building your REST-based web services. Lift's URI dispatch follows the templating approach in which you would leave part of the URIs to be filled by the clients before they are submitted. For example, id can be sent dynamically by the user: http://example.org/api/user/{id}

The following block of code can take care of the above URI matching ...

the List of strings are the tokens after each "/" (forward slash) in the URI. The above code indicates that a GET request can handle that specific URI pattern. And when such a pattern is encountered invoke the method userDetails. If a URI is encountered as http://example.org/api/user/101, 101 is bound to id variable.

Box[LiftResponse] is the return type that's expected. Box goes by the same notion as Option in Scala. When you have stuff to return you would respond with Full(LiftResponse) and Empty when there is no response. So here is an example of how you can URI match and dispatch for a sample CRUD (I'm using simple User account management).

Ideally I would like to write this in a single case block without breaking into multiple units as I did above. When I add more than three case statements of this complexity (which is not that much, IMO) Scala compiler takes way too long and eventually gives up with an OffsetTooBigException. This issue is in bug tracker for a while now, and the above approach of splitting the matchers into multiple units is based on the workaround suggested there.

The pattern matching is flexible and works great for multiple tokens too. Something like http://example.org/user/{userId}/address/{addrId} can be extracted using a very similar pattern like above with values for userId and addrId binding to their corresponding variables.

Here is an excellent article on Scala's partial functions and pattern matching that you may find it useful in the context of this discussion.

Content Negotiation

Content negotiation is one of the core concepts of RESTful systems where a client can indicate which media type(s) it prefers. Also within the media types it can specify the order of preference. Client does this by using Accept header. For example, consider the following Accept header --

Accept: application/xml;q=0.8, application/json;q=0.9

It indicates to the server a) it can accept XML and JSON formats and b) it prefers JSON over XML (by providing higher value for 'q' parameter. q value ranges from 0.0 to 1.0, higher value indicates more preference).

Ok, so how does Lift fares in this area? Mixed results --

  • It recognizes which media type to serve by the Accept header. So for example Accept:application/json header from the client is matched to case "api" :: "user" :: id :: _ XmlJson _ => ...
  • Similarly for XML, an accept header with text/xml is matched to XmlGet method fine. One issue that I encountered here is it only recognizes text/xml as XML media type and not application/xml. application/xml is actually preferable to text/xml, generally speaking. One reason on top of my head is text/* media types ignore the encoding specified in the content, in this case if you declare a specific encoding (e.g: UTF-8) in the XML declaration header it will be ignored.
  • It doesn't respect the q parameter value. So in the above example, Accept: text/xml; q=0.8, application/json;q=0.9 it still serves the client XML as it only looks for if text/xml is present in the header.

So let's look at Tilkov's first question in the list:

Does the framework respect that an HTTP message does not only consist of a URI? I.e., is dispatch done at least based on the HTTP verb, the URI, the Content-type and Accept headers?

My answer, at this point: Yes dispatch works great in terms of -- HTTP verb, the pattern matching URI templates and Accept headers (partially). Lift has to get its act together in further tightening its content negotiation support.

[I would love to contribute some patches for this and more (like hypermedia support, will elaborate it in the future posts). Lift folks, you have a volunteer here!]

Stay tuned!

Update: Follow-up is posted here: Lift and Content Negotiation

Tags: , ,