Wednesday, September 9, 2009

Resource Oriented Architectures: Being "In the Web"

REST seems like an interesting approach to simplifying getting the needed data from a particular web service. It's main benefit comes from reducing it self to the four verbs, GET, POST, PUT, and DELETE, which allows for the builders and maintainers of sites to simplify their outfacing structure. As long as these four know how to handle themselves it doesn't much matter what's going on behind the scenes or how it is getting done: the user is pulling the data they need and making updates as necessary.

The issue, however, is that the author seems to believe that this should be a universally adopted format, that all the information connected to the web should be accessible in this way. But there are some serious problems with this. Firstly, the author seems supremely confident that by relying on any already in-place security measures will prevent the leakage of data. Fine, but what about instances of data trading that happen entirely with the consent of those who are controlling the data sources, but not the "owner" of the data? What if my credit card company decides to announce to a few corporate partners that they aren't blocking GETs to customer usage data? Also, I find the claim that this system will be more secure because you can understand it easier spurious at best. I will grant that easier-to-understand systems give a better chance to secure obvious holes in the system, I cannot fathom that this will seemingly act as a panacea to security woes, which, given how much time the author spends talking about those concerns seems to be his idea. I suppose we will simply have to have faith that SSL is going to be used a lot more in accessing these things (as well as almost all PUT and PUSH commands to prevent a duped context from spamming the servers and data backend into oblivion).

Finally, the last point about putting data at the forefront: I'm not sure I really see the benefit of this system as a user. Perhaps tying some of the metadata captured in the RDF will help with applications of the semantic web, but when I use web services I am rarely going to these locations for static data, but rather for processes I can do with data I have or data located there. And not a lot of what I am after is basic reporting or pinpoint updating. In these cases, the caching advantage disappears (since I'm running specific-to-me GETs), and the simplification to 4 verbs increases the number of middle steps that must be taken, either by adding an additional call between what I am trying to do and the backend processes, or breaking it down into many smaller component calls.

I find it odd that the author also insists that this technique will help with load balancing problems because the entirety of the context is passed in as an argument, which makes switching to a less-burdened server "easier". I'll grant that this would make switching simple (after all, we've got all the information of what we need to do passed in), but what about the underlying data needed to make this happen? Unless I was lucky enough to force a call to the cache I will still have to hit the underlying data at some point, and unless that data is being stored in this other location as well I am simply passing the buck down to the back-end rather than keeping it at the middle-tier.

I suppose the summary of my thoughts here would be this: other than as a way to help enforce correct interfaces between page requests and the underlying mechanics, allowing for easy changing of the underlying mechanics, I'm not really seeing a point to going through the effort of changing your system to use this. Especially since most users wouldn't see a single change, since constructing the system to be easy to use would reduce all these calls to the same links and buttons they already use.

No comments:

Post a Comment