Condos and Clouds: Patterns in SaaS Applications Q&A
For context, please make sure you've attended the webinar, presented by Pat Helland.
Q: Is Google Docs SaaS?
A: You could look as Google Docs as a SaaS. I have no problem with that. You can also look at things like Amazon.com (retail) as a SaaS.
Q: Can you provide good references that address the security issues involved in Cloud Computing?
A: Fundamentally, there are no really good references to address security in the cloud. Security in the cloud is about trust: If I have my own datacenter, I can manage physical security for my app. If I go to a datacenter hosting company, I can get my own racks, network, and VPN but still need to trust the provider with physical access and not to screw up the power and air conditioning. If I go to an IaaS provider (e.g. Amazon AWS), I need to trust that they won't eavesdrop on my data. If I go to a PaaS provider (e.g. Salesforce) that stores my data in the same database tables as other customers, I need to trust that the data will neither be snooped on nor lost.
Q: In composite request services, what do you do if a service is not available? (Even with redundancy it can fail.)
A: In most composite environments, the service is a pool. As mentioned in the talk, the services are stateless (but may go fetch state from a session manager). The underlying "concierge services" need to ensure the availability of these composite request services. That means having a pool, increasing the provisioning if the pool is getting low, and restarting machines when they fail. One additional answer that is different but relevant has to do with designing complex systems to cope with a subset of the services. In the 2007 Dynamo paper, Amazon quoted more than 150 service requests per page render for the Amazon.com retail site. If you, as a user, watch the page refresh, you can see the responsiveness of the various portions of the page (e.g. the recommendations may be slow while the reviews may be fast). In many cases, the sites are designed to cope with different response times and even missing portions of the work.
Q: Please also relate key performance indicators (KPIs) to SLAs. How do we know the SLA is being met?
A: In my perspective, KPIs are a set of criteria that includes the SLAs (and arguably they are just different nomenclature for the same thing). You know the SLA is being met by measure, measure, measure! It is essential to have monitoring tools that are capturing the incoming request and the matching response and looking to see how long the request takes. This should typically be done as a percentile (e.g., 99.9% of the requests completed within X ms). It is also very important to capture these SLA measurements on the components within the stack so that the lower level services can have their responsiveness tracked, too.
Q: Is this the start of a programming language of patterns (PLOP) for cloud services?
A: You could look at this as a programming language pattern but I tend to think of this as more of an API to a platform as a service.
Q: How can we handle the security problems?
A: As I mentioned above, the security issues revolve around trust. Providers of the cloud computing services (be it Infrastructure-as-a-Service, Platform-as-a-Service, or Software-as-a-Service) need to uphold a level of trust. My wife is reluctant to provide our credit card to anyone on the web but Amazon.com has earned her trust and our credit card is used on that web site with great frequency. The same holds true up and down the stack as we implement both Infrastructure-as-a-Service and the emerging Platform-as-a-Service solutions.
Q: The services that you are calling, can they be in different public clouds??
A: Well…maybe they could be in different public clouds. We are not yet seeing the SLA guarantees to make me see that soon, though.
Q: Are you talking about three-layer object class/component design or otherwise?
A: This is much less about an object/class/component design than it is about a service-oriented architecture. Objects/classes/components are GREAT but I tend to see them as similar to using standards available at Home Depot in building a home or office building. Knowing that the specs for a doorknob are standard makes it easier to construct the doors in a building. When looking at the interaction across services, it's much more like ensuring that the flow of boxes delivered by FedEx from one building to another meet the expected shape and form of the boxes. Now, when I talk about the "Condos and Clouds" analogy, it's kind of like ensuring I have a large enough loading dock to receive the incoming boxes delivered by FedEx. Comparing this to objects/classes/components is not quite the point (even though that is awesome technology).
Q: Caches are implemented in RDBMS?
A: Maybe…RDBMSs are cool but frequently have scaling challenges. Building a cache in an RDBMS would (perhaps) be easy at first but may start to crumble under web scale. Typically, you would use a key-value store of some kind which is likely to have better scaling characteristics. Now, don't get me wrong, RDBMSs have a lot of value but that value is not really in providing this kind of caches. So, RDBMS would likely be the wrong choice since its benefits are not super helpful here and its weaknesses (i.e. scale) can be a big problem.
Q: How far does the user connection speed affect the performance of a cloud application? Do cloud applications define a minimum speed requirement?
A: The typical cloud application doesn't count the user connection as a part of the SLA. Typically, you capture the time of the incoming request when it hits your servers and then catch the time your answer leaves the cloud application. This gives you your user-facing SLA.
Q: Any references on details for "Big Data" + Event Pub Sub please?
A: The best reference I can give is the Percolator paper from Google. See "Large-scale Incremental Processing Using Distributed Transactions and Notifications," by Daniel Peng and Frank Dabek - OSDI 2010.
Q: Trust...but verify. How do I verify?
A: I don’t know…trust is complicated and there's no simple answer.
Q: Would you recommend a full SOA as a practice to model relations between modules, and in particular a REST implementation, as a good practice to achieve scalability and statelessness? At least for synch, and messaging pattern for asynch?
A: SOA is a great abstraction for isolation and independence. REST is a great way to talk about separating concerns BUT YOU HAVE TO USE IT CORRECTLY. None of these patterns is magic unless the application/interface designer chooses the right abstraction. REST is a great protocol (with a clunky name) that is all about defining how a service can expose a user specific abstraction for how it uses the service. The "representation" is the user's view of the service. I think of it as the way I present myself in public with my finest clothes on. A REST call (or Representational State Transfer) is a way to whack on the external view of the service's public face to accomplish some action. Now, this is great if the semantics of the facade and the semantics of the changes made to the facade offer easy-to-use (and correct) behavior. A wonderful thing about REST is that it layers nicely over HTTP and other similar protocols. While I love messaging, I thing messaging and REST are simply different syntaxes over what (hopefully) is a nice semantic.
Q: I'm still learning about cloud services. You talked about SLA-controlled "auto" provisioning. Is this state of the art or still coming or a current goal??
A: This "auto" provisioning is an emerging feature. I'm sure that some of the most advanced sites have functionality similar to this (but perhaps not all the way to what I described). Right now, I don't know of environments for Platform-as-a-Service that offer all of the features I've postulated in this talk. The talk is intended to promote discussion and push things forward.
Q: Would you recommend building a new solution on a full PaaS model or just adopt a IaaS and build the rest in-house? for better flexibility, scalabiltiy, and avoiding bugs in the PaaS?
A: There's no doubt that the IaaS stuff has clearly defined behavior (such as VMs and filesystem support). It is increasingly stable and interesting. To use these IaaS environments requires that the programmer effectively be a systems programmer, capable of writing to a VM and building an app on top. I am excited to see the beginnings of Platform-as-a-Service features (examples being Salesforce's Force.com and Google's AppEngine). As time goes on, I expect these PaaS offerings to get richer and make things even better. When I was first starting in the industry, I remember getting a LOT of crap for using a programming language somewhat like C to implement a database system. The conventional wisdom was that real system programmers used assembler. In many ways, these folks were right because the performance was MUCH better at the time for the assembly folks. Over time, I was right and the stability and performance of systems written in high-level languages clearly won the day. You can make similar arguments for sticking to IaaS in today's cloud system.
Q: What's the difference between SaaS and PaaS?
A: SaaS (or Software as a Service) is the supplying of an application's behavior over the web. All the retail over the web stuff (and much more) fits that definition…if the user is over the web, it's SaaS. PaaS (Platform as a Service) is an environment that lets a programmer—who may be constrained in their programming environment—to develop an application that is hosted on the web and provides a SaaS application.
Q: Can you give example of cloud regarding SaaS?
A: Sure…Amazon.com's retail site. A great cloud-based SaaS.
Q: What could make banks move to the cloud?
A: My wife and I do online banking…well, really she does all that. So, my bank is using the cloud as an SaaS. Now, on the other hand, the hosting of the application and the storage of the bank's data are within the bank's datacenters, making it a private environment. There are legal and trust issues that make it hard (so far) for banks to stuff all of their core business into the cloud. You will see an emerging set of the non-core stuff starting to move to third party cloud providers, though.
A: So, this article says that you can steal keys in the cloud and nothing is safe. On the one hand, I know that these kinds of incidents do happen (and RSA as an encryption company has an interest in raising our level of concern). As in all things, we will learn how to establish trust in those that earn it. Will there be problems? Absolutely! Will we learn which systems have earned the users' trust? You bet.
Q: How can the user confirm that data is being backed up and recoverable if catastrophy happens?
A: Darn…this is a hard question. How do you know your doctor is prescribing the right medicine? How do you know your city can cope with an earthquake or hurricane? The right thing to do is ASK your cloud provider what they do, how they back up your data, and what will be done when stuff goes wrong? STUFF WILL ALWAYS HAPPEN. The challenge is how do you prepare for it, what do you do when it happens, and how transparent are you with what really did happen?
Q: How do your handle failure?
A: By expecting it. I don't mean to be flippant but all resilient systems are built on the premise that everything will break and failure is normal.
Q: How are these services paid for by front-end service users?
A: There's no single answer for that. Perhaps the SaaS app running on top sells stuff or advertises. If it is a multitenanted environment, the tenants need to pay and they need to have a strategy for recouping their costs.
Q: Would you mind giving us more details on how Salesforce handles multitenancy?
A: I'd refer you to the SIGMOD 2009 paper, "The Design of the Force.com Multitenant Internet Application Development Platform," by Craig Weissman and Steve Bobrowski.
Q: How can multitenancy be used for customization? Can this be part of the engineering work for the cloud or should it be kept as an application concern?
A: Again, the Weissman and Bobrowski paper describes a bunch about customization in a multitenanted environment.
Q: Will front-end disclaimers protect clients from the few laws trying to be enforced on cloud service data? Can a specific data item be removed quickly?
A: There's no one answer to this. Even in a privately controlled datacenter it's not easy to delete all the data you may want to delete in front of a subpoena.
Q: Considering that various cloud providers use different APIs, how would you advise to work on fine-grained cloud utilization?
A: This is simply an emerging area. Different providers will have different offerings (with different strengths and weaknesses). Hopefully, a dominant way of expressing a PaaS interface will emerge over time. There may be many of them, each with different strengths.
Q: Any comments on private cloud integrating with legacy, non-cloud regarding access control and resource throttling issues?
A: No really strong comments other than the economics will cause us to create mechanisms to surround and interface with legacy systems. You can never throw it all out at once.
Q: How is service sold? Is it based on usage? Or just flat monthly fees?
A: Different cloud providers will have different models.
Q: What typical cloud concierge services do you see in the future?
A: I just described these in the talk.
Q: What is the best cloud service provider in cost terms?
A: They are all different with different strengths and weaknesses. I work for one that is doing very well, though!
Q: A major obstacle for Europeans is that many cloud services are hosted in a legal context of the US that doesn't conform to EU privacy laws. Even if the servers are located in the EU, if the headquarters are located in the U.S....any ideas?
A: This is a great conundrum. I love the discussion of this in the Berkeley paper, "Above the Clouds: a Berkeley View of Cloud Computing."
Q: If we store our private data on the cloud, and then suddenly think of deleting that data, what is the assurance that the data has actually been deleted from the cloud?
A: It depends on the provider's policy and your trust in the provider. This is even more of a concern when faced with a "blind supoena" where the cops come to the cloud provider to get your data and you don't even know it!
Q: Does Auto Placement happen on demand? e.g., copying happens only after current resources are not enough?
A: This is yet to be defined… these mechanisms don't yet exist.
Q: Could you talk a little about law projects regarding public clouds? (Thanks! What a fantastic webinar!)
A: Thanks for your kind words. I would point you to the Berkeley paper for some interesting thoughts on the legal issues. It's a great paper.