• 28Feb

    Why Self Bootstrapping Deployments?

    Regardless of whether you are operating within a cloud environment or not (internal Eucalyptus or external AWS), I have always preached the need to architect software in a cloud friendly manner. Ensuring your deployment process is cloud friendly is one of the most important steps.

    Historically sysadmins and development teams rely on a server to pre-configured prior to deploying their application. Therefore things such as java, application container of choice (tomcat), libraries, file paths, etc are usually either manually done when a new server is provisioned or an image is built after all that work is done. This approach has the following downsides:

    • When upgrades to an OS are needed, it requires manually making a new image
    • When upgrades to java or the application are needed, it is a manual process outside of deployment and still requires the making of a new image
    • Tracked changes of overrides to configuration files or other such actions are usually not captured or held with one person’s knowledge

    When running in the cloud, new machines come and go, and we need processes that adapt along with that. RightScale has done a great job with their right scripts where a pull model is done. A small base image is made so that when a server boot, it contacts RightScale infrastructure and automatically runs any install/deployment scripts to get the server to a ready state.

    The aforementioned approach is ideal, however can be overkill in smaller enterprises or when not dealing with a full blown cloud infrastructure. However, this doesn’t mean we can’t use a solution that gets the vast majority of the benefit in a cloud friendly manner

    Practical Cloud Friendly Approach

    Instead of baking knowledge into an image for servers to contact a deployment engine when they boot, we’ll instead deploy to set of known serves, however to grow/shrink only involves modifying a configuration file. The basis is as follows:

    • Third party distributions (java, app container, etc) as well as the actual in house application to deploy should all be posted to a web location or Maven repository as part of your build procedures.
    • All deployment scripts, configuration, SQL, etc. should be also included in your distribution and hosted at the same web location or Maven repository.
    • When a deploy is performed, each server is logged into (some in parallel) and the appropriate scripts are fetched using wget from the distribution repository, ran, and then cleaned up. If a script needs other files (such as the JRE distribution) that is also fetched from the distribution repository as well.

    I used ANT along with shell scripts to perform the task with the following flow generally happening:

    • deploy - deploy java, app container, hyperichq, and in house app to the target servers
    • stop - stop app container
    • ddl - execute DML
    • dml - execute DDL
    • activate - change softlink on target servers to point to the pending deployment
    • start - start app container (if it is running, it will be stopped first)

    A zip file containing all the necessary scripts and tools to deploy a Web App inside an application container such as JBoss is linked at the end of this article.

    Benefits

    There are many of benefits from using the aforementioned process, of which I will highlight my favorite:

    • No Secrets - all configuration, modifications, steps, etc. are self-documented in your source control repository and automated as part of a deployment. Therefore no hidden surprises.
    • Everything is a Deployment - whether you’re rolling out a regular release, upgrading java, or switching to a whole new server farm, nothing changes! Since it’s all part of the deployment process no special action is needed.
    • No Ghosts - Sometimes people will make changes to your environment (ie. tweak a java config) and it is unknown until you move servers or upgrade java. Since with each deployment, a fresh copy of everything is used, there are no “ghosts” or artifacts left behind from the previous application.

    Zip file containing example implementation of model talked about in this article.

  • 20Jan

    After settling down from the holidays and getting into my new role at work, there were a few news items from January of the new 2010 year that stood out.

    Terrastore is a new distributed document store that grabs my interest due to my years working for Thomson Reuters and continually looking for better ways to store datasets in a scale friendly way. Terrastore looks to accomplish what I was trying to accomplish with Project Voldemort was promising with the additions of the following:

    • add and remove nodes dynamically to/from your running cluster with no downtime and no changes at all to your configuration
    • Install a fully working cluster in just a few commands and no XML to edit
    • Terrastore automatically scales your data: documents are partitioned and distributed among your nodes, with automatic and transparent re-balancing when nodes join and leave.

    The above are items that really help make managing such a system, to be hands off, which is the only way to survive in the cloud. Additionally having the cluster up and running without XML configuration files that have shared values and need tweaked when you start the cluster is a big positive as it helps in self bootstrapping configurations.

    If you haven’t heard Cliff Click talk yet, then fix that and watch one of his presentations. Recently at the JVM Languages Summit in 2009, he presented A Crash Course in Modern Hardware which takes me back to my Computer Organization classes in college, with the added information of recent processes and I/O buses.

  • 06Dec

    I’ve been quite busy lately with the Thanksgiving holiday and additionally I recently changed jobs. I now work for Global Crossing and am excited at the new challenges and opportunities in front of me. Recently I was able to get through my various industry articles and readings. I would like to highlight some really important and worthwhile ones here:

    • Facebook is not following Google’s lead and moving towards batteries built-in to the power supplies of each server.
    • Some details were released about the backend of WoW, pretty impressive.
    • If you have not been following the NoSQL movement over the past year or so, Jonathan Ellis put together a great article summarizing the main players in this space. After reading Jonathan’s post I’m going to look into Cassandra more, as previously I had favored Project Voldemort, however the ability to add/remove live nodes is important to me.
    • Randy Shoup, Distinguished Architect for eBay, gave a wonderful presentation on eBay’s Challenges and Lessons from Growing an eCommerce Platform to Planet Scale (PDF | video).
    • Nati Shalom put together a nice article on the relationship between disk and RDBMS and why it inherently makes databases so easy to break.

    I found Randy Shoup’s presentation the most interesting out of my readings this weekend. Anytime experience can be summarized into general recommendations for other people in the industry, it is huge win for all and makes the information more worthwhile for all.

  • 22Oct

    Jeff Rothschild, Vice President of Technology at Facebook, recently gave a presentation entitled “High Performance at Massive Scale – Lessons learned at Facebook” to the University of California, San Diego. Jeff did a great job summarizing the challenges Facebook had to overcome. The following items I found of most importance:

    • They are moving towards a PHP compiler that creates machine code and plan on gaining a 40% performance gain once it is complete.
    • Even though MySQL is used exclusively for database storage, not a single JOIN is performed. Instead they do manipulations in the web/service tier and feed the cache.
    • They average 250,000 cache requests per server

    I was a little disappointed that Jeff did not talk about managing deployments and application releases, which is always tricky at large scale.

    Direct Link To Video: http://video-jsoe.ucsd.edu/asx/JeffRothschildFacebook.asx

  • 21Oct

    At the end of last month, I gave a presentation at Rochester Institute of Technology (RIT) to the Computer Science Community (CSC) on behalf Thomson Reuters. I decided to focus it on technologies and best practices that the students will most likely not experience until working full time in a corporate environment. Therefore the presentation was broken down into three main parts:

    • Thomson Reuters and Westlaw overview
    • Enterprise Development
    • Grails

    I also made many references to “Cloud Friendly” design of software, which I think is extremely important regardless of whether you intend to run your application in the cloud or not. I will go into this in more detail on a future post. The audience was receptive and I was happy to see many questions answered and positive remarks.


  • 04Apr

    I was lucky enough to catch a lecture by Vinton Cerf, Vice President and Chief Internet Evangelist at Google, yesterday at RIT. He was a very energetic speaker and I enjoyed his talk very much. Below are some of the highlights:

    • “IP version 4 only has a 32 bit address space and that is my fault.”
    • Dr. Cerf wants our mobile phones to be viewed as controllers to all devices around us that will eventually be on the internet (home stereo, lights, projectors, etc.)
    • Dr. Cerf uses Arch Rock products in his home to monitor temperature and humidity in every room of his house.
    • There are multiple cloud vendors/networks available today however they are all different and have different interfaces. He views this the same as how in the days of the ArpaNet when networks were different and he helped to create TCP/IP and gateways and therefore believes it is an excellent area for research today.
    • Bit Rot was stated to be another huge problem that we will face in future years and generations

    I was also lucky enough to be able to ask a question to Dr. Cerf directly in the Q&A part of the lecture. My question was how to innovate and create a common interface between the cloud platforms when the internet has become so commercialized as opposed to back when Dr. Cerf worked on the ArpaNet and it was basically a military experiment. His response was that in academia the comercial and political boundaries are not always present and leveraging organic growth and the open source community support is the best way to accomplish the task.

    If Dr. Cerf comes to a institution near your area, I highly suggest hearing him speak.

  • 04Apr

    The Cloud Computing Expo was quite the event and extremely tiring. Long days and lots of knowledge was shared. Below are some highlights from my notes and experiences.

    Werner Vogel (Amazon)

    • Back in 2001 Amazon.com engineers joked that Amazon.com systems were built with WD40 and Duct Tape
    • “We quickly learned databases don’t scale”
    • With each user request to Amazon.com, approximately 200-300 internal services are hit
    • Each service has a development team and that team is responsible for everything with that service, including deployment and operations
    • Wanted services to be tools not frameworks, and therefore be free standing so a customer can use S3 without needing to use any other AWS
    • Stax allows enterprises to take existing J2EE apps run directly in AWS

    Dave Douglas (Sun)

    • Dave had a great presentation, he opened with a top 10 list of “Things You Didn’t Know About Cloud Computing” of which I would like to highlight three:
      1) Al Gore invented cloud computing in 1989
      2) Amazon only has three customers: Animoto, SmugMug, and the NY Times
      3) IBM blank blank Cloud Computing blank blank JCL blank blank fully punch card compatible
    • Sun Cloud will be open summer 2009
    • Dave showed a prototype UI for assembling services in the cloud. Concept is you want a scalable/reliable database, no need to set it up and configure it, just drag a DB icon into your diagram.
    • The Sun Cloud API is definitely worth checking it out. It is a REST / JSON API licensed under Creative Commons and some cool new attributes like being completely self discovering after initial request
    • OpenOffice will soon have a “Save to Cloud…” option. The goal Sun has is to bring the notion of the cloud up to the end user level.

    Reuven Cohen (Enomaly)

    • Reuven gave a brief overview Enomaly and his involvement with cloud computing
    • Talked about how in the talks he had with the various players in the industry during forming the Cloud Manifesto, companies did not want to be open in their discussions; hence the core problem.
    • The majority of his session was open discussion by everyone there and it was proposed that a Customer Council was needed to a unified voice of the community can be presented.
    • I found this session to be quite enjoyable due to the open discussion nature and hearing everyone’s remarks

    Doug Tidwell (IBM)

    • Doug is an amazing presenter with amazing technical talent, but can keep a room laughing
    • Doug presented on Service Component Architecture (SCA) and used Tuscany to illustrate it
    • As software is developed to run in the cloud and use cloud services, using SCA is more important than ever
    • Doug had a twitter feed up during his presentation and encouraged the audience to leave tweets and he checked it throughout the presentation. It was quite interesting.

    Other items and areas that are worth mentioning:

    • Majority of folks representing large enterprises are still waiting for the following two items before moving to the cloud
      1) Better security and certifications
      2) Ability to run 50% in one cloud and 50% in another cloud so they can handle disaster situations where one cloud experiences trouble or goes bankrupt
    • We have reached the peak of hype with cloud computing and are now in a disillusionment stage
    • RightScale is working on integration with Eucalyptus and will be announcing full details and services end of April!
    • When in the cloud, software load balancers are really your only option and therefore become crucial, as hardware load balancers are not an option. Zeus was there touting their products.
    • Microsoft’s Azure Services Platform looks very impressive. I am looking forward to digging into it deeper and hopefully breaking it down on here.
  • 26Mar

    I sat down this evening and looked at the schedule of sessions at the Cloud Expo next week. I’m looking forward to it all as well as meeting up with some colleagues in Thomson Reuters at the Times Square office.

    3/30 - Monday

    3/31 - Tuesday

    4/01 - Wednesday

  • 21Mar

    I’ll be attending the Cloud Computing Expo in NYC at the end of the month. I’ll be there all three days thanks to a complimentary VIP Gold Pass from ParaScale. I’m looking forward to hearing great presentations, meeting lots of people, and partaking in engaging discussions.

    If you are in NYC for this event, send me an email.

  • 21Mar

    Amazon Web Services (AWS) has long been the forerunner of the cloud computing industry. The people who research, follow, and use AWS are usually an enthusiastic group and love to talk about it and how it can be applied to applications and architectures. There are various AWS User Groups out there and I was lucky enough to find out there was one in my area (Rochester, NY).

    Earlier this week I attended a Rochester Amazon Web Services User Group (RAWSUG) event hosted at RIT. Although the majority of what was covered was an overview presentation of AWS for those not familiar with it, it was great to see a room full of local AWS enthusiasts.

    The presenters were Mitch Garnaat from Cloud Right and David Kavanagh from Direct Thought. They did a great job presenting and I enjoyed asking them questions and meeting them afterwards. Chris Moyer, from Cloud Right, also joined to support questions. Here are some of the items I noted that stood out from the session:

    • Ability to mount read only version of an EBS filesystem to more than one machine has been talked about Amazon and will potentially be on the radar for release in the future.
    • David has been working on a slick iPhone interface for AWS and gave a demo. I liked it, however wished effort would have been put towards a mobile web version instead so that it can benefit all mobile platforms.
    • Confirmation that Amazon will be providing a hardware load balancing solution; availability date was TBD

    I enjoyed the event greatly and look forward to the next one and potentially working with David, Mitch, and Chris in the future. If you are looking to master the cloud, finding a group of other cloud enthusiastis that you can meet with and share news/ideas is a great approach. I highly recommend checking out the AWS User Group page and see and attend an event near you.

« Previous Entries   

Recent Comments

  • Very cool. I'm jealous. :)...