Cloud Workflow Language Overview
Cloud workflows are written in a Cloud Workflow Language. Why a new language? The short answer is that the runtime characteristics of a process are fundamentally different from a typical program execution. Processes tend to execute over long periods of time in an event driven manner, much of the time the process sleeps waiting for the next event to occur at which point it wakes up, makes some decisions and kicks off new actions, and goes to sleep again. Each time it wakes up, the process can be executed by a different server such that changes in load, platform reconfigurations, and machine failures don't affect process execution. Cloud workflows are also often parallel; an event may kick off parallel actions. Sometimes several parallel strategies are initiated and the first one to succeed is pursued while others are aborted. Cloud Workflow languages are designed to support these characteristics natively, thereby simplifying the writing of robust and powerful workflows. However, an important design criteria is to make it as similar to regular scripting languages as possible to make it intuitive even for users not usually exposed to authoring workflows. The hope is to create a language that is easy to pick up and consistent but also exposes all the basic constructs that are needed to build powerful workflows.
At the core, a Cloud Workflow Language allows writing a sequence of expressions. Such expressions may describe the control flow of a process or may specify actions that need to be taken on resources. A cloud workflow is thus a sequence of expressions executed sequentially (which is not saying that all activities must be executed sequentially; for example, the concurrent expression allows for running activities concurrently).
Before going further into the details of the language, here is an example of a complete cloud workflow that can be used to launch servers in order (database first, then application servers). This cloud workflow uses an application name given as an input to find the database servers by name and to build a tag used to retrieve the app servers.
define launch_app($app)
  concurrent timeout: 30m do # Launch database master and slave concurrently and wait for up to 30 minutes 
    sub do # Launch database master 
      @db_master = rs_cm.servers.get(filter: ["name==" + $app + "_db_master"]) # Get server using its name 
      assert size(@db_master) == 1 # Make sure there is one and only one server with that name 
      @db_master.launch() # Launch server 
      sleep_until(@db_master:state[0] == "operational") # Wait for it to become operational 
    end 
    sub do # Launch database slave 
      @db_slave = rs_cm.servers.get(filter: ["name==" + $app + "_db_slave"]) 
      assert size(@db_slave) == 1 
      @db_slave.launch() 
      sleep_until(@db_slave.state == "operational") 
      @@slave = @db_slave # Save reference to slave in global variable so it can be used later 
    end 
  end 
 
  # This won't execute until both servers are operational 
  @@slave.run_executable(recipe_name: "db::do_init_slave") # Init the slave 
  @apps = servers.get(filter: ["name==appserver"])   # Retrieve all the servers with appserver in their name ... 
  @apps.launch() # ...and launch them 
 
end 
The code should be fairly self-explanatory, a few hints that may help:
- Variables whose names start with a $symbol contain JSON values (strings, numbers, booleans etc) while variables whose names start with an@symbol contain collections of resources (deployments, servers, instances etc.). Variables that contain collection of resources are referred to references to differentiate them from variables which contain JSON values. The language also supports global variables prefixed with$$and global references prefixed with@@whose values are accessible to the entire process definition and do not follow the usual scoping rules.
- Resources have actions and fields that can be called with the .operator (e.g.@db_server.launch(),@db_master.state). Actions take parenthesis while fields do not.
- Code written in Cloud Workflow Language always deals with collections of resources and never with a single resource. This explains why @apps.launch()ends up launching all the application servers.
Resource actions (launch() and run_executable() in the definition above) are a special kind of expression that allows interacting with external systems such as the RightScale platform. A resource encapsulates external state and its actions allow managing that state. The state is made available through fields (e.g. state in the definition above). This is similar in nature to objects (resources), methods (actions) and members (fields) in an object oriented language with the distinction that fields are read-only and can only be updated through actions.
Resources can be located using the resource type actions (get() in the definition above).
Functions are built-in helpers that provide logic that gets run in the engine itself (size(), sleep_until()) in the definition above).
Finally, expressions can also be adorned with attributes, which allow for attaching additional behavior to the execution of the expression (such as error handling, timeout etc.). In the definition above the timeout attribute is used to guarantee that the process will not wait for more than 30 minutes for both database servers to become operational.
Put together, resource actions, resource fields, resource type actions, functions and attributes make up the bulk of the language.