- Einstein (programming language)
Introduction
Einstein is an open source forth generation programming language (
4GL ) written on top of theDeesel 3GL programming language. It is a flow based programming language supporting message based constructs such as:read/write/route/join/split.
The language is built around integration of multiple systems, parallel and distributed execution.
resource "time(schedule=cron):0/10 * * * * ?" everyTenSeconds; resource "time(schedule=cron):0/1 * * * * ?" everySecond; resource "stack:TimeStack" stack;
(<< "time:yyyy-MM-dd'T'HH:mm:ss.SSSZ" ) >> "std:The time according to Einstein is ";
listen everyTenSeconds "bool:true" (read all stack >> "std:"); listen everySecond "bool:true" (write stack);
The Language
Resources
Resources can best be compared to variables in 3GLs a resource is identified by a
URI .URI s can best be compared to types in Java they are (mostly) static, refer to actual resources and can provide functionality. Best, because Einstein is not a 3GL, similarities can be quite superficial.URI s in Einstein are almost the same as standardURI s (withURL s as a subset), except they support two extra features:Metadata - that is information which relates to a given provider (orscheme inURL /URI parlance).Nesting - aURI can have an indefinite amount of nesting.So a
URI in Einstein could look like:“cache(timeout=2000):dynamic(lang=groovy):jms(impl=RabbitMQ):$payload.destinationAddress”
Messages
All data in Einstein takes the form of a message. A message contains a payload and associated metadata which relates to such concerns as routing, security and so on. The payload takes the form of a DataObject, DataObject’s are a means to interact with payloads without knowing the nature of the data in the payload. So one may query, join, split, etc a payload without knowing whether it is XML or an object graph. This abstraction is essential to allow the instructions and operators of the language to perform the same operations no matter what data it is dealing with. Once you have written a splitting router for example (by combining the split and route instructions) it will do that operation to any data it receives, improving re-use.
Messages themselves do not directly expose their state, an action must be passed to a message, and depending on the action, the appropriate state is passed back to the action. This inversion of control helps to make sure that, as much as possible, the interactions with a message can be audited. This allows for easy auditing and debugging of a message’s lifecycle.
Execution Groups
Einstein has three different types of execution groups: sequential, list and map. The primary reasons for the fours groups are to support parallel processing and routing.
Flow
This is the usually way that instructions are executed in Einstein, a flow group looks like this:
{ read "my:firsturl"; execute "java:org.me.MyService"; write "my:secondurl"; }
The braces denote a sequential block, in a sequential block the result of the previous instruction is the input to the next instruction.
Tuple A
tuple based execution group allows instructions to be invoked independently of each other with the combined results being available after execution. For example:[read "text:1", read "text:2", read "text:3"]
Will produce a tuple of ["1","2","3"] ; the order of execution is not guaranteed and may or may not be parallel, the order of the results is guaranteed to match the order of the instructions. The input for each instruction is the result of the previous sequential operation.
Map
Map based execution is the most flexible, it can work in one of two different ways. A map based execution group can have its members referred to by the routing instruction, i.e.
route "java:org.me.MyRouter" [ red : write "jms://redQueue", blue : write "http://blueserver.com/Service" ]
But also a Map based group can be executed directly:
[ red : get "text:roses", blue : read "text:violets" ]
In the above example the result would be a map of ["red":"roses", "blue" : "violets"] . Maps are essentially named tuples.
Un-sequenced/Competing
A competing group has all instructions executed at the potentially the same time; however, unlike tuples, the value of the group is the first result returned, not the combined results. This is useful when accessing multiple resources for the same data (e.g. price feeds).
( read “http://myfirstserver.com/PriceFeed”, read “http://mysecondserver.com/PriceFeed” )
The Models
Data Models
Data Models allow Einstein to interact with a wide variety of rich data types without understanding the semantics of the underlying data set. So for example you can split an XML fragment in the same way that you'd split a java.util.List. By making an abstraction for instructions to interact with various data models it allows easy inter-conversion of data types and very high reusability of code.
Execution Models
Execution Models are another very important aspect of Einstein, they provide an abstraction of how instructions will be executed. Execution models can be entirely user defined - allowing the potential for Einstein to execute on a variety of different platforms - from simple, direct execution through SEDA to grid/fabric models. Execution Models themselves provide the model for distributed variables, stacks, execution and communication - all the features of a system are affected by the Execution Model.
The Execution Model specifically provides means for performing these actions:
• Executing one or more instructions. • Iterating over a data set with one or more instructions. • Manipulating a stack which relates to the means of execution. • Manipulating variables in a way which relates to the means of execution.
Why would you require multiple data models? Well unlike a low level programming language Einstein is designed to work across multiple processors and physical nodes. The reason for that is, simply, that applications designed to scale need to spread across all processing resources available to them - but not all applications and not all parts of applications need to scale in the same way. If you need to do a fast local iteration you don't want the instructions to be executed in the same way as the parts of the application that are distributed across a grid of processing nodes.
The most common execution models are : immediate/direct, multi-threaded and distributed.
Variables
A variable in a direct execution model is trivial to implement, but what about a distributed system. Well we still need to make use of the concept of variables, but now we're more likely to use a distributed state management system to implement them, like Coherence , ehcache or Terracotta.
Transaction Model
The Transaction Model determines how transactions should be started and managed.
Exception Model
The Exception Model determines what should be done when an error occurs during an instructions execution, this may include retrying, rolling back transactions and so forth.
External links
* [http://einstein.codecauldron.org Project Website]
Wikimedia Foundation. 2010.