- Open Source Data Integration
The Open Source Data Integration framework from the [http://snaplogic.org SnapLogic] project [cite web|url=http://www.snaplogic.org|title= Open Source Data Integration Framework] is an
open source framework for enterprise scaledata integration . The framework is used to design and run data integration processes in pipelines on an open source server, using components built with the framework's open source graphical designer or coded using Python scripts.Use cases
The open source data integration
framework addresses the need for repurposing different sources of data within an enterprise. Rather than hand-coding scripts using ad hoc tools and languages, the framework provides developers with a coordinated method of accessing, assimilating and publishing data. It encourages reuse of components and processes without unnecessarily burdening developers. It uses a RESTful architecture and standards-based interfaces for optimum continuity. Data integration use cases includedata migration ,data refactoring ,enterprise mashups ,business analytics andweb services . The framework is designed for all levels of technical and business users, for a manner of open source office data.Pipeline metaphor
Pipelines are created in the framework's designer tool using resources built from standard open source and custom components. Pipelines transform data from one or more graphically expressed resources. The resources are instances of components and are easily customized in a forms-based
GUI . The Python scripting language allows developers to create additional components and run pipelines in a command line mode.Pipelines are executed in a lightweight server process running directly alongside a database, on an application or file server, or on a standalone integration server across a network. The server portion of the framework is open source and instances can be distributed or centralized. In the distributed case, pipelines can be run on the same server as the data resides. All implementations provide data administrators the control they desire, while securely providing business users with the data they need through URIs. In the distributed case, one or more dedicated servers can securely access resources anywhere on the network.
Key features
The open source data integration framework provides:
* A browser-based drag-and-drop interface for combining data integration connectors and transformers with pipelines.
* A full programmatic interface in Python for connectors and customizing pipelines.
* A command line interface to allow pipelines to run from shell commands, cron jobs, and other interfaces.
* A metadata repository for storing connectors, transformers and pipelines for re-use.
* A collection of free connectors, transformers and pipelines and examples for common data sources like QuickBooks, SalesForce, Oracle and Apache.
* A collection of free extensions, such as the PHP extension package.
* An open source server to run connectors, transformers and pipelines.Additional features
The framework contains a collection of standard interface and transformers.
Connectors:
* database readers / writers
* RSS readers / generator
* XML reader
* JSON writer
* CSV reader / writer
* fixed width file reader / writerTransformers:
* join
* sort
* filter
* lookup
* aggregate
* merge
* mixer
* sequence generator
* type converter
* date dimension
* user defined computationsImplementation
The browser-based, drag-and-drop designer interface requires a Web browser running the Adobe Flash plugin. Either Python 2.4.x or 2.5.x is required. Python 2.5.1 is recommended for new installations. The open source framework with open source server stack is downloadable from the community site in a self-installing packager.
Supported operating systems
* Windows XP and Server 2003
* Ubuntu 6.10/7.04 (server or desktop)
* Fedora Core 6
* RedHat Enterprise Linux 4/5
* CentOS 4 / 5References
External links
* [http://snaplogic.org Project web site]
* [http://blog.snaplogic.org Project blog]
* [http://alexfletcher.typepad.com/all_bets_off/2007/04/snaplogic_innov.html Open Source Unleashed blog]
Wikimedia Foundation. 2010.