Talend Joins Google to Propose Dataflow as an ASF Incubator Project
Dataflow will be the first Apache Software Foundation project offering a set of SDKs allowing the abstraction of the definition and execution of Data Processing/Pipes workflows, supporting complex Data Ingestion and Integration enterprise patterns including routing as well as data and message transformations. Among the various Big Data open source projects, the Data Processing space is probably the most active and promising. There are many Data Processing Engines/Frameworks out there, some are fully open source like Apache Spark, Apache Flink, Apache Apex while others are packaged and available as a service such as Google Dataflow. Most Apache open source projects combine streaming and batch data processing, and provide various levels of APIs to help programmatically develop pipelines or data flows. Google is helping to lead this charge with an abstraction layer that allows Dataflow SDK-defined pipelines to run on different runtime environments.