Talend’s Offering for Big Data ETL

August 13th, 2013 by Chris Brinkman Leave a reply »

Everyday more companies are starting to realize the potential that big data can deliver in regards to gaining insights into very large volumes of data faster than ever before. The ability to perform this type of analysis on larger than ever volumes of data is very tempting to many corporations. The problem for many companies is that the landscape of products, solutions and strategies designed to work with big data is very crowded and seems to be changing on a daily basis.  Many of these solutions are similar in that they promise to allow relatively non-technical users the ability to perform this type development without the normal learning curve.

Talend’s open source product offering called Talend Open Studio fits into the above landscape promising the following features:

  • A solution for developing, testing, and deploying data management projects
  • A unified platform that makes data management and application integration easy
  • Developers achieve vast productivity gains through an easy-to-use graphical environment
Talend Open Studio

Talend Open Studio

Talend Open Studio, in short, allows developers to drag, drop and connect common components that are used in an ETL process.  Talend Open Studio then generates and submits MapReduce programs that run against your big data store.  MapReduce is the core programming model used to process large data sets.  The benefits to a solution like this are:

  • Ability to develop MapReduce programs, which are complicated by themselves, using a graphical user interface
    • This allows your development staff to become productive in MapReduce quicker, without the normal learning curve, training, etc.
  • Talend Open Studio works with all the leading Hadoop offerings like Amazon, Cloudera, Greenplum, and Hortonworks
  • Talend Open Studio integrates with all the popular existing Hadoop tools like HBase, Hive, Pig, Sqoop, Cassandra, etc.
    • So if there is already and investment in any of these tools that investment is not throw away and can be used with Talend’s offering

While Talend Open Studio is an open source solution and can be downloaded at no cost there is no support provided with the free version of the solution.  Talend offers subscription based products that provide the same functionality as the free version but also includes a warranty, support from Talend and additional features like versioning, dashboards and a shared repository.

Hadoop is clearly becoming the solution of choice for companies interested in performing deep analysis on very large volumes of data.  One of the challenges is a shortage of development talent that understands and has worked with the plethora of Hadoop based offerings.  Talend’s Open Studio is a solution that is targeted at this market and provides features and capabilities that shorten the learning curve in implementing this type of big data analysis.

Does your company have a big data initiative?

If yes, what tools/programs do you use to manage big data and why? Please share your thoughts!

Advertisement

Leave a Reply