Search This Blog

Monday, April 27, 2015

Apache Ignite (incubating) vs Tachyon

After the discovery that my explanation of the differences between Apache Ignite (incubating) and Tachyon caching project, I found out that my attempt to clarify the situation was purged as well.
About the same time I got a private email from tachyon-user google group explaining to me that my message "was deleted because it was a marketing message".

So, looks like any messages even slightly critical to the Tachyon project will be deleted as 'marketing msgs' in true FOSS spirit! Looks like the community building got off the wrong foot on that one. So, I have decided to post the original message that of course was sent back via email the moment it got posted in the original thread.

Judge for yourself:
From:  <kboudnik@gmail.com>
Date: Fri, Apr 10, 2015 at 11:46 PM
Subject: Re: Apche Ignite vs Tachyon
To: tachyon-users@googlegroups.com

You're just partially correct, actually.

Apache Ignite (incubating) is a fully developed In-Memory Computing (IMC) platform (aka data fabric). "Supporting for Hadoop ecosystem" is one of the components of the fabric. And it has two parts:
 - file system caching: fully transparent cache that gives a significant performance boost to HDFS IO. In a way it's similar to what Tachyon tries to achieve. Unlike Tachyon, the cached data is an integral part of bigger data fabric that can be used by any Ignite services.
 - MR accelerator that allows to run "classic" MR jobs on Ignite in-memory engine. Basically, Ignite MR (much list its SQL and other computation components) is just a way to work with data stored in the cluster memory. Shall I mention that Ignite MR is about 30 times - that's 3000% - faster than Hadoop MR? No code changes is need, BTW ;)

When you say about "Tachyon... support big data stack natively." you should keep in mind that Ignite Hadoop acceleration is very native as well: you can run MR, Hive, HBase, Spark, etc. on top of the IgniteFS without changing anything.

And here's the catch BTW: file system caching in Ignite is a part of its 'data fabric' paradigm like the services, advanced clustering, distributed messaging, ACID real-time transactions, etc. Adding HDFS and MR acceleration layer was pretty straight-forward as it was build on the advanced Ignite core, which has been in the real-world production for 5+ years. However. it is very hard to achieve the same level of enterprise computing when you start from an in-memory file system like Tachyon. Not bashing anything - just saying.

I would encourage you to check ignite.incubator.apache.org: read the docs, try version 1.0 from https://dist.apache.org/repos/dist/release/incubator/ignite/1.0.0/ (setup is a breeze) and join our Apache community. If you are interested in using Ignite with Hadoop - Apache Bigtop offers this integration, including seamless cluster deployment which let you get started with fully functional cluster in a few minutes.

In the full disclosure: I am an Apache Incubator mentor for the Ignite project.

With best regards,
  Konstantin Boudnik

On Thursday, April 9, 2015 at 7:39:00 PM UTC-7, Pengfei Xuan wrote:
>
> To my understanding, Apache Ignite (GridGain) grows up from traditional

2 comments:

  1. Cos! You have always been honest ever since I know you from JavaSoft days a decade ago.. Keep it going.

    ReplyDelete
  2. Congrats cos! Your post has included in hadoop weekly#119.

    ReplyDelete

Followers