Comparing Spark Connectors for MongoDB and Hadoop

Apache Spark is one of the fastest-growing big data projects in the history of the Apache Software Foundation.

With its memory-oriented architecture, flexible processing libraries and ease-of-use, Spark has emerged as a leading distributed computing framework for real-time analytics.

Here is Comparing Spark Connectors for MongoDB and Hadoop

MongoDB Connector for Hadoop

Stratio Spark-MongoDB Connector

Machine Learning
Yes Yes

SQL

Not currently

Yes

DataFrames

Not currently

Yes

Streaming

Not currently Not currently
Python

Yes

Yes
Using SparkSQL syntax

Use MongoDB secondary indexes
to filter input data

Yes

Yes

Compatibility with MongoDB
replica sets and sharding

Yes

Yes

MongoDB Support

Yes
Read and write

Yes
Read and writ
HDFS Support

Yes
Read and write

Partial
Write only

Support for MongoDB BSON
Files

Yes

No

Commercial Support

Yes
With MongoDB Enterprise Advanced


Yes
Provided by Strati


* Source MongoDB Whitepaper

Comments

Post a Comment