To process large volumes of data you want to do the work in


Part 1: 150-180 word, critical reply to this forum discussion with references.

The different types of NoSQL databases can be listed as Key-Value store, document-based, column-bases store, and lastly graph-based. NoSQL databases is very flexible in storing data because it stores the data in its natural form. It does not alter, conform or try to strip down the data in order to make it compatible with the database.

We use applications hosted on this databases on daily basis such as LinkedIn, Facebook, our banking apps, google map, and etc. Linked-in uses document-oriented because it most likely uses JSON (JavaScript object notation) and it stores documents that are semi-structured.

Google Map is a great example of column-family table because even though it handles a large amount of data, yet not all the data it stores is necessary in a retrieval process. Banking applications rely more on the document-oriented database because in Banking, data is constantly being written or stored but, data is also retrieved as well. Lastly, Social Network uses graph database in order to easily navigate relationships.

Two main advantages of graph database are node and relationship. Social network is built on the basis of connections between users and creating multiple (millions and billions) connections. In Graph database "relationships take first priority in graph databases.

This means your application doesn't have to infer data connections using things like foreign keys or out-of-band processing, such as MapReduce" ((Neo4j, n.d.). The popular NoSQL databases on the market per category are as follows: Amazon S3-Dynamo was created for key-value store, CouchDB was created for document-based store, HBase and Cassandra were created for column-based store, and Neo4j was created for graph- based store (Kumar, n.d.).

Part 2: 150-180 word, critical reply to this discussion with references.

Hadoop ecosystem of software packages, including MapReduce, HDFS, and a whole host of other software packages to support the import and export of data into and from HDFS (the Hadoop Distributed FileSystem). It runs in clusters with large distributed file system to support large scale computation.

To process large volumes of data, you want to do the work in parallel, and typically across many servers. Hadoop manages the distribution of work across many servers in a divide-and-conquer methodology known as MapReduce. Since each server houses a subset of your overall data set, MapReduce lets you move the processing close to the data to minimize network accesses to data that will slow down the task.

Some of the futures of Hadoop are:

Capable of storing and processing complex data: Hadoop is Capable of storing and processing complex unstructured datasets with out of data loss.

Great Computational ability: Hadoop can ran multiple machines in parallel in distributed model for fast processing

Highly Scalable: Hadoop supports horizontal scalability, from single machine to thousands without having to administer extensively.
Lesser Faults: Hadoop minimize network failures by redirecting jobs to other nodes if one node fails

NoSQL, on the other hand, is referring to non-relational or at least non-SQL database solutions such as HBase. It is about real-time, interactive access to data. NoSQL use cases often entail end user interactivity, like in web applications, but more broadly they are about reading and writing data very quickly.

Some of the futures of NoSQL are:

Multi-Model: NoSQL database are very flexible when handling data. They can use structured, unstructured equally

Distributed: NoSQL database that is designed to distributed at global scale. It scattered in multiple locations involving multiple data centers for read and write operation

Easily Scalable: NoSQL is built with a master less peer to peer architecture with all nodes being the same. This improves performance and allowing for continuous availability and high read and write speed.

Solution Preview :

Prepared by a verified Expert
Database Management System: To process large volumes of data you want to do the work in
Reference No:- TGS02682356

Now Priced at $10 (50% Discount)

Recommended (94%)

Rated (4.6/5)