Transactional integrity is relevant only when data is modified, updated, created, and deleted. Therefore, the question of transactional integrity is not pertinent in pure data warehousing and mining contexts. This means that batch-centric Hadoop-based analytics on warehoused data is also not subject to transactional requirements.

Many data sets like web traffic log files, social networking status updates (including tweets or buzz), advertisement click-through imprints, road-traffic data, stock market tick data, and game scores are primarily, if not completely, written once and read multiple times. Data sets that are written once and read multiple times have limited or no transactional requirements.

Some data sets are updated and deleted, but often these modifications are limited to a single item and not a range within the data set. Sometimes, updates are frequent and involve a range operation. If range operations are common and integrity of updates is required, an RDBMS is the best choice. If atomicity at an individual item level is sufficient, then column-family databases, document databases, and a few distributed key/value stores can guarantee that. If a system needs transactional integrity but could accommodate a window of inconsistency, eventual consistency is a possibility. HBase and Hypertable offer row-level atomic updates and consistent state with the help of Paxos. MongoDB offers document-level atomic updates. All NoSQL databases that follow a master-slave replication model implicitly support transactional integrity.

Source of Information : NoSQL


Subscribe to Developer Techno ?
Enter your email address:

Delivered by FeedBurner