Wednesday, 17 June 2015

Pig use case - change data capture

Create following tables in MySQL and populate with data.

Tuesday, 16 June 2015

Cassandra General Questions


  1. What do you mean by "seed" exactly?
    A IP Address of a node in the cluster, configured into a node just before introducing it to the cluster

Cassandra Keys Primary, Partition, and Cluster IN(), and Range Queries


This post explains the new CQL3 terminology and model changes since the legacy thrift architecture. Examples/exercises given here assume your queries are not using ALLOW FILTERING for the fastest queries. ALLOW FILTERING, however, is used also when developing the highest performing systems for carefully selected and bench marked usage patterns.

Monday, 15 June 2015

Summary of Hadoop concepts

Basics of Hadoop

  • Apache Hadoop is a software framework.
  • Hadoop is based on the concept of divide and conquer.
  • It uses distributed processing for computation and storage.