ZooKeeper Distributed Process Coordination By Flavio Junqueira, Benjamin Reed

Apache ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications.
Why you have to learn about Zookeeper? If you are using application as Hbase, Neo4j, Solr, (more...)

Learned – Data Science Boot Camp: Acquiring and Transforming Big Data

Found Video Lecture Series: Data Science Boot Camp. It's very great for learning. I hope readers will get benefit from it. I learned a bit about it Today and downloaded Example Code to play. Lesson 1: I learned to use Flume Agent and read data by Hive.
Terminal 1: run (more...)

Learned a bit – ACCUMULO – set classpath in accumulo-site.xml

Thank You for comment on my post -  Learned a bit - ACCUMULO. So, I modified "<name>general.classpaths</name>" in conf/accumulo-site.xml file. It works -:)
       Add the following for hadoop-2.0
       $HADOOP_PREFIX/share/hadoop/hdfs/. (more...)

Learned a bit – ACCUMULO – generate_monitor_certificate.sh

After learned a bit - ACCUMULO. I could monitor by http://localhost:50095/status. How to use HTTPS? It's very easy - Create Certificate and Modify accumulo-site.xml file.
[root@centos01 bin]# ./generate_monitor_certificate.sh
What is your first and last name?
  [Unknown]:  Surachart Opun
What is the name of your organizational unit?

Accumulo – Application Development, Table Design, and Best Practices

The NSA started building Accumulo in 2008 and used the Google Big Table architecture as a starting point. Accumulo is NoSQL database that is a simple key/value data store. BTW, Accumulo joined the Apache community in 2011.
Why Accululo is interesting? Security! It's developed from the BigTable data model and (more...)