Past few months I have been meeting with clients and discussing their potential need of Big Data. The discuss gets to the bottom of , do they really need the Big Data ? The below link to my ITNext article talks about As big data goes bigger,IT managers are challenged with the task of identifying data that qualifies for big and finding appropriate solutions to process it.
Click Here To Read Full Article (more...)
While doing a comparison analysis for building a reference architecture for Big Data technology stumbled on a very impressive Open source Big Data Technology mashup . Thanks to http://www.bigdata-startups.com/ . The most impressive part of this mashup is breaking the whole Big Data operational paradigm into multiple stages and giving available opensource technology.
Hope This Helps
Sunil S Ranka
“Superior BI is the antidote to Business Failure”
Recently at one of the client, we had a situation , where in Hadoop was taking lot longer that anticipated time to generate a file. The graph needed the file as an input, but since file was not getting generated on time, Endeca graph was picking up partially created file, causing data issue. After looking into the issue, the best bet was to have a task dependency, we looked into running clover ETL (more...)
Last few weeks I have been engaged with a customer, helping them them with remediation of Endeca project. During remediation, faced a typical challenge, where all the graphs and EQLs were erroring out. After doing some research found out that its a known issue . I spent good amount (more...)
Wishing you all readers!! a very happy new year. 2013 is over and dawn of 2014 has arrived. It just feel like yesterday and now we are here sitting and waiting for the year number to change. By the time I am writting blog, Australia, Mumbai and Dubai (more...)
Volume on BigData being the constant challenge, as an administrator, you will have to keep a tab on the data growth, at the same time you need to make sure there is spurge growth of unwanted objects or folders. Typically you would want to be worried about the (more...)
Recently I have spending most of my time on Big Data projects,using CDH 4.X. Understanding key component of hadoop infrastruture is very necessary, But the MapReduce (MR) is the most important for processing and aggregrating data. For getting the best of the performance, one needs to know (more...)
With replication and fault tolerance, an inbuilt feature of Hadoop. I was always curious to know how blocks are replicated. Got this information while reading “Hadoop The Definitive Guide Edition – 3 ” in chapter 3 “The Hadoop Distributed Filesystem”. Thought would be interesting to share.
Out of the box hadoop provides a benchmarking mechanism for your cluster. While doing the same on Cloudera cluster, it was a fun ride, hence thought will share the same to reduce the pain and increase the fun.
Before you begin anything, set the HADOOP_HOME.The below command (more...)
While while running a simple
beeline > select count(*) from samvi_test_table;
Got the following error.
Job Submission failed with exception 'org.apache.hadoop.security.AccessControlException(Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker. (more...)