Oracle Corp at useR! Conference 2013 #useR2013 #rstats

This year’s R User Conference happened in Albacete (Spain), gathering R professionals and enthusiasts all over the world since 2004, when it first began in Vienna. The sponsors this year were  REvolution analytics, Google, R-Studio, Oracle, and TIBCO. Other companies like OpenAnalytics and Mango Solutions were also present with a booth stand. Besides sponsoring the (more...)

The 3 ways Hadoop will change your Business Intelligence

“It’s the analytics stupid!” Obviously the offense is not intended at the dear reader. It’s a wake up call for all the people excited with Hadoop and lack BI vision. The BI people that lack infrastructure vision are also to blame. Blame for what? We’ll see later in this (more...)

InfoQ : Running the Largest Hadoop DFS Cluster

Since I joined a Big Data Event : Frankfurter Datenbanktage 2013 - I started to take also a look to non-relational technics too. The RDBMS is not for every asepct the correct and fitting and fulfilling answer to all data related IT challenges. 

Frequently I wondered about how facebook (more...)

What’s all the fuss about Big Data?

| Mar 6, 2013

What’s all the fuss about Big Data?

Big Data is the collective term for very large and potentially complex data sets that are deemed to be so large that it’s difficult to handle the data using traditional tools and applications such as Relational Database Management Systems. Scientists in the fields of physics, genetics and meteorology were previous examples of those that encountered Big Data.



Everything you ever wanted to know about Big Data, but had no PDF to carry around!

Back in March 2012 I experienced an air milage overflow: almost straight from Madrid I’ve picked a flight to Israel to speak at a Big Data conference, only to be back in Lisbon and fly again to Johannesburg in South Africa to meet several customers in the retail and manufacturing area. Back to Lisbon I packed again to London [...]

Hadoop! What is it good for? Absolutely … everything!

In times of hysteria people tend to use their reptilian brain. This sub-brain, that has been with us since we were fish, or tadpoles, it’s what kicks in when we face the unknown. In computer science or information technology, organizations tend to hold down to emotions and less and less in reasoning. Could it be [...]

Linux 6 Transparent Huge Pages and Hadoop Workloads

This past week I spent some time setting up and running various Hadoop workloads on my CDH cluster. After some Hadoop jobs had been running for several minutes, I noticed something quite alarming — the system CPU percentages where extremely high.

Platform Details

This cluster is comprised of 2s8c16t Xeon L5630 nodes with 96 GB of RAM running CentOS Linux 6.2 with java 1.6.0_30. The details of those are:

$ cat /etc/redhat-release
CentOS release 6.2 (Final)

$ uname -a
Linux chaos 2.6.32-220.7.1.el6.x86_64 #1 SMP Wed Mar 7 00:52:02 GMT 2012  (more...)

Buzz Around Non-Relational DBs

Reposting from my other blog

Last Saturday we (GITPRO – Global Indian Tech Professionals Association) arranged Tech Talk on NoSQL (nonRelational actually) DBs and Scaling Hadoop. It was very well attended. In the general introduction session when many introduced themselves they told their interests in Hadoop and (more...)