On the tenth day of Christmas, my true love gave to me Ten lords a-leaping. The topic of Big Data is often encountered when talking about NoSQL so let’s give it a nod. In 1998, Sergey Brin and Larry Page invented an algorithm for ranking web pages (The Anatomy of a Large-Scale Hypertextual Web Search […]
Last few weeks I have been engaged with a customer, helping them them with remediation of Endeca project. During remediation, faced a typical challenge, where all the graphs and EQLs were erroring out. After doing some research found out that its a known issue . I spent good amount (more...)
There is an interesting article on Forbes where Paul Sonderegger from Oracle is making the case that you have to jump onto the “Big Data” bandwagon without delay if you want to avoid your big-data-using competitors crushing you.
But he would say that, wouldn’t he?
In reality, most companies already (more...)
[This entry is part 3 of 3 in the series Hadoop Streaming
In our first MapReduce with Hadoop Streaming in bash article, we took a collection of Stephen Crane poems and used a MapReduce job to calculate ‘term frequency’–meaning we counted the number of times each word in (more...)
[This entry is part 2 of 2 in the series Hadoop Streaming
In MapReduce with Hadoop Streaming in bash – Part 1 we found the ‘term frequency’ of words within a collection of documents. For the documents I chose 8 Stephen Crane poems, and our bash Map and Reduce (more...)
[This entry is part 1 of 1 in the series Hadoop Streaming
So to commemorate my recent certification and because my Java absolutely sucks, I decided to do a common algorithm using Hadoop Streaming.
Hadoop Streaming allows you to write MapReduce code in any language that can process (more...)
The Oracle guys running the Big Data 4 the Enterprise Meetup
are always apologetic about marketing. The novelty is quite amusing. They do this because most Big Data Meetups are full of brash young people from small start-ups who use cool open source software. They choose cool open source software (more...)
On his recent Forbes report, Greg Satell lays down 5 steps to get Big Data working in your business. The first four are very well captured, but it was the fifth that really caught my attention: “Adopt a Big Data Mindset“. This is exactly where I want to drill in (more...)
Thank you all who attended my sessions at NYOUG Fall Conference this morning. I appreciate spending you most precious commodity - your time
- with me. I sincerely hope you found both the presentations enlightening as well as entertaining.
Please see the details of the sessions below along with the (more...)
Yesterday's UKOUG Analytics event
was a mixture of presentations about OBIEE with sessions on the frontiers of data analysis. I'm not going to cover everything, just dipping into a few things which struck me during the day
During the day somebody described dashboards as "Fisher Price activity centres for managers". (more...)
This year’s R User Conference happened in Albacete (Spain), gathering R professionals and enthusiasts all over the world since 2004, when it first began in Vienna. The sponsors this year were REvolution analytics, Google, R-Studio, Oracle, and TIBCO. Other companies like OpenAnalytics and Mango Solutions were also present with a booth stand. Besides sponsoring the (more...)
Big Data – The New Information Before asking the crystal ball what can Big Data do for you, sit back and think about these four questions: Where’s the new information? Where could it be? If it was in the right place, what could happen? (Challenges of the main industries) What are the (more...)
Hello readers of my infrequent blog posts! I have started a new job, working on documentation for Cloudera, specifically for the Impala project, which is bringing fast interactive SQL to the Hadoop ecosystem. Read the Impala documentation. Download the Impala software. Get the QuickStart VM to play around with a (more...)
“It’s the analytics stupid!” Obviously the offense is not intended at the dear reader. It’s a wake up call for all the people excited with Hadoop and lack BI vision. The BI people that lack infrastructure vision are also to blame. Blame for what? We’ll see later in this (more...)
Since I joined a Big Data Event : Frankfurter Datenbanktage 2013 - I started to take also a look to non-relational technics too. The RDBMS is not for every asepct the correct and fitting and fulfilling answer to all data related IT challenges.
Frequently I wondered about how facebook (more...)
What’s all the fuss about Big Data?
Big Data is the collective term for very large and potentially complex data sets that are deemed to be so large that it’s difficult to handle the data using traditional tools and applications such as Relational Database Management Systems. Scientists in the fields of physics, genetics and meteorology were previous examples of those that encountered Big Data.