Top 100 analytics companies ranked and scored by Mattermark

Let us move on from Grass Eating Sauropods and talk about who’s who in the analytic space.

For every dime there are dozen analytic companies. Everybody who provides a freaking dashboard is an analytic company. Anybody that merely mentions Google, Facebook, Hadoop etc in the same sentence is somehow into BigData. Haven’t you stumbled across company pages where they claim to be expert in analytics and big data but they want you to schedule a (more...)

Bed bugs in Boston – Analysis of Boston 311 public dataset

Digging into the Boston public Dataset can reveal interesting and juicy facts.

Even though there is nothing juicy about Bed bugs but the data about Boston open cases for Bed bugs is quite interesting and worth looking at.

We uploaded the entire 50 mb data dump which is around 500K rows into the Data Visualizer and filtered the category for Bed Bugs. Splitting the date into its date hierarchy components we then plotted the month (more...)

Using rlwrap with Apache Hive beeline for improved readline functionality

rlwrap is a nice little wrapper in which you can invoke commandline utilities and get them to behave with full readline functionality just like you’d get at the bash prompt. For example, up/down arrow keys to move between commands, but also home/end to go to the start/finish of a line, and even ctrl-R to search through command history to rapidly find a command. It’s one of the standard config changes I’ll make to any system (more...)

Say “Big Data” One More Time (I dare you!)

This is quick. Saw it on Twitter this morning and it is just too funny to not share: Best slide of #Strataconf already? pic.twitter.com/hy8LnEnWhk — Matt Aslett (@maslett) October 16, 2014 Have a great day!Filed under: Big Data, Quotes, User Groups Tagged: #bigdata, quote, Strataconf

Another Great OpenWorld

Steve at the Delphix Booth

Last week I attended Oracle OpenWorld 2014, and it was an outstanding event filled with great people, awesome sessions, and a few outstanding notable experiences.

Personally I thought the messaging behind the conference itself wasn’t as amazing and upbeat as OpenWorld 2013, but that’s almost to be expected. Last year there was a ton of buzz around the introduction of Oracle 12c, Big Data was a buzzword that people were totally excited (more...)

Oracle OpenWorld 2014 is over – What’s next?

Last week Oracle OpenWorld 2014 took place in San Francisco. I did not have the pleasure to attend this event. thanks to the Social Media and the World Wide Web you could be able to follow the highlights. If we check out the Keynote of Thomas Kurian, we can learn that there are three Major Trends; Big…Read more Oracle OpenWorld 2014 is over – What’s next?

Adding Oracle Big Data SQL to ODI12c to Enhance Hive Data Transformations

An updated version of the Oracle BigDataLite VM came out a couple of weeks ago, and as well as updating the core Cloudera CDH software to the latest release it also included Oracle Big Data SQL, the SQL access layer over Hadoop that I covered on the blog a few months ago (here and here). Big Data SQL takes the SmartScan technology from Exadata and extends it to Hadoop, presenting Hive tables (more...)

Major take-aways from Oracle OpenWorld 2014 – some relevant conclusions

image

Oracle OpenWorld 2014 is over. Just under a week, full to the brim with information, events, people, energy, plans, hopes and expectations. I have learned many, many things. Small things, important facts, huge insights and many great people. In this article, I will attempt to sum up the largest themes of the conference as I have interpreted them. In subsequent publications, I will focus on some of them as well as discuss less grand but (more...)

News and Updates from Oracle Openworld 2014

It’s the Saturday after Oracle Openworld 2014, and I’m now home from San Francisco and back in the UK. It’s been a great week as usual, with lots of product announcements and updates to the BI, DW and Big Data products we use on current projects. Here’s my take on what was announced this last week.

New Products Announced

From a BI and DW perspective, the most significant product announcements were around Hadoop and (more...)

Oracle Big Data Information Management Reference Architecture

A few months ago I wrote a blogpost about the Oracle Reference Architecture for Information Management. There is a new Oracle Big Data Information Management Reference Architecture online now. If you want to find out more about Oracle’s Big Data Information Management Reference Architecture please check the below links: Big Data Information Management Reference Architecture Oracle Information Management Architecture…Read more Oracle Big Data Information Management Reference Architecture

Responding in Real-Time with Big Data By Mala Ramakrishnan

| Aug 4, 2014

For an organization to respond in real-time it needs to acquire or develop systems
that can respond in real-time. Such systems need to be able to rapidly
determine that a response is required and determine also what the
appropriate and relevant response should be – they need to decide when
and how to act. These kinds of decision-making systems are known as
Decision Management Systems. To ensure that a response is delivered in
real-time, more (more…)

Permissions for both HDFS and local fileSystem paths

| Jul 18, 2014

Hi All,

Permission issues is one of the key error , while setting up Hadoop Cluster, while debugging some error found below table on http://hadoop.apache.org/ . It’s a good scorecard to keep handy.

 

Permissions for both HDFS and local fileSystem paths

The following table lists various paths on HDFS and local filesystems (on all nodes) and recommended permissions:

Filesystem Path User:Group Permissions
local dfs.namenode.name.dir hdfs:hadoop drwx——
local dfs.datanode.data.dir (more...)

Big Data doom mongers need to look outside of the marketing department

In every change there are hype machines that over play and sages who call doom.  Into the Big Data arena steps David Searls to proclaim that Big Data is a myth and simply hype which is set to burst in an article over at ZDNet. But big data, he said, is nothing more than the myth that collecting vast amounts of data can help companies know customers better than those customers even know

Editor’s Choice award at ODTUG Kscope14: NoSQL and Big Data for the Oracle Professional

My paper on NoSQL and Big Data won the Editor’s Choice award at ODTUG Kscope14. Here are some key points from the paper: The relational camp made serious mistakes that limited the performance and usefulness of the relational model. NoSQL is based on the incorrect premise that tables in the relational model must be mapped to […]

Hadoop for Oracle Professionals Article on Oracle Scene

Oracle Scene (the publication of United Kingdom Oracle Users Group) has published my article "Hadoop for Oracle Professionals", where I have attempted, like many others, to demystify the terms such as Hadoop, Map/Reduce and Flume. If you were interested in Big Data and what all comes with understanding it, you might find it useful.

A PDF version of the article can be downloaded here http://www.proligence.com/art/oracle_scene_summ14_hadoop.pdf

MDM isn’t about data quality its about collaboration

I'm going to state a sacrilegious position for a moment: the quality of data isn't a primary goal in Master Data Management Now before the perfectly correct 'Garbage In, Garbage Out' statement let me explain.  Data Quality is certainly something that MDM can help with but its not actually the primary aim of MDM. MDM is about enabling collaboration, collaboration is about the cross-reference

Lipstick on the iceberg – why the local view matters for IT evolution

There is a massive amount of IT hype that is focused on what people see, its about the agile delivery of interfaces, about reporting, visualisation and interactional models.  If you could weight hype then it is quite clear that 95% of all IT is about this area.  Its why we need development teams working hand-in-hand with the business, its why animations and visualisation are massively important.

How to select a Hadoop distro – stop thinking about Hadoop

Scoop, Flume, PIG, Zookeeper.  Do these mean anything to you?  If they do then the odds are you are looking at Hadoop.  The thing is that while that was cool a few years ago it really is time to face it that HDFS is a commodity, Map Reduce is interesting but not feasible for most users and the real question is how we turn all that raw data in HDFS into something we can actually (more...)

Need for Defining Reference Architecture For Big Data

Hi Fellow Big Data Admirers ,

With big data and analytics playing an influential role helping organizations achieve a competitive advantage, IT managers are advised not to deploy big data in silos but instead to take a holistic approach toward it and define a base reference architecture even before contemplating positioning the necessary tools. 

My latest print media article (5th in the series) for CIO magazine (ITNEXT) talks extensively about need of reference architecture in (more...)

Data Lakes will replace EDWs – a prediction

Over the last few years there has been a trend of increased spending on BI, and that trend isn't going away.  The analyst predictions however have, understandably, been based on the mentality that the choice was between a traditional EDW/DW model or Hadoop.  With the new 'Business Data Lake' type of hybrid approach its pretty clear that the shift is underway for all vendors to have a hybrid