Connecting OBIEE 11.1.1.9 to Hive, HBase and Impala Tables for a DW-Offloading Project

In two previous posts this week I talk about a client request to offload part of their data warehouse top Hadoop, taking data from a source application and loading it into Hive tables on Hadoop for subsequent reporting-on by OBIEE11g. In the first post I talked about hosting the offloaded data warehouse elements on Cloudera Hadoop CDH5.3, and how I used Apache Hive and Apache HBase to support insert/update/delete activity to the fact and (more...)

Loading, Updating and Deleting From HBase Tables using HiveQL and Python

Earlier in the week I blogged about a customer looking to offload part of the data warehouse platform to Hadoop, extracting data from a source system and then incrementally loading data into HBase and Hive before analysing it using OBIEE11g. One of the potential complications for this project was that the fact and dimension tables weren’t append-only; Hive and HDFS are generally considered write-once, read-many systems where data is inserted or appended into a file (more...)

Using HBase and Impala to Add Update and Delete Capability to Hive DW Tables, and Improve Query Response Times

One of our customers is looking to offload part of their data warehouse platform to Hadoop, extracting data out of a source system and loading it into Apache Hive tables for subsequent querying using OBIEE11g. One of the challenges that the project faces though is how to handle updates to dimensions (and in their case, fact table records) when HDFS and Hive are typically append-only filesystems; ideally writes to fact tables should only require (more...)

Better Data Modeling: 7 Differentiating Characteristics of Data Vault 2.0

Hard to believe that the 2nd Annual World Wide Data Vault Consortium (WWDVC15) is NEXT WEEK in beautiful Stowe Vermont. It promises to be an excellent event. The speakers include myself, Claudia Imhoff, Dan Linstedt (the inventor of Data Vault), Scott Ambler, Roelant Vos, Dirk Lerner and many more. The focus will be DV 2.0, […]

OBIEE 11.1.1.9 Now Enables HiveServer2 and Cloudera Impala Datasources

As you all probably know I’m a big fan of Oracle’s BI and Big Data products, but something I’ve been critical of is OBIEE11g’s lack of support for HiveServer2 connections to Hadoop clusters. OBIEE 11.1.1.7 supported Hive connections using the older HiveServer1 protocol, but recent versions of Cloudera CDH4 and CDH5 use the HiveServer2 protocol by default and OBIEE 11.1.1.7 wouldn’t connect to them; not unless you switched (more...)

Presentation Slides and Photos from the Rittman Mead BI Forum 2015, Brighton and Atlanta

It’s now the Saturday after the two Rittman Mead BI Forum 2015 events, last week in Atlanta, GA and the week before in Brighton, UK. Both events were a great success and I’d like to say thanks to the speakers, attendees, our friends at Oracle and my colleagues within Rittman Mead for making the two events so much fun. If you’re interested in taking a look at some photos from the two events, I’ve put (more...)

Oracle Pre-Built Developer VMs and VMBox

Virtual machines (VM) are not new –it has been around for quite some time, and as a consultant I find myself use them all the time. As a matter of fact, just on my laptop and external drive there are at least 15 or 20 different virtual environment which I use for testing, experimenting, and for creating new blog posts.

The thing with virtual machines that you need to be a little more than just (more...)

Friday Philosophy – Know Your Audience

There are some things that are critical for businesses that can be hidden or of little concern to those of us doing a technical job. One of those is knowing who your customers are. It is vital to businesses to know who is buying their products or services. Knowing who is not and never will buy their products is also important (don’t target the uninterested) and knowing and who is not currently buying and might (more...)

Handouts – Introducing Oracle’s Information Management Reference Architecture

One of the great things of working in the Oracle Business Analytics industry is the fact that there is a very active community. Both online as well as offline. Oracle supports these activities where possible. Last week I attended an offline session at Oracle HQ in the Netherlands. This session was a event organized by…Read more Handouts – Introducing Oracle’s Information Management Reference Architecture

Meet Sonra at CeBIT 2015

We will be presenting the Sonra Hadoop Quick Start Appliance at CeBIT next week in Hanover. Meet and greet us in Hall 2, Stand D52 (C58).

At Sonra we understand the difficulties faced by businesses when they begin their Big Data journey. We help you get started in days or weeks and immediately reap the benefits of Big Data. Sonra have packaged optimised Hadoop Supermicro hardware with MapR, the prime Hadoop distribution, and added our (more...)

Thumbs up for Hadoop User Group Ireland Meetup!

HUG_Ireland_logo_smallWe got excellent feedback for our first Hadoop User Group Ireland meetup. We wined, dined, and entertained more than 100 Hadoopers (and there was even beer left at the end of the night).

If you want to find out more about Sonra’s Hadoop Data Warehouse Quick Starter Solutions you can contact me or connect with me on LinkedIn.

For those of you who missed the event I have posted some pictures below. We have recorded (more...)

Hadoop User Group Ireland Meetup

HUG_Ireland_logoJoin MapR and Sonra for the Hadoop User Group Ireland Meetup on 23 February at 6 pm at the Wayra offices (O2/Three building). You’ll learn more about the MapR distribution for Apache Hadoop through use cases, case studies and an introduction to the benefits of using the MapR platform.

Come by for this content-packed first event ending with the opportunity to socialise over beer and pizza kindly provided by Sonra.

 

Agenda:

What is (more...)

Patching the Big Data Appliance

I have been patching engineered systems since the launch of the Exadata V2 and recently i had the opportunity to patch the BDA we have in house. As far as comparisons go, this is were the similarities stop between Exadata and a Big Data Appliance (BDA) patching.
Our BDA is a so called startes rack consisting of 6 nodes running a hadoop cluster, for more information about this read my First Impressions blog post. On (more...)

Oracle Business Analytics Update

Last week I visited the Oracle Business Analytics Partner Update at Oracle NL in Utrecht, the Netherlands. The NL (Pre-) Sales gave us an update of the Oracle Analytics Roadmap. There is a lot of movement in the world of Oracle Analytics. I have written about the publications during Oracle Open World 2014. When we look back…Read more Oracle Business Analytics Update

Presentaties februari 2015

Presentaties februari 2015

Voor februari 2015 staan er diverse presentaties op de planning met name op het gebied van architectuur en agile:

Kennissesies agile architectuur bij Ordina

Op 5 februari organiseert Ordina een kennissessie over agile architectuur in de praktijk. Net als op het LAC zal ik een presentatie houden over de waarde van architectuur en time-to-market. Waarin ik laat zien dat juist architectuur waarde en snelheid kan bieden bij de time-to-market. Dit aan de (more...)

Big Data and the importance of Meta-Data

Data isn't really respected in businesses, you can see that because unlike other corporate assets there is rarely a decent corporate catalog that shows what exists and who has it.  In the vast majority of companies there is more effort and automation put into tracking laptops than there is into cataloging and curating information. Historically we've sort of been able to get away with this

Security Big Data – Part 7 – a summary

Over six parts I've gone through a bit of a journey on what Big Data Security is all about. Securing Big Data is about layers Use the power of Big Data to secure Big Data How maths and machine learning helps Why its how you alert that matters Why Information Security is part of Information Governance Classifying Risk and the importance of Meta-Data The fundamental point here is that

Securing Big Data Part 6 – Classifying risk

So now your Information Governance groups consider Information Security to be important you have to then think about how they should be classifying the risk.  Now there are docs out there on some of these which talk about frameworks.  British Columbia's government has one for instance that talks about High, Medium and Low risk, but for me that really misses the point and over simplifies the

Securing Big Data Part 5 – your Big Data Security team

What does your security team look like today? Or the IT equivalent, "the folks that say no".  The point is that in most companies information security isn't actually something that is considered important.  How do I know this?  Well because basically most IT Security teams are the equivalent of the nightclub bouncers, they aren't the people who own the club, they aren't as important as the

Securing Big Data – Part 4 – Not crying Wolf.

In the first three parts of this I talked about how Securing Big Data is about layers, and then about how you need to use the power of Big Data to secure Big Data, then how maths and machine learning helps to identify what is reasonable and was is anomalous. The Target Credit Card hack highlights this problem.  Alerts were made, lights did flash.  The problem was that so many lights flashed and