Hive (HiveQL) SQL for Hadoop Big Data



In this  post I will share my experience with an Apache Hadoop component called Hive which enables you to do SQL on an Apache Hadoop Big Data cluster.

Being a great fun of SQL and relational databases, this was my opportunity to set up a mechanism where I could transfer some (a lot)  data from a relational database into Hadoop and query it with SQL. Not a very difficult thing to do these days, actually (more...)

Old ventures and new adventures

I have some news, two items actually.

First, today (it’s still 18th June in California) is my blog’s 8th anniversary!

I wrote my first blog post, about Advanced Oracle Troubleshooting, exactly 8 years ago, on 18th June 2007 and have written 229 blog posts since. I had started writing and accumulating my TPT script collection a couple of years earlier and now it has over 1000 files in it! And no, I don’t remember (more...)

Replicating Hive Data Into Oracle BI Cloud Service for Visual Analyzer using BICS Data Sync

In yesterday’s post on using Oracle Big Data Discovery with Oracle Visual Analyzer in Oracle BI Cloud Service, I said mid-way through the article that I had to copy the Hadoop data into BI Cloud Service so that Visual Analyzer could use it; at present Oracle Visual Analyzer is only available as part of Oracle BI Cloud Service (BICS) so at some point the data prepared by Big Data Discovery had to be moved (more...)

Notes on analytic technology, May 13, 2015

1. There are multiple ways in which analytics is inherently modular. For example:

  • Business intelligence tools can reasonably be viewed as application development tools. But the “applications” may be developed one report at a time.
  • The point of a predictive modeling exercise may be to develop a single scoring function that is then integrated into a pre-existing operational application.
  • Conversely, a recommendation-driven website may be developed a few pages — and hence also a few (more...)

Oracle Pre-Built Developer VMs and VMBox

Virtual machines (VM) are not new –it has been around for quite some time, and as a consultant I find myself use them all the time. As a matter of fact, just on my laptop and external drive there are at least 15 or 20 different virtual environment which I use for testing, experimenting, and for creating new blog posts.

The thing with virtual machines that you need to be a little more than just (more...)

Meet Sonra at CeBIT 2015

We will be presenting the Sonra Hadoop Quick Start Appliance at CeBIT next week in Hanover. Meet and greet us in Hall 2, Stand D52 (C58).

At Sonra we understand the difficulties faced by businesses when they begin their Big Data journey. We help you get started in days or weeks and immediately reap the benefits of Big Data. Sonra have packaged optimised Hadoop Supermicro hardware with MapR, the prime Hadoop distribution, and added our (more...)

Thumbs up for Hadoop User Group Ireland Meetup!

HUG_Ireland_logo_smallWe got excellent feedback for our first Hadoop User Group Ireland meetup. We wined, dined, and entertained more than 100 Hadoopers (and there was even beer left at the end of the night).

If you want to find out more about Sonra’s Hadoop Data Warehouse Quick Starter Solutions you can contact me or connect with me on LinkedIn.

For those of you who missed the event I have posted some pictures below. We have recorded (more...)

Using parameters and variables in Hive CLI

| Feb 19, 2015

In this blog post we look at how we can address a shortcoming in the Hive ALTER TABLE statement using parameters and variables in the Hive CLI (Hive 0.13 was used).

There’s a simple way to query Hive parameter values directly from CLI
You simply execute (without specifying the value to be set):

SET <parameter>;

for instance

SET hive.exec.compress.output;
---
hive.exec.compress.output=false

You may use those parameters directly in (more...)

Hadoop User Group Ireland Meetup

HUG_Ireland_logoJoin MapR and Sonra for the Hadoop User Group Ireland Meetup on 23 February at 6 pm at the Wayra offices (O2/Three building). You’ll learn more about the MapR distribution for Apache Hadoop through use cases, case studies and an introduction to the benefits of using the MapR platform.

Come by for this content-packed first event ending with the opportunity to socialise over beer and pizza kindly provided by Sonra.

 

Agenda:

What is (more...)

Get Hadoop Certified with Free Training from MapR

If you want to upskill and get certified on Hadoop you can now do so for free. Thanks to MapR. Over the next couple of weeks they are rolling out their on-demand Hadoop training courses. The highlight of the first batch of courses is Developing Hadoop Applications on Yarn.

Related posts

Big Data and the importance of Meta-Data

Data isn't really respected in businesses, you can see that because unlike other corporate assets there is rarely a decent corporate catalog that shows what exists and who has it.  In the vast majority of companies there is more effort and automation put into tracking laptops than there is into cataloging and curating information. Historically we've sort of been able to get away with this

Security Big Data – Part 7 – a summary

Over six parts I've gone through a bit of a journey on what Big Data Security is all about. Securing Big Data is about layers Use the power of Big Data to secure Big Data How maths and machine learning helps Why its how you alert that matters Why Information Security is part of Information Governance Classifying Risk and the importance of Meta-Data The fundamental point here is that

Hadoop and Oracle Technologies on BI Projects

Last night I attended an event powered by oGH and OBUG. Mark Rittman was invited to talk about; ‘Hadoop and Oracle Technologies on BI Projects’. This event has been organized to inform us about Hadoop combined with Oracle Technologies. Next to that the event was also meant as a start up of a BI / Warehousing SIG.…Read more Hadoop and Oracle Technologies on BI Projects

Big Data: Hadoop and Oracle technologies explained

MarkRittmanUnder the title “Hadoop and Oracle technologies on BI projects” Mark Rittman flew to The Netherlands on the 14th of July to visit the Oracle Usergroup Holland.

As I had obviously heard a lot about Hadoop, I never really did anything further with it and left it to a synaptic link to Gwen Shapira. This lack of action created a kind of threshold in the understanding of the technology. When I heard about this (more...)

Securing Big Data Part 6 – Classifying risk

So now your Information Governance groups consider Information Security to be important you have to then think about how they should be classifying the risk.  Now there are docs out there on some of these which talk about frameworks.  British Columbia's government has one for instance that talks about High, Medium and Low risk, but for me that really misses the point and over simplifies the

Securing Big Data Part 5 – your Big Data Security team

What does your security team look like today? Or the IT equivalent, "the folks that say no".  The point is that in most companies information security isn't actually something that is considered important.  How do I know this?  Well because basically most IT Security teams are the equivalent of the nightclub bouncers, they aren't the people who own the club, they aren't as important as the

Securing Big Data – Part 4 – Not crying Wolf.

In the first three parts of this I talked about how Securing Big Data is about layers, and then about how you need to use the power of Big Data to secure Big Data, then how maths and machine learning helps to identify what is reasonable and was is anomalous. The Target Credit Card hack highlights this problem.  Alerts were made, lights did flash.  The problem was that so many lights flashed and

Securing Big Data – Part 3 – Security through Maths

In the first two parts of this I talked about how Securing Big Data is about layers, and then about how you need to use the power of Big Data to secure Big Data.  The next part is "what do you do with all that data?".   This is where Machine Learning and Mathematics comes in, in other words its about how you use Big Data analytics to secure Big Data. What you want (more...)

Securing Big Data – Part 2 – understanding the data required to secure it

In the first part of Securing Big Data I talked about the two different types of security.  The traditional IT and ACL security that needs to be done to match traditional solutions with an RDBMS but that is pretty much where those systems stop in terms of security which means they don't address the real threats out there, which are to do with cyber attacks and social engineering.  An ACL is only

Introduction to Big Data, Hadoop and NoSQL

In the last few weeks I participated in the training of a DBA course in John Bryce education center in Israel.

The course is titled “Master DBA” – it’s an 8 month evening course to train new DBAs from head to tail. It’s divided into two parts; the first part is about SQL, PL/SQL, and OU “Oracle Database Administration Workshop” parts 1 and 2. The students are then encouraged to take the OCA and OCP (more...)