Apache Impala Internals Deep Dive with Tanel Poder + Gluent New World Training Month

We are running a “Gluent New World training month” in this July and have scheduled 3 webinars on following Wednesdays for this!

The first webinar with Michael Rainey is going to cover modern alternatives to the traditional old-school “ETL on a RDBMS” approach for data integration and sharing. Then on the next Wednesday I will demonstrate some Apache Impala SQL engine’s internals, with commentary from an Oracle database geek’s angle (I plan to get pretty (more...)

Machine Learning Algorithm Cheat Sheet

| Apr 10, 2017

With so many algorithms around its always a struggle to find out which algorithm could be suitable for the problem statement, I want to solve. Microsoft has done an amazing job to start with. Please find attached  Machine Learning Algorithm Cheat Sheet .

Screen Shot 2017-04-10 at 7.23.12 PM

Hope This Helps

Sunil S Ranka


I’m speaking at Advanced Spark Meetup & attending Deep Learning Workshop in San Francisco

In case you are interested in the “New World” and happen to be in Bay Area this week (19 & 21 Jan 2017), there are two interesting events that you might want to attend (I’ll speak at one and attend the other):

Advanced Spark and TensorFlow Meetup

I’m speaking at the advanced Apache Spark meetup and showing different ways for profiling applications with the main focus on CPU efficiency. This is a free Meetup in San Francisco hosted (more...)

GNW05 – Extending Databases With the Full Power of Hadoop: How Gluent Does It

It’s time to announce the next webinar in the Gluent New World series. This time I will deliver it myself (and let’s have some fun :-)

Details below:

GNW05 – Extending Databases With the Full Power of Hadoop: How Gluent Does It

NB! If you want to move to the "New World" - offload your data and workloads to Hadoop, without having to re-write your existing applications - check out Gluent. We are making history! (more...)

Gluent Podcast with Mark Rittman

Mark Rittman has been publishing his podcast series (Drill to Detail) for a while now and I sat down with him at UKOUG Tech 2016 conference to discuss Gluent and its place in the new world with him.

This podcast episode is about 49 minutes and it explains the reasons why I decided to go on to build Gluent a couple of years ago and where I see the enterprise data world going in (more...)

Database Migration and Integration using AWS DMS



Amazon Web Services (AWS) recently released a product called AWS Data Migration Services (DMS) to migrate data between databases.

The experiment

I have used AWS DMS to try a migration from a source MySQL database to a target MySQL database, a homogeneous database migration.

The DMS service lets you use a resource in the middle Replication Instance - an automatically created EC2 instance - plus source and target Endpoints. Then you move data from the source (more...)

Gear up for #AIOUG OTN Yathra’ 2016

Guys, AIOUG is back again with OTN Yathra’ 2016. It is a series of technology evangelist events organized by All India Oracle Users Group in six cities touring across the length and breadth of the country. It was my extreme pleasure to be the part of it in 2015 and I’m pleased to announce that … Continue reading

More Animals in Big Data Zoo – Big Data Landscape for 2016

| Mar 25, 2016

Hi All

While surfing net stumbled upon Big Data Landscape for 2016 image and it was very impressive to see many more new Animals in Big Data Zoo.

 

New Animals

Hope This Helps

Sunil S Ranka


Getting Started With Sample Programs for Apache Kafka 0.9

Read this article on my new blog Ted Dunning and I have worked on a tutorial that explains how to write your first Kafka application. In this tutorial you will learn how to: Install and start Kafka Create and Run a producer and a consumer You can find the tutorial on the MapR blog: Getting Started with Sample Programs for Apache Kafka 0.9

Query existing HBase tables with SQL using Apache Phoenix

Spending a bit more time with Apache Phoenix and reading again my previous post I realised that you can use it to query existing HBase tables. That is NOT tables created using Apache Phoenix, but HBase - the columnar NoSQL database in Hadoop.

I think this is cool as it gives you the ability to use SQL on an HBase table.

To test this, let's say you login to HBase and you create an HBase (more...)

Apache Phoenix, SQL is getting closer to Big Data



Here is a post about another project in the Big Data world, like Apache Hive from my previous post, enables you to do SQL on Big Data. It is called Apache Phoenix.

Phoenix is a bit different, a bit closer to my heart too, as I read the documentation on Apache Phoenix, the word 'algebra' and 'relational algebra' came across few times, and that mean only one thing, SQL! The use of (more...)

Hive (HiveQL) SQL for Hadoop Big Data



In this  post I will share my experience with an Apache Hadoop component called Hive which enables you to do SQL on an Apache Hadoop Big Data cluster.

Being a great fun of SQL and relational databases, this was my opportunity to set up a mechanism where I could transfer some (a lot)  data from a relational database into Hadoop and query it with SQL. Not a very difficult thing to do these days, actually (more...)

Meet Sonra at CeBIT 2015

We will be presenting the Sonra Hadoop Quick Start Appliance at CeBIT next week in Hanover. Meet and greet us in Hall 2, Stand D52 (C58).

At Sonra we understand the difficulties faced by businesses when they begin their Big Data journey. We help you get started in days or weeks and immediately reap the benefits of Big Data. Sonra have packaged optimised Hadoop Supermicro hardware with MapR, the prime Hadoop distribution, and added our (more...)

Thumbs up for Hadoop User Group Ireland Meetup!

HUG_Ireland_logo_smallWe got excellent feedback for our first Hadoop User Group Ireland meetup. We wined, dined, and entertained more than 100 Hadoopers (and there was even beer left at the end of the night).

If you want to find out more about Sonra’s Hadoop Data Warehouse Quick Starter Solutions you can contact me or connect with me on LinkedIn.

For those of you who missed the event I have posted some pictures below. We have recorded (more...)

Hadoop User Group Ireland Meetup

HUG_Ireland_logoJoin MapR and Sonra for the Hadoop User Group Ireland Meetup on 23 February at 6 pm at the Wayra offices (O2/Three building). You’ll learn more about the MapR distribution for Apache Hadoop through use cases, case studies and an introduction to the benefits of using the MapR platform.

Come by for this content-packed first event ending with the opportunity to socialise over beer and pizza kindly provided by Sonra.

 

Agenda:

What is (more...)

Patching the Big Data Appliance

I have been patching engineered systems since the launch of the Exadata V2 and recently i had the opportunity to patch the BDA we have in house. As far as comparisons go, this is were the similarities stop between Exadata and a Big Data Appliance (BDA) patching.
Our BDA is a so called startes rack consisting of 6 nodes running a hadoop cluster, for more information about this read my First Impressions blog post. On (more...)

Big Data and the importance of Meta-Data

Data isn't really respected in businesses, you can see that because unlike other corporate assets there is rarely a decent corporate catalog that shows what exists and who has it.  In the vast majority of companies there is more effort and automation put into tracking laptops than there is into cataloging and curating information. Historically we've sort of been able to get away with this

Security Big Data – Part 7 – a summary

Over six parts I've gone through a bit of a journey on what Big Data Security is all about. Securing Big Data is about layers Use the power of Big Data to secure Big Data How maths and machine learning helps Why its how you alert that matters Why Information Security is part of Information Governance Classifying Risk and the importance of Meta-Data The fundamental point here is that

Securing Big Data Part 6 – Classifying risk

So now your Information Governance groups consider Information Security to be important you have to then think about how they should be classifying the risk.  Now there are docs out there on some of these which talk about frameworks.  British Columbia's government has one for instance that talks about High, Medium and Low risk, but for me that really misses the point and over simplifies the

Securing Big Data Part 5 – your Big Data Security team

What does your security team look like today? Or the IT equivalent, "the folks that say no".  The point is that in most companies information security isn't actually something that is considered important.  How do I know this?  Well because basically most IT Security teams are the equivalent of the nightclub bouncers, they aren't the people who own the club, they aren't as important as the