Learned a little bit about importing data from MySQL into HDFS using Sqoop

I have a chance to read a book - Hadoop Real-World Solutions Cookbook(Thank you ^______^). It pops up in my head, why I have never tested about Sqoop. As you know sqoop is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as (more...)

The Hadoop hump – why enterprises struggle to move from Proof of Concept to Enterprise deployment

At the recent Hadoop Summit in Amsterdam I noticed something that has been bothering me for a while.  Lots of companies have done some great Proof of Concepts with Hadoop but they are rarely turning those into fully blown operational solutions.  Being clear I'm not talking about the shiny, shiny (more...)

What’s your take on RDBMS and NoSQL?

My take is that application developers have belatedly but correctly concluded that an RDBMS is not the best tool for every application. For example, relational algebra, relational calculus, and SQL are not the best tools for graph problems. As another example, weblogs are non-transactional and don’t benefit from the ACID (more...)

Why NoSQL became MORE SQL and why Hadoop will become the Big Data Virtual Machine

A few years ago I wrote an article about "When Big Data is a Big Con" which talked about some of the hype issues around Big Data.  One of the key points I raised was about how many folks were just slapping on Big Data badges to the same old (more...)

The 3 ways Hadoop will change your Business Intelligence

“It’s the analytics stupid!” Obviously the offense is not intended at the dear reader. It’s a wake up call for all the people excited with Hadoop and lack BI vision. The BI people that lack infrastructure vision are also to blame. Blame for what? We’ll see later in this (more...)

InfoQ : Running the Largest Hadoop DFS Cluster

Since I joined a Big Data Event : Frankfurter Datenbanktage 2013 - I started to take also a look to non-relational technics too. The RDBMS is not for every asepct the correct and fitting and fulfilling answer to all data related IT challenges. 

Frequently I wondered about how facebook (more...)

What’s all the fuss about Big Data?

Uncategorized
| Mar 6, 2013

What’s all the fuss about Big Data?


Big Data is the collective term for very large and potentially complex data sets that are deemed to be so large that it’s difficult to handle the data using traditional tools and applications such as Relational Database Management Systems. Scientists in the fields of physics, genetics and meteorology were previous examples of those that encountered Big Data.

 

However,

One database to rule them all?

Perhaps the single toughest question in all database technology is: Which different purposes can a single data store serve well? — or to phrase it more technically — Which different usage patterns can a single data store support efficiently? Ted Codd was on multiple sides of that issue, first suggesting (more...)

Notes and links, February 17, 2013

1. It boggles my mind that some database technology companies still don’t view compression as a major issue. Compression directly affects storage and bandwidth usage alike — for all kinds of storage (potentially including RAM) and for all kinds of bandwidth (network, I/O, and potentially on-server).

Trading off less-than-maximal compression (more...)

Comments on Gartner’s 2012 Magic Quadrant for Data Warehouse Database Management Systems — evaluations

To my taste, the most glaring mis-rankings in the 2012/2013 Gartner Magic Quadrant for Data Warehouse Database Management are that it is too positive on Kognitio and too negative on Infobright. Secondarily, it is too negative on HP Vertica, and too positive on ParAccel and Actian/VectorWise. So let’s consider those (more...)

Comments on Gartner’s 2012 Magic Quadrant for Data Warehouse Database Management Systems — concepts

The 2012 Gartner Magic Quadrant for Data Warehouse Database Management Systems is out. I’ll split my comments into two posts — this one on concepts, and a companion on specific vendor evaluations.

Links:

  • Maintaining working links to Gartner Magic Quadrants is an adventure. But as of early February, 2013, this (more...)

We don’t use databases; we don’t use indexes

Whenever salespeople phone Mogens Norgaard, he puts them off by saying that he just doesn’t use the products that they are calling about. When the office furniture company phones, he says “We don’t use office furniture.” When the newspaper company phones, he says “We don’t read newspapers.” When the girl scouts phone, he probably says [...]

Learn – FATAL mapred.JobTracker: ENOENT: No such file or directory

 Nothing much for this post. I just needed to keep information in my blog. After I installed hadoop (rpm). I had started namenode & datanode, bit had an issue about jobtracker. When I started it. It showed error - FATAL mapred.JobTracker: ENOENT: No such file or directory.
-bash-4.1$ hadoop jobtracker &
[1] 14455
-bash-4.1$ 13/01/04 20:07:30 INFO mapred.JobTracker: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting JobTracker
STARTUP_MSG:   host = centos/192.168.111.80
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.1.1
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1411108; compiled by 'hortonfo' on Mon Nov 19 (more...)

MapReduce Design Patterns Building Effective Algorithms and Analytics for Hadoop and Other Systems By Donald Miner, Adam Shook

What is MapReduce? It is computing paradigm for processing data that reside on hundreds of computer. BTW, I believe you can read more meaning on wikipedia or some page, there will give you more.
As you know MapReduce is the heart of Hadoop. If you are interested in Hadoop. You cannot avoid to learn about MapReduce, it's really important.
If you are developer or someone who is interested in design patterns for the MapReduce framework. I mention book from O'reilly - MapReduce Design Patterns Building Effective Algorithms and Analytics for Hadoop and Other Systems By Donald Miner, Adam Shook. All (more...)

Some trends that will continue in 2013

I’m usually annoyed by lists of year-end predictions. Still, a reporter asked me for some, and I found one kind I was comfortable making.

Trends that I think will continue in 2013 include:

Growing attention to machine-generated data. Human-generated data grows at the rate business activity does, plus 0-25%. Machine-generated (more...)

Notes on Microsoft SQL Server

I’ve been known to gripe that covering big companies such as Microsoft is hard. Still, Doug Leland of Microsoft’s SQL Server team checked in for phone calls in August and again today, and I think I got enough to be worth writing about, albeit at a survey level only,

Subjects (more...)

Learn – hadoop-fuse-dfs


Read Mounting HDFS. There is something very interesting. As people know HDFS is Hadoop Distributed File System. So, i wanted to test a bit about hadoop-fuse-dfs. Somehow, I had impala for test only. I thought it's easy to use it and learn.
-bash-4.1$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
rootfs                24607156   4370100  19987060  18% /
devtmpfs               1534680       200   1534480   1% /dev
tmpfs                  1544584 (more...)

Everything you ever wanted to know about Big Data, but had no PDF to carry around!

Back in March 2012 I experienced an air milage overflow: almost straight from Madrid I’ve picked a flight to Israel to speak at a Big Data conference, only to be back in Lisbon and fly again to Johannesburg in South Africa to meet several customers in the retail and manufacturing area. Back to Lisbon I packed again to London [...]

Hadoop! What is it good for? Absolutely … everything!

In times of hysteria people tend to use their reptilian brain. This sub-brain, that has been with us since we were fish, or tadpoles, it’s what kicks in when we face the unknown. In computer science or information technology, organizations tend to hold down to emotions and less and less in reasoning. Could it be [...]

Comic: How to write CV for NoSQL

Original Post can be viewed at Comic: How to write CV for NoSQL

This is pretty old comic from geek&Poke . Enjoy   Related PostsLife Is Changed Now!!!!!Oracle Direct connector for HDFSWishing all a hApPy DiWaLi10,000 Hits – First MilestoneChecking Database Feature Usage StatsZemanta

AskDba.org Weblog