As Gluent is all about gluing together the old world and new world in enterprises, it’s time to announce the Gluent New World webinar series!
The Gluent New World sessions cover the important technical details behind new advancements in enterprise technologies that are arriving into mainstream use.
These seminars help you to stay current with the major technology changes that are inevitably arriving into your company soon (if not already). You can make informed decisions about what to learn next – (more...)
A very popular tool for data scientists is RStudio. This tool allows you to interactively work with your R code, view the R console, the graphs and charts you create, manage the various objects and data frames you create, as well shaving easy access to the R help documentation. Basically it is a core everyday tool.
The typical approach is to have RStudio installed on your desktop or laptop. What this really means is that (more...)
Spark is an open source Apache project that provides a framework for multi stage in-memory analytics. Spark is based on the Hadoop platform and can interface with Cassandra OpenStack Swift, Amazon S3, Kudu and HDFS. Spark comes with a suite of analytic and machine learning algorithm allowing you to perform a wide variety of analytics on you distribute Hadoop platform. This allows you to generate data insights, data enrichment and data aggregations for storage on (more...)
Read this article on my new blog
Ted Dunning and I have worked on a tutorial that explains how to write your first Kafka application. In this tutorial you will learn how to:
Install and start Kafka
Create and Run a producer and a consumer
You can find the tutorial on the MapR blog:
Getting Started with Sample Programs for Apache Kafka 0.9
I’m happy to announce that the last couple of years of hard work is paying off and the Gluent Offload Engine is production now! After beta testing with our early customers, we are now out of complete stealth mode and are ready talk more about what exactly are we doing :-)
Check out our new website and product & use case info here!
“You’ve been running! Take a selfie, see how exercise changes you!” I smile when that message it pops into the notifications list on my Android smartphone when using theBasis Peak. All part of what endears me to using it even more to track my activity and sleep patterns.
The “smile-o-meter” approach of the Basis Peak Photo Finish feature is a great mix of the analog and digital, leveraging well-familiar smart phone functionality to (more...)
Cloudera welcomes InfoCaptor as a certified partner for data analytics and visualization. InfoCaptor delivers self-service BI and analytics to data analysts and business users in enterprise organizations, enabling more users to mine and search for data that uncovers valuable business insights and maximizes value from an enterprise data hub
Rudrasoft, the software company that specializes in data analytics dashboard solutions, announced today that it has released an updated version of its popular InfoCaptor software, which (more...)
Spending a bit more time with Apache Phoenix in my previous post I realised that you can use it to query existing HBase tables. That is NOT tables created using Apache Phoenix, but HBase - the columnar NoSQL database in Hadoop.
I think this is cool as it gives you the ability to use SQL on an HBase table.
To test this, let's say you login to HBase and you create an HBase table like (more...)
Phoenix is a bit different, a bit closer to my heart too, as I read the documentation on Apache Phoenix, the word 'algebra' and 'relational algebra' came across few times, and that mean only one thing, SQL! The use of (more...)
In this post I will share my experience with an Apache Hadoop component called Hive which enables you to do SQL on an Apache Hadoop Big Data cluster.
Being a great fun of SQL and relational databases, this was my opportunity to set up a mechanism where I could transfer some (a lot) data from a relational database into Hadoop and query it with SQL. Not a very difficult thing to do these days, actually (more...)
We will be presenting the Sonra Hadoop Quick Start Appliance at CeBIT next week in Hanover. Meet and greet us in Hall 2, Stand D52 (C58).
At Sonra we understand the difficulties faced by businesses when they begin their Big Data journey. We help you get started in days or weeks and immediately reap the benefits of Big Data. Sonra have packaged optimised Hadoop Supermicro hardware with MapR, the prime Hadoop distribution, and added our (more...)
Join MapR and Sonra for the Hadoop User Group Ireland Meetup on 23 February at 6 pm at the Wayra offices (O2/Three building). You’ll learn more about the MapR distribution for Apache Hadoop through use cases, case studies and an introduction to the benefits of using the MapR platform.
Come by for this content-packed first event ending with the opportunity to socialise over beer and pizza kindly provided by Sonra.
I have been patching engineered systems since the launch of the Exadata V2 and recently i had the opportunity to patch the BDA we have in house. As far as comparisons go, this is were the similarities stop between Exadata and a Big Data Appliance (BDA) patching.
Our BDA is a so called startes rack consisting of 6 nodes running a hadoop cluster, for more information about this read my First Impressions blog post. On (more...)
Data isn't really respected in businesses, you can see that because unlike other corporate assets there is rarely a decent corporate catalog that shows what exists and who has it. In the vast majority of companies there is more effort and automation put into tracking laptops than there is into cataloging and curating information.
Historically we've sort of been able to get away with this
Over six parts I've gone through a bit of a journey on what Big Data Security is all about.
Securing Big Data is about layers
Use the power of Big Data to secure Big Data
How maths and machine learning helps
Why its how you alert that matters
Why Information Security is part of Information Governance
Classifying Risk and the importance of Meta-Data
The fundamental point here is that
So now your Information Governance groups consider Information Security to be important you have to then think about how they should be classifying the risk. Now there are docs out there on some of these which talk about frameworks. British Columbia's government has one for instance that talks about High, Medium and Low risk, but for me that really misses the point and over simplifies the
What does your security team look like today?
Or the IT equivalent, "the folks that say no". The point is that in most companies information security isn't actually something that is considered important. How do I know this? Well because basically most IT Security teams are the equivalent of the nightclub bouncers, they aren't the people who own the club, they aren't as important as the