Getting started with Apache Flink and Kafka

Read this article on my new blog Introduction Apache Flink is an open source platform for distributed stream and batch data processing. Flink is a streaming data flow engine with several APIs to create data streams oriented application. It is very common for Flink applications to use Apache Kafka for data input and output. This article will guide you into the steps to use Apache

Streaming Analytics in a Digitally Industrialized World

Read this article on my new blog Get an introduction to streaming analytics, which allows you real-time insight from captured events and big data. There are applications across industries, from finance to wine making, though there are two primary challenges to be addressed. Did you know that a plane flying from Texas to London can generate 30 million data points per flight? As Jim

Setting up Spark Dynamic Allocation on MapR

Read this article on my new blog Apache Spark can use various cluster manager to execute application (Stand Alone, YARN, Apache Mesos). When you install Apache Spark on MapR you can submit application in a Stand Alone mode or using YARN. This article focuses on YARN and Dynamic Allocation, a feature that lets Spark add or remove executors dynamically based on the workload. You can

Save MapR Streams messages into MapR DB JSON

Read this article on my new blog In this article you will learn how to create a MapR Streams Consumer that saves all the messages into a MapR-DB JSON Table. Install and Run the sample MapR Streams application The steps to install and run the applications are the same as the one defined in the following article: MapR Streams application Once you have the default producer and

Getting Started with MapR Streams

Read this article on my new blog You can find a new tutorial that explains how to deploy an Apache Kafka application to MapR Streams, the tutorial is available here: Getting Started with MapR Streams MapR Streams is a new distributed messaging system for streaming event data at scale, and it’s integrated into the MapR converged platform. MapR Streams uses the Apache Kafka API, so

Getting Started With Sample Programs for Apache Kafka 0.9

Read this article on my new blog Ted Dunning and I have worked on a tutorial that explains how to write your first Kafka application. In this tutorial you will learn how to: Install and start Kafka Create and Run a producer and a consumer You can find the tutorial on the MapR blog: Getting Started with Sample Programs for Apache Kafka 0.9

Using Apache Drill REST API to Build ASCII Dashboard With Node

Read this article on my new blog Apache Drill has a hidden gem: an easy to use REST interface. This API can be used to Query, Profile and Configure Drill engine. In this blog post I will explain how to use Drill REST API to create ascii dashboards using Blessed Contrib. The ASCII Dashboard looks like Prerequisites Node.js Apache Drill 1.2 For this post, you will use the SFO

Convert CSV file to Apache Parquet… with Drill

Read this article on my new blog A very common use case when working with Hadoop is to store and query simple files (CSV, TSV, ...); then to get better performance and efficient storage convert these files into more efficient format, for example Apache Parquet. Apache Parquet is a columnar storage format available to any project in the Hadoop ecosystem. Apache Parquet has the following

Apache Drill : How to Create a New Function?

Read this article on my new blog Apache Drill allows users to explore any type of data using ANSI SQL. This is great, but Drill goes even further than that and allows you to create custom functions to extend the query engine. These custom functions have all the performance of any of the Drill primitive operations, but allowing that performance makes writing these functions a little trickier

Introduction to MongoDB Security

View it on my new blog Last week at the Paris MUG, I had a quick chat about security and MongoDB, and I have decided to create this post that explains how to configure out of the box security available in MongoDB. You can find all information about MongoDB Security in following documentation chapter: In this post, I won't go into the detail about

Moving My Beers From Couchbase to MongoDB

See it on my new blog : here Few days ago I have posted a joke on Twitter Moving my Java from Couchbase to MongoDB — Tugdual Grall (@tgrall) January 26, 2015 So I decided to move it from a simple picture to a real project. Let’s look at the two phases of this so called project: Moving the data from Couchbase to MongoDB Updating the application code to use

Everybody Says “Hackathon”!

TLTR: MongoDB & Sage organized an internal Hackathon We use the new X3 Platform based on MongoDB, Node.js and HTML to add cool features to the ERP This shows that “any” enterprise can (should) do it to: look differently at software development build strong team spirit have fun! Introduction I have like many of you participated to multiple Hackathons where developers, designer and

Nantes MUG : Event #2

Last night the Nantes MUG (MongoDB Users Group) had its second event. More than 45 people signed up and joined us at the Epitech school (thanks for this!).  We were lucky to have 2 talks from local community members: How “MyScript Cloud” uses MongoDB by Mathieu Ruellan Aggregation Framework by Sebastien Prunier How “MyScript Cloud” uses MongoDB First of all, if you do not know MyScript I

How to create a pub/sub application with MongoDB ? Introduction

In this article we will see how to create a pub/sub application (messaging, chat, notification), and this fully based on MongoDB (without any message broker like RabbitMQ, JMS, ... ). So, what needs to be done to achieve such thing: an application "publish" a message. In our case, we simply save a document into MongoDB another application, or thread, subscribe to these events and will received

Big Data… Is Hadoop the good way to start?

In the past 2 years, I have met many developers, architects that are working on “big data” projects. This sounds amazing, but quite often the truth is not that amazing. TL;TR You believe that you have a big data project? Do not start with the installation of an Hadoop Cluster -- the "how" Start to talk to business people to understand their problem -- the "why" Understand the data you must

Introduction to MongoDB Geospatial feature

This post is a quick and simple introduction to Geospatial feature of MongoDB 2.6 using simple dataset and queries. Storing Geospatial Informations As you know you can store any type of data, but if you want to query them you need to use some coordinates, and create index on them. MongoDB supports three types of indexes for GeoSpatial queries: 2d Index : uses simple coordinate (longitude,

db.person.find( { "role" : "DBA" } )

Wow! it has been a while since I posted something on my blog post. I have been very busy, moving to MongoDB, learning, learning, learning…finally I can breath a little and answer some questions. Last week I have been helping my colleague Norberto to deliver a MongoDB Essentials Training in Paris. This was a very nice experience, and I am impatient to deliver it on my own. I was happy to see that

Pagination with Couchbase

If you have to deal with a large number of documents when doing queries against a Couchbase cluster it is important to use pagination to get rows by page. You can find some information in the documentation in the chapter "Pagination", but I want to go in more details and (more...)

How to implement Document Versioning with Couchbase

Introduction Developers are often asking me how to "version" documents with Couchbase 2.0. The short answer is: the clients and server do not expose such feature, but it is quite easy to implement. In this article I will use a basic approach, and you will be able to extend (more...)

Deploy your Node/Couchbase application to the cloud with Clever Cloud

Introduction Clever Cloud is the first PaaS to provide Couchbase as a service allowing developers to run applications in a fully managed environment. This article shows how to deploy an existing application to Clever Cloud. I am using a very simple Node application that I have documented in a previous article: (more...)