Convert CSV file to Apache Parquet… with Drill

Read this article on my new blog A very common use case when working with Hadoop is to store and query simple files (CSV, TSV, ...); then to get better performance and efficient storage convert these files into more efficient format, for example Apache Parquet. Apache Parquet is a columnar storage format available to any project in the Hadoop ecosystem. Apache Parquet has the following

Apache Drill : How to Create a New Function?

Read this article on my new blog Apache Drill allows users to explore any type of data using ANSI SQL. This is great, but Drill goes even further than that and allows you to create custom functions to extend the query engine. These custom functions have all the performance of any of the Drill primitive operations, but allowing that performance makes writing these functions a little trickier

Introduction to MongoDB Security

View it on my new blog Last week at the Paris MUG, I had a quick chat about security and MongoDB, and I have decided to create this post that explains how to configure out of the box security available in MongoDB. You can find all information about MongoDB Security in following documentation chapter: In this post, I won't go into the detail about

Moving My Beers From Couchbase to MongoDB

See it on my new blog : here Few days ago I have posted a joke on Twitter Moving my Java from Couchbase to MongoDB — Tugdual Grall (@tgrall) January 26, 2015 So I decided to move it from a simple picture to a real project. Let’s look at the two phases of this so called project: Moving the data from Couchbase to MongoDB Updating the application code to use

Everybody Says “Hackathon”!

TLTR: MongoDB & Sage organized an internal Hackathon We use the new X3 Platform based on MongoDB, Node.js and HTML to add cool features to the ERP This shows that “any” enterprise can (should) do it to: look differently at software development build strong team spirit have fun! Introduction I have like many of you participated to multiple Hackathons where developers, designer and

Nantes MUG : Event #2

Last night the Nantes MUG (MongoDB Users Group) had its second event. More than 45 people signed up and joined us at the Epitech school (thanks for this!).  We were lucky to have 2 talks from local community members: How “MyScript Cloud” uses MongoDB by Mathieu Ruellan Aggregation Framework by Sebastien Prunier How “MyScript Cloud” uses MongoDB First of all, if you do not know MyScript I

How to create a pub/sub application with MongoDB ? Introduction

In this article we will see how to create a pub/sub application (messaging, chat, notification), and this fully based on MongoDB (without any message broker like RabbitMQ, JMS, ... ). So, what needs to be done to achieve such thing: an application "publish" a message. In our case, we simply save a document into MongoDB another application, or thread, subscribe to these events and will received

Big Data… Is Hadoop the good way to start?

In the past 2 years, I have met many developers, architects that are working on “big data” projects. This sounds amazing, but quite often the truth is not that amazing. TL;TR You believe that you have a big data project? Do not start with the installation of an Hadoop Cluster -- the "how" Start to talk to business people to understand their problem -- the "why" Understand the data you must

Introduction to MongoDB Geospatial feature

This post is a quick and simple introduction to Geospatial feature of MongoDB 2.6 using simple dataset and queries. Storing Geospatial Informations As you know you can store any type of data, but if you want to query them you need to use some coordinates, and create index on them. MongoDB supports three types of indexes for GeoSpatial queries: 2d Index : uses simple coordinate (longitude,

db.person.find( { "role" : "DBA" } )

Wow! it has been a while since I posted something on my blog post. I have been very busy, moving to MongoDB, learning, learning, learning…finally I can breath a little and answer some questions. Last week I have been helping my colleague Norberto to deliver a MongoDB Essentials Training in Paris. This was a very nice experience, and I am impatient to deliver it on my own. I was happy to see that

Pagination with Couchbase

If you have to deal with a large number of documents when doing queries against a Couchbase cluster it is important to use pagination to get rows by page. You can find some information in the documentation in the chapter "Pagination", but I want to go in more details and (more...)

How to implement Document Versioning with Couchbase

Introduction Developers are often asking me how to "version" documents with Couchbase 2.0. The short answer is: the clients and server do not expose such feature, but it is quite easy to implement. In this article I will use a basic approach, and you will be able to extend (more...)

Deploy your Node/Couchbase application to the cloud with Clever Cloud

Introduction Clever Cloud is the first PaaS to provide Couchbase as a service allowing developers to run applications in a fully managed environment. This article shows how to deploy an existing application to Clever Cloud. I am using a very simple Node application that I have documented in a previous article: (more...)

SQL to NoSQL : Copy your data from MySQL to Couchbase

TL;DR: Look at the project on Github. Introduction During my last interactions with the Couchbase community, I had the question how can I easily import my data from my current database into Couchbase. And my answer was always the same: Take an ETL such as Talend to do it Just write a (more...)

Create a Couchbase cluster in less than a minute with Ansible

TL;DR: Look at the Couchbase Ansible Playbook on my Github. Introduction   When I was looking for a more effective way to create my cluster I asked some sysadmins which tools I should use to do it. The answer I got during OSDC was not Puppet, nor Chef, but was Ansible. (more...)

Six months as Technical Evangelist at Couchbase

Already 6 months! Already 6 months that I have joined Couchbase as Technical Evangelist. This is a good opportunity to take some time to look back. So first of all what is a Developer/Technical Evangelist? Hmm it depends of each company/product, but let me tell you what it is for (more...)

Screencast : Fun with Couchbase, MapReduce and Twitter

I have created this simple screencast to show how you can, using Couchbase do some realtime analysis based on Twitter feed. The key steps of this demonstration are Inject Tweets using a simple program available on my Github Couchbase-Twitter-Injector Create views to index and query the Tweets by User name Tags (more...)

Easy application development with Couchbase, Angular and Node

A friend of mine wants to build a simple system to capture ideas, and votes. Even if you can find many online services to do that, I think it is a good opportunity to show how easy it is to develop new application using a Couchbase and Node.js. So (more...)

How to get the latest document by date/time field?

I read this question on Twitter, let me answer the question in this short article. First of all you need to be sure your documents have an attribute that contains a date ;), something like : To get the "latest hired employee" you need to create a view, and emit (more...)

Introduction to Collated Views with Couchbase 2.0

Most of the applications have to deal with "master/detail" type of data: breweries and beer department and employees invoices and items  ... This is necessary for example to create application view like the following: With Couchbase, and many of the document oriented databases you have different ways to deal with this, (more...)