Introduction
In the final installment in our series on Hive UDFs, we're going to tackle the least intuitive of the three types: the User Defined Aggregating Function. While they're challenging to implement, UDAFs are necessary if we want functions for which the distinction of map-side v. reduce-side operations are opaque (more...)
Introduction
In our ongoing exploration of Hive UDFs, we've covered the basic row-wise UDF. Today we'll move to the UDTF, which generates multiple rows for every row processed. This UDF built its house from sticks: it's slightly more complicated than the basic UDF and allows us an opportunity to explore (more...)
Introduction
In our ongoing series of posts explaining the in's and out's of Hive User Defined Functions, we're starting with the simplest case. Of the three little UDFs, today's entry built a straw house: simple, easy to put together, but limited in applicability. We'll walk through important parts of the (more...)
Introduction
User-defined Functions (UDFs) have a long history of usefulness in SQL-derived languages. While query languages can be rich in their expressiveness, there's just no way they can anticipate all the things a developer wants to do. Thus, the custom UDF has become commonplace in our data manipulation toolbox.
Apache (more...)