Migrating Python ML Models to other languages

I’ve mentioned in a previous blog post about experiencing some performance issues with using Python ML in production. We needed something quicker and the possible languages we considered were C, C++, Java and Go Lang.

But the data science team used R and Python, with just a few more people using Python than R on the team.

One option was to rewrite everything into the language used in production. As you can imagine no-one wanted (more...)

Cat or Dog — Image Classification with Convolutional Neural Network

The goal of this post is to show how convnet (CNN — Convolutional Neural Network) works. I will be using classical cat/dog classification example described in François Chollet book — Deep Learning with Python. Source code for this example is available on François Chollet GitHub. I’m using this source code to run my experiment.

Convnet works by abstracting image features from the detail to higher level elements. An analogy can be described with the way how humans think. (more...)

Jupyter Notebook for retrieving JSON data from REST APIs

If data is available from REST APIs, Jupyter Notebooks are a fine vehicle for retrieving that data and storing it in a meaningful, processable format. This article introduces an example of a such a dataset:

image

Oracle OpenWorld 2018 was a conference that took place in October 2018 in San Francisco. Over 30,000 attendees participated and visited some 2000 sessions. Raw data from the session catalog is available from an API – a REST service that (more...)

The Full Oracle OpenWorld and CodeOne 2018 Conference Session Catalog as JSON data set (for data science purposes)


Oracle OpenWorld and CodeOne 2018 are two co-located conferences that took place in October 2018. Some 2000 sessions presented by over 2500 presenters form the core of these conferences. Many details are known about each of the sessions and the speakers – from title, abstract, room (size), date and time, session slides, session type and key topics to first name, last name, Twitter handle, picture, company, job title and bio. image

A data set is available (more...)

Build it Yourself — Chatbot API with Keras/TensorFlow Model

Is not that complex to build your own chatbot (or assistant, this word is a new trendy term for chatbot) as you may think. Various chatbot platforms are using classification models to recognize user intent. While obviously, you get a strong heads-up when building a chatbot on top of the existing platform, it never hurts to study the background concepts and try to build it yourself. Why not use a similar model yourself. Chatbot implementation (more...)

Understanding Nested Lists Dictionaries of JSON in Python and AWS CLI


After lots of hair pulling, bouts of frustration, I was able to grasp this nested list and dictionary thingie in JSON output of AWS cli commands such as describe-db-instances and others. If you run the describe-db-instances for rds or describe-instances for ec2, you get a huge pile of JSON mumbo-jumpo with all those curly and square brackets studded with colons and commas. The output is heavily nested.


For example, if you do :

aws rds (more...)

Game of Thrones Series 8: Real Time Sentiment Scoring with Apache Kafka, KSQL, Google’s Natural Language API and Python

Game of Thrones Series 8: Real Time Sentiment Scoring with Apache Kafka, KSQL, Google's Natural Language API and Python

Hi, Game of Thrones aficionados, welcome to GoT Series 8 and my tweet analysis! If you missed any of the prior season episodes, here are I, II and III. Finally, after almost two years, we have a new series and something interesting to write about! If you didn't watch Episode 1, do it before reading this post as it might contain spoilers!

Let's now start with a preview of the starting scene of Episode (more...)

Python transforming Categorical to Numeric

When preparing data for input to machine learning algorithms you may have to perform certain types of data preparation.

In most enterprise solutions all or most of these tasks are automated for you, but in many languages they aren’t. The enterprise solutions are about ‘automating the boring stuff’ so that you don’t have to worry about it and waste valuable time doing boring, repetitive things.

The following examples illustrates a number of ways to record (more...)

named tuple to JSON – Python

In pgdb – PostgreSQL DB API, the cursor which is used to manage the context of a fetch operation returns list of named tuples. These named tuples contain field names same as the column names of the database query.

An example of a row from the list of named tuples –


Row(log_time=datetime.datetime(2019, 3, 20, 5, 41, 29, 888000, tzinfo=), user_name='admin', connection_from='72.20.208.64:21132', command_tag='INSERT', message='AUDIT: SESSION,1,1,WRITE,INSERT,TABLE,user.demodml,"insert into user.demodml (id) values (1),(2),(3),(4),(5),(6),(7),(8),(9),(11);",',  (more...)

Publishing Machine Learning API with Python Flask

Flask is fun and easy to setup, as it says on Flask website. And that's true. This microframework for Python offers a powerful way of annotating Python function with REST endpoint. I’m using Flask to publish ML model API to be accessible by the 3rd party business applications.

This example is based on XGBoost.

For better code maintenance, I would recommend using a separate Jupyter notebook where ML model API will be published. Import Flask (more...)

Selecting Optimal Parameters for XGBoost Model Training

There is always a bit of luck involved when selecting parameters for Machine Learning model training. Lately, I work with gradient boosted trees and XGBoost in particular. We are using XGBoost in the enterprise to automate repetitive human tasks. While training ML models with XGBoost, I created a pattern to choose parameters, which helps me to build new models quicker. I will share it in this post, hopefully you will find it useful too.

I’m (more...)

Prepare Your Data for Machine Learning Training

The process to prepare data for Machine Learning model training to me looks somewhat similar to the process of preparing food ingredients to cook dinner. You know in both cases it takes time, but then you are rewarded with tasty dinner or a great ML model.

I will not be diving here into data science subject and discussing how to structure and transform data. It all depends on the use case and there are so (more...)

Jupyter Notebook — Forget CSV, fetch data from DB with Python

If you read a book, article or blog about Machine Learning — high chances it will use training data from CSV file. Nothing wrong with CSV, but let’s think if it is really practical. Wouldn’t be better to read data directly from the DB? Often you can’t feed business data directly into ML training, it needs pre-processing — changing categorial data, calculating new data features, etc. Data preparation/transformation step can be done quite easily with (more...)

Python List

This blog post is about appending data elements to list in Python.

Suppose we have a simple list “x”, we will look at different ways to append elements to this list.

x = [1, 2, 3]

The “append” method appends only a single element

>>> x
[1, 2, 3]
>>> x.append(4)
>>> x
[1, 2, 3, 4]
>>>

>> x.append(5, 6, 7)
Traceback (most recent call last):
File "", line 1, in
TypeError:  (more...)

Time SQL Execution with Python

I've said before in this blog how I find Python to be very useful for doing various things, including processing data to or from an Oracle database. Here is another example where a relatively simple and straightforward piece of Python code can deliver something that is very useful - in this case measuring the elapsed time of SQL queries executed on an Oracle database. We often need to execute a given SQL query and see (more...)

How to install Python 3 on Oracle Linux

You can install Python 3 on your Oracle Linux 7 environment with three simple steps: sudo yum install -y yum-utils sudo yum-config-manager --enable *EPEL sudo yum install -y python36 As a first step, in case you don’t have it yet on your system, is to install the yum-utils package. This package includes the yum-config-manager which allows you … Continue reading "How to install Python 3 on Oracle Linux"

Tweet Escalation to Your Support Team — Sentiment Analysis with Machine Learning

I have published an article on Towards Data Science. I explain end-to-end technical solution which would help to streamline your company support process. With the focus on airline support requests received from Twitter. It could save a lot of time and money for the support department if they would know in advance which request is more critical and must be handled with higher priority.

Read the full article here - Solution to automate tweet sentiment (more...)

Python & Oracle 1

While Python is an interpreted language, Python is a very popular programming language. You may ask yourself why it is so popular? The consensus answers to why it’s so popular points to several factors. For example, Python is a robust high-level programming language that lets you:

  • Get complex things done quickly
  • Automate system and data integration tasks
  • Solve complex analytical problems

You find Python developers throughout the enterprise. Development, release engineering, IT operations, and support (more...)

API for Amazon SageMaker ML Sentiment Analysis

Assume you manage support department and want to automate some of the workload which comes from users requesting support through Twitter. Probably you already would be using chatbot to send back replies to users. Bu this is not enough - some of the support requests must be taken with special care and handled by humans. How to understand when tweet message should be escalated and when no? Machine Learning for Business book got an answer. (more...)

How to: Adding Speech to Oracle Digital Assistant; Talk to me Goose

At Oracle Code One in October, and also on DOAG in Nurnberg Germany in November I presented on how to go beyond your regular chatbot. This presentation contained a part on exposing your Oracle Digital Assistant over Alexa and also a part on face recognition. I finally found the time to blog about it. In this blogpost I will share details of the Alexa implementation in this solution.