Expanding array to multiple rows – Athena

A single row in Athena table is stored as —

select id, course, date from demo.course_tab where id='1234567892'

 	id	course	                                                                 date
 1234567892	[95c3c1bc5873, 2e345b2eb678, 027b02599f4a, 8695a580520b, 5d453355d415, cdcc7682070b]	2019-06-13

The datatype for course column is array(string). Now, how can you get the output in below format –

 	id	        course          date

1	1234567892	95c3c1bc5873	2019-06-13
2	1234567892	2e345b2eb678	2019-06-13
3	1234567892	027b02599f4a	2019-06-13
4	1234567892	8695a580520b	2019-06-13
5	1234567892	5d453355d415	2019-06-13
6	1234567892	cdcc7682070b	2019-06-13

(more...)

S3 bucket size using aws cli

To get the total objects count and total size for folder within S3 bucket recursively –


aws s3 ls --summarize --human-readable --recursive s3://content-dev/mywork/dumb | grep 'Total'

Total Objects: 1922
   Total Size: 297.4 GiB

named tuple to JSON – Python

In pgdb – PostgreSQL DB API, the cursor which is used to manage the context of a fetch operation returns list of named tuples. These named tuples contain field names same as the column names of the database query.

An example of a row from the list of named tuples –


Row(log_time=datetime.datetime(2019, 3, 20, 5, 41, 29, 888000, tzinfo=), user_name='admin', connection_from='72.20.208.64:21132', command_tag='INSERT', message='AUDIT: SESSION,1,1,WRITE,INSERT,TABLE,user.demodml,"insert into user.demodml (id) values (1),(2),(3),(4),(5),(6),(7),(8),(9),(11);",',  (more...)

PostgreSQL – LWLock:multixact_offset

On one of the PostgreSQL db , the active sessions were waiting on IO:SLRURead and LWLock:multixact_offset causing application latency.

Multioffset

As per PostgreSQL doc, SLRURead is waiting for a read of an SLRU page and multixact_offset is waiting for I/O on a multixact offset buffer.

Diagnosing the issue took some time so cut to the chase

1. Identified the SQLs, relation (and its size) for which these wait events were high. These table had ~1500 DMLs (more...)

Python List

This blog post is about appending data elements to list in Python.

Suppose we have a simple list “x”, we will look at different ways to append elements to this list.

x = [1, 2, 3]

The “append” method appends only a single element

>>> x
[1, 2, 3]
>>> x.append(4)
>>> x
[1, 2, 3, 4]
>>>

>> x.append(5, 6, 7)
Traceback (most recent call last):
File "", line 1, in
TypeError:  (more...)

Happy New Year 2019

Wish you all a very Happy, Healthy and Successful New Year 2019. Have a blessed year ahead!

Cheers!!

PostgreSQL – CPU Utilization and Index

One of the Production Aurora PostgreSQL instance running on db.r4.16xlarge instance (64 vCPU and 488 GB ) was reporting high CPU Utilization spiking upto 100%.

Screen Shot 2018-12-09 at 11.14.49 AM

With such issues, one of the first thing is to look for the SQLs with high buffers shared hit. I have built a small tool called pgsnap which is something similar to AWR respostory in Oracle maintaining the SQL stat history.  So, with pg_stat_statements and hist_pg_stat_statements(that’s what (more...)

Python – Flatten List of Lists

Itertools is one of the most powerful module in Python. Today I had requirement to flatten list of lists and itertools made it so easy.

My list —

>> val = [['a','b'],'c',['d','e','f']]

Required Result

['a', 'b', 'c', 'd', 'e', 'f']

How do you do it? Itertools to the resuce —

>>> list(chain.from_iterable(val))
['a', 'b', 'c', 'd', 'e', 'f']

So simple !!

Python — Enumerate

Suppose you have a dataset (data) and want to find every 5th item.  How would you do it?

data = [ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

The first thing which could come to mind in using slice, but that won’t work as its based on index and index starts from 0.

>>> data[::5]
[1, 11]

The answer should be = [9, 19]

This is where enumerate comes in play. (more...)

AWS DMS – Target TableName Differs

AWS DMS is a tool that supports both homogenous and heterogeneous migration, helping to migrate to aws cloud.

During most of the migrations, the source and target table names remain the same, in which case the Mappings.json file is pretty simple. As an example (Oracle to PostgreSQL)

 {
    "rules":
    [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator":
            {
                "schema-name": "DEVO",
                "table-name": "TEST_DEMO"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "transformation",
            "rule-id": "2",
            "rule-name": "2",
             (more...)

PostgreSQL – Unique constraint and null value

An important behavior in PostgreSQL to know about is the duplicate null values do not violate unique constraints.

Oracle

SQL> create table test (id number (2,0), 
                        country varchar(20) not null, 
                        state varchar(20)
                       );

Table created.

SQL> alter table test add constraint pk_test_id primary key (id);

Table altered.

SQL> alter table test add constraint uniq_test_cs unique (country, state);

Table altered.

PostgreSQL

admin@test # create table test (id numeric(2,0), 
                                country character varying(20) not null, 
                                state character varying(20)
                                 (more...)

FillFactor for UPDATE

What is FillFactor in PostgreSQL?

As per offical Doc —

fillfactor (integer)

The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default. When a smaller fillfactor is specified, INSERT operations pack table pages only to the indicated percentage; the remaining space on each page is reserved for updating rows on that page. This gives UPDATE a chance to place the updated copy of a row on the (more...)

S3 Bucket – IllegalLocationConstraintException

A quick one !!

Creating bucket failed with “IllegalLocationConstraintException” —

[ec2-user@ip-172-31-33-128 ~]$ aws s3api create-bucket --bucket labtestrand --region us-west-2

An error occurred (IllegalLocationConstraintException) when calling the CreateBucket operation: The unspecified location constraint is incompatible for the region specific endpoint this request was sent to.

Solution —

[ec2-user@ip-172-31-33-128 ~]$
[ec2-user@ip-172-31-33-128 ~]$ aws s3api create-bucket --bucket labtestrand --region us-west-2 --create-bucket-configuration LocationConstraint=us-west-2
{
"Location": "http://labtestrand.s3.amazonaws.com/"
}
[ec2-user@ip-172-31-33-128 ~]$

Oracle to Postgres — Index Skip Scan

I have been working on Oracle to Postgres migration and it has been a fun ride.

One of the recent encounters during migration was performance issue in Postgres for a query, actually quite a simple query.

SELECT count(*) AS num_incidents FROM audit_cs acs INNER JOIN audit_changes ac ON acs.audit_change_id = ac.audit_change_set WHERE acs.object_id=$1 AND ac.path = 'Flag' AND ac.new_value = 'sign' 

Let’s look at the execution plan in Postgres

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 (more...)

Oracle and More…

It has been over a year I wrote my last blog post. I would admit I was being lazy to write, but the good part is not anymore !!

I plan to take out some time and write at least 2 blogs every month and share my learning with the community and to learn from the community.

With the current pace of change in technology, it very important to keep up with it, though sometimes it (more...)