Topic: This post is about measuring Apache Spark workload metrics for performance investigations.
If you are already familiar with Apache Spark and Jupyter notebooks you may want to go directly to the example notebook and code. If you want additional context and introduction to the topic of using Spark on notebooks, please read on.
Topic: this post is about a simple implementation with examples of IPython custom magic functions for running SQL in Apache Spark using PyS
Introduction: performance troubleshooting of a slow query using parallel query execution in a Hadoop cluster
The idea for this post comes from a performance troubleshooting case that has come up recently on our database services. It started with a user reporting slow response time from a query (more...)
Machine learning and neural networks in particular, are currently hot topics in data processing. Many tools and platform are now easily available to work and experiment (more...)
IPython/Jupyter notebooks are one of the leading free platforms for data analysis, with many advantages, notably the interactive web-based interface and a large ecosystem of readily available packages for data analysis and visualization. Moreover IPython/Jupyter notebooks are a very handy format for sharing code and data as you will see in the examples.
See also (more...)
Previous work and motivations
Tools for dynamic tracing are very useful for troubleshooting and internals investigations of Oracle workloads. Dynamic tracing probes on the OS/kernel, can be used to measure the details for I/O latency for example. Moreover probes on the Oracle (more...)
Apache Impala is an open source massively parallel processing (MPP) SQL Query Engine for Apache Hadoop. This post explores the use of IPython for querying Impala and generates from the notes of a few tests I ran recently on our systems. For completeness please that that several additional options exist to query Impala, some (more...)
SystemTap and dynamic tracing tools in general give administrators great control on their systems with the relatively little additional effort to learn the new tools. In this post you will see of how SystemTap that can be used to modify data on the fly at runtime. The outcome is a form of "live patching". Examples are provided on how to apply these ideas to Oracle SQL parsing functionality. This type of "guru mode" use (more...)
The reason for a tool like PerfSheet.js is to make the analysis of AWR data easier by providing a graphical interactive interface and by automating several repetitive steps of data extraction and chart preparation. Pivot charts provide a flexible and easy to use way to navigate (more...)
The recent progress and maturity of some of the Linux dynamic tracing tools has raised interest in applying these techniques to Oracle troubleshooting and performance investigations. See Brendan Gregg's web pages for summary and future developments on dynamic traces for Linux. Some recent work on applying these tools and techniques to Oracle can be found (more...)
Understanding the workload is an important part of troubleshooting activities. We seek answers to questions like: what is the system doing, where is the time spent, which code paths are most used, what are the wait events, etc. Sometimes the relevant diagnostic data is easy to find, other times we need to dig (more...)
Many thanks to UKOUG for delivering once more a technical conference of very high quality and to all the participants who make the (more...)
Context: The case of the DB Time > CPU Time + Wait Time
Oracle instrumentation provides wait event and CPU time accounting, a powerful and readily accessible data source for performance troubleshooting. An Oracle session at a given point in time is either on CPU, for example when processing data from cache, (more...)
The problem: Operations with high latency on a filesystem and/or a storage volume can sometimes be attributed to just a few disks 'misbehaving', possibly because they are suffering mechanical failures and/or are going to break completely in the near future.
I/Os of high latency on just a few disks can then appear (more...)
Motivations: The drill down of I/O latency is an important technique for troubleshooting and benchmarking storage. Average latency values can hide details of what is happening on the storage. Think for example of storage systems with flash and spindles, each serving I/O at different latency. Moreover averaging the measured values over time can hide details in case of varying (more...)
Why: Latency histograms (and by extension wait event histograms) provide very useful information when troubleshooting performance for systems exhibiting response time with multi-mode distribution. In such cases average wait values are often not sufficient to understand the behavior of the system under study and histograms provide a finer level of details. A (more...)