Exadata 上 Infiniband 网络简介

在 Exadata 环境中节点间数据传输是通过 Infiniband 实现的,本文将对其硬软件做一简介。


Sun Datacenter InfiniBand Switch 36 - 36口 Infiniband 交换机


X4 和 X5 每套机柜会包含两个(leaf switch)

X3 和 X2 半配和满配的机柜会包含三个(两个 leaf switch,一个 spine switch)

下图(X5-2, 满配)中绿色部分即为交换机,一共有三个单元,中间是48口 Cisco 以太网交换机,上下各是一个36口 Infiniband 交换机

对于X5来说,无论何种配置,交换机在机柜中的位置是固定的,U20和U22存放 IB 交换机,U21存放 Cisco 以太网交换机。

每一个 2-socket 计算和存储节点都包含一块基于 PCIe 的 Mellanox HCA(Host Channel Adapter)卡, 双 infiniband 4X QDR 口,单口单向速率40 Gb/s:


X2 和 X3 使用的基于 PCIe 2.0 的 HCA 卡。由于总线速度的限制,两个端口使用的是主备(active/passive)模式,并由 Linux 原生网卡绑定功能(配置文件)提供失败切换机制,以下 ifconfig 命令输出显示 bondib0 是绑定后的逻辑网卡,ib0 和 ib1 是两块物理网卡:

Exadata X5 PDU – CLI already in use

Exadata X5-2 and X4-8B racks are delivered with the “Enhanced” PDU metering units connected via the Cisco switch. Although the documentation says they should have static addresses, they don’t. You need to configure them manually using serial console connection, this is described in my earlier post here.

However if you forget to exit the serial console connection to the PDU and then try to login using SSH later you’ll get the following message:

login as:  (more...)

Exadata Software is Released

Good new features announced, such as finding Flash Cache and Flash Log statistics on AWR, automatic ASM data redundancy check even shutting down a storage server by pressing the power button or through ILOM, preventing Flash Cache population in cell to cell rebalance, disabling SSH on Storage Servers and running CellCLI commands from Compute Nodes via new ExaCLI utility. Find details in the MOS

DBMCLI for Exadata Database Servers

Monitoring database servers for hardware failures in Exadata used to be an issue. We have CELLCLI for storage servers where we can make many administration tasks and set SMTP configuration to receive e-mail for hardware failures. For the DB servers we were able to configure SMTP Client on database server's ILOMs for email alerts but it was not a straightforward and easy configuration as we do in

Favorite Way: Migrating to Exadata

There are lots of considerations to be taken into account when migrating databases to the Exadata. It’s like any other migration. DBAs and other stakeholders of the system have to evaluate what to migrate and what not to, physical and logical settings, versions and many other things.

I’m not delving into migration preparation or strategies and methods here, instead I want to mention which method I like the best to migrate databases to the Exadata. By (more...)

Restoring a Database to a New Diskgroup

I had the pleasure of rebuilding an Exadata rack for a customer a while back, and it provided a pretty good refresher in backup and recovery for me.  As DBAs, we back up databases all the time, but the restores are performed much less frequently.  In the case of this rack, there were several databases across multiple ASM diskgroups.  One of the goals of the rebuild was to consolidate all of the databases into a (more...)

Shrink/Grow Exadata diskgroups

One of the important tasks that I foresee after an initial Exadata deployment is, mostly prior to DB in production, is to balance/resize the Exadata diskgroups (DATA & RECO).  Generally, the space is distributed as 80(DATA), 20(RECO) or 40(DATA), 60(RECO), depending on the database backup option you choose while deploying. In one of our Exadata setups, we don't need such a huge RECO size, hence, we shrunk the RECO size and increased the DATA diskgroup (more...)

Oracle 12 and latches

Oracle DBAs who are so old that they remember the days before Oracle 11.2 probably remember the tuning efforts for latches. I can still recall the latch number for cache buffers chains from the top of my head: number 98. In the older days this was another number, 157.

But it seems latches have become less of a problem in the modern days of Oracle 11.2 and higher. Still, when I generate heavy (more...)

Latest Release of Oracle Exadata Exachk and HealthCheck

Announcing the release of Exachk Version Exachk is a tool that is used to quickly determine the health of the entire Exadata infrastructure. Some of the key components that are checked include the database, cell servers and the network. This tool further reviews the configuration of the system and compares with the best practices. […]

The post Latest Release of Oracle Exadata Exachk and HealthCheck appeared first on VitalSoftTech.

How long does a logical IO take?

This is a question that I played with for a long time. There have been statements on logical IO performance (“Logical IO is x times faster than Physical IO”), but nobody could answer the question what the actual logical IO time is. Of course you can see part of it in the system and session statistics (v$sysstat/v$sesstat), statistic name “session logical reads”. However, if you divide the number of logical reads by the total time (more...)

JSON support in Exadata and later

Some time ago Oracle announced that RDBMS has built-in support for JSON processing. A little later it was also mentioned that you have support for JSON in the Exadata storage servers for offloading. This is probably a lot more exciting to users of JSON than it is to me as I’m not a developer. However, whenever an announcement such as the one I’m referring to is made I would like to (more...)

Exadata’s onecommand fails to validate NTP servers on storage servers

This will be simple and short post on an issue I had recently. I got the following error while running the first step of onecommand – Validate Configuration File:

2015-07-01 12:31:03,712 [INFO  ][    main][     ValidationUtils:761] SUCCESS: NTP servers on machine exa01db02.local.net verified successfully
2015-07-01 12:31:03,713 [INFO  ][    main][     ValidationUtils:761] SUCCESS: NTP servers on machine exa01db01.local.net verified successfully
2015-07-01 12:31:03,714 [INFO  ][    main][     ValidationUtils:778] Following errors were found...
2015-07-01 12:31:03,714 [INFO  ][     (more...)

HAIP and Exadata

If you’ve run an exachk report, y0u may have seen the following message with regard to your databases:

Status Type Message Status On Details
FAIL Database Check Database parameter CLUSTER_INTERCONNECTS is NOT set to the recommended value db01:dbm011, db02:dbm012 View

This check is commonly seen when a database is created on Exadata without using the custom “Exadata” templates included with the database creation assistant.  These customized templates include a multitude of recommended parameter settings found in (more...)

MGMTDB not automatically created on Exadata X5 and GI

While deploying an X5 Full Rack recently it happened that the Grid Infrastructure Management Repository was not created by onecommand. The GIMR database was optional in and became mandatory in and should be automatically installed with Oracle Grid Infrastructure 12c release 1 ( For unknown reason to me that didn’t happen and I had to create it manually. I’ve checked all the log files (more...)

Investigating the full table direct path / buffered decision.

A lot of blogposts and other internet publications have been written on the full segment scan behaviour of a serial process starting from Oracle version 11gR2. This behaviour is the Oracle engine making a decision between scanning the blocks of a segment into the Oracle buffercache or scanning these blocks into the process’ private process global area (PGA). This decision is even more important on the Exadata platform, because the Oracle engine must have made (more...)

dbnodeupdate.sh post upgrade step fails on Exadata storage software

I’ve done several Exadata deployments in the past two months and had to upgrade the Exadata storage software on half of them. Reason for that was because units shipped before May had their Exadata storage software version of

The upgrade process of the database nodes ran fine but when I ran dbnodeupdate.sh -c for completing post upgrade steps I got an error that the system wasn’t on the expected (more...)

A Deep Dive into ASM redundancy in Exadata

I made this presentation at Serbia Oracle User Group event in Zlatibor and Harmony 2015 event in Tallinn at last month. Both were very nice organizations that i really enjoyed, learned much and met with many Oracle experts. I prepared this presentation to answer to following questions, which i think are really important for Exadata Database Machine administrators: - To what degree, disk and

How do I change DNS servers on Exadata storage servers

This is just a quick post to highlight a problem I had recently on another Exadata deployment.

For the most customers the management network on Exadata is routable and the DNS servers are accessible. However in a recent deployment for a financial organization this wasn’t the case and the storage servers were NOT able to reach the DNS servers. The customer provided a different set of DNS servers within the management network which were still (more...)

How to configure Power Distribution Units on Exadata X5

I’ve done several Exadata deployments recently and I have to say of all the components PDUs were hardest to configure. Important to notice that unlike earlier generations of Exadata the PDUs in X5 are Ehnanced PDUs and not Standard.

Reading the public documentation (Configuring the Power Distribution Units) it says that on PDUs with three power input leads you need to connect the middle power lead to the power source. Well I’ve done (more...)

IO Resource Manager for Pluggable Databases in Exadata

Another interesting topic that goes over and above the CDB Resource Manager Plans I described earlier this week is the implementation of IORM Plans for Pluggable Databases. Pluggable Databases are an interesting object for studies, and I like to research things. When 12c came out there was no support for offloading initially, it is required that you are on cell software 12.1.x.x.x for full 12c support on Exadata. One aspect I was particularly (more...)