An Oracle server – How Fast for £1,000

mwidlake | Jul 27, 2010 02:54 +0000

Question? How fast an Oracle server can you create for £1,000 pounds?

{I’d really appreciate feedback and suggestions on this particular post}

The power of domestic PCs continues to grow, with four-core chips become pretty much standard and starting RAM looking more like 4GB than 2GB, with 8GB quite reasonable. So, how quick an Oracle server can you make based on a domestic PC? After all, those of us who play with Oracle in our spare time tned to use such machines and, in fact, they are often not far off what are our smaller servers at the office really are. When I worked at the Wellcome Trust Sanger Institute, we had to make our IT budget pounds go a long way. We were, after all, a charity with a limited budget but also a scientific organisation with a huge demand for data and processing. So we used a lot of cheap kit.

I’m seriously thinking of giving this a go. I need a new PC anyway and so I am willing to use it, at least initially, to see what can be done.

If I do this, I’m going to need to set some boundaries on the exercise. How about:

  • The oracle licence is being ignored in the cost {and please, I don’t need to be told how the licence can be more than the hardware costs!}. OS cost is though.
  • I am not aiming for enterprise-level resilience, so I am not going to consider hot-swappable components, dual redundant power supplies or things like that.
  • I am going to use new kit, so no scavenging or buying second-hand. It must all be easily available and repeatable.
  • I will use local storage in the server or connected to ports available on the server.
  • It will support a database of 1TB in size {yet to be designed}.
  • Oracle v11. Enterprise edition but nothing special like TimesTen or Exadata (unless Oracle are willing to sell me an Exadata box for a grand, then I’ll consider it).
  • I’m not considering backup and recovery performance {and this would be a serious oversight if this was a real system, but most places have central backup/recovery facilities}.

I would also have a few other things to decide.

The main one is “Do I use Linux or Windows?” Yes, you are all probably shouting “Linux!!!” but I have never been a Linux sys admin (I was an incredibly poor HP-UX system admin for 3 months though) so it will take me more time to deal with issues under Linux - in work situations I have always had access to people who know all this stuff to sort out issues but in this case I will be doing this on my own. On the other hand, you can just chuck Oracle on a standard windows box and it works, and as a rule hardware just works under Windows. If I decide to use USB3 ports, for example, is it going to be a major pain getting drivers under Linux? But then if I want the fastest oracle box under a grand why would I slow it down with windows and spend money on the licence? I just want the box to run Oracle and a workload.

The second “software” decision is, how do I measure performance? I think I could be getting to grips with Dom Giles’ excellent Swingbench {BTW, nice tag line on that page, Dom :-) }. But it runs on Java and guess what boys and girls? I’ve never been a Java developer. How limited are my skills! So that would take some of my precious spare time up too.

I’d love feedback on this, I’d love to know what hardware suggestions you would make, what you think about the overall idea, what else I need to consider to make the tests valid… I have a few ideas already for the hardware architecture and the intention would be to try lots of things but I’ll save that for a second post. After all, if I get no feedback I might just spend the money on a gaming machine and a week’s walking in the Lake District instead.

And if anyone want to help with the cost, please send cheques to….


Team Work & The Science of Slacking

mwidlake | Jul 23, 2010 00:59 +0000

We all know that working in a team is more efficient than working on your own (and I did say a week or two back how I was enjoying the rare privilege of working in a team of performance guys). Many of us also know about team dynamics and creating a balanced team of ideas people, completer-finishers, implementers, strategists and so forth. Those of use who have been exposed to training courses or books on team management know all these good things about teams and how we are supposed to get the most out of them.

How many of us, though, have been introduced to the work of the French Agronomist Max Ringelmann and the aspect of teams named after him, the Ringelmann Effect? In summary the Ringelmann Effect proposses that people in teams try less hard than they do when working alone. Especially if they think no one is watching them.

Back at the start of the 20th century Ringelmann tested out his ideas using a tug-of-war experiment. He would get people to pull on a rope as hard as they could and record their efforts using a strain gauge. Then he would get them to pull on the rope as part of a team, from 2 to 8 people. As soon as people were part of a team, they pulled less hard. With two people in the team, each pulled 93% as hard as on their own, with three people this dropped down to 85% and with 4 it was just 77%. By the time there were 8 people in the team, effort was down to 50%.

This idea of shirking work more and more as the team increased in size became established in modern psychology and was given Mr Ringelmann’s name. Psychologists explain that when someone is part of a group effort then the outcome is not solely down to the individual and, as such, is not totally in their control. This acts as a demotivating factor and the person tries that little bit less hard. The larger the team, the greater the demotivation and the more significant the drop in effort. Ringelmann found that effort was down to 50% in a team of 8 so how bad can the impact of the team be? I think most of us have at least witnessed, and quite possibly been in, the position of feeling like just a cog in a massive corporate team machine. Thoroughly demotivating (though, of course, we all of us still tried as hard as we could, didn’t we?).

The effect is also know under the far more entertaining title of Social Loafing.

Monsieur Ringelmann was far kinder at the time and pointed out that these chaps pulling on the rope could well have been suffering from a lack of synergy. They had not been trained together to pull as a team so that could account for the drop in effort, they were not synchronising their effort.

However, in the 1970′s Alan Ingham in Washington University revisited Ringelmanns work and he was far sneekier. Sorry, he was a more rigorous scientist. He used stooges in his team of rope-pullers, blindfolds and putting the one poor person pulling for real at the front of the team pulling the rope. Thus he could record the effort of the individual. Ingham found that there was indeed a drop in efficiency due to the team not pulling as one. But sadly, this was not the main factor. It remained that the drop in effort was mostly down to the perceived size of the rest of the team. The bottom line was proven to be the human capacity to try less hard when part of a team and that the drop in effort was directly proportional to the size of the team.

We are of course not immune to this effect in the IT world and someone has even gone to the effort of checking that out, James Suleiman and Richard T Watson.

It seems the ways to reduce this problem are:-

  • Don’t give people boring jobs.
  • Don’t give the same job to several people and let them know they all have the same job.
  • Ask people how they are getting on and give them mini-goals along the way.
  • Atually reward them for success. Like saying “thank you” and NOT giving them yet another boring, hard job to do as they did the last one so well.

I think it is also a good argument for keeping teams small {I personally think 5 or 6 people is ideal} and split up large projects such that a single team can cope. Then give tasks to individuals or pairs of people.

If you like this sort of thing you might want to check out one of my first blog post (though it is more an angry rant than a true discussion ofthe topic) which was on the Dunning-Kruger effect, where some people are unaware of their own limitations – though I did not know it was called the Dunning-Kruger effect until others told me, which only goes to show that maybe I am not aware of my own limits… Read the comments or click through to the links from there to get a better description of some people’s inability to guage their own inabilities.


My laptop has a Bug

mwidlake | Jul 20, 2010 08:02 +0000

My laptop is suffering from bugs, and I’m not talking software.

It is warm and sunny here in the Southeast of England, which is not always the case during the British Summer, and I am suffering an invasion of little insects. Specifically Thrips or Thunderbugs. They are called Thunderbugs as they are supposed to appear in numbers when a thunderstorm is brewing. Like most Old Wives Tales it is utter rubbish. But kind of true too…

If you do not know, a thrip is usually a small insect about 0.15 mm wide and maybe 0.4mm long. So small, but visible. About the size of this:

,

Yep, a coma on an average LCD panel. And that is where the problem is. One has got into my laptop and under my screen and it is sure to die. It is currently scurrying around at the far left of the screen and I’m considering a mercy killing before it wanders further across the screen into prime acreage. I had this before on my old laptop. In that case it died in the middle of the screen and for ever more has looked suspiciously like a coma, or single ‘quote’, causing me confusion when it falls on top of emails, word documents and…. code. It really was a pain when it came to code. Even now, if I use that old machine it sometimes catches me out. It can merge with a letter in new and exciting ways, to subtly change a word or command.

I’m obviously not alone, a quick web search threw up some other people complaining of the same issue.

And of course it is a common knowledge that “bugs” in computing really did start out as insects getting fried in the electronics and valves of the very first machines in the mid-20th century. I wonder if that is really true or just another old myth? James Higgins seems to think it is real and who am I to doubt him. He has a photo of the evidence after all.


More Memory Meanderings – IOPS and Form Factors

mwidlake | Jul 19, 2010 07:44 +0000

I had a few comments when I posted on solid state memory last week and I also had a couple of interesting email discussions with people.

I seriously failed to make much of one of the key advantages of solid-state storage over disk storage, which is the far greater capacity of Input/output operations per second (IOPS), which was picked up by Neil Chandler. Like many people, I have had discussions with the storage guys about why I think the storage is terribly slow and they think it is fast. They look at the total throughput from the storage to the server and tell me it is fine. It is not great ,they say, but it is {let’s say for this example} passing 440MB a second over to the server. That is respectable and I should stop complaining.

The problem is, they are just looking at throughput, which seems to be the main metric they are concerned about after acreage. This is probably not really their fault, it is the way the vendors approach things too. However, my database is just concerned in creating, fetching, and altering records and it does it as input/output operations. Let us say a disk can manage 80 IOPS per second (which allows an average 12.5 ms to both seek to the record and also read the data. Even many modern 7,200 rpm discs struggle to average less than 12ms seek time). We have 130 disks in this example storage array and there is no overhead from any sort of raid or any bottleneck in passing the data back to the server. {This is of course utterly unbelievable, but if i have been a little harsh not stating the discs can manage 8ms seek time, ignoring the raid/hba/network cost covers that}. Each disc is a “small” one of 500GB. They bought cheap disk to give us as many MB/£ as they could {10,000 and 15,0000 rpm disks will manage 120 and 160 IOPS per second but cost more per MB}.

Four sessions on my theoretical database are doing full table scans, 1MB of data per IO {Oracle’s usual max on 10.2}, Each session receiving 100MB of data a second, so 400MB in total. 5 discs {5*80 IOPS*1MB} could supply that level of IOPS. It is a perfect database world and there are no blocks in the cache already for these scans to interrupt the multi-block reads.

However, my system is primarily an OLTP system and the other IO is records being read via index lookups and single block reads or writes.

Each IOP reads the minimum for the database, which is a block. A block is 4k. Oracle can’t read a bit of a block.

Thus the 40MB of other data being transferred from (or to) the storage is single block reads of 4k. 10,000 of them. I will need 10,000/80 disks to support that level of IO. That is 125 discs, running flat out.

So, I am using all my 130 discs and 96% of them are serving 40MB of requests and 4% are serving 400MB of requests. As you can see, as an OLTP database I do not care about acreage or throughput. I want IOPS. I need all those spindles to give me the IOPS I need.

What does the 40MB of requests actually equate to? Let us say our indexes are small and efficient and have a height of 3 (b-level of 2), so root node, one level of branch nodes and then the leaf nodes. To get a row you need to read the root node, branch node, lead node and then the table block. 4 IOs. So those 10,000 IOPS are allowing us to read or write 10,000/4 records a second or 2,500 records.
You can read 2,500 records a second.

Sounds a lot? Well, let us say you are pulling up customer records onto a screen and the main page pulls data from 3 main tables (customer, address, account_summary) and translates 6 fields via lookups. I’ll be kind and say the lookups are tiny and oracle just reads the block or blocks of the table with one IO. So that is 9IOs for the customer screen, so if our 40MB OLTP IO was all for looking up customers then you could show just under 280 customers a second, across all users of your database. If you want to pull up the first screen of the orders summary, each screen record derived from 2 underlying main tables and again half a dozen lookups, but now with 10 records per summary page – that is 80 IOs for the page. Looking at a customer and their order summary you are down to under thirty a second across your whole organisation and doing nothing else.

You get the idea. 2,500 IOPS per second is tiny. Especially as those 130 500GB disks give you 65TB of space to host your database on. Yes, it is potentially a big database.

The only way any of this works is due to the buffer cache. If you have a very healthy buffer cache hit ratio of 99% then you can see that your 2500 records of physical IO coming in and out of the storage sub-system is actually supporting 250,000 logical-and-physical IOPS. {And in reality, many sites not buffer at the application layer too}.

Using Solid State Storage would potentially give you a huge boost in performance for your OLTP system, even if the new technology was used to simply replicate disk storage.

I think you can tell that storage vendors are very aware of this issue as seek time and IOPS is not metrics that tend to jump out of the literature for disk storage. In fact, often it is not mentioned at all. I have just been looking at some modern sales literature and white papers on storage from a couple of vendors and they do not even mention IOPS – but they happily quote acreage and maximum transfer rates. That is, until you get to information on Solid State Discs. NOw, because the vendor can say good things bout the situation then the information is there. On one HP white paper the figures given are:

				Modern super-fast		Top-end
				SAS disk drive Top-end 	Solid State Disk
Sustained write     	150MB/s			180MB/s
Sustained read			90MB/s			180MB/s
Random write			285				5,000+
Random read				340				20,000+ 

More and more these days, as a DBA you do not need or want to state your storage requirements in terms of acreage or maximum throughput, you will get those for free, so long as you state your IOPS requirements. Just say “I need 5000 IOPS a second” and let the storage expert find the cheapest, smallest disks they can to provide it. You will have TBs of space.

With solid-state storage you would not need to over-specify storage acreage to get the IOPS, and this is why I said last week that you do not need solid state storage to match the capacity of current disks for this storage to take over. We would be back to the old situation where you buy so many cheap, small units to get the volume, IOPS are almost an accidental by-product. With 1GB discs you were always getting a bulk-buy discount :-)

I said that SSD would boost performance even if you used the technology to replicate the current disk storage. By this I mean that you get a chunk of solid-state disk with a SATA or SAS interface in a 3.5 inch format block and plug it in where a physical disk was plugged in, still sending chunks of 4k or 8k over the network to the Block Buffer Cache. But does Oracle want to stick with the current block paradigm for requesting information and holding data in the block buffer cache? After all, why pass over and hold in memory a block of data when all the user wanted was a specific record? It might be better to hold specific records. I suspect that Oracle will stick with the block-based structure for a while yet as it is so established and key to the kernel, but I would not be at all surprised if something is being developed with exadata in mind where data sets/records are buffered and this could be used for data coming from solid state memory. A second cache where, if using exadata or solid-state memory, holding single records. {I might come back to this in a later blog, this one is already getting bloated}.

This leads on to the physical side of solid-state discs. They currently conform to the 3.5” or 2.5” hard disc form factor but there is no need for them to do so. One friend commented that, with USB memory sticks, you could stick a female port on the back of a memory stick and a joint and you could just daisy-chain the USB sticks into each other, as a long snake. And then decorate your desk with them. Your storage could be looped around the ceiling as bunting. Being serious, though, with solid state storage then you could have racks or rows of chips anywhere in the server box. In something like a laptop the storage could be an array 2mm high across the bottom the chasis. For the server room you could have a 1u “server” and inside it a forest of chips mounted vertically, like row after row of teeth, with a simple fan at front and back to cool the teeth (if needed at all). And, as I said last time, with the solid state being so much smaller and no need to keep to the old hard disk format, you could squeeze a hell of a lot of storage into a standard server box.

If you pulled the storage locally into your server, you would be back to the world of localised storage, but then LANs and WANs are so much faster now that if you had 10TB of storage local to your server, you could probably share it with other machines in the network relatively easily and yet have it available to the local server with as many and as fat a set of internal interfaces as you could get your provider to manage.

I’m going to, at long last, wrap up this current instalment on my thoughts with a business one. I am convinced that soon solid-state storage is going to be so far superior a proposition to traditional disks that demand will explode. And so it won’t get cheaper. I’m wondering if manufacturers will hit a point where they can sell as much as they can easily make and so hold the price higher. After all, what was the argument for Compact Discs to cost twice as much to produce as old cassette tapes, even when they had been available for 5 years? What you can get away with charging for it.