Thursday, October 20, 2011

Fast upgrades, with zones

I've been working with a sub-optimal solution with zones for a few years. The zones are located on SAN and movable between hosts which is good, but they are also located on an a filesystem which is not ZFS and not even UFS. This have made upgrades and patching terribly hard and slow, especially with more than 20 zones per host. All local zones earlier had to be upgraded at the same time as a global zone and all they had to be down for the entire operation (sometimes over 8 hours).

After a testing upgrade-on-attach for a while and combining it with Turbocharged SVr4 packages I now have a solution which brings the downtime to under one hour for an entire upgrade including local zones.

Everything would have been even easier if we had all root filesystems on ZFS. Since migrating several terabytes of data at the same time as the upgrade was not an option this was a good solution:
  • Detach all local zones
  • Add Live upgrade and Turbocharge patches
    (119254-70,121428-13,121430-40,124630-28 or later, included in S10u8)
  • lucreate to a ZFS rpool
  • luupgrade to S10U10 and add additional patches
  • luactivate and reboot
  • $(zoneadm attach -U) all zones in parallel
This solution is dependent on good I/O and perhaps even a separate disk/LUN for every zone root plus sufficient CPU resources on the system.

There you go, Solaris 10 8/11 is ready for the SPARC T4!

Patches for "Turbo-Charging SVR4 Package Install" are now available

Tuesday, October 18, 2011

Data corruption

Today I had an interesting experience on how bad things are on platforms that lacks ZFS. For an upgrade I was dumb enough to store a ISO image on a non-ZFS filesystem for a while. Later when performing upgrade tests I begun to get all sorts of strange errors from live upgrade.

The image had been corrupted on the disks or somewhere between the plates and the memory, before I verified it with md5 it was not found by any layer of the storage system.

I would normally never store anything important on a filesystem which is unable to protect data, and this time I was lucky since I found the corruption after a day instead of after a few months.

If feels good to know that even my data at home is protected by ZFS.

Wednesday, October 12, 2011

SPARC presentations from OOW

The presentations from OOW are now online, two are very interesting from a SPARC perspective. One of them is a somewhat deeper technical review of the new SPARC T4. The second states that the next generation M-systems are currently being tested together with the 28nm 16-core T5 that testing should start this October and they have marked that processor as being delivered "early".

Since there should be few differences between a T4 and T5 besides the 28nm shrink, double the cores and possibly a frequency bump it seems like it could be possible squeeze the time to market for that design. (With few differences I mean that the core itself is ready)

Next Generation SPARC Processor, An In-Depth Technical Review
SPARC Strategy

Thursday, October 6, 2011

A new dawn for SPARC

I think latest news and release surrounding SPARC changes the economy quite a bit. You will now get Virtualization with LDOM and Zones, Solaris and management with OpCenter included in your support contract.

This comes as the same time as the SPARC T4 is released with is suitable for a wide range of workloads contrary to the previous generations on T-series. The new T4 systems are also much cheeper compared to M-series with the same amount of compute power.

The included encryption also seems to be the fastest currently available in a common processor, so you can get the benefit of encrypting your filesystem/database/network with a very small performance penalty.

Compared to a few month ago the list of good news is quite impressive:
  • SPARC T4 with 5x the single thread performance of SPARC T3
  • 2-3x times faster encryption compared to T3 (which already was fast)
  • Live migration of LDOMs
  • OpCenter included in support contact
  • Live migration between T2,T3 and T4 in the near future
  • T4 supports 256GB memory per socket (1TB for the 5U T4-4)
  • A engineered system is available, the SPARC SuperCluster
  • Solaris 11 will be released shortly
Compared to other virtualization technologies Solaris Zones has no overhead and VM for SPARC (LDOM) have no overhead for CPU or memory. Some presentations at OOW claimed that zones had 4x less latency than KVM and 15x lower overhead compared to VMWare.

The next generation of T-processor is also being developed and should arrive next year. They will double the number of cores to 16 per CPU. A possible T5-4 would have 64 cores and 512 threads in one 5U enclosure.

I know there is a lot of talk about supporting the whole stack, but you will have one single vendor to blame for problems with the server, virtualization and management. Perhaps more importantly it will not cost more than the hardware itself and the support contract.

More news from OOW 2011

Another quick update of things I've found out so far, I will write more when I find some spare time:
  • There will be a Solaris 11 release event the 9th of November
  • The final Solaris 11 release should be based on build snv_175
  • The next version of OpCenter (12C) will be able to manage existing zones (brown field)
  • The next update of Solaris 10 (in 2012) will be the last Solaris 10 update
  • Solaris developers claimed that part of the reason for not supporting older sun4u machines was due to the way cache was handled which was not optimal.
  • While Solaris 10 supports the SPARC T4 you will get optimal performance using Solaris 11 since some changes could not be backported.

Wednesday, October 5, 2011

First days of OOW

A quick summary of he most interesting news from the first days of Oracle Open World that are related to Solaris/SPARC.

  • Solaris 11 is slightly ahead of schedule, it was supposed to be released by the end of the year.
  • OpCenter is now included in your support contract for no additional cost, including virtualization!
  • A new Oracle VM for SPARC will be release (3.0) that supports live migration between processors with diffent clock frequency and even between T2 and T3.
  • Future releases of Oracle VM for SPARC will try to remove limitations, for example enable live migration while using dedicated PCI hardware in guest domans.
  • Some work on Solaris 12 have already been started
  • Solaris 11 will have more focus on virtualization/cloud
  • SPARC super cluster will support exadata instances (one per T4-4)

Now i'm of to a new session, a deep technical review of the SPARC T4.