Virtual Alpha system performance

There are many aspects in a computer system, whose performance can be important. For instance, we can talk about the CPU performance, the disks read or write speed, the network throughput, etc. A virtual Alpha system in the same way has the same performance parameters. However, these parameters are usually not proportionally slower or faster on the virtual system with respect to the real system. For instance, the CPU of the virtual system can be faster than of the real one, while the network speed is slower.

Therefore, when reasoning about the performance of some application, it must be specified, which performance aspects are important for this applications. For this reason some applications are called CPU bound, some IO (disk or network) bound.

In many cases the performance is not really important at all. If the application is usually idle and sometimes does short computations and data transfers, the performance will hardly be an issue.

Please check the benchmark page for the comparison of various aspects of AlphaVM and real Alpha performance.

CPU performance.

CPU bound tasks are tasks whose performance depends almost solely on the CPU performance. Typical applications that require a lot of CPU power:

  • Scientific computations (like numeric solving differential equations)
  • Image processing (ray tracing)
  • Blockchain hash computation
  • Compression/decompression (although the IO speed can also be important).
  • Compilation (some disk IO is usually also involved)
  • Database query processing (some disk IO is usually also involved)

The virtual CPU emulates the Alpha CPU instructions in some way. It makes sense to distinguish the workloads that use different instructions subsets, because the emulation complexity for them is different.

  • Integer instructions. Many workloads use just integer arithmetic. Integer instructions are usually less complicated to emulate than the floating point instructions. AlphaVM-Pro for most workloads performs faster than any real Alpha CPU.
  • IEEE floating-point instructions. The host CPU Intel Xeon implements IEEE floating-point instructions. Therefore it the Alpha IEEE instructions use the host IEEE instructions, which makes the emulation relatively straightforward and fast.
  • VAX floating-point instructions. The host CPU does not have VAX floating-point instructions. Therefore the emulation is complicated. The performance is relatively slow.

Many workloads are actually a mixture.

It is very important for the performance how the Alpha CPU emulator is implemented.

  • Interpretation. The emulator interprets Alpha instructions read from Alpha memory in a loop. This is a simple and straightforward implementation, which is relatively slow. In AlphaVM this implementation is called the Basic CPU.
  • Just-in-time compilation. The emulator compiles chunks of Alpha code to native code of the host CPU. Then the compiled code is executed. For repeatedly executed code it is much faster. For most CPU bound workloads AlphaVM JIT3 CPU is a factor of 5-10 faster than the basic CPU.

IO bound workloads

The performance of data transfer tasks is usually bound by the disk or network performance. For some tasks the performance almost does not depend on the CPU. For some tasks the CPU still limits the performance of the protocol stack in the transfer chain. Examples:

  • File copy. It hardly depends on the CPU.
  • FTP. The network stack uses a lot of CPU power at high network speeds. For 1Gbit and even for 100Mbit network the CPU performance is important
  • Database query is a complicated mixture of CPU and IO workload.

The emulated disk performance is usually faster because

  • Modern disks are faster (think of SSD).
  • The container file caching can be used.

Network can be faster, if the real system had slower Ethernet than the host system. For instance your AlphaServer had a 10gbit Ethernet card, while the host has 1Gbit NICs.

However, the network performance is slowed down due to the emulation overhead. The emulations overhead comes from the fact that the emulator creates an extra chunks in the data transfer path between the application and the wire. You can think of it as if there was an extra switch in the path (like a virtual switch).

The performance with respect to the host system.

Above we discussed the performance comparison between a real Alpha system and a virtual Alpha system (AlphaVM). One can try to compare certain performance measurements between the guest virtual Alpha system and the hosting system. The virtual system is always significantly slower in this comparison. It is because the guest CPU is an emulator, which is much slower than the host CPU. This CPU slowness affects all benchmarks. Moreover, all IO paths simply longer for the guest system, which causes extra overhead.

Leave a Reply