Dario Lo Presti - Fotolia
Will server hardware perform faster with transactional memory?
For many Java applications doing list processing, having transactional memory built right into Intel's commodity CPUs will result in greater server hardware performance.
Hardware transactional memory isn't a new thing. It's a feature that's been around for quite a while on a number of different microchip architectures found in expensive lines of server hardware. But like most architectures focused on high-capacity computing, they could hardly be considered accessible to the masses.
However, the elitist nature of hardware transactional memory (HTM) computing is now a thing of the past, as 2016 saw Intel release a line of affordable processors -- namely, its i3, i5 and i7 Haswell products -- with the transactional memory technology baked right into them. "It has seen the light of day in some more exotic systems before, but now, regular x86 chips have this feature," said Gil Tene, CTO at Azul Systems Inc., based in Sunnyvale, Calif., at QCon 2016 in New York City.
No good technologist would ever chagrin a new server hardware feature, especially one that's embedded right into the central processing unit, but the question remains as to just how much of an effect it will have on existing applications, and which applications will benefit the most. To understand why some systems will benefit greatly from HTM and others might not, it's important to understand exactly how hardware transactional memory features work.
Dampening the effect of data contention
In order to make an enterprise run faster, increase the number of CPUs installed on the server hosting the system. Whether we're counting the number of physical CPUs, or the number of cores per system, with four, eight, 16 and 32 processors are not uncommon. As hardware prices continue to drop, they might even be considered affordable. But no matter how many cores a system might have, software locks and data contention can slow a powerful system down to a crawl.
In both Java programs and the server hardware that hosts them, locking is ubiquitous. Sometimes, a lock occurs explicitly when a software developer decorates a method or a block of code with the synchronize keyword. Sometimes, the locks are inconspicuous, as when a utility class pulled from Java's collection API puts a lock around an entire data structure, such as a map or a list. When programmatic data and a software lock come together, program execution has to take place serially, and the concept of concurrency goes out the window.
Parallelizing otherwise serial data access
When you map HTM to things like lock regions, code that normally grabs a lock and serializes execution can instead be run concurrently.
Gil Tene,
CTO Azul Systems
The promise of hardware transactional memory is the processor can make more intelligent decisions about locked data structures. And rather than forcing serial access, the hardware can keep track of the parts of the data structure that must be accessed atomically and allow other CPUs to access the data in parallel, so long as the atomicity of data being used within the transaction is not breached. As a result, threads that might otherwise have been blocked due to a lock can continue to run in parallel, allowing systems with multiple processors to come increasingly closer to achieving linear scalability. "When you map HTM to things like lock regions, code that normally grabs a lock and serializes execution can instead be run concurrently without actually serializing the lock," Tene said.
There are limitations, of course, with the granularity of the data being accessed a primary one. If 10 separate threads are trying to manipulate the exact same piece of data, nine of those threads are going to have to wait. But in most enterprise systems in which the ability to perform list-processing operations is the rate-determining step, the benefits of using hardware transactional memory should be palpable.
Would your server hardware benefit from hardware transactional memory? Let us know.
Next Steps
What to do with the garbage collection in JVM
Approaching performance testing
Embrace HTTP 2 for app performance improvement