OT: Things Fashion Stores and Designers just don’t Get
I have just returned from a trip to Milan, my lovely wife traveling with me. And after a full weekend there, I feel like ranting again. For my trusted readers, if you expect something technical from this, please stop reading now.
Curious Partition Function Behaviour
Just another short blog today describing a curious issue I found with a query plan this week and a “workaround”.
In our core system, we have a table with two partitions. One partition contains all the work that “has been done” (which has the column WorkItem set to –1) and the other the “work in progress” (with WorkItem to different values, all > -1).
The reason we have created just two partitions for this table (which is a heap) is that the items that are “work in progress” are often scanned, yet the work that has been done (WorkItem = –1) is the vast majority of the table. This “mini partitioning” is a nice design pattern I often apply to skewed distribution like these. It provides a significant performance boost on table scans. But this week I saw an oddity I have not run into before.
Default Configuration of SQL Server (and query hints)
Throughout the years, I have become convinced that the default settings used in SQL Server are often wrong for systems that need scale or are properly managed. There is a series of settings I find myself consistently changing when I audit or install a new server. They are “my defaults”
I thought I would share those here. Bear in mind that these are setting that assume a certain level of rational behaviour in the organisation you find yourself in. If you are working in a bank, they may not apply to you.
Clustered Indexes vs. Heaps
At Stack Overflow the other day, I once again found myself trying to debunk a lot of the “revealed wisdom” in the SQL Server community. You can find the post here: Indexing a PK GUID in SQL Server 2012 to read the discussion. However, this post is not about GUID or sequential keys, which I have written about elsewhere, it is about cluster indexes and the love affair that SQL Server DBAs seem to have with them.
Synchronisation in .NET– Part 4: Partitioned Data Structures
In this final instalment of the synchronisation series, we will look at fully scalable solutions to the problem first stated in Part 1 – adding monitoring that is scalable and minimally intrusive.
Thus far, we have seen how there is an upper limit on how fast you can access cache lines shared between multiple cores. We have tried different synchronisation primitives to get the best possible scale.
Throughput this series, Henk van der Valk has generously lent me his 4 socket machine and been my trusted lab manager and reviewer. Without his help, this blog series would not have been possible.
And now, as is tradition, we are going to show you how to make this thing scale.
Synchronisation in .NET– Part 3: Spin Locks and Interlocks/Atomics
In the previous instalments (Part 1 and Part 2) of this series, we have drawn some conclusions about both .NET itself and CPU architectures. Here is what we know so far:
- When there is contention on a single cache line, the lock() method scales very poorly and you get negative scale the moment you leave a single CPU core.
- The scale takes a further dip once you leave a single CPU socket
- Even when we remove the lock() and do thread unsafe operations, scalability is still poor
- Going from a class to a padded struct gives a scale boost, though not enough to get linear scale
- The maximum theoretical scale we can get with the current technique is around 90K operations/ms.
In this blog entry, I will explore other synchronisation primitives to make the implementation safe again, namely the spinlock and Interlocks. As a reminder, we are still running the test on a 4 socket machine with 8 cores on each socket with hyper threading enabled (for a total of 16 logical cores on each socket).
Synchronisation in .NET– Part 2: Unsafe Data Structures and Padding
In the previous blog post we saw how the lock() statement in .NET scales very poorly when there is a contention on a data structure. It was clear that a performance logging framework that relies on an array with a lock on each member to store data will not scale.
Today, we will try to quantify just how much performance we should expect to get from the data structure if we somehow solve locking. We will also see how the underlying hardware primitives bubble up through the .NET framework and break the pretty object oriented abstraction you might be used to.
Because we have already proven that ConcurrentDictionary adds to much overhead, we will focus on arrays as the backing store for the data structure in all future implementations.
Synchronisation in .NET– Part 1: lock(), Dictionaries and Arrays
As part of our tuning efforts at Livedrive, I ran into a deceptively simple problem that beautifully illustrates some of the scale principles I have been teaching to the SQL Server community for years.
In this series of blog entries, I will use what appears to be a simple .NET class to explore how modern CPU architectures handle high speed synchronisation. In the first part of the series, I set the stage and explore the .NET lock() method of coordinating data.
My Final SQL Server Presentations (until further Notice)
As my regular readers have noticed, activity on my blog has slowed down lately. My new job at Livedrive is keeping me very busy and excited. It’s the opportunity I have been looking for: right in the middle of the Open Source vs. Microsoft cloud battle (and in a hectic development environment). I am greatly enjoying myself in this space.
However, I have precious little time to blog about SQL Server. And quite frankly, by now I feel there isn’t much more left for me to say on this subject. The time has come for me to move on to other subject areas and master new skills.
How will this affect my public speaking appearances?
MySQL – First Impressions
In my new job as the CTO of Livedrive, I have the pleasure of working with both Microsoft SQL Server and MySQL. We have a rather nice real estate with tens of petabytes online which keeps us entertained with scale challenges.
Having spent some time with MySQL lately, and being an old SQL Server user, I thought it might be interesting to share some of my early experiences. The good, the bad and the ugly.