Home Blog

Looking Forward 100 Months (T-SQL Tuesday #100): The Roundup


tsqltuesdayGreat news for those of us who plan to stick around the data realm for a few more years: Things are looking pretty good in 2026!

I asked T-SQL Tuesday #100 participants to take a speculative view on the world 100 months from now. 32 people weighed in—in some form or another—and produced a set of extremely interesting and diverse posts. Despite their far-ranging nature, the various posts converged on similar themes in many cases:

  • What is the DBA career path for the next eight years?
  • What will be the longer-term impact of the cloud and other infrastructure advancements?
  • Where will Machine Learning and Artificial Intelligence technologies take us?
  • Things will change, but how much will stay the same?

I thoroughly enjoyed reading each and every one of the posts, and I’d like to extend a sincere thank you to everyone who participated. And without further ado, I bring you the roundup. I’ve grouped the posts based on the general themes I noted above.

Your Future Career

Unless you’re planning to retire later this year or that lottery ticket finally hit, it’s important to keep yourself in the know about what to expect next from your career. Ours is an ever-changing field and T-SQL Tuesday #100 participants expect a fairly turbulent eight years. Don’t worry, though; it will be a great ride for those who manage to hold on.

khairul-arifin-387515-unsplashWarwick Rudd focused on a future he believes will result in a large-scale DBA skills shortage—but thanks to improved automation, only a shortage of advanced DBA skills. Warwick mentions that  “by the time T-SQL Tuesday 200 arrives, the baseline for the DBA is going to require the knowledge and skill sets of someone in the position now with quite a bit of knowledge. Nearly all of those base level tasks that allow junior positions to be filled will be being completed by automation and as such the requirement for a more in-depth knowledge of the product stack will be required."

Brent Ozar’s post could fit into multiple categories—it’s full of various predictions—but my favorite of his insights had to do with career path. Brent sees “data safety” as a big standalone future business (one that, in my mind, can’t come soon enough). He sees DBAs as becoming “reliability engineers,” which is an interesting shift in mindset. And on the non-career front, he sees Microsoft allowing other vendors to sell SQL Server serverless offerings. This last one I think is absolutely insane but also incredibly intriguing. We shall see.

Glenda Gable took a science fiction approach. Her concern about the current state of office affairs is that sometimes people’s words are misconstrued. Communication is tough. She sees a future in which an AI device will help you understand people’s actual intentions, so as to avoid those strange and sometimes career-limiting interactions.

Hugo Kornelis makes a number of interesting predictions across numerous areas. Among other things he predicts over-application of machine learning, followed by an extreme pullback—something I agree is bound to happen. But his most interesting prediction is the future of “performance tuning” as a career. Imagine that your life and daily plans were subject to an optimizer. Imagine if that optimizer made and error and, e.g., sent you out to buy ice cream, leaving it sitting in the hot car for three hours while you run other errands. You’d need a Life Plan tuner. And that’s the career Hugo has in mind for himself. I’m intrigued!

Rob Farley is the Old Faithful of T-SQL Tuesday, having never yet missed a single month. In his T-SQL Tuesday #100 post he goes a bit introspective, asking what’s really important in your career. He comes to the conclusion that machines are going to greatly outpace humans in solving the tech problems—but even the best metal mind can’t tackle business issues. So that’s where we should focus if we want any chance of future career success.

Eugene Meidinger notices that developer ramp-up time is ever increasing; the more complex our technology solutions become, the longer it takes to learn how to use them. Meanwhile, developer job duration seems to continually drop. What happens when the two meet and what, especially, happens when developers start leaving jobs before they’re even properly trained up?

Hamish Watson is an incredibly optimistic sort of person. He sees a bright future in which "all Data Professionals will have an appreciation and understanding of how to automate the delivery of value to the end user." He also predicts that Microsoft will greatly improve SSDT, and that companies will actually want to invest in data cleanup initiatives. All of which sounds great to me.

My own post was, I will admit, inspired by a debate had over Twitter. But this is a topic I’ve been thinking about a lot. I’m not the first to say it, I certainly won’t be the last, and I’m probably not wholly correct. But I’m calling it and in 2026 the DBA is dead.

Cloud and Infrastructure Changes

The public cloud is really nothing new, but its rapid adoption represents a sea change in the way we see infrastructure and overall application architecture. Change is not going to stop or even slow down in the coming years; quite the opposite. So where will we be in 2026?

pawel-nolbert-284225-unsplashShane O'Neill brings us a few journal entries from the world of T-SQL Tuesday #200. He predicts few remaining on-premise database instances and complete virtualization of just about everything—including the PASS Summit and a virtual round of karaoke. I hope the drinks are non-virtualized, at least?

Deborah Melkin talks about future trends in data volumes, data analysis, and data protection. She brings up the question of “where does the data live?” A decision path that was, only recently, pretty much a single node, has blossomed into a small tree. "Are we storing the databases locally or in the cloud? If you’re hosting locally, are hosting the databases directly on a server or are you hosting VMs or using containers? And which operating system do you want, Windows or Linux?" What sort of structure will this tree have in eight more years?

Steve Jones will, in 2026, be 60—and officially the Old Man of SQL Server (version 23). He predicts more cloud, broader processor and platform support, and a revamped and actually working version of replication. He also predicts that SQL Server will support multiple first-class query APIs for different languages and paradigms. Which seems to me a lot more probable than any new replication fixes!

Robert Davis sees himself as semi-retired by 2026. I wish he’d informed us of how he plans to accomplish that one; I don’t see any early exit in most of our futures. But I digress. He predicts a new feature, Scalability Groups, which will be an enhanced Availability Group concept that will effectively support writeable secondaries for full read and write scale out. Cool idea and totally foreseeable. Plus, merge replication will be officially deprecated. We can only hope.

Kendra Little’s future vision centers around a new core infrastructural concept, the BAGI. Big Awesome Giant Infrastructure. These are huge memory-based compute clusters. And because everything is in memory, you pay for that memory—by the gram of weight that your data consumes on the chips.

ML, AI, and Automation of Everything

Machine Learning is changing the world. But is it doing so for the better? With all of that power comes a lot of responsibility. And what will it bring to the table when leveraged within the database frameworks of today?

franck-veschi-517860-unsplashBert Wagner imagines next-level, ML-driven automatic plan tuning capabilities. This future only makes sense. Among other predictions, Bert sees the "ability to choose an appropriate execution plan based on historical time-of-day server usage."

Arthur Daniels sees a similar future, but perhaps one that’s not quite as positive. What do you do when the AI-based query store starts auto-forcing the wrong plans? You’re forced to dig in to the open-sourced query optimizer and figure out which special trace flag is needed to shut the thing down.

Doug Purnell’s future vision is a bit more positive. He foresees PowerBI-based AI-generated notifications for "transactions, indexing, backups, DR/HA, and security tuning." All of which, coupled with PowerShell, will allow for seamless and totally online production migrations. Sounds pretty cool!

Eric Blinn wonders who will actually be running that PowerShell code. Will it be you, or your virtual DBA, Victor? It’s unfortunate if you don’t want to work with Victor; you’ve been financially forced to do so by Microsoft. Victor solves problems faster and better than you ever could. Can this possibly end well?

David Fowler takes us even further along the same path—and back, to the heady days the ‘80s. Skynet has become self-aware. And SQL Server is hosted therein. That terminator looks awfully familiar; does anyone know where I can find John Connor?

Randolph West’s 2026 world is one of drones everywhere, SQL Server running in Raspberry Pi containers, and special glasses that allow you to blink to accept calls. The elevator in your office building will use a predictive model to figure out which floor to drop you on without your having to push a button. But what happens when the batteries die on your fancy phone-enabled glasses?

Ginger Grant is concerned about the accuracy of Machine Learning. She predicts more automation, more cloud, more managed services…but an overall contraction when ML models start to get too many things wrong. She also issues a call to action: If you want to get involved in this world, learn Python. (And possibly don't bother with R at all.)

Riley Major provides a general description of ML and discusses his projections of SQL Server including ML and AI as first-class citizens, right there in the SELECT statement.

The More Things Change…

IT is trend-driven and amazingly cyclical. Eight years is a long time, but is it really that long a time? Maybe things won’t be so different after all.

ajeet-mestry-429216-unsplashKoen Verbeek considers BI in 2026. The key here is evolution, not revolution. He foresees ML-driven ETL and the need for ever-greater data protection. But decent data models will still be needed, self-service will still be a pipe dream, and at the end of the day there will still be data warehouses. Koen notes that "it’s totally cool what your machine learning model can do and how it will predict everything our customers will need, but the CFO [will still need] that financial report." And business users will still want everything in Excel. Undoubtedly.

Matthew McGiffen notices that a lot of companies are always several versions behind whatever is current. So his prediction for 2026? “DBAs will finally have the go-ahead from their companies to upgrade to the versions of SQL that are current now and start using this stuff!"

Jeff Mlakar makes various predictions about how little things will change. Security will still be awful, semicolons still won’t be required in T-SQL, and all of the currently deprecated features will still be there and still marked for deprecation. Sounds about right!

James McGillivray believes that data professionals will still be in high demand in 2026. None the less, they won’t be able to fix much. Most organizations still won’t focus on data quality, and master data will still not be in great shape. We’ll have a lot more tools at our disposal, but the core aspects of the job really won’t change.

Kenneth Fisher’s 2026 involves SQL Server releases once every two weeks. This is necessary, apparently, to keep up with the speed of business. Gone are DevOps—replaced, per Kenneth, by BizOps! What about our beloved Profiler? In 2026 it’s still right there when you need it.

Nate Johnson sees relational databases as still alive, albeit mostly in the cloud. Data quality will still be a huge issue and just about everything will be columnstore-based. Oh, and sp_whoisactive will be built in to SQL Server. Fingers crossed, Nate!

Kevin Hill reveals a nice surprise about auto shrink. Finally.

Other Topics and Creativity

The future is a wide and varied place, and naturally not everything fit neatly into the four major buckets. In reality, it’s almost certain that nothing will.

ricardo-viana-105232-unsplashKennie Pontoppidan imagines massive change in the next eight years. He sees an evolution of databases well beyond what was imagined back in the System R days and foresees quantum databases, built-in data classification, multi-language support, and most importantly direct data distribution lines that involve virtually all devices. Kennie believes that "the concept of [the] database will be diluted; we will be talking [instead] about data swarms."

Amy Herold, on the other hand, predicts a future of strange and horrible decisions made by Microsoft. The company will, in Amy’s mind, replace Entity Framework with NHibernate, get rid of Agent in favor of ActiveBatch, and force us all to write our queries in some as-yet-unimagined JSON-based format. Shudder.

Todd Kleinhans has been reading up on DNA-based storage. It’s super-high density, but read speeds are abysmal. No worries, DBA friends! For in 2026 we’ll see the advent of the non-clustered DNA index…

Taiob Ali is concerned about the future implications of “digital dementia,” a condition that involves "deterioration of cognitive abilities due to overuse of technology." He suggests reducing screen time, making an effort to be social (IRL!), and actually using your brain on occasion.

Bob Pusateri was that kid in elementary school who refused to color inside the lines. And maybe, on rare occasion, ate a glue stick. He decided to forego the future altogether and instead wrote about last month’s topic. He certainly does have a unique hobby!

That’s that for T-SQL Tuesday #100. Thanks for reading and thanks again to the participants for the great posts and the insight. It was a great pleasure to host.

If you would like to host a future T-SQL Tuesday, please contact me via the form on this site. I’d love to get you onto the calendar. Let’s help T-SQL Tuesday actually make it another 100 months and see how many of these predictions come true.

T-SQL Tuesday #200 (v1.0): Last of the DBAs


tsqltuesday8:22 a.m. It was a dark and stormy night.

Well, not so much, though Mike’s excruciating hangover was making him wish otherwise. The early morning sun shone brilliantly in through the office window, punctuating the headache that even four ibuprofen pills simply refused to touch. Sure, that hangover “miracle cure” medicine had come out last year, and everyone seemed to love it, but for Mike newer wasn’t always better.

After taking a cautious sip of water, he carefully typed his password into the Windows 11 login screen, watched as his desktop sprang to life, and got ready to start his day.

One thing Mike’s hangover had going for it? The office was quiet. Like, really quiet. Mike was the only one who’d bothered to trek into the physical space this Tuesday morning. He thought fondly about the earlier, and somewhat more social, days of the firm. 2008, when he was hired as the first DBA. 2013, when they used to regularly pull all-night rotations while migrating all of the databases into Availability Groups. Even as recently as 2020, his team still regularly got together for cloud migration and support planning meetings. These days, though, that just didn’t happen. Fuel prices had shot through the roof after the crisis of 2022, and Full Reality headsets made virtual meetings feel almost just like they used to, you know, in the real world.

The only thing missing was the donuts. Mike really missed the donuts.

Logging in to SQL Operations Studio, Mike checked his few remaining Agent jobs and was relieved to see that no failures had occurred the evening before. The firm’s final remaining Kimball Method data warehouse wasn’t used very often, but still ran a few key quarterly reports. Mike, as Lead DBA, was responsible for keeping it up and running, and for fixing the occasional blip.

pexels-photo-48566Sitting back in his chair, Mike decided to close his eyes just for a bit. Why not? There wasn’t much left to do. He vaguely remembered being stressed some days, earlier in his career, but that all seemed like a dream now. He used to like that stress. That feeling of being needed, being important. But these days his once-thriving team of nine DBAs – five in the home office and four in other parts of the world – was not such a team at all. Mike was still Lead DBA by title, but he had only himself to lead.

The beginning of the end for Mike’s DBA group was some time around 2019, when the executive board made a wholesale decision to start moving everything into what was then referred to as the “public cloud.” Some of his direct reports submitted their resignations letters in the following weeks. A few held on for a couple of years, helping with the migration work. But in the end, most had gone elsewhere in the firm, and were working closer to the business than ever—and, Mike figured, probably making a lot more money.

Kelly, one of Mike’s sharpest Senior DBAs back in the late 2010s, was the biggest success story. She had really embraced all of the new technical challenges, showing up at the office with Hadoop and MongoDB books as early as 2016. As soon as the cloud move was announced, she volunteered to spearhead it, and management had taken her up on the offer. She was now in charge of the company’s small team of Data Engineers: six people who, Mike had to admit, were getting a lot more work done, a lot faster, than his nine ever could have dreamed of. They didn’t have to deal with storage devices, network switches, and all of the other hardware challenges Mike used to love to play with. They developed Data Flow and Visualization packages, helped the business implement a solid Data Studies practice, and a couple of the Engineers had created some very cool learning algorithms that produced results faster and more cleanly than any of the firm’s competitors. Well, so Mike had heard. He didn’t really understand a thing they did.

9:43 a.m. Still a bit too early for even a really early lunch. Hoping to find something interesting to do with the rest of the morning, Mike poked around some DMVs. Maybe he’d find an index to optimize or something? A few end-user queries had run overnight, and one had even hit a suboptimal plan. But Query Adaptation Services, the killer feature that had had everyone really excited in 2020, had done its job. Glancing at the new execution plan, Mike couldn’t help but envy the artificial intelligence that Microsoft had baked into the Adaptation engine. There was no way he ever would have thought to hint the query to get a plan like that.

10:32 a.m. No emails since yesterday. Nothing left to check. Mike vaguely wondered if he should browse around, maybe try to find a book on Python 4? Too tiring to think about starting over. Might as well go for a walk and take a really long lunch break…

It’s 2026. The DBA is dead, dead, dead.

Yes, that’s right. The DBA is gone. And good riddance to the DBA.

That doesn’t mean that you are dead. That doesn’t mean that your career is dead. Quite the opposite. Data is going nowhere.

You are a data professional. A data engineer. A data specialist. You are smart, creative, and agile. You provide business value. You are able to adapt to changing times and changing tides.

The term DBA is a relic, a leftover from another age. An age of expensive infrastructure, high latency networks, and server time. Lots of server time on lots of servers. Servers that had to be administrated.

DBA represents slow-moving, rigid IT policies that aren’t in line with the immediate needs of the business. DBA represents everything that used to be true of data, and nothing that still is.

It’s 2018. Are you still a DBA?

This month’s T-SQL Tuesday challenge, hosted by me, was to envision the world 100 months in the future.

I see a future very bright for data professionals. We’ve seen an upward trend in data volumes, and that’s not stopping. We’ve seen an upward trend in requests by the business for more insights, faster. That’s not going anywhere. We’ve seen a move towards newer, faster, better, more self-driving technologies. And most importantly, we’ve seen a move away from traditional infrastructure and infrastructure roles. Databases of the future will not need “administration.” They’ll need many, many other tasks. In many cases, there won’t even be a “database” in the way we think of them today. And these are all good things.

Being a data professional doesn’t even remotely mean that you have to be a “DBA.” It never did, yet somehow the two have become conflated. DBA friends, I want you all to wildly succeed. I want you to succeed with me—in our DBA-less world of the future. The DBA is dead. Long live the Data Professional.

See you in 2026.

Invitation: T-SQL Tuesday #100 – Looking Forward 100 Months

image Just over eight years ago I decided to try an experiment: Set up a roaming blog event for the SQL Server world. I was hopeful that it would catch on; I even prefixed the number one with two zeroes in anticipation of someday, maybe, hopefully filling in all three spaces. But honestly, 100 months? That’s a long, long time, especially in our industry.It’s been quite a ride – 100 posts over 101 months (we actually skipped one month the first year, due to not wanting to overlap with the PASS Summit). Some months, especially recent ones, have been immensely popular, generating 50 or more posts. A couple of months have been a bit more touch and go. But overall I’d call the event quite successful.tsqltuesdayAlong the way we got an official logo, Steve Jones created a fantastic archive web site, and the event has not only help created a tremendous number of blog posts, it’s also been cited as inspiration by many new bloggers to whom I’ve spoken. In short, I’m really proud of what I put in motion, and what we—this means you if you’re a blogger—have been able to do together. Thank you for 100 great months! Let’s look backward…This is, of course, the invitation for month #100, so let’s get down to business. As I prepped for writing this invitation, I went through the archive and attempted to categorize all 99 prior topics. What immediately became apparent is that our community is extremely concerned, above all else, with professional development. 24 of the 99 topics we’ve seen so far fall into that bucket. For what was initially conceived as a mainly technical series, that’s not quite what I expected. But given the general maturity of the community and data professionals in general, it’s also not too surprising. Things often tend to change slowly in our database world, but we’re always looking for the next big move for our careers.The last few years has felt like quite a shift with regard to this last point. We’ve seen rapid adoption of such new ideas as public cloud, machine learning, and so on. T-SQL Tuesday topics have also followed these trends; around 30% of the topics over the past two years have been focused on these newer paradigms. When I started T-SQL Tuesday in 2009, no one was talking much about any of these things. We were still really excited about SSDs, just getting into the stride with multicore, and wondering about what cool stuff we might get in SQL Server 2008 R2. (Did we ever answer that final question?) Let’s look forward…Anyone who has been in IT for more than 10 minutes knows how the industry works. IT loves its trends, but things tend to be very cyclical. Our current crop of “cloud” technologies are nothing new at the heart of things. The cloud, a wise person once said, is merely someone else’s server. And we’ve seen prior iterations of these same hardware sharing ideas going all the way back to the 1960s. Machine Learning, likewise, has been around in other guises for several years. Columnstore ideas are decades old even if they’re new-ish in the Microsoft space, and even in-memory has a rather deeper history than many realize. None of these “new” technologies about which we’re so excited are actually new, even if they’re newer (and hopefully better) implementations.It is possible, of course, that a more rapid cycle could break the trend. It is possible that newer technologies will be truly new. Or perhaps the future will simply bring more of the same, a continuation of the everlasting IT sine wave.Your mission for month #100 is to put on your speculative shades and forecast the bright future ahead. Tell us what the world will be like when T-SQL Tuesday #200 comes to pass. If you’d like to describe your vision and back it up with points, trends, data, or whatever, that’s fine. If you’d like to have a bit more fun and maybe go slightly more science fiction than science, come up with a potential topic for the month and create a technical post that describes an exciting new feature or technology to your audience of June 2026. (Did I do the math properly there?) The RulesT-SQL Tuesday would be nothing without some rules, and a while back I codified them. The important notes with regard to Month 100:
  • Your post must go live next Tuesday, March 13, between 00:00:00 and 23:59:59 in your own time zone. Or whatever time zone you happen to be in. Or you can borrow a friend’s time zone. Just make sure it’s Tuesday somewhere, okay?
  • Your post must include the T-SQL Tuesday logo as seen above. And said logo must link back to this invitation post.
  • Make absolutely sure to let me know in the comments section below after you post, so that I can include you in the roundup.
  • If you’re on Twitter, tweet about your post using #TSQL2sday. (Now that we have 280 characters, I’m not sure that 3 character savings was quite worth it…)
  • If you’re interested in hosting a future month, contact me.
 That’s it! Any questions? Let me know. Otherwise, go forth, think, write, and enjoy yourself. See you in another eight years.

The Career Influence of Usenet Strangers (T-SQL Tuesday #096)

T-SQL TuesdayA freshly minted Computer Science graduate at the tail end of the dot-com era, I had three basic requirements for my first job: (1) money; (2) money; (3) more money. (I was going to be a millionaire, don't you know!) So I did pretty much the same thing as all of my friends: got myself lots of job offers, played them against each other as much as possible, and then accepted the one with the highest salary.Not a great tactic. I turned down at least two jobs that would have, in retrospect, put me on a much more stable initial path. The one I took? A gig with an ill-conceived startup, as sole developer, writing internal apps. I was told by my manager, a networking guy, to create the front end in ASP and use either Access or SQL Server as a backing data store. Why? Because they had a licenses for that stuff through the company's MSDN subscription. Never mind the fact that I'd just spent four years of my life learning all about C++, algorithms, data structures, and the like, and had never even heard of ASP. (Luckily I knew the dangers of Access. Because who wouldn't?)I had nothing even remotely resembling a mentor. No one to talk to about my work. I didn't even get to meet my end users; I was given requirements concocted in meetings I was specifically told not to attend. So I rolled up my sleeves, made best friends with a search engine, and got my first taste of the technical community at large. I spent my days trying, failing, experimenting, and most of all reading and re-reading the words of wisdom written by my four virtual companions. And somehow I managed to cobble together a pretty decent web app. (That no one used. But that's another story for another day.)Eventually one day I hit a problem that I just couldn't find a solution for even with the best of my Google skills. And somehow in my quest for knowledge I stumbled upon the Microsoft Usenet server. You see, prior to all of these newfangled web forums people used to send messages to one another using standalone programs called News Readers. (One might argue -- and I do -- that this technology was much nicer for overall communication and created a better community feeling than anything we have today. But I digress.)Into the newsgroups I went, and I discovered an open and welcoming environment that existed solely to help all comers. Ask a question, get an answer in minutes from someone ten times smarter than you. Amazing. One guy in particular stood out in those days, Microsoft Scripting MVP Brent Ashley. Brent was The Man on the scripting and ASP newsgroups and I briefly wanted to be like Brent. So I stuck around for a little while and tried to answer some questions. Mostly not very well. And then I went off and did other things.Time passed, I was laid off by a couple of crappy startups, my career shifted, and I became a focused data engineer. And once again one day I found myself at a loss. Remembering the help I'd received earlier, I went back to the newsgroups -- this time hitting the SQL Server section. And the SQL Server world totally blew me away! Unlike the scripting newsgroups, which were fairly quiety, in the SQL Server arena there was a tremendous amount of activity. And lots and lots of very smart people to learn from. Even better, many of the questions at the time were complex query puzzles. The kinds of things that, as a CS kind of guy, were able to both touch upon my background and pique my interest. I asked a few questions and received great answers. And fortuitously I decided, once again, to try to answer a few on my own...In the SQL Server newsgroups I found the very closest thing to actual mentors that I've had in my career to date. The regulars in the groups showed me the value of learning through helping others, helped me experience the joy of engaging in deep technical debate, and guided me (very patiently) toward being a far better SQL Server practitioner than I otherwise ever could have been.The end result of my newsgroup visits? I was able to become an MVP. Which prompted me to start speaking in public. Which helped me contribute to my first book. Which gave me the cred I needed to become an independent consultant. Which allowed me to get involved in all sorts of interesting projects and write and speak more. And create a monitoring stored procedure. And meet lots of cool friends in the global SQL Server community... And on and on and on.So it is quite fair to say that I am deeply, deeply in debt to the various newsgroup regulars -- especially on the .programming group -- who I looked up to and learned so much from back in those days. (And many of whom I continue to learn from even now!) In no particular order:
Steve Kass Itzik Ben-Gan Anith Sen Tibor Karazsi Erland Sommarskog Joe Celko Aaron Bertrand Kalen Delaney Andy Kelly
Thank you all. You truly shaped me and my career more than you will ever know.

Playing the Third-Party Recruiter Game (T-SQL Tuesday #093)

T-SQL TuesdayFinding a new job can be a long and stressful proposition. This holds true even for skilled candidates. And even for skilled candidates in an atmosphere like we have in 2017—in which everyone who is employable is, for the most part, employed, and IT positions often stay open for months due to a lack of qualified people.This is a great time to look for a new job, but you still need to put in the work of looking and interviewing. This month’s T-SQL Tuesday is hosted by Kendra Little, and the topic is job interviewing. This is a bit of a departure from the usual T-SQL Tuesday technical topics, but a perfect one in my opinion: Even those of us who think we are in our perfect, forever, dream jobs, will almost certainly have to sit on the interviewer side of the table during some of those dreamlike workdays. And when that bubble pops, the same people will get to sit on the interviewee side. It’s always worthwhile to keep your skills and résumé up to date.When I saw the topic I thought back to my own experiences interviewing and being interviewed: Early in my career I was a bit of a job hopper; I have been a hiring manager on a few occasions; and I have spent several years consulting and contracting. In all of these cases, that’s meant lots of interviewing, and so I started work on a comprehensive guide. Alas, that turned out to be too much to complete for today, so I’ve decided to present what I feel is one of the most important pieces.Third-party recruiters. External firms that employers work with to get the word out on positions and find candidates willing to have a chat. In exchange, the recruitment firm gets a nice reward: usually 20% of the salary for each candidate who gets hired. This setup seems simple enough, but it’s full of pitfalls for the candidate and in my opinion is one of the most important aspects of job hunting in our current era.When I started my career in the late ‘90s, recruiting firms were certainly common, but when I had my résumé up on a job board I would much more often hear directly from employers. Times have changed and these days, my LinkedIn profile is messaged daily by recruiters, and only very rarely by direct employers. Just like every other service, the job of finding new employees has been almost fully outsourced. And just like much outsourced labor, the work is often shoddy and of ridiculously low quality.Don’t get me wrong. There are some good recruiters out there. I can name at least three of them. The majority of them, however, know nothing about the job for which they’re recruiting, nothing about how to sell people to the employers they’re supposedly working for, and nothing about you. Nor do they care.In the ‘90s I could (and did) simply ignore third-party recruiter requests. But today, that’s no longer a great option. Many times, unless you have a direct contact at a company, recruiters are the only possible way of getting in the front door. This means that you must understand how and when to leverage these people.At the same time that recruiting has become more and more popular among employers, it has also become a good way for recent college graduates with no career prospects to attempt to make a quick living. These low-end recruiters are paper pushers. They receive a job description from their higher-ups, spam it around, and hope that something sticks.These low-end recruiters are to be avoided at all costs. They will mess up your job search.Third-party recruitment follows a basic pattern, and we’ll get in to the ins and outs of each step.
  • Job description dissemination, in which the recruiter sends out information in hopes of collecting résumés.
  • Résumé collection, in which the recruiter collects the results of the first phase.
  • Résumé submission, in which the recruiter sends your information to an employer.
  • Employer contact, in which the recruiter helps the employer decide whether to move forward with a given candidate, at various stages of the process.
  • Interview prep, in which the recruiter should help you get ready to talk to the employer.
  • Salary negotiation, in which the recruiter should help you get the best possible deal.
These steps are all simple enough, but the basic question to keep in mind is: Who is the recruiter working for and what is his or her goal?Recruiters, for the most part, get paid only once a candidate is hired. And just like anyone else, recruiters want to maintain positive cash flow. This often means getting as many candidates as possible in the door and attempting to rush the process through to get one hired, so that the recruiter can move on to the next victimperson.The sticky part of this situation, for you as a job seeker, is the résumé submission phase. Most recruiter contracts state that employers must pay, upon candidate hire, the first recruiter who submitted the candidate. This has some interesting caveats, and I now present a true story to illustrate:
A while back I was contacted by a recruiter who sent me a very interesting job description. I decided to move forward, sent the recruiter my résumé and heard nothing back. Repeated e-mails were never answered, and that was the end of it. Later, another recruiter, with whom I already had a relationship, sent me the same job. I told him that I was very interested, but thought that there was a chance I had already been submitted. The second recruiter checked with the hiring manager, who said he’d never seen my résumé. He saw it that day, liked it, and we did a phone screen the next afternoon.Just as I was getting prepped for an in-person round I received a phone call from the original recruiter. It seems his friend in HR at the employer had tipped him off that I was coming in for an interview. What had happened to my original submission? I found out later that around the time of my submission the employer had decided to stop working with that recruiter’s firm due to its recruiters being unprofessional and unreliable. But the employer was still contractually bound by that first submission clause. At this point there were two recruiting firms involved, both expecting payouts if I was hired, and a strange contractual gray area given that I’d interviewed only after that second submission. So the employer did the only thing it could do: It stopped talking to me. Game over, for 18 months until that initial contract expired. By which time I was doing something else and the employer had hired some other person.
What is the moral of this story? You must be incredibly cautious when allowing someone to represent you. Looking back, all of the red flags were there. I just didn't notice them.Let’s get back to those phases listed above, starting with résumé submission—the first point at which the recruiter talks to the employer on your behalf. It’s interesting to compare the typical low-end recruiter with the ideal one that you actually want to work with.
  • Résumé submission. A shoddy recruiter will simply email your résumé. A decent recruiter will call first, sell you a bit, and then send the résumé. Ideally after working with you on it a bit. These moves help keep you from getting burned.
  • Employer contact, a.k.a. selling you. A shoddy recruiter won’t bother following up after submitting your résumé. A good recruiter knows that hiring managers sometimes get busy, and everyone hates reading résumés anyway. So the good recruiter can and will push a bit. “Did you look in detail at Sara’s resume? She would really be a great fit for the role.” Work with the bad recruiter who doesn’t follow up? Once again, you’re burned, even if the employer never officially closes the loop.
  • Interview prep. A bad recruiter will simply give you a time and place. A good recruiter, on the other hand, will give you details (or help you find out): who you will talk to, what kinds of questions to expect, how long you’ll be in the interview, and so on. (All important information, but for another post.)
  • Negotiation. Most salary negotiations, when a third-party recruiter is involved, go through the recruiter. Do you want some loser negotiating for you, trying to end things as quickly as possible? Or do you want someone who will play the game a bit in hopes that both of you will profit?
Remember at all times who the recruiter is working for. It may not be you. As a result of these rather nasty career pitfalls, it is absolutely imperative that you work only with better recruiters. Better recruiters will establish relationships. They’ll get to know both the employers for whom they regularly recruit and a set of candidates, and will be able to make more intelligent placements. Remember that recruiting can be incredibly profitable; so there are good reasons for someone who is good at recruiting to stay in the industry for a long time. Veteran recruiters are ones you want to bet your next career move on.If you hear from a recruiter with an interesting job description, the very first thing you should do is verify that this is someone you want to talk to. Go look them up on LinkedIn. Check their job history. Do they have a lot of contacts? You’re a professional and you want to be represented by a professional. If the recruiter has been working in the recruitment field for less than five years, or has a suspiciously low number of contacts, give them a pass to avoid being burned.Recruiter meet the mark? Job sound at least marginally interesting? Ask them for a phone chat to discuss in more detail. The third-party recruiter is not the employer, so you should be extra-candid in your questioning. Take this opportunity to not only find out about the job, but also to interview the recruiter a bit more. Ask the recruiter if he or she has personally spoken with the hiring manager. Ask the recruiter if he or she has personally previously placed candidates at this company, and how the interview process works. Again, you want that personal and professional touch. You also can and should ask the recruiter for feedback on your résumé. A recruiter who has been in the industry for a long time has seen a lot of candidates, both good and bad, and should be able to help you smooth out rough spots. Remember that you’re the recruiter’s product—and a valuable one at that. Take every advantage you can get.If everything works out, have the recruiter send your résumé to the employer. Then sit back and wait. If you don’t hear back within a couple of days, send an email. The recruiter should be on top of things, and this is the final test. If you still don’t hear back from them—even if it’s just “I’m still waiting on the employer,” then write back and tell the recruiter you’re no longer interested. Ask if they can withdraw your résumé submission. They’ll never say yes, but there is a small chance this request will wake them up. I’ve learned that hard way that if a recruiter shows lack of interest early in the process, the same attitude will persist later. And unfortunately this may land you in one of those nasty states I already alluded to.Third-party recruiting has become much more prevalent in recent years, but it doesn’t necessarily have to be a bad thing. Take control and by all means don’t be afraid to say “no” in order to keep yourself out of a bad situation. And never, ever, let a recruiter do or say anything on your behalf without your explicit permission.Best of luck, and happy interviewing!

The SQL Hall of Shame

SQL Server is a big product, with lots of features. Many of the features—hopefully the vast majority of them—are solid and allow us to build trustworthy, secure, and reasonably usable database applications. But nearly everyone has stumbled into a few, shall we say, “darker areas” of the product now and again. Strange and seemingly nonsensical enhancements have come and gone over the years, and we’re all certainly aware of the Microsoft tendency to occasionally introduce amazing-looking but almost unworkable v1 features that never get another glance from the product team prior to their deprecation years later.The Most Useless Feature in SQL ServerThe other day Klaus Aschenbrenner brought this topic to the forefront of my mind (or at least my Twitter stream) with his bold proclamation that the generally lauded “in-memory OLTP” is the “most useless feature in SQL Server.”Personally I have yet to see an especially interesting and non-trivial use case for the feature (and no, user session state management is certainly not it), but most useless? Surely, I thought, that prize has to go instead to English Query? Dejan Sarka quickly chimed in and recommended Data Quality Services. It’s tough to argue with that one, but then Brent Ozar mentioned auto-shrink…It was at this point that I realized just how many candidates there are for “most useless” things lying around the product. So I decided to create my own tweet. I asked for the most useless feature, anytime between version 7.0 (which I would call the beginning of SQL Server’s “modern era”) and now. I received quite a few suggestions, and so I have decided to catalog them here—along with a bit of personal commentary on each one.The list that follows is mostly unordered and culled straight from what I received on Twitter. Hopefully I haven’t missed anything due to Twitter’s weird threading and “priority” mechanisms. And please let me know in the comments if your favorite useless feature is missing, or you’d like to add a comment and/or argument about one of these. Perhaps we can find some way to turn these dark and ugly corners into things of beauty? Well, we shall see…So without further ado, the Most Useless SQL Server Features, per my Twitter followers, as of June 2017!In-Memory OLTP. The one that started it all. Useless? I guess it depends on where you draw the line between “OLTP” and “memory optimized.” The latter is, I think, demonstrably useful in certain cases. For example, improving the performance of temporary tables. But as for the full-blown OLTP use case, as well as the various native optimized options? Very few databases suffer from the exact types of latch contention for which this feature was created, and there are much better and easier ways to get even better performance gains than are offered by native optimization. This feature seems to be getting continued attention from Microsoft, so it may well prove to be the foundation for some very useful things in coming releases. I am going to reserve judgement for now, whether or not Mr. Aschenbrenner agrees.English Query. My number one vote. This was an attempt at bringing “natural language” capabilities into the product. I believe it was introduced in SQL Server 7.0. I’ve never talked to anyone who used it in a production environment. Personally I never even installed it. I don’t recall hearing that it was deprecated, but it disappeared at some point, maybe in SQL Server 2005? No one cried. I guarantee it.Data Quality Services (DQS). This one was so promising on the surface, and something that is absolutely needed in many data warehouse projects. A centralized business rules and data verification platform can really help reduce complexity. But DQS never made it past the initial stage. It was (is?) effectively a collection of valid text strings—and that’s it. No real “rules,” no real workflow. And abysmal performance. Abysmal. An excellent suggestion for this list to be sure.Master Data Services (MDS). I remember when Microsoft acquired this tool. I was working on a large data warehouse project. We watched an online demo and we’re blown away. We decided right away to use it the moment Microsoft released its version. But then that didn’t happen. For a long, long time. Two years later, as I recall, it finally hit the street. And while I never personally tried it, from what I heard it was a complete disaster of bugs and faulty workflow. I’ve heard that it has gotten marginally better since then, but I think the ship has sailed. Microsoft lost the trust of its end users and this product is forever doomed for inclusion in lists like this one.Auto-Shrink. “Stop my file from growing and taking over the disk!” Sounds great, but is it? If you understand even a little bit about SQL Server architecture you understand why that file is growing. And sometimes you really do need to shrink a file, but if you look at the internals of shrink you know that it can cause fragmentation. So shrink is something you want to avoid, not automate. Auto-shrink is a feature put into the product for users who don’t want to deal with the product. It can only cause you more problems. Useless? Sure, I’ll go there.Auto-Close. This option is one that few people know what to do with. It’s there for a reason—it helps large instances, with a tremendous number of databases, make better use of system resources when not all of those databases need to be used at the same time. Most shops shouldn’t turn it on. But useless? I’d say no.Auto-Grow. When creating a file in SQL Server you have various options. You can pre-allocate as much space as you think you’ll need and hope to fill it up later. Or you can allocate the minimum amount of space and let the file grow itself, or not over time. If you only pre-allocate, you risk not having enough space there when a spike occurs and you need it. If you let your files grow too often, you’ll introduce fragmentation. If you grow your files by large percentages, you can wind up consuming way too much disk space. If you combine the best of all of these—pre-allocating using metrics and allowing for incremental growth as a safety net—you’ll be in pretty good shape. All of which is a long way to say that I have no idea why this feature was suggested; it is, in my book, anything but useless. I think about it and consider its configuration every single time I create a database file. You should, too.Notification Services (SSNS). This feature was a framework designed to help generate emails, reports, and other forms of “notifications” when data changed certain ways. (Example: “Let me know when the stock price hits $40/share.”) Added sometime in the early 00s, it was more or less doomed to failure from the get go: It was driven by a very confusing XML-based interface, the documentation was beyond horrible, and the book authoring community largely ignored it (with the exception of a few pages of high-level coverage here and there, and a book written by the guy who created the feature for Microsoft). If you had the patience to muddle through the mess (as I did!) it was possible to get the thing running and doing some fairly decent things. But the learning curve was so steep, and the general situation so poor, that Microsoft didn’t even bother keeping v1 around for more than a single release. The plug was pulled and Notification Services was deprecated prior to SQL Server 2008. Probably for the best.Query Notifications. Entirely unrelated to, but often confused with Notification Services. This is a SQL Server and ADO.NET feature that allows applications to receive events when the results of a SQL Server query might change due to data modifications. This feature was released as part of SQL Server 2005 and would be hugely useful in so many cases except that it has proven itself time and again to be entirely undependable and nearly impossible to debug. And Microsoft has, as far as I can tell, put exactly $0.00 of investment into improving the situation. I am actually getting angry as I type, because there are so very many excellent use cases for this thing that has been left out there in its piece of junk state—but I can’t use it in any but the simplest of scenarios due to the risk factor. Thanks, Microsoft.Buffer Pool Extension (BPE). A feature that received a lot of attention in SQL Server 2014. Well there wasn’t much else in SQL Server 2014 to give attention to. What exactly is the use case for this thing? It’s not entirely clear, but the combination of a fast local SSD with really slow primary storage isn’t especially common. On Twitter Joey D’Antoni suggested some Azure use case, but I felt like he was pushing for it. Useless? Probably for most of us.Management Data Warehouse (MDW) / Data Collector. Everyone who does anything serious with SQL Server buys or builds a monitoring tool. That’s not cheap. Microsoft could provide huge value to its users by shipping a bunch of monitoring and collection capabilities with the product! At least, that was the idea back in 2008. MDW is a classic example of a great idea that got rushed into the product as version 1.0 and then never touched again. It is famously unable to be uninstalled once you turn it on. The data collector is famously limited in what it can do unless you exercise various arcane hacks that are only documented on long-dead blogs written by people who no longer work for Microsoft. And the reporting end is almost nonexistent. Why the ball was dropped, I’ll never know. But in the meantime, I don’t plan to bother turning this thing on.Lightweight Pooling / Fiber Mode. This rather arcane feature was added to help very marginally improve performance of very specific workloads. From Day One it shipped with a warning: Don’t turn this on. It breaks various SQL Server features which, apparently, aren’t needed for that very specific workload. I don’t recall the exact specifics around when or why you’d use this, but I never have. No one I’ve talked to ever has. I’m not even sure it still works, but I won’t be trying it out. I wonder if the handful of customers for whom this was created actually benefitted? If that’s you, let us know!JSON. The long-awaited follow-up to XML support has finally arrived in SQL Server 2016, and right away someone votes it as most useless feature ever. Tough crowd. Personally I’m on the fence here. I don’t see much reason to store data in a “relational” database in JSON format (or XML for that matter). I do see some very limited use case around rendering data to JSON, but I would question why the app can’t do that? Same on the shredding side. Looking back at the XML features, I have mainly used them to query Dynamic Management Views; and in that case I only did that because I had to, due to the fact that those views expose so much data in XML format! I also heavily use XML for concatenating strings. Not so much actual XML work, though. I assume the same story will hold true for JSON. A few people here and there will use it. Most of us won’t, until Microsoft starts converting all of the system objects over. (Please, no!)SQL Server Management Studio (SSMS). I live in SSMS all day long. Aside from a few crashes and the occasional performance issue I’ve had no problem with it, ever, and I felt that it was a huge upgrade from the Query Analyzer and Enterprise Manager days. Sorry, whomever suggested SSMS for this list, but I am absolutely a fan. I’ve tried a few third-party management tools and they don’t even begin to come close. SSMS, useless? Not at all.Connect to SSIS from SQL Server Management Studio. I don’t even know what to say here. I saw this option once or twice, tried it, didn’t see anything that pertained to me, and never really looked into it. Maybe Andy Leonard or some other SSIS specialist can fill us in on the point of this thing. I can’t vote one way or another because I completely lack the context.DROP DATABASE IF EXISTS. This one was suggested by Greg Low, who explained that it fails if anyone is connected to the database. And disconnecting them first requires checking if the database exists. Well that makes way too much sense. Useless.Columnsets (and Sparse Columns in general). I remember being really excited when these features shipped, and I couldn’t wait for them to get some enhancements. (I asked for columnset data type information; my request was denied.) For those who don’t recall, Sparse Columns is supposed to be a way to model really wide tables in which most of the attributes are NULL for any given row, and columnsets is a way to bring back some of the sparse data in an XML format. And these would have been interesting features except Sparse Columns requires making a DDL change on the target table, and it’s really confusing as to when you should or should not use it and how much space you’ll save and so most people just stick with the tried-and-true Entity-Attribute-Value approach. And then there is the manageability problem of maintaining a table with 20,000 columns. Yeah, no thank you. Columnsets were sort of just tacked on to the whole thing and maybe if you could create more than one set and name them or something like that it would be cool, but no, you could only have one, and it has some sort of weird limitations that I don’t remember right now because I’ve never looked at it again and don’t plan on it. I’ll mark this entire feature set, sadly, as useless. But it could have been so great. Version 1.0 For Life strikes again.SQLCLR. If you’ve been following me for any amount of time you know I’m a fan. It has its quirks, it’s still pretty much a v1 thing after 12 years, but I’ve gotten plenty of great use out of it. So has Microsoft. Various SQL Server features leverage it internally. Needs love, but hardly useless from where I sit.Utility Control Point (UCP). I remember hearing about this and reading up on it when it was released. (2008 R2?) But I can’t say much beyond that. I just looked at the documentation and I’m not entirely sure what this thing is supposed to be. I’m not going to spend any more time on it. Does anyone have a counter argument for why we shouldn’t call this useless?Raw Partitions. I am not a storage guy, but I guess this can help improve disk I/O if you’re willing to take on some risk. That seems like a bad tradeoff for most databases. ‘nuff said.Uninstall. I’ve lost count of the number of times I’ve installed SQL Server over the years, but I’ve probably only uninstalled a few. And I can’t say it was ever a simple experience. SQL Server tends to leave fragments all over the place, like many Microsoft products, and uninstall is anything but clean. The best way to do it is to simply pave away. So I have to agree with whomever posted this suggestion on Twitter. Uninstall, you’re pretty useless.Service Broker (SSB). Ah, Service Broker. I have a real love-hate relationship with this feature. I’ve used it on lots of projects. It works well, when it works. It’s a pain to debug when it doesn’t work. Its architecture is ridiculous and beyond overbaked. (Conversations and dialogs, really? All we want is a simple queue! And the fact that you can create fake conversations to make it go faster? I can’t even…) But when it comes down to it, having some kind of queue in the database is better than not having any kind of queue in the database. It’s solid, transactional, and after a bit of configuration it usually gets the job done.Not Freeing Allocated Memory Except Under Pressure. SQL Server allocates as much memory as it needs, up to however much you tell it to use. (The memory configuration amount is even referred to as a “target.”) It uses all of that memory to store your data and query plans so that it doesn’t have to re-retrieve it off of slow disks or re-compute them, respectively. And that is a good thing. Why should it prematurely free that memory? If you don’t want it using as much memory, tell it to use less memory. Useless? No. This is absolutely a feature and a correctly implemented one at that. Sorry, Twitter person.Database Engine Tuning Advisor (née Index Tuning Wizard). Back when I was a SQL Server newbie I needed to make things fast. The only problem is that I had no idea how to make things fast. So I clicked around and there was this wizard that said it would analyze my workload and tell me what to do. How cool is that? I tried it and it suggested all sorts of interesting things, which I did. And then a year later I knew a bit more and was kind of embarrassed that I’d created some of these ridiculous artifacts. And then a few years later Microsoft gave the thing its kind of downgraded new name—it’s no longer a full-on Wizard, now it’s merely an Advisor. But here’s the thing: I want my advisors to give me good advice, not wreck my database. So I’ll go elsewhere for advice, like for example my brain. Thanks.DBCC PINTABLE. The guy who doesn’t like SQL Server not freeing allocated memory would lose his mind over this one. A long-gone feature at this point, it specifically marked tables such that their data would be infinitely held in memory. (I can’t recall what was supposed to happen under pressure.) As it turns out this is pretty useless given the way SQL Server’s least-recently used cache aging algorithm works. So yeah, useless.Virtual Interface Adaptor (VIA) Network Protocols. I don’t even know what this is, I just wrote commentary on over 20 other things, and I’m too tired to read up. So sorry, Gianluca Sartori. Tell us all about it in the comments, if you would.Mirrored Backups. Another suggestion by signore Sartori. I have no feeling on this one way or another, but it never seemed like taking a backup and then copying it was all that much of an issue so I don’t really see the point.SQL_VARIANT. Long ago I did some work in Visual Basic and quickly became annoyed with the fact that I never knew what type my variables were, or even whether they had been properly initialized. Welcome to the joys of variable data type systems. I was able to work around this somewhat by using Option Explicit, but now let’s think about the difference between your application—which needs to deliver data to your users—and your database—which is expected to store, master, and protect that data. Using a variable data type in the application is annoying. Using it in the database? That’s downright scary. That means you don’t know your data. That said, this data type has been useful over the years in very special cases, such as functions that need to be able to seamlessly return various different types of data. As a matter of fact, Microsoft used SQL_VARIANT in SQL Server 2016 as part of the context data enhancement. So useless? No. But use with extreme prejudice.So that’s that. The complete list from Twitter, annotated. Got more? Want to argue, agree, or otherwise? Comment away.

The Guru (T-SQL Tuesday #089)

T-SQL TuesdayI became a consultant a bit earlier in my career than was probably wise. The going was rough at first, with periods of feast and periods of famine. Although I had several clients, I didn’t understand how to sell services or schedule my workload, and so I’d wind up either doing huge amounts of work for a few weeks, or absolutely nothing some other weeks. The topic of this month’s T-SQL Tuesday is “Database WTFs” and it was during one of my periods of relative inactivity that I met Jonathan at a local user group meeting, which resulted in one of the more memorable WTF situations of my career.Jonathan was a few years older than I was. Most of the user group attendees wore jeans and sneakers. Jonathan had on a nice looking suit and shiny shoes. He had a smooth demeanor and introduced himself as “a consultant and SQL guru.” Literally, with that exact phrasing. We struck up a conversation and he told me about all of the amazing work he was doing with various customers. I was extremely impressed. Jonathan represented the consultant I thought I wanted to be, a bit further down the road.I told Jonathan about my struggles with finding steady work, and he immediately offered to introduce me to a company he’d spoken with, CorpX, Inc. He explained that they were a growing company and needed lots of performance tuning help. He was just too busy on other things to give them a hand, but maybe I could. I eagerly agreed, and he did an e-mail introduction the next day.I don’t remember the initial conversation with CorpX, nor whether there was any kind of interview, but we signed a three month agreement and I told my customers I wouldn’t be available for a while. Before I knew it I was nervously walking in to a strange office for my first day on the job. Once in the company’s space, the receptionist told me she wasn’t expecting me, but that I could sit down and wait while she sorted things out. She offered me a magazine, but only after 20 minutes of waiting did I actually pick it up and start flipping through the pages. The receptionist told me that she thought I was supposed to meet with the CTO, but he was busy, so I should keep waiting. An hour later I’d read the magazine cover-to-cover, so I picked up a second one.At some point, a few magazines in, a guy stopped by to chat with the receptionist. He was a some sort of IT person, and she asked him if he knew when the CTO would be available. “Oh, he went for coffee a while ago.” However, the guy had heard that I was supposed to be there that day, and was able to lead me to my space: A cubicle in a dark corner of the basement. Just like Milton’s. He left me there to stew, but it didn’t last long. I had no login credentials, nothing to do, and was tired of wasting my day. So I tracked him down again and asked him to introduce me to the DBA.The DBA was a red-faced guy named Dave, and he seemed to be in a state of panic. He spoke quickly and told me he didn’t have time to deal with me; he was in the middle of something way too important. His large office was perfectly clean and ordered, and I clearly recall the three huge monitors set up on his desk (well, huge for the time). A complex ER diagram covered all available screen real estate. I told Dave that I could possibly help out with his SQL Server problem, and he replied that there was nothing for me to do. He deeply involved in designing the next version of the database. He couldn’t possibly think about anything else. He asked me to leave his office.Vintage Baseball BatsIt was around noon, so I left for a while to get lunch. When I returned to the office, this time I was marched directly to the CTO. “Where have you been? I’ve been waiting for you.” The CTO was a middle-aged guy with a beer gut and, apparently, an extreme baseball fetish. His office was filled with baseballs, helmets, jerseys, and bats. He absentmindedly played with a bat as he talked to me, lifting it and bringing it down on his desk when he wanted to emphasize a point. “We hired you as a pinch hitter. We retired our SQL Server team a few weeks ago, because I finally came to the realization that SQL Server can’t meet our performance demands. It’s like the database farm team, you know? We’re moving to Oracle in six months. That’s a real database! Until then, though, we need to play hard and score one for our customers, and that’s where you come in. You’ll turn our screwballs into fast balls. Well, as fast as that crappy thing can go, am I right?”I asked him if he was sure that SQL Server was at fault. Even though I was still fairly young, I had already worked on some pretty big projects. And I had a number of very experienced friends, including some well known MVPs who talked about their work on gigantic systems. “Yes, we’re sure. We had Jonathan in here, and he is a SQL guru. He told us that SQL Server simply does not scale.”Part of me wanted to get up and run, but I was intrigued. I decided to treat it as a challenge. I would prove him wrong and save the company millions on its Oracle transition. Maybe I’d score a steady long-term client in the process. I asked for access to a test system and to the source control repository, and the CTO gave me a blank look. “You’re here to performance tune our production database.” There was no need for a test environment, he told me, and he wasn’t aware of any database code in source control. He handed me a printout of a network login and password and told me that I had plenty of access to “work some magic.” I asked him which part of the application to focus on, and he told me to work on any piece I felt like; they were all equally slow. He concluded our discourse by saying that he was a very busy person and as a consultant I was expected to figure things out for myself and not bother him or his staff with details. I was to come back and talk to him when I had made some progress, and not before then.Back in my basement hideout, I spent the next couple of hours exploring the network and figuring out which server to connect to. The CTO was right; I did have enough access. I was sysadmin on the production SQL Server and had full admin access to the app server. I logged in to the app and with the help of a Profiler trace managed to figure out one of the main slow stored procedure calls that occurred any time someone saved a change via the user interface.Pasting the procedure call into SSMS, I turned on Actual Execution Plan, hit F5, and got ready to see indications of a few missing indexes. I was ready to walk back upstairs, gloat to the CTO, and ask for a better workspace so I could continue to help. What I didn’t expect was what actually came back: Not one execution plan, or two, or three, but hundreds and hundreds. The scroll bar become progressively smaller as time clicked by and the elapsed counter did not stop running. All I’d done in the application was change the name of a single field. What was going on?RecursionI opened the stored procedure and it was a simple update against a single table. But after poking around in Object Explorer I discovered that the table had four different insert and update triggers. Each of the triggers created a cursor over the inserted table, and for each row therein called one or two stored procedures. Each of these stored procedures was seemingly more complex than the last, with increasing amounts of dynamic SQL and nested cursors. And each did updates or inserts to at least one, and usually several tables. As I chased down those tables I saw the same pattern again and again: More triggers on more tables, with more cursors and more stored procedure calls, with more dynamic SQL and more cursors and more stored procedure calls. I began recursively searching for the bottom of it all, but gave up once I hit 9 or 10 levels. The tables were named with some sort of scheme involving various abbreviations, but with no documentation or support, I had no clue what was going on.Getting concerned about my ability to unwind this mess, I looked around the database and discovered that the same pattern was used on every single table. I ran some counts and found around 400 tables, 1200 triggers, and several thousand stored procedures and user-defined functions. Not to mention a massive number of views, all extraordinarily nested.The best part? Every single module in the database had a dated copyright notice, containing my friend Jonathan’s name. He had apparently built the entire thing himself over the prior four years, no doubt billing some truly obscene number of hours. And then he had gone ahead and signed his work. To this day I still wonder about the legal implications of that copyright notice. What does it mean to copyright a work for hire that contains logic specific to someone else’s business? And why did CorpX allow this in its code? Over all of that time did no one else ever look at the database?I wandered upstairs to ask the CTO what he expected me to do with such a massive mess, but the office was empty. Sitting in the basement, I’d lost track of time, and had spent four or five hours just researching that one initial bit of code path. It was late in the evening, I’d fixed nothing, and I knew that I wouldn’t be able to work any magic. I realized that the guru had already tricked us all, and I decided to cut my losses. I packed up my things, logged out, and left the building. I emailed the CTO that night and politely told him that the engagement wouldn’t work out. I never billed for my day, and he never replied to my note. I looked up Dave, the DBA, on LinkedIn a few months later. He was now “database architect” for a different company. Years later I met someone else who worked for CorpX and found out that they were still, miraculously, in business. Perhaps in the end they did find a magician; I was told that they were still running SQL Server.Copyright JonathanI ran into Jonathan one other time, and he asked me why I never thanked him for getting me into such a sweet gig. I merely shook my head.(Yes, this is a true story. Well, mostly. I changed some names and details to obscure things a bit. But this is pretty much how things happened. I learned a lot that day!)

Solving the Net Changes Problem with Temporal Tables (T-SQL Tuesday #087)

SQL Server 2008 was perhaps not the meatiest of SQL Server releases, but it did have one especially promising feature: Data change detection. Or maybe two features, depending on how you look at things. I’m talking, of course, about the almost confusingly similarly named Change Data Capture and Change Tracking.Change Data Capture was the big gun that promised to solve everyone’s change tracking woes (well at least everyone running Enterprise Edition). In practice it was a heavy, overwrought hack of a feature. Built on the foundations of replication—a feature designed to move data, not record and store changes to it—the entire system was quirky and problematic. Users had to become familiar with odd APIs, low-level transaction identifiers, and the occasional call to sp_repldone for that wonderful moment when the whole thing crashed and burned. But enough about this feature. If you want to know more about it and its future in SQL Server, look it up in the Azure SQL Database documentation.Change Tracking, on the other hand, was designed to solve only one problem. It did so using a totally integrated approach, with changes recorded as part of the write transaction, first-class T-SQL syntax, and a very simple and well thought out API. The Change Tracking problem? Net changes. Simply put: First give me all of the rows. Then let me periodically ask what’s changed since last time I asked. When I ask, tell me how to synchronize my copy, cache, app, or whatever is on the other end, to get it into the same state as the host data set.T-SQL TuesdayFast-forward to SQL Server 2016, and now we have Temporal Tables. This feature is, in many ways, the best of both worlds when it comes to change detection. Temporal Tables is as well integrated into the query engine—and the transaction—as Change Tracking, but captures all changes, much like Change Data Capture. It has a clean and simple interface, and while it’s definitely a v1 technology it seems like a much more solid foundation than either of the SQL Server 2008 features ever did.This month’s T-SQL Tuesday, hosted by Matt Gordon, asks for solutions to old problems using “shiny new toys.” I wondered—could I solve the Change Tracking net changes problem using Temporal Tables? Unlike its predecessors, net changes is a major use case that Temporal Tables simply doesn’t do right out of the box. Turning on both Temporal Tables and Change Tracking side-by-side is an option, but that seems like overkill. One change detection technology should be enough…or is it?Thinking through the problem, it quickly became apparent to me that once you have all changes—as we have with Temporal Tables—getting to a net scenario is not especially difficult. Given a primary key and a time interval over which we’re concerned about changes,  there are of only four possible outcomes:
  • A new row for the key was inserted
  • The row was updated
  • The row was deleted
  • Nothing happened
For now we can ignore the final outcome and focus on the first three, but of course any proper solution must ensure that the negative case is taken into account so as to not send false changes.For now, consider the following set of propositions, given that we’re asking at time Y for all changes since a prior time X.
  • INSERT: The key did not exist at time X but does exist at time Y.
  • DELETE: The key existed at time X but does not exist at time Y.
  • UPDATE: The key existed at both time X and at time Y, and at least one change occurred between time X and time Y.
Given these assumptions, we can begin work on a temporal queries that return the necessary rows. Solving for these conditions will require all rows that were active as of time X, rows that were (or are) active at time Y, and for the final case, all rows that were active in-between times X and Y. Since this is a range-based scenario, our best Temporal predication option will be either FROM or BETWEEN. The difference between these two is subtle: FROM uses an open interval (non-inclusive at both endpoints), whereas BETWEEN uses a half-open interval, inclusive on the end date. Given the choice in the scenario, BETWEEN makes more sense, as we can take advantage of the inclusive endpoint to avoid dropping a badly-timed row. But more on that in a moment.To begin solving for net changes, we must first get all changes over the time interval in question:[sql]SELECT t.pk, t.[other columns], t.valid_start_time, t.valid_end_time FROM [table] FOR SYSTEM_TIME BETWEEN @start_time AND @end_time AS t [/sql]This query will return all rows with a valid_end_time greater than the passed-in @start_time and a valid_start_time less than or equal to the passed-in @end_time. With that in mind, we can start putting together some expressions for each of the operations in question.First step in tackling the insert: Did the key only come into existence after @start_time? The following expression finds out by testing the minimum valid_start_time per key; if it’s after the passed-in @start_time, we know it’s a new key.[sql]CASE WHEN @start_time < MIN(t.valid_start_time) OVER (PARTITION BY t.pk) THEN 1 ELSE 0 END AS is_new_key [/sql]A key could have been both inserted and deleted in the interval between @start_time and @end_time—or simply deleted altogether—so a second expression is necessary to determine whether the row still exists at @end_time. The expression below accomplishes this by checking the @end_time against the maximum valid_end_time per key. If the @end_time is greater than or equal to the maximum valid_end_time then the key must have been deleted. (This is where we’re taking advantage of the inclusive endpoint used by the BETWEEN predicate.)[sql] CASE WHEN @end_time >= MAX(t.valid_end_time) OVER (PARTITION BY t.pk) THEN 1 ELSE 0 END AS is_deleted_key [/sql]The final case is that of update operations. A ranged Temporal predicate will return at least one row per key that was active during the range. If the query returns more than one row for a given key, some change must have occurred. A simple COUNT will suffice in this case.[sql]CASE WHEN 1 < COUNT(*) OVER (PARTITION BY t.pk) THEN 1 ELSE 0 END AS is_changed_key [/sql]Putting this all together, we can come up with a general template query to answer net changes style questions. To make this work I put the above expressions into a derived table, to which I added a ROW_NUMBER. The numbering is used so that we get only a single version for each row – the most recent one. (Consider cases where a row was updated multiple times during the interval.) The query below uses the derived table [x], and applies a WHERE clause that filters the set based on the expressions, in order to output only rows we’re interested in: new, deleted, or changed rows, except in cases where a row was both inserted and deleted in the input range. The CASE expression in the outer SELECT list is used to determine what type of operation will have to be done on the synchronizing side in order to bring the two sets into parity.[sql] SELECT x.*, CASE WHEN x.is_new_key = 1 THEN 'INSERT' WHEN x.is_deleted_key = 1 THEN 'DELETE' ELSE 'UPDATE' END AS change_operation FROM ( SELECT t.pk, t.[other columns], t.valid_start_time, t.valid_end_time, CASE WHEN @start_time < MIN(t.valid_start_time) OVER (PARTITION BY t.pk) THEN 1 ELSE 0 END AS is_new_key, CASE WHEN @end_time >= MAX(t.valid_end_time) OVER (PARTITION BY t.pk) THEN 1 ELSE 0 END AS is_deleted_key, CASE WHEN 1 < COUNT(*) OVER (PARTITION BY t.pk) THEN 1 ELSE 0 END AS is_changed_key, ROW_NUMBER() OVER (PARTITION BY t.pk ORDER BY t.valid_end_time DESC) AS rn FROM [table] FOR SYSTEM_TIME BETWEEN @start_time AND @end_time AS t ) AS x WHERE x.rn = 1 AND ( x.is_new_key = 1 OR x.is_changed_key = 1 OR x.is_deleted_key = 1 ) AND NOT ( x.is_new_key = 1 AND x.is_deleted_key = 1 ) [/sql]And in theory, that’s that. This query does in fact solve for net changes. But the devil’s in the details, and there is a subtle bug in the above query: What happens if the passed in @end_time happens to fall at exactly the same time that an update occurred on one of my rows? The answer is that the MAX(t.valid_end_time) for that key will be equal to the passed in @end_time, and so the logic above will consider the update to be a delete. Clearly not desirable behavior!The solution? In an update operation, Temporal Tables logs the new row with the same valid start time as the old row’s valid end time. Therefore, to figure out whether an update occurred, we must look forward exactly one row from our actual intended end the. To accomplish this, the following modified version of the query creates a new search end time by bumping the input end date up by 100ns (the finest resolution supported by DATETIME2).[sql] DECLARE @search_end_time DATETIME2 = DATEADD(NANOSECOND, 100, @end_time)SELECT x.*, CASE WHEN x.is_new_key = 1 THEN 'INSERT' WHEN x.is_deleted_key = 1 THEN 'DELETE' ELSE 'UPDATE' END AS change_operation FROM ( SELECT y.*, CASE WHEN 1 < COUNT(*) OVER (PARTITION BY y.pk) THEN 1 ELSE 0 END AS is_changed_key, ROW_NUMBER() OVER (PARTITION BY y.pk ORDER BY y.valid_end_time DESC) AS rn FROM ( SELECT t.pk, t.[other columns] t.valid_start_time, t.valid_end_time, CASE WHEN @start_time < MIN(t.valid_start_time) OVER (PARTITION BY t.pk) THEN 1 ELSE 0 END AS is_new_key, CASE WHEN @end_time >= MAX(t.valid_end_time) OVER (PARTITION BY t.pk) THEN 1 ELSE 0 END AS is_deleted_key FROM [table] FOR SYSTEM_TIME BETWEEN @start_time AND @search_end_time AS t ) AS y WHERE y.valid_start_time <= @end_time ) AS x WHERE x.rn = 1 AND ( x.is_new_key = 1 OR x.is_changed_key = 1 OR x.is_deleted_key = 1 ) AND NOT ( x.is_new_key = 1 AND x.is_deleted_key = 1 ) [/sql]This query uses @search_end_time in the Temporal predicate, but then does all of the comparison work using the original @end_time. For the is_new_key check, this doesn’t matter, as we’re only comparing start times. For the is_deleted_key check it makes all the difference in the world, as an update done at exactly @end_time will populate a new row which will increase the value of MAX(t.valid_end_time). Because the extended search predicate can return an extra row, the is_changed_key and rn expressions had to be moved to an outer table expression filtered by the original @end_time. If an update occurs exactly at @end_time, we don’t want it to trigger a delete operation now, but we also don’t want to actually see it until next time we synchronize.That bug solved, we now must consider a second, slightly more important unsolved problem: What do you use for the @start_date and @end_date?Naïvely speaking we should be able to ask for a base set of rows by using some long-past date as the @start_date—say, 1900-01-01—and the current SYSUTCDATETIME() as the @end_date. Then we should be able to pass back that same @end_date next time as the start date, and so on and so forth. But that approach will open you to a major issue; Temporal Tables simply wasn’t designed for this.Backing up a bit: Change Tracking, which was in fact designed for answering net changes questions, works by placing a surrogate internal transaction identifier on each row touched during a transaction, and then exposing an external transaction identifier that corresponds to the commit time of the transaction. It does this by using an external “transactional commit table” in conjunction with the main changes table. But Temporal Tables uses an entirely different system, wherein there is no external table, and each row is populated with its one-and-only, both internal and external identifier, as it is being written. And every row touched by a given transaction gets the same exact time: The start time of the transaction.So what’s does this mean when we’re talking about net changes? Pretend for a moment that you start a transaction, jump on a plane, go hang out on the beach in Jamaica for a week (highly recommended), and then return to your office (lightly sunburned), update a table of your choosing, and commit the transaction. The time stamped on the row will correspond to the start of the transaction—a week ago, before you ever boarded the plane. Meanwhile, your synchronization process has been running regularly, let’s say once an hour, and it thinks it’s lagged by only an hour. It’s never going to ask for rows that changed a week ago. This is referred to in change detection parlance as a lost update.Does this mean we’re completely out of luck? Of course not—but it does mean that a complete solution will have to consider the transactional state of the system. And you’ll have to monitor to make sure that no one starts a transaction and leaves it hanging around for a week. (You should probably be doing that anyway.)The key to my full solution is the sys.dm_tran_active_transactions DMV. This view contains one row per current transaction—either explicit or implicit—and a handy column called transaction_type which categorizes the transactions as read, read/write, system, or distributed. It also includes a transaction start time, which can be used to figure out which times we might not want to touch just yet, if some transaction is still outstanding from then.The following expression uses the DMV and is designed to output a safe @end_date, that will avoid lost updates. It starts by asking for the minimum start time for any active read/write transactions (transaction_type 1). If there are none, it uses the current SYSUTCDATETIME instead. Just to be on the extra safe side—in case there was a delay in processing due to sitting on a long runnable queue, for example—the expression pulls five seconds off of the end time. Note that this code also needs to be able to covert the local time as reported by the DMV into UTC as required by our Temporal predicate. There is currently no reasonable way to get the local time zone in T-SQL (there is at least one hack involving a registry read, which I do not recommend), so you’ll have to replace the hardcoded Eastern Standard Time with whatever time zone is appropriate for your server.[sql] CONVERT ( DATETIME2(0), DATEADD ( SECOND, -5, ISNULL ( ( SELECT CONVERT(DATETIMEOFFSET, MIN(transaction_begin_time)) AT TIME ZONE 'Eastern Standard Time' AT TIME ZONE 'UTC' FROM sys.dm_tran_active_transactions WHERE transaction_type = 1 ), SYSUTCDATETIME() ) ) ) [/sql]

The final operation I’ve added to the end date expression is to truncate the fractional (sub-second) precision off of the DATETIME2. The reason for this is that it’s important that subsequent change requests use the exact same start time as the prior end time. If you’re storing the value from your last synchronization in a system that has slightly different date rules than SQL Server, you may lose some precision and wind up with a rounding error. So better, in my opinion, to delay changes by up to one additional second, by ensuring that any possible date/time data type will suffice.

Once all of this is in place on your server, pointing at your table, usage is pretty simple:

  • Set a start date of 1900-01-01.
  • Set an end date using the expression above.
  • Ask for “changes.” That’s your input set. (Similar to Charge Tracking’s VERSION.)
  • Record the end date, and use it as the start date next time. The same date functions as both end date one time and start time the next because of the BETWEEN predicate’s half-open interval; if a delete operation occurred at exactly the end date, it will be included in that synchronization due to the end date being inclusive, but it will not be included in the next synchronization because the start time comparison is non-inclusive.
  • Re-populate the end date each time using the expression. It may be worthwhile to log the start and end dates so that you can detect conditions where the new end date is not much greater than the start date. This would indicate either that you might want to increase your polling interval or figure out why user transactions are running for a long time.
  • Repeat as needed. Remember that every cache is effectively stale the moment you’re done synchronizing it, so you must strike a balance between synchronized enough and not putting excessive stress on the host system trying to keep things up to date. And given the safety measures I’ve taken in the above expression, if you use this system and ask for changes more often than once every six seconds, you’ll tend to not get very far.

That’s it—net changes from Temporal Tables. Not the simplest exercise in the world, and maybe not quite as perfect as we’d like, but completely workable none the less. And of course, it’s always fun to put shiny new toys completely through their paces.

Enjoy, and happy synchronizations!

SQL Server 2016 “Hidden Gems” – Now on a YouTube Near You

Sad that you missed last week's SQL Server 2016 "Hidden Gems" GroupBy session?Unlikely, perhaps...but no worries either way! It's now available for free, on YouTube:https://www.youtube.com/watch?v=P4608MNM-QUEnjoy!

SQL Server 2016 “Hidden Gems” Resources

Today I was honored to present my SQL Server 2016 “Hidden Gems” talk for the GroupBy online conference.The slide deck and demos are attached to this post.Following are some links to help you in further exploration of the various features I discussed:STRING_SPLIThttps://msdn.microsoft.com/en-us/library/mt684588.aspxAaron Bertrand’s performance tests: https://sqlperformance.com/2016/03/sql-server-2016/string-splitSession Contexthttps://msdn.microsoft.com/en-us/library/mt605113.aspxTime Zone Supportsys.time_zone_info: https://msdn.microsoft.com/en-us/library/mt612790.aspxAT TIME ZONE: https://msdn.microsoft.com/en-us/library/mt612795.aspxHASHBYTEShttps://msdn.microsoft.com/en-us/library/ms174415.aspxSession Wait Statshttps://msdn.microsoft.com/en-us/library/mt282433.aspxLive Query Statisticshttps://msdn.microsoft.com/en-us/library/dn831878.aspxAbility to attach to other sessions' Live Queries! https://blogs.msdn.microsoft.com/sql_server_team/query-progress-anytime-anywhere/Actual Query Plan Data Enhancementshttps://blogs.msdn.microsoft.com/sql_server_team/added-per-operator-level-performance-stats-for-query-processing/INPUTBUFFER DMVhttps://msdn.microsoft.com/en-us/library/mt652096.aspxDatabase Scoped Configurationhttps://blogs.msdn.microsoft.com/sqlserverstorageengine/2016/05/02/database-scoped-configuration/Increased Number of Index Keyshttps://msdn.microsoft.com/en-us/library/ms143432.aspx#EngineIncreased Index Key Sizehttps://blogs.msdn.microsoft.com/sqlserverstorageengine/2016/04/26/increased-nonclustered-index-key-size-with-sql-server-2016/COMPRESShttps://msdn.microsoft.com/en-us/library/mt622775.aspxColumnstoreToo much stuff to list! Go google it.File Attachment: machanic_sql_2016_hidden_gems.zip