Retrieving a return value from a stored procedure using the Entity Framework DBContext.Database class

I was trying to figure out how to call a stored procedure and then retrieve the return value using the Entity Framework in .NET.      Entity Framework is designed to be an ORM layer, so it doesn’t really have very strong support for stored procs.    My problem seemed to be somewhat common as I found a decent number of results on google.    It seems like lots of places that use Entity Framework still require the use of procs.    Perhaps they don’t trust the performance or scalability of Entity Framework, or perhaps the particular procedure in question encapsulated some complex and highly optimized SQL logic.

 

Whatever the reason, the code had to call a stored procedure and retrieve a return value.    The stored procedure in question could not be modified to use an output parameter instead of having a return value.  This also seemed to be a common requirement.   Typically the proc would be really old legacy TSQL, and changing it would have required too much bureaucracy to have been worth it.

 

So there’s a couple of ways to solve the problem.   Probably the simplest and easiest way is not to use Entity Framework at all and just use plain old ADO.NET instead.       However, in my case, everything else in the solution  was already using the Entity Framework, so I wanted the code to be consistent.    After doing some investigation and testing, I finally figured it out.    The trick is to set the direction of the SQLParameter as ParameterDirection.Output.    This is what tripped me up initially, as in ADO.NET, you would declare the SQLParameter with direction type ParameterDirection.ReturnValue.      Another interesting thing to note is that Database.ExecuteSqlCommand returns an integer.   But this int does not correspond to the return value from the stored procedure.   The MSDN documentation seemed to indicate otherwise.   It states that the return value is: “The result returned by the database after executing the command.”     When I tried storing the result, I got back -1 and I’m not really sure why.

 

Everything  I found online consisted of just code snippets, so here is a complete code sample that deletes a row from a database table and checks the return value to see if it was successful.

 

 


public bool DeleteRowFromDB(int rowId)
{
    bool success = false;
    var retVal = new SqlParameter("@Success", SqlDbType.Int) {    Direction = ParameterDirection.Output };

    object[] parameters =
    {
         new SqlParameter("@RowID", SqlDbType.BigInt) { Value = rowId}
        ,retVal
    };

    string command = string.Format("exec @Success = dbo.spDeleteTableRow @RowID");

    //the result is -1 regardless of what the stored procedure returns
    //note that ExecuteSqlCommand maps to sp_executesql in SQL Server 
    //which will append the type and value for @RowID  
    int result = _DBContext.Database.ExecuteSqlCommand(command, parameters);

    if ((int)retVal.Value == 1)
    {
        success = true;
    }

    return success;
}

 

Finding crap in the SQL Server database

Anyone who has ever maintained an old legacy code base often finds themselves working with a decade old database schema that now spans hundreds of tables, each of which contain dozens of columns. Trying to locate a particular column or table in the db can be particularly difficult, especially if the only clue you have is a vague inkling of what said table/column might be named. Luckily, SQL Server provides several system tables that hold metadata that can help you find the proverbial needle in the haystack.

We can query sys.tables and sys.columns like so:

--look for a particular table
SELECT t.name AS table_name
FROM sys.tables AS t
WHERE t.name LIKE '%needle%'

--look for a particular column
SELECT t.name AS table_name,
c.name AS column_name
FROM sys.tables AS t
JOIN sys.columns c
ON t.OBJECT_ID = c.OBJECT_ID
WHERE c.name LIKE '%needle%'
ORDER BY table_name;

The information_schema.columns table lets you accomplish the same thing. This table shares a lot of the same information as sys.columns (but does not contain the object_id), but it already contains the table_name so you don’t have to do a join. This is good if you are lazy and dislike typing more than necessary.

SELECT TABLE_NAME, COLUMN_NAME
FROM Information_Schema.Columns
WHERE COLUMN_NAME LIKE '%needle%'
ORDER BY TABLE_NAME;

These queries provide a good starting point for exploring the database and all the metadata available to you. Of course, if you’re too lazy to do any typing at all, you can simply open SQL Server Management Studio, click on “View” in the menus, and select Object Explorer Details. From there you can do a search. Its important to note that this searches EVERYTHING. Views, stored procedures, triggers, tables, columns, and everything else that is an object in SQL Server.

If you run the profiler, you can see that it is indeed querying all the system tables, including sys.database_principals, sys.asymmetric_keys, and sys.all_objects.

Often times this UI search will get the job done, although it can produce a lot of noise.

TSQL PIVOT

The TSQL PIVOT command is an operation that essentially rotates a set of table data. The end result is a transform that converts columns into rows. One common usage of PIVOT is to normalize denormalized data. This may seem rather abstract and confusing at first, so here’s a simple example.
ACME Corp has an optional end user survey that allows you to them to track customer satisfaction. The survey asks users to rate the company on a scale of 1-5 across a variety of criteria. This data is stored in the database as a set of denormalized key/value pairs:

CREATE TABLE SURVEY_RESULTS (
   RESULT_ID int PRIMARY KEY IDENTITY(1,1)
, [USER_ID] int
, QUESTION_ID int
, SURVEY_SCORE int
)

The SURVEY_RESULTS table has a foreign key, QUESTION_ID, that maps to the QUESTION_ID column in the table SURVEY_QUESTIONS. The SURVEY_QUESTIONS table stores the actual survey questions themselves (In a real world scenario, SURVEY_QUESTION would store a localization id that points to some resource file instead of holding the actual text of the question):

CREATE TABLE SURVEY_QUESTIONS (
QUESTION_ID int PRIMARY KEY
,SURVEY_QUESTION varchar(50))


ALTER TABLE SURVEY_RESULTS
	ADD CONSTRAINT fk_SURVEY_RESULTS_QUESTION_ID_SURVEY_QUESTIONS_QUESTION_ID FOREIGN KEY (QUESTION_ID)
REFERENCES SURVEY_QUESTIONS (QUESTION_ID)

INSERT 
  INTO SURVEY_QUESTIONS WITH (ROWLOCK)
       (QUESTION_ID, SURVEY_QUESTION)
VALUES (1, 'Rate our service')
      ,(2, 'Rate your customer satisfaction')
      ,(3, 'Rate our user interface')

INSERT
  INTO SURVEY_RESULTS WITH (ROWLOCK)
       ([USER_ID], QUESTION_ID, SURVEY_SCORE)
VALUES (1, 1, 5)
      ,(1, 2, 4)
      ,(1, 3, 5)
      ,(2, 1, 4)
      ,(2, 2, 3)
      ,(2, 3, 2)
      ,(3, 1, 3)
      ,(3, 2, 2)
      ,(3, 3, 1)

This design allows flexibility in adding and removing questions. Using a normalized table with a column for each question would mean that every time the survey changed, entire columns would need to be added or removed. Of course, a normalized table makes it much easier to visualize the survey results though, so it would be helpful to create a VIEW on the SURVEY_RESULTS table that converted the key/value rows into columns.

SELECT * FROM SURVEY_RESULTS

Gives us the following, which is hard to interpret at a glance due to its denormalized nature:

 
RESULT_ID   USER_ID     QUESTION_ID SURVEY_SCORE
----------- ----------- ----------- ------------
1           1           1           5
2           1           2           4
3           1           3           5
4           2           1           4
5           2           2           3
6           2           3           2
7           3           1           3
8           3           2           2
9           3           3           1 

What we’d like to see is one row for each user, with the score for each question as its own column in that row. This can be achieved using PIVOT (In the real world we’d create a view for easy at-a-glance access to this data):

SELECT  
USER_ID, [1] AS 'Service score', [2] AS 'Customer satisfaction', [3] AS 'User Interface'
FROM
(SELECT [USER_ID], QUESTION_ID, SURVEY_SCORE
    FROM SURVEY_RESULTS) AS InputTable
PIVOT
(
AVG(SURVEY_SCORE)
FOR QUESTION_ID IN ([1], [2], [3])
) AS PivotTable;

This gives us the desired result set:


USER_ID     Service score Customer satisfaction User Interface
----------- ------------- --------------------- --------------
1           5             4                     5
2           4             3                     2
3           3             2                     1

Let’s take a closer look at this query. The first thing that we need to do is specify what data we are pivoting. This is done in the FROM clause. As you can see, this is just a simple table expression that selects all three columns from the SURVEY_RESULTS table. For brevity’s sake, I will refer to this table expression by its alias, InputTable.

Now that we know what we are pivoting, we need to identity the pivot column and the set of values to pivot on. This is done in the FOR clause. In our example, we are pivoting on the QUESITON_ID column, and the set of values we are interested in are 1, 2, and 3, which correspond to the questions about service, customer satisfaction, and the user interface. For each value in this set, all rows in InputTable that have a matching value in the specified pivot column will be combined together based on an aggregation function. In our case, that would be the average survey score.

The astute reader will point out that applying an aggregation function implies that there is some sort of grouping being done. There is indeed an implicit grouping based on any column appearing inside InputTable that is not directly referenced inside of the PIVOT statement. In our case, USER_ID is mentioned inside the InputTable but it is not referenced inside the PIVOT statement, so the final result will be grouped based on USER_ID. The query will generate a row for each user in the final result set, along with three columns corresponding to the average score for each question. These are referenced in the outer SELECT statement and are based on the set of values provided to PIVOT. In our case, these are [1], [2], and [3]. Of course, since our sample data only has one answer from a given user per question, the aggregation function doesn’t really matter.

Let’s suppose we want to average the scores across all users. In that case, we simply modify the query slightly and remove USER_ID from the InputTable (note that we also cast SURVEY_SCORE as a decimal, so the average does not get rounded to the nearest integer). This causes the entire result set to be implicitly grouped, which is exactly what we want:

SELECT  
 [1] AS 'Service score', [2] AS 'Customer satisfaction', [3] AS 'User Interface'
FROM
(SELECT QUESTION_ID, CAST(SURVEY_SCORE AS DECIMAL) AS SURVEY_SCORE
    FROM SURVEY_RESULTS) AS InputTable
PIVOT
(
AVG(SURVEY_SCORE)
FOR QUESTION_ID IN ([1], [2], [3])
) AS PivotTable;

As expected, this query returns exactly one row with the average score for each question appearing in its own column:

 
Service score                           Customer satisfaction                   User Interface
--------------------------------------- --------------------------------------- ---------------------------------------
4.000000                                3.000000                                2.666666

 

And that covers the PIVOT statement in a nutshell. What about UNPIVOT? As you can probably guess, the UNPIVOT command essentially rotates the data in the opposite direction: It takes a set of normalized table data and denormalizes it, converting columns into rows. Its syntax is similar and its use is left as an exercise to the reader.

MCTS EXAM 70-433: Microsoft SQL Server 2008 Database Development

This year I decided to get a Microsoft SQL Server 2008 Database Developer Certification (MCTS EXAM 70-433). While I was doing consulting work at Avanade, getting at least one new certification was an annual requirement. I ended up changing jobs before I could do so, but decided to go ahead and get one anyway. I felt like my computer studies were in a rut; I did a lot of reading, but not enough coding. I figured that studying for a certification would be one good way of motivating myself to take a more hands on approach to my ongoing education. Writing articles for this website would be another. Combining the two resulted in this article about my experience with the exam.

I have never gotten a certification in anything before, so I was going into this with no idea what to expect. I ended up being really surprised. My overall take on this is that certifications are not a very good way to assess a candidate’s knowledge. It does seem like a good business to be in, for the software companies that offer the certifications, the testing companies that administer the tests, and the educational companies that offer preparation courses/guides. Despite business being good, there are definitely lots of improvements that can be made to make it a better experience for the test taker. Unfortunately, I have a sneaking suspicion that all parties involved feel that things are “good enough”, and feel that ironing out some of the kinks will yield diminishing returns in terms of profit. All this may come off sounding cynical, so I do want to state that I am happy to have gotten my database certification, as there is nothing more rewarding than setting a goal and achieving it. The process of studying for the exam has drastically improved my knowledge of SQL Server, and for that I am grateful.

The sign up process on the other hand, could use some streamlining. There are too many clicks involved and information about the test is hard to come by What I had expected was to see a prominently displayed sign up link on the Microsoft Learning homepage, where I could type in the name of the test I wanted to take (which would populate an AJAX autocomplete dropdown), enter in some registration information, and sign up for the test. Instead, I had to do an old fashioned search on the home page. I eventually ended up here: http://www.microsoft.com/learning/en/us/Exam.aspx?ID=70-433. Surprisingly, the sign up link is located off to the side, as though it were an afterthought. I’m not sure why they didn’t put this front and center. After all, isn’t the whole point of these exams to charge users cold hard cash to take them? Hiding the sign up link kind of defeats the purpose. Of course, one of the things that threw me off was that the testing is handled by a third party company, Prometric, so the hyperlink is actually an image of their company logo. As a first time test taker, I would not have known to look for this.

After clicking the link, I was redirected to the Prometric website. Once there, I ended up having to re-select which exam I wanted to take. Technically, it would not have been difficult for the hyperlink on the Microsoft website to contain a test id in the query string, and then have the Prometric website do some parsing and prepopulating of the test information on their end. This is one of those examples of the experience being “good enough” and not worth the effort to fix. The sign up process was pretty straightforward but did involve a lot of clicking and tedious filling in of webforms. After a few minutes of selecting a time and test location and filling out my credit card information, I finally managed to complete the registration process. The test costs about two hundred dollars. Its roughly what I expected, but it would have been nice to see the pricing information available earlier on in the process instead of all the way at the end. One good thing to note is that Prometric has the infrastructure in place to offer multiple testing centers at a variety of locations and business hours, and as a result, provides convenient and flexible scheduling.

I chose Bellevue Community College as my test center, but was confused because my registration confirmation did not tell me where exactly on campus the test would be. I tried to contact Prometric, but there was no oddly no phone numbers or email listed on their website. There was a Contact Us link where I managed to fill out a webform with my inquiry about the test location. Of course I never got a response. Perhaps this is their way of keeping costs low by not having to devote resources to respond to pesky customers. Luckily, my confirmation email did provide information about the test center, which included a phone number. I had first thought that this was the number for the college itself and was worried that the front desk would have no clue what I was talking about when I asked them where my Microsoft Certification exam would be located. It turns out the number provided was actually the number for the Bellevue College Testing Center itself, which is its own separate entity inside the school. The woman on the phone was able to provide me with all the relevant information I needed.

Try as I might however, I could not find any official information about the format of the exam. This was conspicuously missing on the Microsoft Learning website, the Prometric website, and even the official MCTS 70-433 Training Kit. Granted, the training kit came with a CD ROM that contained practice exams, but it was not altogether clear to me that these practice exams would be in the same format as the real test. I had expected there to be at least some “free response” type questions where I’d be given some sort of real world problem and would need to come up with a good database schema, write some queries against it, and perhaps come up with some intelligent choices for indexes. In other words, I expected the exam to contain questions similar to the ones I’d be asked in a real interview. Nope! It wasn’t until I took the test itself that I found out it would be done via testing software on a computer, consisted of roughly 55 multiple choice questions1, with a little over two and half hours to complete. Belongings would be stowed away in a locker, and an erasable mat with markers would be provided. 700 points would be considered a passing score. I found it funny that they’d mention this, yet give no indication how many points each question was worth.

I have mixed feelings about the official Microsoft Training Kit. It consists of a 422 page preparation guide, a CD ROM that contains practice tests, and a discount code for the exam. The book was not too helpful. It sacrifices depth for breadth, giving each topic only cursory coverage. This became quite apparent after taking a few tests on the CD; many of the questions are not covered by the book at all. This is not so much the fault of the authors, as the book would need to be thousands of pages long in order to be sufficient. As a result, this is also not the book you want to read to learn Microsoft SQL Server. It does however, serve as a good jumping off point: Numerous other publications do deep dives on the topics discussed. For example, Inside Microsoft® SQL Server® 2008: T-SQL Querying by Itzik Ben-Gan does a great job teaching TSQL querying fundamentals, query tuning, and execution plans (it also happens to be one of my favorite books that I found myself referencing constantly during my studies).

My advice to any remotely competent database developer would be to focus on the practice exams, using the preparation guide as a reference, and supplementing it with other books on SQL Server as necessary. The testing software provides a Certification Mode that simulates an actual exam, a Study Mode that lets you answer questions at your own leisure, and a Custom Mode that lets you control various settings such as what categories of questions to ask, and whether or not to include questions answered from previous test attempts. The software is written by Measure Up, which is apparently a company that provides practice tests for Oracle, Cisco, Microsoft, Comp TIA and other such certifications. It is surprisingly solid, fast, easy on the eyes, comprehensive, and (most importantly) does not crash or cause blue screens of death. The UI is intuitive, which is what I’d expect from a straightforward multiple choice test. Unfortunately, when it came time to take the actual test at Bellevue Community College, the, the software was written by Prometric. It was a lot slower, the graphics were dated and ugly, but at least it didn’t crash. I’m not sure why Prometric didn’t just contract out the test software to Measure Up, because Measure Up had the far more polished product.

The test questions fall under a few general categories which are outlined in the preparation guide. I won’t rehash them all here, but the bulk of it consists of things an intermediate level database developer would have done throughout their career: creating and managing objects in a database schema and writing queries. The former involves knowing how to create tables, views, partitions, indexes, triggers, stored procs, functions, CLRs and managing the permissions associated with these objects. The latter involves understanding CTEs, recursive CTEs, pivoting, unpivoting, subqueries, joins, merges, groups, ranking functions (ROWNUMBER, DENSE RANK, NTILE etc), cubes, and rollups. You are also expected to know how to manipulate the XML data type, so be sure to brush up on your XPath and XQuery.

One of the test categories, SQL Server Components, assesses your knowledge of miscellaneous SQL Server components, including the Power Shell, full text indexes, Dbmail, and the Service Broker. Unfortunately, I felt that this category of questions was written in such a way that they ended up feeling more like trivia questions than an actual test of my understanding. For example, one question asks you how to enable message forwarding in the Service Broker. In the real world, you’d just use google to look up the command and its parameters; MSDN provides incredibly detailed documentation on SQL Server. In the testing center however, you are not allowed to use google, and this artificial constraint turns it into a test of how well you can do rote memorization instead.

Of course, the problem stems from the multiple choice format. It lends itself to forced cramming of knowledge with little understanding or retention. I remember how my computer science finals were typically all open book, and involved writing proofs. There was just no way to bs your way through like you can with multiple choice. Any good test taker can make educated guesses, eliminate incorrect answers, and even utilize information derived from previous questions. With all these drawbacks, why don’t they make these tests better? Well, as I said before, testing-for-profit is a lucrative business, and multiple choice exams make sense from a business perspective. You don’t need a person who understands the test material to grade the test, you just need a computer that can run some software. This makes the tests easy to grade and administer: Write once, get paid multiple times. This scales well, keeps costs low, and profits high.

All this means that a certification does not really mean much in and of itself. So is it even worth doing? Sure, you get a nice and shiny certification that’s signed by Steve Ballmer2, but so what? Well for one thing, many consulting/contracting companies such as Avanade and Accenture do put a heavy emphasis on certifications, as it typically allows them to bill more for their headcount. Thus, lots of certs listed on your resume will make you look good when applying to these companies. Also, if you’re going to put in the effort to keep up to date on your tech education, paying a few hundred dollars for some sort of proof is better than nothing. It certainly won’t hurt. The certification is really for one’s personal edification; you only get what you put into your studies. If you focus on cramming to pass the test rather than learning the material, then its not going to do much good. When it comes time for the actual job interview, it will immediately be obvious which approach you took.

1 Some questions aren’t multiple choice and involve slightly more user interaction. These types of questions are the closest thing resembling a free response question that I could find. For example, one question might ask you for the correct TSQL statement needed to create a DML trigger that inserts into a history table. It provides you with a list of randomly ordered words (not all of which are needed), which you then need to assemble into the correct statement.

2 You have to fork out an additional 20 bucks for a hard copy of the certification.

Quick and dirty table import hack for SQL Server

Every so often, I’ve needed to import data from the QA/Staging database table onto my local machine to help troubleshoot a hard to repro bug. Typically, I only need to import a few table’s worth of data, so creating a backup of the QA database and restoring it on my local machine is overkill. Here’s a simple trick to do so quickly and painlessly, without needing to click through the tedious export/import wizard or open up a command line to run BCP (and invariably having to look up the BCP syntax).

Open up SQL Server Management Studio and connect to the QA database in question. Open up a new query, and run a SELECT * FROM [TableName] query. Click on the results pane, and then Ctrl-A Ctrl-C to select everything and copy the data. Shown as an example here is the AdventureWorks.Person.Address table (obviously truncated as it contains almost 20,000 rows):

Disconnect from the QA database and reconnect to your local database. Open up a new query window and run TRUNCATE TABLE [TableName] to wipe out existing data. If necessary, you can back up this data, but since this is data on your local machine, chances are you won’t really need to. If you have any columns with the identity property, things will get trickier. Your best bet is to right-click the table in the object explorer, SCRIPT TABLE AS -> CREATE TO -> NEW QUERY WINDOW. From there, edit the automatically generated query by removing any IDENTITY properties (along with any foreign key constraints). DROP the table, and then run the creation script to recreate the table sans any Identity columns.

Now, navigate to the table using the Object Explorer and right click the table you want to import the data into. Select Edit Table:

Right click the arrow to the right of the first row and select Paste:

The rows will now show up. You may now close the window (the data is saved automatically):

Yes, its really that easy. SQL Server is even smart enough to preserve the NULL values in varchar columns (as opposed to treating them as a string with the value “NULL”). Note that you can also use this same method to back up and restore your data by using an Excel spreadsheet as an intermediary backup file. Select all the data in the table and copy it as before. This time, paste it into an Excel spreadsheet. Excel will automatically detect and preserve the columns:

Later when you want to restore your original data, select and copy all the data from the Excel spreadsheet and paste it back into the table using the “Edit Table” method outlined above.

Obviously, this operation gets slower the more data there is to copy. After a certain point, it’d make more sense to use a more robust method, such as BCP. However, when it comes to programming, laziness is a virtue, and this is one of the fastest and easiest ways to import data.

TSQL query the missing gaps in a table

I’m currently studying to get a Microsoft SQL Server 2008 Database Development Certificate and am going through their official 70$ 400 + page training kit. It covers a wide variety of topics, one of them being how to find the missing gaps in a table. This actually turns out to be a good interview question that can test whether or not a candidate understands how to write more complex queries that go beyond simple select statements.

The problem itself is relatively straightforward. Imagine for example, that you have an identity column on a table. You might expect to see a consecutive series of integers when you select that column from the table. However, on a production environment, you’d typically see “missing” values. This can be due to a number of reasons. The most obvious one being that rows can get deleted. A more subtle cause is that rollbacks do NOT reset the identity seed value on a column – so any failed inserts will cause the seed value column to increment. Regardless of the cause, let’s say you would like to identify all the gaps in the table.

Consider the following table:

CREATE TABLE TestTable (SomeID INT NOT NULL PRIMARY KEY)

INSERT INTO TestTable (SomeID) VALUES (1)
INSERT INTO TestTable (SomeID) VALUES (2)
INSERT INTO TestTable (SomeID) VALUES (3)
INSERT INTO TestTable (SomeID) VALUES (10)
INSERT INTO TestTable (SomeID) VALUES (11)
INSERT INTO TestTable (SomeID) VALUES (12)
INSERT INTO TestTable (SomeID) VALUES (90)
INSERT INTO TestTable (SomeID) VALUES (91)
INSERT INTO TestTable (SomeID) VALUES (92)

The missing gaps are 4-9, and 13-89. How can you write a query that correctly returns these values?

A good approach is to solve the problem by breaking it down into smaller pieces. The first piece of the puzzle is to figure out where each missing gap begins. We know that there is a gap if for a given value x in the table, there doesn’t exist an entry x + 1 in the table. To use a concrete example from our sample table: There is no gap after 1, because 2 exists in the table. However, there is a gap after 3, because 4 does not exist in the table. So we translate this into SQL: Select all values of SomeID where there does not exist another SomeID value that is equal to the current SomeID value plus one. This will require a correlated subquery where the inner query references the outer query’s SomeID value:

SELECT a.SomeID AS StartGap
  FROM TestTable a 
 WHERE NOT EXISTS (SELECT * FROM TestTable b WHERE b.SomeID = a.SomeID + 1)

This returns:

StartGap
---------
3
12
92

There are two things to note here. First, we need to add 1 to the Start value in order to get the actual starting value of the gap. The second thing to note is that 92 is returned because it is the largest entry in the table. This isn’t a gap, its just the last value in the table. So, we need to filter out the maximum value of SomeID from the table. The modified query looks like this:

SELECT a.SomeID + 1 AS StartGap
  FROM TestTable a 
WHERE NOT EXISTS (SELECT SomeID FROM TestTable b WHERE b.SomeID = a.SomeID + 1)
   AND a.SomeID < (SELECT MAX(SomeID) FROM TestTable)

This gives us

StartGap
--------
4
13

Now that we know where each gap begins, we can tackle the task of figuring out where the gap ends. Given the start gap value, the end gap value is equal to the next largest number in the table minus one. For example, there is a gap that begins at 4. The next largest number that exists in the table is 10, so the gap ends at 9. Translated into SQL, given the current value of SomeID, we want to select all the values from the table greater than SomeID, and then return the MINIMUM.

Again, we will use a correlated subquery to reference the current value of SomeID:

SELECT a.SomeID + 1 AS StartGap
	  ,(SELECT MIN(SomeID) - 1 FROM TestTable c WHERE c.SomeID > a.SomeID) AS EndGap
  FROM TestTable a 
WHERE NOT EXISTS (SELECT * FROM TestTable b WHERE b.SomeID = a.SomeID + 1)
   AND a.SomeID < (SELECT MAX(SomeID) FROM TestTable )

This gives us the correct result of

StartGap    EndGap
------------------
4	        9
13	        89

Tada!

As a side note, the Microsoft SQL Server 2008 – Database Development Training Kit provides two solutions to this problem, but it literally doesn’t bother explaining them at all. Quite disappointingly, it merely provides the answer. I had remembered reading about this problem before in a different book, Inside Microsoft SQL Server 2008: T-SQL Querying, by Itzik Ben-Gan, so I referred to it instead. The author does a deep dive on this topic and its related problems in his book, providing multiple solutions and detailed explanations. I was fortunate enough to attend a Microsoft DevConnections conference many years ago where Itzik Ben-Gan ran a number of T-SQL workshops, covering the missing gaps problem, in addition to many other tips and tricks. For me, the most impressive part about it was not the clear and concise way he managed to educate everyone, but that he could take a seemingly dry topic such as TSQL and actually make it interesting and even humorous. Naturally I bought his book and am glad I did so; after all I’m still referencing it years later. I’d highly recommend it for anyone else hoping to either get a SQL Certification, or just improve their TSQL skills. And no, I am not getting paid to plug the book (I wish). The Training Kit on the other hand, is quite comprehensive in the list of topics it covers, but it sacrifices depth for breadth. Its a good starting point and should be used more as a baseline checklist of SQL topics you should understand.