Peter van Ooijen over at CodeBetter.com posted in his blog about some observations he had when working with stored procedures in a recent project. What I found to be interesting about his post was his comment that a stored procedure can be, “a view with parameters.” I’ve run into this assertion before, and it’s something I think needs some clarification for a lot of developers. I do not feel that there is any real similarity between stored procedures and views — they are entirely different types of objects in an SQL database, and should not be considered forms of one another in any way.
Following is an edited version of the response I left in Peter’s blog; I thought it warranted its own post:
Stored procedures are not — and never can be — “parameterized views”. A view in an SQL database can be treated the same as a table in virtually every context. Consider:
SELECT * FROM Tbl
vs.
SELECT * FROM View
One of the great things about working with views and tables is that the person querying the database does not need to know whether the base object being queried is a view or a table. For the sake of writing SQL queries, they are one and the same. Both a view and a table have well-defined columns, with well-defined datatypes.
These assertions cannot be made for a stored procedure as compared with a view. A stored procedure is related to a view only in as much as both are defined using SQL syntax. But beyond there, the two diverge into completely different types of entities. First of all, consider:
SELECT * FROM StoredProcedure
This will not work, and will only result in an “invalid object name” exception. The reason? Stored procedures expose no explicit output contract. Thanks to conditional branching, dynamic SQL, and SELECT *, a stored procedure can output vastly different results beween invocations, or based on different input parameters. It is quite possible to code a stored procedure that will output no result sets for one set of input parameters, two result sets for another, and four for another. Or, it’s possible to change the returned result sets, e.g. by outputting different column names or datatypes. Please note, this is an extremely poor (and very dangerous) coding habit to get into — but the point is, it is impossible to verify the output of a stored procedure for a given set of input parameters without running it.
Furthermore, a stored procedure “late binds” to the base objects being queried. This adds to the difficulty in verifying the output of a stored procedure, and is why you can create the following stored procedure without getting an exception (until you try to run it, of course):
CREATE PROC XYZ AS SELECT * FROM ThisTableDoesNotExist GO
These stored procedure behaviors are in stark contrast to the way views work. Views provide a couple of means of verification:
- The output columns/data types can be verified, and bound to, before actually querying the view
- A view can be “schema bound”, meaning that the underlying base tables (or other views) which the view is based on cannot be changed, schema-wise, unless the view is dropped.
For the first point, simply query the INFORMATION_SCHEMA.COLUMNS or sys.Columns views, and column information can be determined for a view without having to query it.
The second point adds to the first in a few ways. Schema binding brings to views a certain sense of “early binding,” which as I mentioned is missing in stored procedures. Although no view can be created if one of its base objects does not exist, schema binding takes it one step further and guarantees that the base objects used to create the view must exist, and must not be changed, for as long as the view is present in the database. This means that if a schemabound view is created that outputs a certain set of columns with certain datatypes, it is guaranteed to do so for as long as it exists in the database — in other words, its contract is bound to the schema, and changes to other objects cannot affect it. This is a powerful guarantee, which stored procedures fail to make.
So now the question is, if a stored procedures isn’t a parameterized view then what is? The answer, as of SQL Server 2000 (and continuing in 2005), is the table-valued UDF. A table-valued UDF is parameterized, has an explicit and verifyable output contract*, and can be schema bound. If you are looking to implement a solution that makes use of a form of parameterized views, stored procedures are probably not the right choice. I think that table-valued UDFs are quite underused and deserve a second (or first!) look from many T-SQL developers who may have glossed over them in the past.
* Note: Unlike for a view, the column list for a table-valued UDF cannot be queried from the INFORMATION_SCHEMA.COLUMNS table. The column list is, however, available from sys.Columns.
Adam,
As I have mentioned to you before, I love this blog post. Whenever I am arguing with someone about the use of sprocs I just point them staight at this and pretty soon they are building UDFs.
Just wanna say thanks (again) for writing this.
-Jamie
Thanks a lot. I’ve been trying to solve a SSIS problem the whole day, and this has solved it. I’ve changed my stored proc to a function and now things are working.
(I am just confused why in other SSIS packages the stored procs worked…)
Hi Adam,
I have to agree with you that UDFs are a better solution for the OUTPUT contract. But, I’d like to mention that this kind of discussion always opens that can of worms: should you or should you not use SPs? That’s where the discussion goes retarded(no, not parochial, retarded). Working with big databases will always force you to use SPs, but once you agreed that SPs are wrong…
Max
TVF should always be used if you are simply reading from tables (and not doing inserts, deletes, updates).
There are other advantages with TVF: Avoiding code duplication and minimizing network traffic.
Take for example, you have two apps, app1 and app2. Both apps queries against the same table but app1 only needs columns 1-5; app2 needs columns 2, 3 and 6-10.
With SPs, you either have to either write two SP that does essentially the same thing but select different columns or modify the SP signature and add conditional logic to format the output.
For each app, you can simply select the columns you need from the TVF. The TVF will return only the columns you want. No code duplication, no extra columns/row returned.
Thanks, I was thinking this was the case but your post summed it all up nice and clearly.
Just to throw this in, as of SQL Server 2005 I think INFORMATION_SCHEMA.ROUTINE_COLUMNS contains info about all the columns returned in TVFs.
Thanks for writing this, helped me one day and today I referred someone to it to fix his problem.
Comments are closed.