Stupid Question 51: Stored procedures or not? And does size and/or complexity of app/data matter?

[To celebrate my first year of programming I will ask a ‘stupid’ questions daily on my blog for a year, to make sure I learn at least 365 new things during my second year as a developer]

Stored procedure or not?

This question was brought up today as I talked with a dev that likes stored procedures. I like SQL (and I also like NoSQL just to be clear), but I’ve never been comfortable with stored procedures as I (and yes I might be wrong) I don’t like separating the logic down to the database layer - adding what I consider a second data access layer- and I don’t like creating procedures ‘beforehand’. I’m a girl, I tend to change my mind. One minute I want red hair, other minute I want red hair with blue stripes. I want to be able to fix my hair at home, and not have to run to the hairdresser to get it done. My hair will definite constantly change as I ’evolve’.
As for the performance gain, I’ve been told that the performance win isn’t that big anymore (and how early on should I optimize anyway?), and as for SQL injection I always use parameterized queries (who doesn’t today?);

This might be an utterly silly thought/reasoning, and I am guessing that there’s a big It Depends attached to this question. Nonetheless, I want to, as always, know what it depends on - if you think it depends. After all, I would love to be wrong and learn something new today :)

So: Does it depend or is there a right or wrong? And does the size of the application complexity of the data change this?

Comments

Leave a comment below, or by email.

Steve Smith

9/24/2012 12:20:58 PM

In my experience, stored procedures are great when you have a complex query, and/or a query that requires special attention for performance or other reasons.  These days, most applications can have most of their CRUD operations done via an ORM tool like Entity Framework or similar.  It will depend on the application, but probably 90% or more of the data access can easily be done in this fashion.  This frees up your database expert to only have to focus on the 10% or less of your application's data access that is actually challenging and interesting.  Also, you may not know which 10% of your application will need that attention until you write it and people start using it and you see where the slow parts are (or you performance/load test it to find out), so as you note, don't prematurely optimize.  I've also found that ORM tools generally are not well-suited to reports and reporting applications; frequently you can map a given report's data needs to a sproc quite effectively, and they frequently require the kind of tuning I mention here.

Jimmy Engström

9/24/2012 12:42:08 PM

I like Entity Framework (and Linq) and I like to have all my logic in code.
The main argument for me is that if I do something wrong or the database structure has changed the compiler tells me.
When using a store procedures Sql Manager won't tell me what stored procedures a structure change might break.(I have heard that it might be available in SQL 2012?).
The arguments for stored procedures (that I have heard) is they are faster (as you mention, not really true any more), easier to change without deploying new code (Wait what? then it is just as easy to break something!).

Wesley Wilson

9/24/2012 12:42:16 PM

With stored procedures you get the benefit of a cached query plan, which typically helps performance, because SQL Server doesn't have to figure out each time what the best way to run the query is. Sometimes of course you can get a bad query plan cached, but generally it helps performance.

If you are writing your own SQL (as opposed to Entity Framework or something else generating it), it can be cleaner to keep the SQL separated into a stored procedure. Keeping it in code can get really messy if you aren't careful to format it in a readable manner.

Also, if you do SQL in code, and then you drop or rename a column from the underlying table, you have to search your code for references instead of doing a dependency check in the database.

Yoshida Hiro

9/24/2012 12:46:23 PM

I love stored procedures and I think they are a necessity for large or enterprise level applications.
They make getting and setting data secure. I dont suggest ORM's for anything except a small qwick and dirty application. I have seen linq and EF halt a database because of poorly designed queries on the db.
plus linq seems to hold the database connection open for far too long.

Sérgio S

9/24/2012 12:55:45 PM

For a "couple" of years now, I've only written SPs in 2 situations: when the DBAs make me (data access policies for example) or when there's a clear advantage in doing the work in a SP (a complex data "cleaning"/sanitization/aggregation job that's run periodically for instance). Most of the time, I'll use an ORM or micro-ORM.

James Curran

9/24/2012 1:40:13 PM

A frequent justification for stored procedures is that they are "precompiled" by the database, and are threfore faster. This is not true. (at least in regard to Microsoft SQL Server. Probably the same for Oracle et al, but I don't know for sure). Any SQL command executed is compiled & cached. If a character-for-character identical command comes in, the cached execution plan is reused. Doesn't matter if it's a stored procedure or not. Now, getting that char-for-char identical query would pretty much require you to use parameters in the query, which stored procedures pretty much require anyway. So, if you build SQL statement with the options embedded (e.g. "SELECT * FROM tbl WHERE name LIKE '" + query + "?'"), you'll be penalizing it vs a stored procedure, but building a command like that is bad for a whole bunch of other reasons, so just don't do it. Always use parameters with all SQL.

So, like Steve, I only use stored procedures for particularly involved SQL work. (On my site, NJTheater,com, you can do a "Six-Degrees of Separation" query between any two actors -- that's handled by one of the nastiest stored procedures I've ever written). From simple queries, I use Linq2Sql, because (like you) I like having my Data Queries right in my code (where it can be easily & effectively stored in source control).

Another reason I like using direct sql instead of stored procedure, is that when you start using SPs for your data access, you start funneling everything through a very limited number of them. e.g. Say your db has a Person table, and you have three different task which each require only one or two columns from that table (but each needs a different set). You're not going to write three different GetPersonById SPs -- you're going to create just one, which inefficiently returns every column. Similarly, you'll write one UpdatePerson SP, which takes a dozen parameters to update every column --- even if (as most likely) only one or two changed.

James Curran

9/24/2012 1:45:54 PM

Reply to: Yoshida Hiro

I can design a bad query and put it in a Store Procedure just as easily as I can do it in Linq.  That a logic problem, not a technology problem.

James Curran

9/24/2012 1:46:53 PM

Reply to: Wesley Wilson

MS SQL Server caches direct SQL commands as well as stored procs.  There is no advantage there.

Joep Beusenberg

9/24/2012 1:52:39 PM

I like to use Linq far better than SPs. My data access layers expose IQueryable whenever possible.
If used correctly and combining those partial queries in my business layers result in much more efficient data calls. Unneeded data will not be ferched.
An if you look into the queries Linq manages to produce... Phew! I could never have thought up such complex queries without having to resort to temporary tables and stuff like that. It's sometimes phenomenal. And as James said before: these also will be cached by the server. And it's far more efficient to tha one huge cached query than dozens of cached SPs.

On the other hand you have to profile your Linq queries very well, becouse it's very easy to create requests that can't be translated to SQL, and they will result in many many many calls to the data tier if not resolved.

Roman Mueller

9/24/2012 1:56:13 PM

I am coming from a LOB background and still develop stuff in that field. I normally use stored procedures where transactions involve multiple tables with only limited amount of input or when the transactions require decisions being made based on the data deeper inside the transaction. Basically hiding the complexity of these transactions from the rest of the application. In the past it also was a question of performance (I am talking about the times when you didn't have parameterized queries, and all the flexibility came from cleverly combining the criteria into a string).
With less complex applications I don't use them, as they also add another level for the debugging process.

Mark

9/25/2012 5:30:09 AM

Another advantage of SP's is the ability to limit access via them. Many DBA's like SP's as it avoids the need to allow direct access to tables.

Oggy Shopov

9/25/2012 5:33:38 AM

I think the size, complexity, architecture, and purpose of the application and its data, matter when making those decisions. No one approach is perfect for all scenarios. For example the location of the database, relative to the application is an important consideration to make. If the app server is separated from the database server, this means network traffic between the two must be taken into consideration. Let's say you need to perform an operation on a large amount of records and then update a table with the new values. Using the entity framework in this case would mean that all records will have to travel back and forth between the db server and the app server. This introduces network traffic cost and the resulting latency.

SP-level security is another advantage in favor of stored procedures. Access to important business data operations can be controled in a more reliable manner at the database level.

Another good example is an enterprise environment where you would have multiple types of clients connecting to the same databse. If the different clients are not running on the same technology/framework, then it would be impossible to reuse business logic if it is layered within the application. Think about an organization with a legacy non-.NET CRM, which is now mounting an ecommerce platform to it. It would be more efficient to centralize certain procedures at the database level, where the different clients can reuse it.

So, to me considering a combination of the two approaches is the right way to go. Use a technology where it makes sense to be used and not just because it is trendy :)

Pacheco

9/25/2012 6:24:47 AM

Reply to: Yoshida Hiro

Curiously, one of the largest Enterprise applications in all Americas, dealing with crucial data from health care systems (and a small mistake could mean the death of a human being) uses all-ORM database and not a single SP. 

If someone would call such a system a "quick and dirt" application, I think every system in the world could be called as such.

Nathan

9/25/2012 6:34:57 AM

You shouldn't think of sp's as a 2nd data access layer, the best ones are simply an abstraction over the underlying data and whilst they can be used for simplifying complex data that can be done equally well with views. 

Sp's are useful, IMHO, is performing complex operations against the data where you may need to create temp tables and manipulate data which is all done on the server without any need to go back to the client and because you can debug sp's it is easy to isolate and check it does what you expect, without a decent profiler like EFProf you could easily struggle to do this through an ORM.

Where sp's really shine are security, you can allow a user to have execute only access to an sp to carry out whatever actions you require without the user needing explicit permisions on the tables being read/updated/deleted/inserted.  The security may not matter to every developer but certain environments need this, plus there is nothing to stop you using an sp from an ORM.

Whilst ORM's are a great tool I firmly believe as a developer you need to understand what your db can do for you as you may be "reinventing the wheel" trying to do something client side that the server can easily accomplish.

Developer Art

9/25/2012 6:49:46 AM

Your analogy about styling hair at home vs. running to a hairdresser is a bit far-fetched.

Whether you're writing data access code in your application or writing a stored procedure - it's still just a piece of text. Why would it matter where it is located? You simply press ALT+TAB to switch to the database window and you continue typing the code there.

And if you feel like the database shouldn't contain any logic - understand that the database is an integral part of your application. Without it your little app doesn't exist. And the datatypes, references between tables and constraints - it's the same "logic", just under a different sauce.

Sean Kearon

9/26/2012 12:28:34 AM

I think the hairdresser analogy is a very good one here.  It takes a lot more time and effort to go to the hairdresser as opposed to doing it yourself.  

Delivering value to the customer is what we as developers do.  Whatever makes it easier for you to write, test, debug, deploy and maintain your apps for their lifetime is always the path to take.  

Compared to more developer friendly technologies like ORM or NoSQL, stored procedures (when used as a general approach to data access) take more time to write, test, debug, deploy and maintain.  This will slows you down and reduces the value you can deliver.

Sure, there's a time to use stored procedures, but focusing on using a more developer friendly technology will increase your ability to deliver delivering good, testable apps that can be maintained over time.  Modern ORMs like EF and NH deliver that in bucket loads.  In the NoSQL world, RavenDB delivers a superb developer experience.

Dennis van der Stelt

9/26/2012 10:33:10 AM

Good question and it always depends. Unfrotunatly ;)

But I really hate Stored Procedures. Simply because so many developers store logic in there. And as I can't wrap my head around large SQL queries (who can when they're 500 lines long?) I tend hate them more. And the fact that so much logic is hidden behind statements, makes them extremely hard to convert to C#.

That doesn't change that when you work with extreme large sets of data at the same time, SQL is invaluable. C# just can't handle that large amount of data properly. Easy.

Some will say security is important. I don't really get that. Spend proper time on security in other layers (firewall, windows, code, users, etc) and while you're at it, you might also protect tables and/or views.

The performance part. It's most of the time the execution plan that matters. This is cached and will happen for both queries executed from code, as for stored procedures. But!!! People tend to use 'IF' statements in Stored Procedures. When the outcome of an if-statement is different, the entire execution plan is thrown away and regenerated. This will probably never happen with sql executed from code, as you don't use if-statements in it. :)

Cool Dev

9/26/2012 10:57:28 AM

Stored Procedures all the way!  If data requirements change in the future, you can simply edit the stored procs to reflect the new requirements WITHOUT having to recompile and re-deploy any DLLs.

mar

9/20/2013 10:31:00 AM

Reply to: Developer Art

I agree. If you know how to do both. It is just as easy to do it in a stored procedure.  Also, it depends on who you are doing the work for.  I have been on projects where the request was to put everything on the front-end, and in other projects, everything had to be on the server.

mar

9/20/2013 10:32:12 AM

Reply to: Sean Kearon

If you know waht you are doing it does not slow you down.

Last modified on 2012-09-23