Programmatical Database Optimisation…
The application that I am programming at the moment dictates that an object can have many participants. For improved usability the owner of the object needs to be able to add and remove participants en massé.
Creating this functionality is simple but ensuring that it is done efficiently requires some further thought.
A = List of Users who currently have access to the object
B = List of Users that should have access to the object after updating
B can contain none, some or all of A.
Method 1
The easiest thing to do when updating the list of participants is to delete all current participants (A) and grant access to all new participants (B). In doing this we would be executing two, possibly bulky, of an RDBMS’s most time intensive SQL statements :
- DELETE FROM X WHERE Y in (A,B,C) and D=E
- INSERT INTO X VALUES (A, B, C)
Method 2
Another method of setting the new participants would be to only update what has changed in the list. Doing this incurs an additional SELECT statement in order to determine what needs to be added and removed :
- SELECT * FROM X WHERE Y=Z
- DELETE FROM X WHERE Y in (A,B,C)
- INSERT INTO X VALUES (A, B, C)
The question now is :
Which method is the most database efficient?
Method 1 gets the job done in two SQL statements but method 2 can possibly reduce the number of inserts and deletes necessary while at the same time adding an additional SELECT statement.
To emphasize this issue further – consider the following :
In my application an object (A) can contain child-objects (B). Each child-object can have an individual list of participants – this list is a subset of those participants in the parent object.
If a user is removed from the participant list of the parent – he/she should no longer be a participant in any of its child objects – and hence should be removed.
In doing this the same situation arises above :
Is it best to do a blanket DELETE statement across the participants table for the parent object and each of its child objects for the user ID?
OR
Execute a SELECT statement to find the list of objects and child objects that the user has access to and then DELETE as necessary?
Which method in these scenarios is the most efficient?
Any thoughts?
Answers on a postcard….
Categories: ALM, Programming
Whoah,
Thats a little heavy!
I suppose if you do the SELECT check first – then you dont have to commit to a DELETE if you dont need to – -but then – - if the delete is necessary you have increased the workload to remove the participant(s).
Hi Kaya,
Yeah I have it in my mind that I should do the check (SELECT) first before only deleting the necessary rows. I wrote this blog post as a sort of “Talk out loud” / “I hope someone can help”.
Anyone else have an opinion on this?
[...] Programmatical Database Optimisation… [...]
You don’t list using an if statement in your query.
If count(*) from B>1 begin
delete from a where a.id in (select id from b)
end
Hi Margaret,
Sorry – the check is implied. I was just listing the basic SQL.
How do you tackle this scenario?
I read a lot of interesting content here. Probably you spend a lot of time writing, i know
how to save you a lot of work, there is an online tool that
creates readable, google friendly articles in seconds, just type in google – laranitas free content