You are hereBlogs / rahul's blog / Drupal 7 - Creating arbitrary ANDed/ORed database queries using the DBTNG abstraction layer

Drupal 7 - Creating arbitrary ANDed/ORed database queries using the DBTNG abstraction layer


rahul's picture

By rahul - Posted on 19 June 2011

While working on the 7.x port for my Document module for Drupal day before yesterday, I had a pretty anxious moment. As you would know, the new DBTNG database abstraction layer in Drupal 7 fundamentally changes the way your module code interacts with the db. Now instead of writing raw Sql queries, you use a set of high level objects to specify the various components of your query using a series of methods which is then translated to Sql by the DBTNG layer and executed against the database. Sample this for example:

 

db_select('node', 'n')
  ->fields('n')
  ->condition('status', 1)
  ->condition('uid', 2)
  ->execute();


The above DBTNG query is roughly equivalent to the following in Drupal 6:

db_query('SELECT n.* FROM node n WHERE status = 1 AND uid = 2');

It should not be too difficult to co-relate the components and the way a DBTNG query is constructed by comparing it to traditional way of writing the raw Sql query.

Coming back to my original point, in the document module's search page, the Drupal 6 version created a pretty complex query with dynamic conditions (see lines 113-146 here for details on the query generated). The conditions were arbitrarily ANDed/ORed together based on user input in the search form. And when I came to porting this part of the module for Drupal 7, for a moment I was practically numb; how do I represent an arbitrary series of AND and OR Sql conditions using DBTNG in Drupal 7.

If you notice the above DBTNG example carefully, you would see that I added 2 conditions without specifying how to combine them and DBTNG ANDed them by default. You can pass a third parameter to condition() method specifying the operator for the condition (=, <>, <, LIKE etc.) but no way how to tell DBTNG to OR the conditions and not AND them.

At one point, I thought of dumping the DBTNG abstraction layer for this part of the module's Drupal 7 port, and use the good old db_query method. But that would have meant that I would have had to manually add logic for paginating/sorting results as the output of this query was being formatted and displayed as a table to the user (see this for details on what I mean).

So, I decided to hang-on and do a bit more research. As I kept googling to figure out something useful, I came across an issue page on drupal.org that had some code with methods called db_and and db_or in them. I saw a ray of hope and the next couple of google searches took me to this and this page. The solution was now there for the taking.

The DBTNG layer provides these 2 (rather three, the third one being db_xor) methods which can be used to create arbitrarily ANDed/ORed (even XORed) queries. Each of these methods returns a DatabaseCondition object to which you can add conditions which are combined together with the operation reflected by the name of these methods. For example:

 

$ored = db_or();
$ored
  ->condition('uid', 1)
  ->condition('uid', 2)


We added 2 conditions to the return object of db_or and the conditions would be ORed together. Because the return value of db_or (and other 2 methods) is a DatabaseCondition object, you can pass it to a DatabaseStatement's condition method. Here's a compete example to fetch all nodes created by user 1 or 2:

 

$ored = db_or();
$ored
  ->condition('uid', 1)
  ->condition('uid', 2);

db_select('node', 'n')
  ->fields('n')
  ->condition($ored)
  ->execute();


Can you see what we did here. We created a DatabaseCondition object that ORed 2 conditions and then passed that to the condition method of SelectQuery object. This is equivalent to:

 

db_query('SELECT n.* FROM node n WHERE uid = 1 OR uid = 2');


Let us increase the complexity a bit. Let us now fetch all nodes whose status is published and uid is 1 or 2 (a combination of ANDed/ORed conditions). Here's the DBTNG code:

 

$ored = db_or();
$ored
  ->condition('uid', 1)
  ->condition('uid', 2);

$anded = db_and();
$anded
  ->condition('status', 1)
  ->condition($ored);

db_select('node', 'n')
  ->fields('n')
  ->condition($anded)
  ->execute();

This is equivalent to:

 

db_query('SELECT n.* FROM node n WHERE status = 1 AND (uid = 1 OR uid = 2)');


We first ORed the 2 conditions and then ANDed the result with a third condition. The power and flexibility of the DBTNG approach comes from the fact that you can pass either a discreet condition or a DatabaseCondition object to the condition method.

As one last example, let us fetch all nodes where title contains 'test' and user is 1 or 2 and status is published.

 

$ored = db_or();
$ored
  ->condition('uid', 1)
  ->condition('uid', 2);

$anded = db_and();
$anded
  ->condition('status', 1)
  ->condition($ored);

db_select('node', 'n')
  ->fields('n')
  ->condition($anded)
  ->condition('title', '%test%', 'LIKE')
  ->execute();


You would notice I have now added a raw condition directly on SelectQuery object because it ANDs the conditions by default. I could well have added the new condition to the $anded object and it would have produced the same result.

Now as a completely last example, here's the code from Drupal 7 port of the Document module that creates arbitraty ANDed/ORed conditions based on user input:

 

function document_perform_search($searchFields, $searchText, $searchYear = NULL, $searchDocType = NULL) {
  $conditions = db_and();
  $conditions->condition('n.status', DOCUMENT_STATUS_PUBLISHED);
  
  switch ($searchFields) {
    case 0:
      $conditions->condition('author', '%' . $searchText . '%', 'LIKE');
      break;
    case 1:
      $conditions->condition('keywords', '%' . $searchText . '%', 'LIKE');
      break;
    case 2:
      $or = db_or();
      $or->condition('keywords', '%' . $searchText . '%', 'LIKE')
          ->condition('author', '%' . $searchText . '%', 'LIKE');
      $conditions->condition($or);
      break;
    case 3:
      break;
    default:
      die('Invalid Input');
  }

  if (!empty($searchYear)) {
    $conditions->condition('publish_year', $searchYear);
  }

  if (!empty($searchDocType) > 0) {
    $conditions->condition('d.type', '%' . $searchDocType . '%', 'LIKE');
  }

//...More code here


  $query = db_select('node', 'n');
  $query->join('document', 'd', 'n.vid = d.vid');
  $query
        ->condition($conditions)
        ->extend('PagerDefault') 	//Pager Extender
          ->limit(10)				//10 results per page
        ->extend('TableSort')		//Sorting Extender
          ->orderByHeader($headers)//Field to sort on is picked from $header
        ->fields ('n')
        ->fields('d');
  $results = $query
        ->execute();

//...More code here


Once you get a hang of DBTNG and how to do such things with the library, you actually start to like it (I have started liking it definitely).

If you think you have a query you want to express using DBTNG but are not able to figure out the representation, post a comment below and I would try having a shot at it.

 

Hey Rahul,

Thanks for this article. Your table Pager and this article was very much helpful for me... I had a small question though, can we use DISTINCT ON on db_select returned object as a way to remove duplicates. All I found was distinct() which applies to entire row. I am using drupal 7 for my site which is a kind of front end for a large database...

rahul's picture

Hi, yes you can use distinct() directly on an object returned by db_select() and it would return only unique rows.

I suppose he was asking how to use distinct only to a specific column, not whole row. That's what I'm trying to figure out also..

what I did to solve my issue was when a SelectQuery object is created, it can be extended with an where clause and also the distinct() so you can modify your query such that, distinct can be applied on column rather than row in case you need to do so. Might be some easy other in DBTNG to do so. Expert opinion needed :D

rahul's picture

Whenever you are stuck with something that DBTNG does not seem to provide out of the box, there's the good old db_query method :-)

Agree, but if you have to use Drupal pager, which I found can be applied only on SelectQuery object you need to use db_select. For the rest db_query like you said always helps. 

rahul's picture

While Pager and TableSort extenders are certainly useful, you need to understand that they pick the required information for paging and sorting respectively from query-string parameters. So if you are in a situation where you find DBTNG short of what you need, you have multiple options on how to accomplish your task:

  1. Use db_query as mentioned, for Paging and sorting, collect the parameters manually from query string and append them approrpiately to your query. I do not think that should take more than a few lines of code to do so.
    You can see this example to note how I manually send the paging and sorting parameters to the server in an Ajax request. You can similarly collect the same on the server and modify your query.
    You can have a look at the code of these extenders to see how they go on doing their task. Here's a function from PagerDefault:

    function pager_find_page($element = 0) {
      $page = isset($_GET['page']) ? $_GET['page'] : '';
      $page_array = explode(',', $page);
      if (!isset($page_array[$element])) {
        $page_array[$element] = 0;
      }
      return (int) $page_array[$element];
    }
  2. Use addExpression to add any custom expression to the Select list.
  3. Construct your query using db_select, then use toString() magic method to get the query and modify the same for your customizations.

I used the following code based on your post.

$query = db_select('node', 'n');

$query->condition('n.status' ,'1','=') 

->condition('n.title', 'a','like');
                 $query->fields('n', array ('title','nid' ) ) ;
                     $query=$query->extend('PagerDefault')     
                    ->limit(10)  
                     ->extend('TableSort')    ;      

echo $query->__toString();exit;

But the output was "SELECT n.title AS title, n.nid AS nid FROM {node} n WHERE (n.status = :db_condition_placeholder_0) AND (n.title LIKE :db_condition_placeholder_1 ESCAPE '\\')".

Why was ' ESCAPE '\\'' created and how can I remove ' ESCAPE '\\'' from the output

rahul's picture

Does the query generates an error when executed using DBTNG? My guess is ESCAPE is a placeholder used internally by DBTNG for string parameters and should be replaced appropriately before actual query execution as long as execution happens through DBTNG.

That code results in an error while executing. Also that was the reason why I used echo $query->__toString(); to get the query and run it in mysql.

rahul's picture

Can you try removing the like comparison operator and then run the query?

Removing the 'LIKE' comparison operator removes the "ESCAPE '\\'". But I need 'LIKE' operator in the code

rahul's picture

Can you try putting '%' around the values for comparion (e.g. %a%)? Also try a% and %a.

Using '%a', or 'a%' or '%a%' does not bring any change. "ESCAPE '\\'" is still there.

rahul's picture

Hmmm... I do not have time to research at this point, but I would have stepped through Drupal code to see what makes it behave like that.

I think this should resolve the weird escaping, doing LIKE with dbtng. It's a function called db_like.

http://api.drupal.org/api/drupal/includes!database!database.inc/function/db_like/7

rahul's picture

Thanks for sharing that Garrett!!

Thanks for the post! A lot of the newer features like db_or are hard to find at this stage in D7's life, so stumbling upon posts like this at the top of Google are huge time savers. Thanks again.

rahul's picture

Thanks for the feedback csdco. Yes, documentation for D7 needs to improve a lot, there are lots of hidden gems that are not just obvious from the surface.

Thank you so much for your help. This has saved me so much time.

Here is one that has a nested select that I am trying to re-write. Can you help with this one.

$result = db_query_range(db_rewrite_sql("SELECT n.nid, n.title FROM {node} n LEFT JOIN {term_node} tn ON tn.nid = n.nid ".$join
        ." WHERE n.type IN ('industry_news', 'rapidtm_news', 'blog', 'forum', 'webinars', 'events') "
        ."AND LOWER(n.title) LIKE LOWER('%%%s%%') AND n.status='1' "
        ."AND n.vid IN ( "
            ."SELECT tn.vid "
            ."FROM term_node tn "
            ."LEFT JOIN term_hierarchy th on th.tid = tn.tid "
            ."WHERE th.parent = %d"
        ." ) "
        .$where." "
        ."ORDER BY n.created DESC", 'n', 'nid'), $string, $tid, $lowerLim, $upperLim);

rahul's picture

Hi Chris, I am running short on time, so won't be able to give it a try. But I personally would have continued using db_query_range for such complex scenarios. However if you would insist on using the DBTNG layer, then you would want to have a look at SelectQuery->addTag and hook_query_alter to figure out a way to manipulate your query with the nested select:
http://api.drupal.org/api/drupal/includes%21database%21select.inc/function/SelectQuery%3A%3AaddTag/7
http://api.drupal.org/api/drupal/modules%21system%21system.api.php/function/hook_query_alter/7

Rahul,  I will take a look at SelectQuery and hook_query_alter. I will post back the results.

Thanks,
Chris

rahul's picture

Hi Chris, you might also want to have a look at PagerDefault and TableSort extenders to see how they extend and change the query on the fly.

Hi Chris, can you please help me out.

MSSQL Query:

"select distinct duration,sid
           case
               when (substring(duration,1,7)=('JAN-APR')) then SUBSTRING(duration,9,4)+'0'
               when (substring(duration,1,7)=('MAY-NOV')) then SUBSTRING(duration,9,4)+'1'
           end
           from RegisteredCourses where rollno='$prollno' order by sid desc";

How can i replace the above mssql query with drupal 7? please help me. Thanks

rahul's picture

Hi Jack, I don't think there's any way to do this apart from writing a complex extender for DBTNG. I believe a much more efficient way of executing this query would be through db_query or db_query_range.

This is very clear and useful information. I'm using it now to fetch complex autocomplete data, which improves the user interface of my application so much! Thanks!!!

For simple SELECTs in code the recomended approach is db_query(). The only reason to use the db_select(), db_delete(), db_* ... set of functions is if your query is to be manipulated by another module via hook_query_alter() for example, or another hook.

db_select() adds the overhead of creating an alterable query object. If your module is self contained and just doing some simple SELECT logic nobody else cares about then you should stick with db_query. My point is that what drupal API you use to query is dependant on what your query is supposed to do within drupal.

rahul's picture

Hi David, thanks for your comment. I believe structured queries (i.e. those executed via db_select/db_delete etc) are preferable in D7, especially because of their intuitiveness and maintainability. Consider for example a SELECT operation with join conditions and multiple where clauses. It becomes extremely clumsy to respresent such queries as a string in PHP especially considering you have to take care of the ordering of arguments with their placeholders. And if you add/remove a condition later, you need to take care to maintain ordinality of the argument with its placeholder (and its easy to overlook this for non-trivial modules is something I can vouch for with past experience :))

Now consider a INSERT statement with more than let's say 25 columns. Would you really like to use db_query and would rather maintain the ordinality of query arguments with their placeholders manually or would you like to use db_insert and forget about the order in which you specify columns.

If you are really concerned about the overhead of db_select etc, you probably shouldn't be using Drupal. Drupal is meant to be extensible and each Drupal page rendering executes hundreds of hooks on any non-trivial production site. The performance differential of a db_select vs db_query would be negligible for most sites and I would consider it pre-mature optimization to use db_query unless usage data from your site justifies such optimizations. I would rather embrace the structured definition provided by db_select over minimal performance gains of db_query unless site usage needs necessitate such in-depth optimization.

SELECT SUM( f2.field_funds_available_value ) FROM field_data_field_grant_application_deadline AS f1, field_data_field_funds_available AS f2 WHERE f1.entity_id = f2.entity_id AND f1.field_grant_application_deadline_value >= NOW( ) AND f1.field_grant_application_deadline_value <= DATE_ADD( NOW( ) , INTERVAL 8 MONTH )

Post new comment

The content of this field is kept private and will not be shown publicly.
Type the characters you see in this picture. (verify using audio)
Type the characters you see in the picture above; if you can't read them, submit the form and a new image will be generated. Not case sensitive.

Recent comments