We Are Communication Architects

Building brand awareness through content creation and community engagement.

July 13th, 2010

Extending WordPress search with Sphinx (Part II)

Once theSphinx server is setup, we can begin with WordPress integration. The first thing we need to do capture the override the query_vars in WP_Query before it runs it’s own search. We’ll do this by running an action on ‘parse_query’ with the code below.

function my_parse_query(&$wp_query) {
  if($wp_query->is_search && class_exists('SphinxClient')) {
    $results = my_sphinx_search_posts($wp_query->query_vars);
    if($results) {
      $matching_ids = array();
      if(intval($results['total']) > 0 ) {
        foreach($results['matches'] as $result) {
          $matching_ids[] = intval($result['attrs']['post_id']);
        }
      } else {
        $matching_ids[] = -1;
      }
      //clear the search query var so posts aren't filtered based on the search
      $wp_query->query_vars['sphinx_search_term'] = $wp_query->query_vars['s'];
      unset($wp_query->query_vars['s']);
      if(isset($wp_query->query_vars['paged'])) {
        //set our own copy of paged so that wordpress doesn't try to page a query already limiting posts
        $wp_query->query_vars['sphinx_paged'] = $wp_query->query_vars['paged'];
        unset($wp_query->query_vars['paged']);
      }
      $wp_query->query_vars['post__in'] = $matching_ids;
      $wp_query->query_vars['sphinx_num_matches'] = intval($results['total']);
    }
  }
}
add_action('parse_query', 'my_sphinx_parse_query', 10, 1);

In the above, we’re first verifying that the current $wp_query object is trying to run a search and that the Sphinx extension for PHP is correctly installed on the server. If both of the previous statements are true, it runs the my_sphinx_search_posts() function to get the post ID’s matching our search terms; we’ll get more into that function in a bit. To prevent the WP_Query->get_posts() function from further filtering the posts that Sphinx has already selected, we are un-setting the ‘s’, ‘paged’ query_vars, but we are saving them for later use by storing them under different names with the query_vars array as ‘sphinx_search_term’ and ‘sphinx_paged’ accordingly. Not also that we’re saving the total number of matches sphinx found under ‘sphinx_num_matches’ in the query_vars as well.

Now lets dig into the my_sphinx_search_posts() function.

function my_sphinx_search_posts($args) {
  $defaults = array(
    'search_using' => 'any',
    'paged' => 1,
    'posts_per_page' => 0,
    'showposts' => 0
  );
  $args = wp_parse_args($args, $defaults);
  $sphinx = new SphinxClient();
  $sphinx->setServer('127.0.0.1', '9312');

  $search = $args['s'];
  switch($args['search_using']) {
    case 'all':
      $sphinx->setMatchMode(SPH_MATCH_ALL);
      break;
    case 'exact':
      $sphinx->setMatchMode(SPH_MATCH_PHRASE);
      break;
    default:
      $sphinx->setMatchMode(SPH_MATCH_ANY);
  }

  $page = isset($args['paged']) && (intval($args['paged']) > 0) ? intval($args['paged']) : 1;
  $per_page = max(array($args['posts_per_page'], $args['showposts']));
  if($per_page < 1) {
    $per_page = get_option('posts_per_page');
  }
 
  $sphinx->setLimits(($page - 1) * $per_page, $per_page);
  $sphinx->setMaxQueryTime(30);
  $result = $sphinx->query($search);
  return $result;
}

The function above is just a wrapper around the Sphinx search. It accepts an array of arguments, which matches up to the query_vars array used by WP_Query during a search query. It is using the search based query_vars to create the search arguments to send to Sphinx and then returning the final result.

Now that we’ve searched against Sphinx and have gotten WP_Query to return the posts based on the Sphinx result, we need to reset the query_vars that we modified earlier so that WordPress continues to behave as if it ran the search itself. We’re going to do this by adding an filter to ‘found_posts’.

function my_sphinx_search_filter_found_posts($found_posts, &$wp_query = null) {
  if(!is_null($wp_query)) {
    if(isset($wp_query->query_vars['sphinx_num_matches'])) {
      $found_posts = intval($wp_query->query_vars['sphinx_num_matches']);
    }
    if(isset($wp_query->query_vars['sphinx_search_term'])) {
      $wp_query->query_vars['s'] = $wp_query->query_vars['sphinx_search_term'];
    }
    if(isset($wp_query->query_vars['sphinx_paged'])) {
      $wp_query->query_vars['paged'] = $wp_query->query_vars['sphinx_paged'];
    }
  }
 
  return $found_posts;
}
add_filter('found_posts', 'my_sphinx_search_filter_found_posts', 10, 2);

That’s it. We’ve setup WordPress to use Sphinx as its search engine, while letting it still degrade nicely to use WordPress’ default search if Sphinx isn’t available.

I have grouped this code together as a stand-alone plugin that can be downloaded and used on your own site: WP Sphinx Search

About the Author
Michael Pretty is an application developer for the Voce Connect Platforms team with a background in developing for PHP, mySQL, WordPress and a handful of other environments. Follow him on Twitter @prettyboymp

Filed in Development, Programming, Search, WordPress

Add Your Comment2 Responses to “Extending WordPress search with Sphinx (Part II)”

Sarah on September 1st, 2010 at 11:20 am

The Sphinx search is installed fine but the wordpress plugin is showing the following error:

Fatal error: Class ‘SphinxClient’ not found in /home/klsr/public_html/topics/wp-content/plugins/wp-sphinx-search/wp-sphinx-search.php on line 121

Can you please let me know how to fix it?

Michael Pretty (prettyboymp) on September 8th, 2010 at 8:44 am

Sarah, try upgrading to the latest version of the plugin. The initial version required the Sphinx PECL extension to be installed for PHP. I’ve updated the plugin so that it will fall back on the PHP class version of the extension if it isn’t installed. This should take care of your problem.