Now it's time to get serious and look at writing some simple code that can query a running Sphinx index and take advantage of its advanced query features.
Two weeks ago, in Sphinx: Search Outside the Box I took a high-level look at why Sphinx is a great choice for full text indexing large data sets. Last week, in Sphinx: Getting Practical we dove into setting up Sphinx and building a simple index and querying it from the command line.
Now it’s time to get serious and look at writing some simple code that can query a running Sphinx index and take advantage of its growing number of advanced query features. The Sphinx Documentation is obviously the definitive reference, but I hope to show just enough sample code that you realize how easy it is to start talking to a Sphinx server.
API Basics
There are Sphinx clients available in most popular languages. If you look in the api
subdirectory of the source tree, you’ll find Ruby, Java, PHP, Python, and C (libsphinxclient). There’s a Perl module available (Sphinx-Search on CPAN) too. And if none of those are sufficient, the latest versions of Sphinx even support SQL-like queries issued via the MySQL protocol (on TCP port 3306, just like MySQL). Talk about an easy migration path from MySQL full-text!
Obviously the syntax in the various languages, differs, but the general approach for querying Sphinx is similar in all of them.
- create a sphinx client object
- set query options
- set query
- connect to sphinx server (if not connected)
- send query
- receive results
- close connection
Note that it’s possible to batch queries and send several at once. Doing so allows Sphinx to perform more efficiently if some duplicate work can be done only once. However, that’s not often needed in traditional web deployments, but it can be useful in offline processing. Also, newer Sphinx releases have support for persistent connections. Not only do they reduce the fork()
overhead (which can be substantial!) on the server side, they also reduce the TCP overhead and allow for higher throughput in high volume situations. As a result, step 4 and step 7 may not always apply.
Let’s look a simple PHP code example that connects to the Sphinx server running on localhost, searches for all documents that contains the phrase “hello world”, and sorts them by size.
require("sphinxapi.php");
$host = "localhost";
$port = 3306;
$index = "index1"; // put your index name here
$query = "hello world";
$cl = new SphinxClient();
$cl->SetServer($host, $port);
$cl->SetMatchMode(SPH_MATCH_PHRASE);
$res = $cl->Query($query, $index);
Once the results come back, you simply check for success and matches, printing out the document ids.
if ($res == false) {
print "failure: " $cl->GetLastError() . "\n";
}
else {
print "retrieved $res[total] of $res[total_found] matches in $res[time] seconds\n";
foreach ($res["matches"] as $docinfo) {
print "$docinfo[id]\n";
}
}
That code makes use of the single file sphinxapi.php
which is the PHP client API that’s shipped as part of every Sphinx release. In fact, the test suite used to validate new releases uses the PHP API heavily, so you can probably find example code to do just about anything you’d need.
As you can see, it follows the process outlined above. After a few variables are defined, we create a new Sphinx client object ($cl
), set a few options, and then fire off the query. Iterating over the results is also very straightforward. The example above is intentionally short — it’s actually possible to retrieve some metadata (namely, the attributes) for each of the matched documents in the result set too.
Building on that simple foundation, there’s a lot more we can do.
Matching Modes
In the example code we used a call to SetMatchMode()
, passing SPH_MATCH_PHRASE
. That told Sphinx we wanted a phrase match–that is find “hello” and “world” used together. There are several other matching modes availble.
- SPH_MATCH_ALL: find documents that contain all of the query terms
- SPH_MATCH_ANY: find documents that contain any of the query terms
- SPH_MATCH_BOOLEAN: allow AND (&), OR (|), and negation (-term) expressions plus grouping using parenthesis
- SPH_MATCH_EXTENDED2: support queries using Sphinx’s more complex query language
- SPH_MATCH_FULLSCAN: search all documents, applying any specified filters and grouping
Between Boolean and extended2 (which replaces the original “extended” mode), you can construct queries complex enough for just about any circumstance.
Sorting Modes
Sphinx allows you to choose from several sorting modes that affect the order in which results are returned but not which documents match the query.
- SPH_SORT_RELEVANCE: Sphinx default, sort from most relevant to least based on word frequency
- SPH_SORT_ATTR_ASC: sort in ascending order based on an attribute
- SPH_SORT_ATTR_DESC: sort in descending order based on an attribute
- SPH_SORT_TIME_SEGMENTS: group by “time segment”, then sort by relevance within the groups
- SPH_SORT_EXTENDED: configure sorting based on multiple attributes, each of which can be in ascending or descending order
- SPH_SORT_EXPR: sort based on an arbitrary mathematical expression
To make this more concrete, consider this call:
$cl->SetSortMode(SPH_SORT_ATTR_DESC, "size");
That asks Sphinx to sort the documents from largest to smallest (based on the size attribute included in the earlier index definition).
In “extended” mode, you can use the attributes defined for your index as well as some of Sphinx’s internal attributes as well.
$cl->SetSortMode(SPH_SORT_EXTENDED, "size ASC, @id DESC");
That tells Sphinx to sort in ascending order by size and then in descending order by document id in the case of a tie. Extended mode is very powerful–especially if you have numerous attributes on which to sort (price, weight, date added, etc.).
Filtering
In addition to full-text search capabilities, Sphinx lets you use numeric attributes to refine a search. For example, in building a product search, you may want to find all products whose price is less than $500. Or maybe find all those that fall between $50 and $75. To do this, you’ll want to call SetFilter()
, SetFilterRange()
, or SetFloatFilterRange()
. All three filtering functions allow you to specify either an inclusive or exclusive filter.
Using SetFilter()
can find documents whose attributes match one or more values, or exclude those documents that match one or more values.
$cl->SetFilter("price", array(100), 0); // find $100 items
$cl->SetFilter("price", array(50, 75), 1; // exclude $50 and $75 items
Similarly, we can use SetFilterRange()
to find or exclude a range of integer values (use SetFloatFilterRange()
for non-integer values).
$cl->SetFilterRange("price", 50, 100, 0); // find items priced between $50 and $100
$cl->SetFilterRange("price", 50, 100, 1); // exclude items priced between $50 and $100
Between filters on attributes and the extended query language, you can handle a surprising array of query types without having to write a lot of custom code.
Geography
A special case of filtering and sorting based on attributes is geo-distance. If you have geocoded data, such as houses for sale or the locations of restaurants, you can add latitude and longitude attributes to your index and take advange of Sphinx’s built-in support. In the SPH_SORT_EXPR
sorting mode, you can use the built-in GEODIST()
function to compute the distance between two points of latitude and longitude. But it’s easier to use the SetGeoAnchor()
call to tell Sphinx what the latitude and longitude attributes are called in your index and specify an “anchor” point from which distances will be computed.
$cl->SetGeoAnchor("lat", "lon", $latitude, $longitude);
Once that is done, you can use the magic attribute @geodist
in both filters and sorting. That would allow you to, say, find all pizza places within a 5 mile radius of a given point and then sort the result set based on that distance.
Conclusion
Hopefully this has provided you with some ideas for the types of tweaking you can do behind the scenes to make Sphinx search just the way you expect (and need) it to. In addition to everything we’ve seen so far, Sphinx can also perform more complex grouping of results and it can also build “excerpts” of matched documents on the fly to show context (much like Google does). As always, it’s best to check the documentation for complete descriptions of the options as well as any gotchas or hints.
Happy searching!
Comments on "Sphinx: Queries and APIs"
The port you’re using for connecting (3306) is the default for mysql, not sphinx.
Looking forward to reading more. Great article.Much thanks again. Keep writing.
A formidable share, I simply given this onto a colleague who was doing a little bit evaluation on this. And he in fact bought me breakfast as a result of I found it for him.. smile. So let me reword that: Thnx for the treat! But yeah Thnkx for spending the time to discuss this, I feel strongly about it and love reading extra on this topic. If possible, as you turn into expertise, would you thoughts updating your weblog with extra details? It is highly helpful for me. Huge thumb up for this blog post!
Looking forward to reading more. Great post.Much thanks again.
cCtj86 wonderful points altogether, you simply gained a brand new reader. What might you recommend in regards to your publish that you simply made a few days in the past? Any positive?
The data talked about inside the write-up are a number of the most effective obtainable.
Just beneath, are numerous totally not connected web sites to ours, having said that, they are surely really worth going over.
We like to honor quite a few other web web-sites around the net, even if they aren?t linked to us, by linking to them. Underneath are some webpages worth checking out.
Just beneath, are various absolutely not connected sites to ours, even so, they may be surely really worth going over.
We came across a cool website that you might enjoy. Take a search if you want.
Check below, are some entirely unrelated sites to ours, however, they are most trustworthy sources that we use.
We’re a group of volunteers and opening a new scheme in our community. Your site provided us with valuable information to work on. You have done an impressive job and our entire community will be grateful to you.
comprar salomon speedcross 3 [url=http://patronatomera.gob.ec/omera.php?es=comprar-salomon-speedcross-3]comprar salomon speedcross 3[/url]
I have been absent for some time, but now I remember why I used to love this blog. Thanks , I?ll try and check back more often. How frequently you update your web site?
christian louboutin barcelona
Many thanks, this site is extremely beneficial
“Thanks for ones marvelous posting! I genuinely enjoyed reading it, you will be a great author.”
Sites of interest we have a link to.
hi!,I love your writing very much! share we be in contact extra about your post on AOL?
I require a specialist in this space to resolve my problem.
Maybe that’s you! Having a look forward to look you.
Here are a few of the web sites we advise for our visitors.
Every after in a whilst we pick blogs that we study. Listed beneath would be the newest sites that we decide on.
Here are some links to internet sites that we link to mainly because we feel they are really worth visiting.
YslQA6 asdfqckaupht, [url=http://tkpkdcorqgqp.com/]tkpkdcorqgqp[/url], [link=http://xgbamllsdbdr.com/]xgbamllsdbdr[/link], http://bxtoerriurzf.com/
The time to study or stop by the subject material or websites we’ve linked to beneath.
Below you will obtain the link to some websites that we feel you need to visit.
Please visit the sites we follow, like this one particular, as it represents our picks through the web.
The information talked about in the write-up are several of the top readily available.
Check beneath, are some completely unrelated websites to ours, on the other hand, they may be most trustworthy sources that we use.
Just beneath, are a lot of entirely not related web-sites to ours, even so, they may be surely worth going over.
We came across a cool web site that you just may possibly appreciate. Take a look when you want.
That would be the finish of this post. Right here you will locate some web sites that we assume you will value, just click the links.
Usually posts some extremely fascinating stuff like this. If you?re new to this site.
That will be the end of this write-up. Here you?ll come across some internet sites that we feel you?ll enjoy, just click the links.
Usually posts some incredibly exciting stuff like this. If you are new to this site.
Usually posts some incredibly interesting stuff like this. If you are new to this site.
Just beneath, are various entirely not related internet sites to ours, having said that, they are certainly worth going over.
We came across a cool site that you just may possibly take pleasure in. Take a search if you want.
This article is in fact a pleasant one it helps new internet viewers,
who are wishing in favour of blogging.
Also visit my web page DamonEGuynes
Although internet websites we backlink to below are considerably not connected to ours, we really feel they are really really worth a go as a result of, so possess a look.
Please visit the websites we follow, which includes this 1, as it represents our picks from the web.
We like to honor several other net web sites around the net, even though they aren?t linked to us, by linking to them. Underneath are some webpages worth checking out.
Just beneath, are many completely not related sites to ours, nevertheless, they’re surely really worth going over.
Here are a number of the sites we suggest for our visitors.
Usually posts some really fascinating stuff like this. If you?re new to this site.
Here are some links to internet sites that we link to simply because we feel they may be worth visiting.
Usually posts some incredibly fascinating stuff like this. If you are new to this site.
Always a massive fan of linking to bloggers that I like but really don’t get a lot of link love from.
Check beneath, are some absolutely unrelated web sites to ours, nonetheless, they are most trustworthy sources that we use.
The time to read or go to the material or sites we have linked to beneath.
Below you?ll come across the link to some web sites that we consider you’ll want to visit.
Please pay a visit to the sites we comply with, which includes this a single, because it represents our picks in the web.