Web application development in App Engine is still a new thing especially for me. While most of us are coming from the SQL type database, one of the most challenging work is to perform search.
Unlike SQL, you can't do START WITH or CONTAINS search, which I usually do in SQL for much user friendly search. There's also a limited inequality search you can do. Good thing is that there's a number of solution which could solve these shortcomings like the Searchable Model and Simple Full Text Search by Bill Katz.
Using the Searchable Model is really nice for short StringProperty you want to search on so you can be sure that you won't reach the 5,000 index entries cap. While is doesn't really imitates what the SQL START WITH and CONTAINS do, it is much better than the filter option provided. By default the Searchable Model index all the StringProperty you have in your model. To keep your Entries small and quick, you can specify which property gets index.
class Person(search.SearchableModel): first_name = db.SringProperty() last_name = db.SringProperty() @classmethod def SearchableProperties(cls): return [['first_name'], ['last_name']]
This will make a index on first_name and last_name separately, and you use it this way
query = Person.all().search('john', properties=['first_name']).order('frist_name') query = Person.all().search('smith', properties=['last_name']).order('last_name')
You can also declare multiple fields to be in one index to perform search on multiple property search.
@classmethod def SearchableProperties(cls): return [['first_name', 'last_name']]
And use it like this.
query = Person.all().search('john', properties=['first_name', 'last_name']).order('frist_name')
Using the Simple Full Text Search by Bill Katz is best for searching TextProperty, to do what we did with Searchable Models, will be like this
class Person(db.Model, Searchable): first_name = db.SringProperty() last_name = db.SringProperty() INDEX_ONLY = ['first_name', 'last_name']
And performing search could be as easy as
query = Person.search('john')
While both implementation seems to very similar here's a simple comparision showing how different they are.
| Searchable Model | Simple Full Text Search | |
| Stemming | No | Yes |
| Multi-word | Some how | Yes |
| Result | Returns a query object wherein you can perform additional operation like filter and order | Returns a list of match entities which you can't perform any filter or order, unless you're going to manipulate the list returned. |
| Implementation | Adds a StringListProperty to your model (__searchable_text_index). This kind of implementation bounds you to the 5,000 index entries cap. | Using another Model (StemmingIndex) to store the indexs of your model and links them by the Parent-child relationship. This approach is nice without modifying your model plus you're not bound to the 5,000 index entries cap. |
| Update | When you update you entity, the index also gets updated. | After updating your model, you have to manually update the index by performing a ancestor search on the StemmingIndex, delete it and create a new one to make the search perform right. |
While there are still some limitation on these solutions, I still think they're great solution for now. Good news is we might have a Full Text Search API coming.
