Searchdaimon Es	SearchBlox	mnoGoSearch	IBM OmniFind Yahoo! Edition	Google Mini	MS Search Server	Thunderstone	Constellio	Amazon CloudSearch	Google Site Search
Version	Auto updating	V 6.4 Build 2	3.3.12	8.4.2	4.6.4	2010 Express	App 6.01, script 8.0	1.2.1	–	–
Search page	Open	Open	Open	Open	Open	Open	Open	Open		Open
Indexed documents[1]	48 801 (100%)	34 900 (72%)	43 859 (90%)	36 464 (75%)	47 135 (97%)	38 369 (79%)	45 141 (92%)	43 685 (89%)	48 737 (99.8%)	73 900 (151%)
Index size	1.1G	3.8G	1.2G	2.9G	–	2.1G[2]	3.3G	9.6G	–	–

Collection refiltering	X	X							X
Misleading result count[3]					X	X			X
Administration	Web gui	Web gui	Command line only	Web gui	Web gui	Web gui	Web gui	Web gui	Web gui, some command line	Web gui
Platform	Virtual appliance, hardware appliance	Windows and Linux	Windows and Linux	Windows and Linux	Hardware appliance	Windows	Virtual appliance, hardware appliance	Linux	Amazon own cloud infrastructure	Google’s own cloud infrastructure
Cost	Free open source version with community support only. Version with full support from $1 999 to $15 000 depending on number of users and hardware options	Free for the first 10 000 documents. Then $5 000 per server per year for a more advanced version with unlimited documents	Linux for free, Windows version from $99 to $19 850 depending on underlying database technology	Free	From $2 990 to $9 990 depending on number of documents	Free	From $990 depending on number of documents and hardware options	Free	Different search servers at $86.40, $345.60 and $489.60 per month, depending on data size and query load. You may need several in parallel if you have much data or many users. In addition there is data transfer, query count and document updating fees	$100 to $2 000+ per year depending of number of queries and on demand index quota
Max documents	No hard limit	10 000 for free version. No hard limit for paid version	No hard limit	500 000	From 50 000 to 300 000 depending on license	No hard limit	Depending on license	No hard limit	Has limit but no numbers has been published	Unknown
Underlying search technology	Propertarian[4]	Lucene/Solr	Sql server	Lucene/Solr	Propertarian[4]	Sql server	Propertarian	Lucene/Solr, sql server	Propertarian, based on Amazon A9	Propertarian, based on Google.com
Review	Searchdaimon ES review	SearchBlox review	mnoGoSearch review	IBM OmniFind Yahoo! Edition review	Google Mini review	Microsoft Search Server Express 2010 review	Thunderstone review	Constellio review		Google site search review

htdig

http://www.htdig.org/
We plan to add htdig to Open Test Search soon. Htdig is a open source search engine mostly used for websites/intranets. Is a bit outdated, with the latest release from 2004.

Easy to install. In CentOS you only need to do a “yum install htdig htdig-web”. Unfortunately you have to download an build programs from 3-party’s to convert common documents like .doc, pdf, xls etc.

Notes

[1] Indexed documents

There is a total of 48 811 documents in the two test collections. Some search engines ignores documents that they don’t have a data converter for. Ignoring thus documents means you cant search for file names of images and other not text content. Other index the file name and/or meta data.

There is also some documents with special file names that can be safely ignored. Typical starting the file name with “~” or having “#” it the name ( the # character has a special meaning when used in a url ).

[2]Estimated size

The search server don’t revile disk usage in its gui. This number is based on the size of the C:\Program Files\Microsoft Office Servers\14.0\Data\MSSQL10.SHAREPOINT folder.

[3]Misleading result count

Some search engine don’t show the correct number of found document’s. Instead that try to estimate ho many it can be. For example the Google Mini sees it have found 134 000 documents containing enron, but there is only ~50 000 documents in the data set.

[4]Propertarian search TECHNOLOGY

Neither Searchdaimon nor Google states what technology they are using under the hood, but it is assumed to be some kind if inverted index. Probably written in C or C++.

Open Test Search [BETA]