SearchUnit Documentation: Release Notes

SearchUnit Release Notes

8.2.0

Improvements:
-Ability to set keyotiSearch.filterLoadLevel to DocId, DocIdUrlCategories or Everything in Javascript to set the filter load level (see help page Examples/JS Usage/Filtering Results).
-Security: upgrade embedded jQuery to v3.6.4 and jQuery UI to v1.13.2 due to vulnerabilities in earlier versions.

8.1.1

Improvements:
-Fix bug in 8.1.0 where result summary is occassionally blank or garbled.
-Add CrawlableDomains configuration property which can be used to specify other domains/hosts that can be crawled if there are links to it.

8.1.0

Improvements:
-Fix rare issue with reading past EOF (end of file) during searching
-Support .dotm, .dotx, .potm, .potx file types
-Find plain text URLs when crawling non-HTML files (eg. .docx, .pdf etc)
-Fix template issue
-Add xlsm, dotm, dotx file types
-Allow autocomplete to be disabled using new property keyotiSearchAutocomplete.sew_enabled = false
-Fix stack overflow that can occur in parser with very specific text
-Default to last used index directory in Index Manager Tool

8.0.0

Breaking Changes
-As 7.0.1: Indexing tools are now built against .NET 4.5 for TLS 1.2 support. Please ensure .NET 4.5 or higher runtime is installed on any machine performing indexing. Please recompile any plug-in projects with .NET framework 4.5.
-Stop indexing formulas in XLSX files.
-Update user agent to "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36"

Improvements:
-Include .NET 5 based DLLs and Index Manager tool
-Fix phrase searches that include hyphenated words.
-Add ResultNumber to ResultItem so that results can be numbered in the ResultItem template.
-Fix issue with @ symbol in indexer login password.
-Bug fixes

7.0.1

Breaking Changes
-Indexing tools are now built against .NET 4.5 for TLS 1.2 support. Please ensure .NET 4.5 or higher runtime is installed on any machine performing indexing.

Improvements:
-Fix issue with TLS1.2 support.
-Fix issue with 2 page search going into infinite loop.
-Add ability to process results in Javascript, set keyotiSearch.onResultsObtained function to a function that will receive results.

7.0.0

Breaking Changes
-None

Improvements:
-Geolocation filtering support added.
-Web API exposure, documentation added.
-Upgrade HTTP links to HTTPS when the crawler start URL is HTTPS based. See UpgradeCrawledURLsToHTTPS in configuration to disable.
-Result summaries no longer include the TITLE tag from the document, this improves result quality as repetitive titles do not fill up the results (requires reindexing).
-Upgraded index manager tool to use .NET 4.6, which will support TLS 1.2 for HTTPS connections.
-Fix bug in lemma generator that would create incorrect lemmas under certain conditions.
-Add Keyoti_Robots meta tag support, so you can have one ROBOTS style meta tag specifically for SearchUnit.
-Fix issue with ligature characters in page titles.
-Allow programmatically set titles to be searchable.
-Remove document filename and document title from dynamic result summaries.
-Index additional content programmatically using new method overload AddDocument(Document document, string additionTextToIndex).
-Index meta description and keywords.

6.1.0

Breaking Changes
-Documents are now added to the index with the URL provided in the server's response, which may be different from the requested URL. Previously requested URLs where stored in the index.
Improvements:
-The SearchUnit.js Javascript file now contains no lines over 3000 characters long (this is to help certain other developer tools that might not handle long lines)
-Fix bug with highlighter when keywords start with a symbol.
-Fix incorrect link generation during crawling when FriendlyURLs is enabled.
-Improve redirected URL handling.
-DocumentText class now includes Uri property.
-Fix Windows Service timing being affected by manually triggered Service runs.
-Fix bug in autocomplete when keyotiSearchAutocomplete.options.sew_runSearchOnSelect = false
-Improve support for PDF's using a specific filter stream.
-Improve handling of redirected pages.

6.0.0

Breaking Changes
.NET 1.1 support has ended, .NET 1 DLLs have been removed. No breaking changes from v2012.2 or v2012.3. If you are upgrading from a version older than 2012.2, please see the 2012.2 breaking changes below.

Improvements:
-Add new AJAX based user interface, and support MVC, Razor, AJAX etc.
-Add custom data filter controls and sorting.
-Add search cloud control.
-Minor bug fixes.

This product was previously known as "Search for ASP.NET", and has been renamed to "SearchUnit". The name change is purely superficial. The previous release was "Search for ASP.NET 2012", this release is "SearchUnit v6".

2012.3.0

Breaking Changes
If you are upgrading from a version other than 2012.2, please see the 2012.2 breaking changes below.

Improvements:
-Fix issue with xslx parsing.
-PDF title selection improvements.
-Diacritic (accented characters) ignoring for queries (eg. café matches cafe and vice-versa).
-When reimporting 'All' indexable sources do not stop because one source had too many server errors.
-Add more central events (plug-ins).
-Minor bug fixes and API improvements (from customer feedback).

2012.2.0

Breaking Changes
-The underscore '_' character is now treated as NON word breaking, this means that if it appears inside a word (eg. this_variable) then it will not break the word apart (ie into 'this' and 'variable'). This is useful for emails for example, in order to be able to search for my_email@host.com. To revert to old behavior, edit the configuration setting WordNonBreakingCharacters, and remove _ from the character list, and then add it to the WordBreakingCharacters list. Requires reindex to have effect.
-Hyphen '-', forward slash '/' and back slash '\' are now considered word breaking, this means that if they appear inside a word, then they will break the word apart (eg. "\program files\myapp" would be findable with the query "myapp" but not with "\program files\myapp"). Hyphens are treated specially and will match whether they are breaking or non-breaking. To revert to old behavior, edit the configuration setting WordBreakingCharacters, and remove /\- from the character list, and then add it to the WordNonBreakingCharacters list. Requires reindex to have effect.

Improvements:
-Allow keywords to match parts of filenames that are split by hyphens or underscores. Eg. match "key1" to "my-key1-file.pdf".
-Change "_all" queries (which return all documents authorized for user) to only return documents within specified location and content categories.
-Minor bug fixes and API improvements (from customer feedback).

2012.1.0

Breaking Changes
-Support has been added for the 'robots' meta tag - the indexer will now obey these meta tags, to change this behavior set RespectsRobotsMetaTags to false in Configuration.

Improvements:
-Update an indexed document URL when a server redirects to a new URL.
-BoostFactorTagName in Configuration can be used to change the boost factor tag name that the indexer looks for.
-Added login URL parameter (under More Options in the import form) for file-system based imports. This URL is hit before a filesystem import is started, allowing for the search user to be logged in (see the authentication section of the help for more information on log ins).
-Smaller bug fixes.

2012.0.1

Improvements:
-Fix bug in lemmas when certain digit based queries are used.
-Fix bug in index management tool so that security, content and location category changes are effective immediately instead of when the index is next closed (eg closing the tool or indexing a source).

2012.0.0 (v5)

Breaking Changes:
-The index format for 2012 is different to 2010 and earlier. To convert a 2010 index to 2012 format, open the index with the Index Management Tool and it will be converted for you. v3 or older indexes cannot be converted and should be generated from scratch.
-The Configuration.URLCaseSensitive property's default value has been changed from true to false (because it is appropriate to majority of users) - however this does have the side effect that URLs will be stored in all lowercase, therefore if you are doing any URL comparison (comparing your own strings against our stored URLs) check that your string is also lowercase before comparison. Eg. doing something like this resultURL.Contains("News") will ALWAYS be false if Configuration.URLCaseSensitive==false, and should be changed to resultURL.Contains("news")
Improvements:
-Addition of document security groups to allow user to see only results that they have access to
-Search speed improvements for multiple keywords
-New, more powerful index management tool UI
-Popular search auto-complete added (to existing lexicon based auto-complete)
-SearchBox watermark added
-New (plug-in) central events added
-Indexing of ZIP files added
-Added "ImpliedLogicOperator" to configuration to enable changing default search operator from AND to OR
-Now possible to see all documents (that user has permission to see), use query: _all to get all documents, or programmatically searchAgent.Search(searchAgent.AllDocumentReturnQuery, 1, 10);
-Auto-complete suggestions can be easily customized using a plug-in, or in Javascript on the client
-Result preview text can be easily customized using a plug-in, or in Javascript on the client
-Stemming/Lemma support increased to 9 languages
-Added AsynchronousQueue class (.NET 2+) to support asynchronous adding of documents to the index
-Windows Service can now email logs of its actions
-Numerous smaller API enhancements
-New publishing feature in index manager with ability to auto rebuild indexes and deploy with FTP or file copy (ideal for shared hosting users)

2010.1.0

Breaking Changes:
-To speed up searching, some result data is not loaded until a later stage. This only affects developers with custom 'filter' code (i.e. an override of the AddResultItemToResults method). To revert to the old behavior, set the FilterLoadLevel property in SearchResult to Keyoti.SearchEngine.Search.FilterLoadLevel.DocIdUrlCategories

-Web Admin & Windows Service and Plugins: We have improved these by including separate installers for .NET versions 1, 2 (3, 3.5) and 4. In previous versions a single .NET 1 based installer was used, with the side effect that any plug-in DLLs used had to be compiled against our .NET 1 main DLLs. Therefore, if you use a plug-in DLL and choose to install the admin MSI for .NET 2 (3, 3.5) or 4, you will need to also change the plug-in to reference our corresponding main DLLs for that .NET version. Please see the help section on plug-ins for more information or email support.

-.NET 4 users: In ASP.NET 4, the HtmlEncode and UrlEncode methods of the HttpUtility and HttpServerUtility classes have been updated to encode the single quotation mark character (') as ' and %27 respectively. Due to this, there is a possibility that an index created with .NET 2 will not be fully compatible with .NET 4 (if URLs have ' chars in them). The safest thing to do is recreate the index from scratch. If uncertain if this affects you, please email support at keyoti.com.
Improvements:
-Inclusion of .NET 4 DLLs and VS 2010 demos
-Inclusion of Web Admin / Windows Service versions for .NET 1, 2 (3, 3.5) and 4 (see breaking changes above)
-Faster searching (with larger indexes)
-Configuration.ImpliedLogicOperator can now be used to change the default operator that is assumed between query words from AND to OR
-Insertion of "..." in the result summary when more than one snippet of text is used. "..." can be changed in Configuration.SummaryTextSplitter
Bug Fixes:
-HTML tags in titles were not escaped
-Some ignore regions were not always correctly ignored
-Fix a highlighting issue with complex queries

2010.0.0 (v4)

Improvements:
-Vastly improved indexing performance, especially with larger indexes (50K docs up)
-Faster searching
-New indexing methodology
-Stop lists
-API improvements
-SearchBox options generates option controls automatically
-Web admin application now uses AJAX based updates on the import/optimize form
-Additional Configuration settings added (Eg. DbConnectionTimeout)

Please see the upgrade section for more details on changes.
Compatibility:

-Not backwards compatible with earlier versions, please see the upgrade section.

3.1.1

Improvements:
-Substantial improvement in PDF indexing performance (CPU and memory usage)
-Fixed bugs relating to image based search buttons
Compatibility:

-Backwards compatible with 3.1.0, please see compatibility notes for 3.1.0 as well.

3.1.0

Improvements:
-Improved HTML output standards compliance.
-Configuration.MaxDocumentSizeToIndex has been added to allow very large documents to be skipped during indexing. By default it is set to 30MB which means that files larger than this will not be indexed.
Compatibility:

-By customer demand, the controls: SearchResults, SearchBoxOptions, and FeaturedResults are wrapped by "DIV" tags, instead of "SPAN" tags. The SearchBox control is still wrapped by "SPAN" tags, but does not use a table for layout. The templates are not wrapped by any tag. These changes have been made to make the rendered output standards compliant, however it is _possible_ that it will effect appearance under your application - therefore it is recommended that the appearance is checked after upgrading.

-Due to potential performance impacts incurred by Authenticode signing assemblies (see http://blogs.msdn.com/shawnfa/archive/2005/12/13/502779.aspx), the product DLLs are no longer Authenticode signed by default. If you prefer to use the Authenticode signed versions, please obtain them here http://keyoti.com/downloads/Search_ASP.NET-Setup-3-1-0-AuthenticodeSignedDLLs.zip

3.0.0

Improvements:
-Added wildcard search support, eg. "excel*" or "*ment" or "*read*".
-File-system based document importing
-Plug-in architecture added
-Simplified index configuration
-New international spelling dictionaries
-Improved search performance
-Improved performance when indexing large numbers of documents (over 10,000)
-Simpler customizability (event based)
-Support for multiple simultaneous indexes
-Autocomplete and result preview AJAX functionality for Controls
-Support for more document types (added .xls, .ppt, ODF, OOXML)
-Improved robustness
-Chinese / Japanese / Korean language support
-Easier single page search/results support
-Meta tag based location/category assignment
-Plug-in DLL based location/category assignment
-Added option for spelling suggestions to be taken from search lexicon, giving better suggestions. See SearchSuggestions.SpellingSuggestionSource
-Added FoundSpellingError event to SearchSuggestions to enable API user to add-to/alter prescribed suggestions.

Compatibility:

Is NOT backwards compatible with v2, please see the upgrade section

Search Pro History

2.2.0b

Improvements:
-Fixed issue with some PDF document encoding of accented chars.
-Added option for spelling suggestions to be taken from search lexicon, giving better suggestions. See SearchSuggestions.SpellingSuggestionSource
-Added FoundSpellingError event to SearchSuggestions to enable API user to add-to/alter prescribed suggestions.
-Added multilingual spelling dictionary
Compatibility:

Is backwards compatible with 2.1.0

2.1.0

Improvements:
-PDF document titles now read from PDF meta data
-Optimized build process memory usage to a scalable flat profile (lower overall usage)
-Optimized search speed in large (>1000000 occurrence) indexes

Search Lite History

2.0.0

Improvements:
-New FeaturedResults control added.
-Crawl/build log window added showing web-site errors.
-Build index operation memory reduced.
-Crawl depth setting added.
-Unhyphenated word forms also searched. (eg. "web-site" now yields results containing "web site").
Compatibility:
Is backwards compatible with 1.3.0

1.3.0

Improvements:
-Ignore region properties added (IgnoreBlockBeginPattern/IgnoreBlockEndPattern), enabling menus etc to be omitted from the index.
-When content encoding not returned by server, encoding is now read from meta tags.
-Documents are now not *reindexed* if 'modified date' is older than the index (also see IgnoreLastModifiedDate).
-Support for (and example of) merging disparate result DataSets added.
-Visual Studio 2005 support added.
-Search speed improvements.
-Http compressed page support added (gzip/deflate).
-Summary word highlighting HTML element property added.
-ReaderExceptionOccurredEvent added to enable detection of broken web-site links.
Fixes:
-Fixed SearchResultURL redirection when result page URL already contains a query string.
Compatibility:
Due to the addition of a new index file in the index directory, it is advisable that you rebuild your indexes before deploying this version. If you do not rebuild the indexes the file will be created automatically (however this may fail if the ASPNET process has insufficient permissions in the index directory). Please see this KB article for details; http://keyoti.com/kb/Default.aspx?ToDo=view&questId=102&catId=54

1.2.0

Improvements:
-Added support for user pressing 'Enter' key inside search box.
-Added properties to control next/prev page link texts.
-SearchResult ItemCreated events added.
-Frames web-site support added (crawling and indexing).
-Added PathMatchesToBeIncluded to allow only URLs with matching substrings to be indexed.
-Property added to handle error page redirects.
-Crawl start URL is persisted by Index Management tool for added convenience.
Compatibility:
Is backwards compatibly with 1.1.0

1.1.0

Improvements:
-Forms Authentication support, enabling the crawler and indexer to work with web applications that require a user to logon - please see Help documentation.
-Windows Integrated Authentication support - enabling the crawler and indexer to crawl web applications where Integrated Authentication is enabled under IIS.
-Crawler now warns users if the start URL is invalid or cannot be read (404 etc).
-Support for international encodings in addition to Unicode/UTF-8.
-Additional API exposure allows subclasses of SearchAgent to be used.
-Support for non English search expression operators (eg. "UND" and "ODER" for "AND" and "OR").
Fixes:
-Issue fixed in crawler making success rate higher.
Compatibility:
Is backwards compatibly with 1.0.0

1.0.0

-Initial release.