If this article does not address your needs please let us know.
Typically the SearchUnit indexer
will index from sources such as the web (crawling), databases, and the
filesystem, however it cannot automatically merge meta-data from a
database with documents that the meta-data describes.
To do that we need to write a little bit of code, to pull
the data together and apply it as needed.
There is a
complete project here for MVC however it applies equally to Web Forms or
other non-web projects. In the project
there are 2 areas to focus on, “SearchIndexController.cs” which does the
indexing (for brevity the controller contains the actual logic) and
Views\SearchIndex\Index.cshtml which holds the view (everything else in the
project is regular plumbing code).
Also, for simplicity this project indexes documents every
time the page loads, in reality you would probably want to index when documents
are added or changed. Also if you do
index from a web page, it is better to index
documents asynchronously.
Indexing using meta-data
In this example a simple List<> is used to hold the
meta-data, but it could also have come from a database.
public ActionResult Index()
{
string appURL =
string.Format("{0}://{1}{2}", Request.Url.Scheme,
Request.Url.Authority, Url.Content("~"));
/* NOTE this sample depends upon
external URLs, please double check the URLs are accessible before attempting to
index them. */
//Create some meta data that we
want to index - typically this would come from a database.
var documentsToIndex = new
List<DocumentMetaData>();
documentsToIndex.Add(new
DocumentMetaData {
Author = "Crowd",
Url =
"https://en.wikipedia.org/wiki/John_Smith",
Type = "Web page",
Description="Wikipedia page about
John Smith",
Title="Wikipedia article
about John Smith" });
documentsToIndex.Add(new
DocumentMetaData {
Author = "Unknown
Author",
Url = appURL+"/docs/1.pdf",
Type = "PDF",
Description="PDF about the
explorer, John Smith",
Title="PDF about John
Smith" });
//Create a configuration object
using the index directory path where we want to store the index files.
var config = new
Keyoti.SearchEngine.Configuration{IndexDirectory =
System.Web.Hosting.HostingEnvironment.MapPath("~/App_Data/Index" )};
//Force the documents to be
reindexed even if they haven't changed, just for testing.
config.IgnoreLastModifiedDate =
true;
config.UseFileSizeToIdentifyChange
= false;
DocumentIndex documentIndex = null;
try {
documentIndex = new
DocumentIndex(config);
//Iterate our meta data, and
index the documents
foreach(DocumentMetaData dm in
documentsToIndex){
var doc = new
Keyoti.SearchEngine.Documents.Document(dm.Url, config);
var dt = doc.ReadText();
//Add our meta data to the
indexed content
dt.AppendText(dm.Author+" ", config);
dt.AppendText(dm.Description+" ", config);
doc.Title = dm.Title;
//Add extra data to the
CustomData so we can use it as the results are generated
doc.CustomData="Title="+dm.Title+"&Type="+dm.Type;
documentIndex.AddDocument(doc);
}
} finally
{
if(documentIndex!=null)
documentIndex.Close();
}
return View();
}
In the code above the meta-data is appended to the indexed
text using
var doc = new
Keyoti.SearchEngine.Documents.Document(dm.Url, config);
var dt = doc.ReadText();
//Add our meta data to the
indexed content
dt.AppendText(dm.Author+"
", config);
dt.AppendText(dm.Description+"
", config);
And we want to use the title from the meta-data;
doc.Title = dm.Title;
By setting the .CustomData property to a (URL encoding based)
formatted string, we can pull out the file type (Type) as the results are
shown.
doc.CustomData="Title="+dm.Title+"&Type="+dm.Type;
To show the results, some templating is required in order to
show the filetype
<div
id="sew_searchResultControl">
<div id="sew_resultHeader">
</div>
<div id="sew_resultList">
<div
id="sew_resultItemTEMPLATE" class="sew_resultItem">
<span
class="sew_resultItemLink"><a
href="${UriStringWithKeywords}">${Title}</a></span>
<span
class="sew_resultItemSummary">${Summary}</span>
<span
class="sew_previewResultWrapper">
<img alt="Click to
preview the document text"
src="/Keyoti_SearchEngine_Web_Common/ResultPreview_Expander_Closed.png"
onclick="keyotiSearchResultPreviewer.toggleResultPreview(this,
'${UriStringAsStored}',
'/Keyoti_SearchEngine_Web_Common/ResultPreview_Expander_Closed.png',
'/Keyoti_SearchEngine_Web_Common/ResultPreview_Expander_Opened.png')"
/>
<span
class="sew_previewResultContent">Loading document...</span>
</span>
<div style="clear:both;
height:1px;"></div>
<span
class="sew_resultItemURL">${UriString}</span>
<span
class="sew_location">${Location}</span>
<span
class="sew_location">${Content}</span>
<span
class="${CustomDataDictionary.TypeDisplayClass}">File type:
${CustomDataDictionary.Type}</span>
</div>
</div>
<div
id="sew_resultFooter"></div>
</div>
|