|
Rank: Advanced Member
Groups: Registered
Joined: 9/1/2010 Posts: 136
|
In my DispatcherAction event handler, for the CalculateWordRelevancies action, I need to be able to identify different words in the word collection as having differing weights. Presently I am doing it with the code shown below, where I am using several different markers before and after sections of data, and all words between a pair of markers get the same weight, based on the type of markers. Presumably these markers will never be search words. Is there a better way to delimit the data or to accomplish what I'm trying to do? Thanks! Here is a snippet of my code: Code: Dim AuthorStartMark As String = "##AS##" Dim AuthorEndMark As String = "##AE##" Dim LanguageStartMark As String = "##LS##" Dim LanguageEndMark As String = "##LE##" Dim TitleStartMark As String = "##TS##" Dim TitleEndMark As String = "##TE##" Dim ProdConfig As New Configuration ProdConfig.EventHandlerAssemblyPath = IndexPluginFilePath ProdConfig.IndexDirectory = IndexFolderPath ProdConfig.Logging = True
Dim ConfigMgr As ConfigurationManager = New ConfigurationManager(IndexFolderPath) ConfigMgr.SaveSettings(ProdConfig)
Dim ProdIndex As New Index.DocumentIndex(ProdConfig) 'Loop on products to be indexed
For I = 0 To NumProducts - 1
TextStr = ""
TextStr &= TitleStartMark & " " & TitleStr & " " & TitleEndMark & " " TextStr &= AuthorStartMark & " " & AuthorStr & " " & AuthorEndMark & " " TextStr &= LanguageStartMark & " " & LanguageStr & " " & LanguageEndMark & " " TextStr &= CoverStr & " " TextStr &= OtherFeatures
SummaryStr = CatalogText
CustomData = ProductID.ToString
ProdIndex.AddDocument(New Documents.PreloadedDocument(New Uri(NavigateUrl), TitleStr, TextStr, SummaryStr, Nothing, Nothing, Nothing, CustomData, ProdConfig))
Next I
ProdIndex.Optimize()
ProdIndex.Close()
|
|
Rank: Advanced Member
Groups: Administrators, Registered
Joined: 8/13/2004 Posts: 2,669 Location: Canada
|
In the current version, the recommended way is with section weight boosting: https://keyoti.com/produ...%20Section%20Weight.htm
They're html comments. If you're not using the current version then when you have is probably fine. If you're concerned about people accidentally searching for ##TE## then probably the simplest way to avoid that is to make the markers longer, eg. ##WordWeightMarker_AS##, which is even less likely to be searched for by accident. Jim -your feedback is helpful to other users, thank you!
|
|
Rank: Advanced Member
Groups: Registered
Joined: 9/1/2010 Posts: 136
|
Thanks, Jim! That's helpful. In my case, the words to be indexed are coming entirely from my string variable TextStr, not from an HTML or XML file. Still, could I format TextStr as an HTML snippet, with a "<p>" tag at the beginning and a "</p>" tag at the end, and thereby use the HTML comment method of embedding different weighting factors?
Dan
|
|
Rank: Advanced Member
Groups: Administrators, Registered
Joined: 8/13/2004 Posts: 2,669 Location: Canada
|
Sorry Dan, I didn't look at your code closely enough to see you were using PreloadedDocument. Unfortunately the boost comments I mentioned won't work with it. Using your existing method is the way to go. Jim -your feedback is helpful to other users, thank you!
|
|
Rank: Advanced Member
Groups: Registered
Joined: 9/1/2010 Posts: 136
|
|
|