Identify word weights to CalculateWordRelevancies - SearchUnit

Welcome Guest

Search | Active Topics | Log In | Register

Forum » Technical Support Questions » SearchUnit » Identify word weights to CalculateWordRelevancies

Options

DMacy

#1 Posted : Friday, June 9, 2017 8:21:15 PM

Rank: Advanced Member

Groups: Registered

Joined: 9/1/2010
Posts: 136

In my DispatcherAction event handler, for the CalculateWordRelevancies action, I need to be able to identify different words in the word collection as having differing weights. Presently I am doing it with the code shown below, where I am using several different markers before and after sections of data, and all words between a pair of markers get the same weight, based on the type of markers. Presumably these markers will never be search words. Is there a better way to delimit the data or to accomplish what I'm trying to do?

Thanks!

Here is a snippet of my code:

Code:

Dim AuthorStartMark As String = "##AS##"
Dim AuthorEndMark As String = "##AE##"
Dim LanguageStartMark As String = "##LS##"
Dim LanguageEndMark As String = "##LE##"
Dim TitleStartMark As String = "##TS##"
Dim TitleEndMark As String = "##TE##"

Dim ProdConfig As New Configuration
ProdConfig.EventHandlerAssemblyPath = IndexPluginFilePath
ProdConfig.IndexDirectory = IndexFolderPath
ProdConfig.Logging = True

Dim ConfigMgr As ConfigurationManager = New ConfigurationManager(IndexFolderPath)
ConfigMgr.SaveSettings(ProdConfig)

Dim ProdIndex As New Index.DocumentIndex(ProdConfig)

'Loop on products to be indexed

For I = 0 To NumProducts - 1

TextStr = ""

TextStr &= TitleStartMark & " " & TitleStr & " " & TitleEndMark & " "
TextStr &= AuthorStartMark & " " & AuthorStr & " " & AuthorEndMark & " "
TextStr &= LanguageStartMark & " " & LanguageStr & " " & LanguageEndMark & " "
TextStr &= CoverStr & " "
TextStr &= OtherFeatures

SummaryStr = CatalogText

CustomData = ProductID.ToString

ProdIndex.AddDocument(New Documents.PreloadedDocument(New Uri(NavigateUrl), TitleStr, TextStr, SummaryStr, Nothing, Nothing, Nothing, CustomData, ProdConfig))

Next I

ProdIndex.Optimize()

ProdIndex.Close()

User Profile
Hide User Posts

Jim

#2 Posted : Sunday, June 11, 2017 5:29:29 AM

Rank: Advanced Member

Groups: Administrators, Registered

Joined: 8/13/2004
Posts: 2,669
Location: Canada

In the current version, the recommended way is with section weight boosting: https://keyoti.com/produ...%20Section%20Weight.htm
They're html comments.

If you're not using the current version then when you have is probably fine. If you're concerned about people accidentally searching for ##TE## then probably the simplest way to avoid that is to make the markers longer, eg. ##WordWeightMarker_AS##, which is even less likely to be searched for by accident.

Jim

-your feedback is helpful to other users, thank you!

WWW

User Profile
Hide User Posts

DMacy

#3 Posted : Monday, June 12, 2017 2:49:43 PM

Rank: Advanced Member

Groups: Registered

Joined: 9/1/2010
Posts: 136

Thanks, Jim! That's helpful. In my case, the words to be indexed are coming entirely from my string variable TextStr, not from an HTML or XML file. Still, could I format TextStr as an HTML snippet, with a "<p>" tag at the beginning and a "</p>" tag at the end, and thereby use the HTML comment method of embedding different weighting factors?

Dan

User Profile
Hide User Posts

Jim

#4 Posted : Tuesday, June 13, 2017 4:57:17 AM

Rank: Advanced Member

Groups: Administrators, Registered

Joined: 8/13/2004
Posts: 2,669
Location: Canada

Sorry Dan, I didn't look at your code closely enough to see you were using PreloadedDocument. Unfortunately the boost comments I mentioned won't work with it.
Using your existing method is the way to go.
Jim

-your feedback is helpful to other users, thank you!

WWW

User Profile
Hide User Posts

DMacy

#5 Posted : Tuesday, June 13, 2017 11:44:57 AM

Rank: Advanced Member

Groups: Registered

Joined: 9/1/2010
Posts: 136

Okay. Thanks!

Dan

User Profile
Hide User Posts

Forum Jump

You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

Watch this topic
Print this topic

Normal
Threaded

Identify word weights to CalculateWordRelevancies - SearchUnit - Forum