Title Back Colour Keyoti Title Line Title Curve
Blue Box Top

Event Handler to Boost Weigh of a document - SearchUnit - Forum

Welcome Guest Search | Active Topics | Log In | Register

Options
danielvbdash
#1 Posted : Wednesday, January 14, 2015 10:53:54 PM
Rank: Member

Groups: Registered

Joined: 1/14/2015
Posts: 12
Hi everyone

I have followed this article here but I got no success (http://keyoti.com/products/search/dotNetWeb/HtmlHelp2012/UserGuide/Examples/Boosting%20Page%20&%20Section%20Weight.htm)

I created 2 files:
./important/a.html
./b.html

a.html content:
holiday 2015 test

b.html content:
holiday 2015 test

I created a project using the Plug-in 'Create a Project' option.

Per the User Guide Example the file a.html should appear as the first result. But it is not happening.

My code:
Code:

public void dispatcher_Action(object sender, ActionEventArgs e)
        {
            //Log everything - comment this line after debugging to optimize speed.
            Keyoti.SearchEngine.DataAccess.Log.WriteLogEntry("Plug-in Template Project", e.ActionData.Name.ToString(), conf);
            Document document;
            if (e.ActionData.Name == ActionName.CalculateWordRelevancies)
            {
                System.Collections.ArrayList words = e.ActionData.Data as System.Collections.ArrayList;
                document = sender as Document;
                if (document.Uri.AbsolutePath.Contains("/important/"))
                {
                    foreach (Word word2 in words)
                    {
                        word2.Weight *= 2; //boost the weight of all words by 2, if they're in the /important/ subdir.
                    }
                }
            }
        }
Jim
#2 Posted : Thursday, January 15, 2015 1:12:18 AM
Rank: Advanced Member

Groups: Administrators, Registered

Joined: 8/13/2004
Posts: 2,667
Location: Canada
Hi, if you enable logging, you should see (after indexing) a file called "Plug-in Template Project.txt" in the index directory, do you see it and what content do you see?

If you don't see it, then it means that the plugin is not working at all, please look for a file called CentralEventHandler.txt and send the contents (probably the last lines will include an error).

If you do see "Plug-in Template Project.txt", then I would add extra log lines in that plugin method, to output things like document.Uri.AbsolutePath.

Also, to debug you can open your plugin project and using the VS debug menu, attach to the indexer process - put a break point in your method (above) and it should be hit during indexing, you can then step through and see what is going on.



-your feedback is helpful to other users, thank you!

-your feedback is helpful to other users, thank you!


danielvbdash
#3 Posted : Thursday, January 15, 2015 11:43:01 AM
Rank: Member

Groups: Registered

Joined: 1/14/2015
Posts: 12
Hey Jim, thanks.

But I am still having problems with that.
Look my code:

Code:

public class ExternalEventHandler
    {
        IEventDispatcher dispatcher;
        Configuration conf;

        public ExternalEventHandler(IEventDispatcher dispatcher, Configuration conf)
        {
            Keyoti.SearchEngine.DataAccess.Log.WriteLogEntry("Plug-in Template Project", "Initialized 2", conf);
            dispatcher.Action += new ActionEventHandler(dispatcher_Action);
            dispatcher.NeedObject += new NeedObjectEventHandler(dispatcher_NeedObject);
            this.dispatcher = dispatcher;
            this.conf = conf;
        }

        public void DetachHandlers()
        {
            if (dispatcher != null)
            {
                dispatcher.Action -= new ActionEventHandler(dispatcher_Action);
                dispatcher.NeedObject -= new NeedObjectEventHandler(dispatcher_NeedObject);
            }
        }

        public void dispatcher_Action(object sender, ActionEventArgs e)
        {

            Keyoti.SearchEngine.DataAccess.Log.WriteLogEntry("Plug-in Template Project", "Action name: " +e.ActionData.Name.ToString(), conf);
            Document document;
            if (e.ActionData.Name == ActionName.CalculateWordRelevancies)
            {
                Keyoti.SearchEngine.DataAccess.Log.WriteLogEntry("Plug-in Template Project", "entered IF", conf);
                System.Collections.ArrayList words = e.ActionData.Data as System.Collections.ArrayList;
                document = sender as Document;
                Keyoti.SearchEngine.DataAccess.Log.WriteLogEntry("Plug-in Template Project", document.Uri.AbsolutePath, conf);
                if (document.Uri.AbsolutePath.Contains("/important/"))
                {
                    Keyoti.SearchEngine.DataAccess.Log.WriteLogEntry("Plug-in Template Project", document.Title, conf);
                    foreach (Word word2 in words)
                    {
                        Keyoti.SearchEngine.DataAccess.Log.WriteLogEntry("Plug-in Template Project", "before: " +word2.Weight.ToString(), conf);
                        word2.Weight *= 2; //boost the weight of all words by 2, if they're in the /important/ subdir.
                        Keyoti.SearchEngine.DataAccess.Log.WriteLogEntry("Plug-in Template Project", "after: " + word2.Weight.ToString(), conf);
                    }
                }
            }
        }

        public void dispatcher_NeedObject(object sender, NeedObjectEventArgs e)
        {
            //leave blank unless needed, will use default objects.
        }
    }


Here you can see my "Plug-in Template Project.txt" after I Re-import the Source.


01/15/2015 08:24 Initialized
01/15/2015 08:25 GetIndexableSourceRecords
01/15/2015 08:25 ImportStarted
01/15/2015 08:25 GetIndexableSourceRecords
01/15/2015 08:25 RequestingUri
01/15/2015 08:25 ResponseFromServerReceived
01/15/2015 08:25 RequestingUri
01/15/2015 08:25 ResponseFromServerReceived
01/15/2015 08:25 UseDocumentEncoding
01/15/2015 08:25 ReadingText
01/15/2015 08:25 CalculateWordRelevancies


I believe it should be like this:


01/15/2015 08:24 Initialized 2
01/15/2015 08:25 Action Name: GetIndexableSourceRecords
01/15/2015 08:25 Action Name: ImportStarted
01/15/2015 08:25 Action Name: GetIndexableSourceRecords
01/15/2015 08:25 Action Name: RequestingUri
01/15/2015 08:25 Action Name: ResponseFromServerReceived
01/15/2015 08:25 Action Name: RequestingUri
01/15/2015 08:25 Action Name: ResponseFromServerReceived
01/15/2015 08:25 Action Name: UseDocumentEncoding
01/15/2015 08:25 Action Name: ReadingText
01/15/2015 08:25 Action Name: CalculateWordRelevancies


I saw that some times my code is being logged:

1/15/2015 08:18 ImportFinished
01/15/2015 08:19 QueryExpressionGroupCreated
01/15/2015 08:19 GetWordVariations
01/15/2015 08:19 QueryExpressionGroupCreated
01/15/2015 08:19 GetWordVariations
01/15/2015 08:19 QueryExpressionGroupCreated
01/15/2015 08:19 GetWordVariations
01/15/2015 08:19 QueryExpressionGroupCreated
01/15/2015 08:19 GetWordVariations
01/15/2015 08:19 Initialized 2
01/15/2015 08:19 Action name: QueryExpressionGroupCreated
01/15/2015 08:19 Action name: GetWordVariations
01/15/2015 08:19 Action name: GetIndexableSourceRecords
01/15/2015 08:19 Action name: ResultItemsFinalized


How can I add images to a post here?

Thank you in advance
Jim
#4 Posted : Thursday, January 15, 2015 2:07:00 PM
Rank: Advanced Member

Groups: Administrators, Registered

Joined: 8/13/2004
Posts: 2,667
Location: Canada
You could use http://imgur.com and post a link to the img.

Can you create a fresh folder (index directory), copy all .XML files and the plugin DLL (if it is in your current index dir) to the new folder. Then run the indexer import.

Could you email me all of your .txt files from the index dir please, via support at keyoti.com

Reason I ask you to do this is because what you've posted in the logs is very odd, I can't image how you would get "Initialized" instead of "Initialized 2" unless somehow the plugin was switched to the old plugin version. What is odder is that it happens AFTER the log of "Initialized 2"

Thanks
Jim

-your feedback is helpful to other users, thank you!

-your feedback is helpful to other users, thank you!


danielvbdash
#5 Posted : Sunday, January 18, 2015 7:54:06 PM
Rank: Member

Groups: Registered

Joined: 1/14/2015
Posts: 12
Hello Jim,

I tried creating a fresh folder for the index directory, copied all the .XML files and then ran the Re-import indexer, but it is still not working.

I sent you all the .txt files and the VS project by e-mail. All files are from this fresh folder.

My Keyoti version is 2012.5.14.423.

Thanks in advance.
Jim
#6 Posted : Monday, January 19, 2015 12:44:44 AM
Rank: Advanced Member

Groups: Administrators, Registered

Joined: 8/13/2004
Posts: 2,667
Location: Canada
Thanks, it's still logging "Initialized" so there can be no other explanation I think than it using an older plugin DLL.

-Are you sure the one at ....IndexDirectory\Plugins\Ranking\CS-PluginTemplate.dll is up to date? Full path is in "CentralEventDispatcher.txt" log file.

-The assembly version is set to [assembly: AssemblyVersion("1.0.0.0")] in AssemblyInfo, have you tried creating a new version number?

-Do you have the plugin DLL in your GAC too? If so, you must change the version number, otherwise it will just use the older version.

Thanks
Jim



-your feedback is helpful to other users, thank you!

-your feedback is helpful to other users, thank you!


danielvbdash
#7 Posted : Monday, January 19, 2015 8:02:48 AM
Rank: Member

Groups: Registered

Joined: 1/14/2015
Posts: 12
I will try to run your options Jim.

But I need to say that if you try to search for 'Initialized 2' in the "Plug-in Template Project.txt" file you will find some occurrences, and those occurrences appear after the import is finished.

Can you confirm that?
Jim
#8 Posted : Monday, January 19, 2015 2:58:48 PM
Rank: Advanced Member

Groups: Administrators, Registered

Joined: 8/13/2004
Posts: 2,667
Location: Canada
Not in the log you sent yesterday, there are only entries for "Initialized", I just double checked.

-your feedback is helpful to other users, thank you!

-your feedback is helpful to other users, thank you!


danielvbdash
#9 Posted : Wednesday, January 21, 2015 10:22:15 PM
Rank: Member

Groups: Registered

Joined: 1/14/2015
Posts: 12
Hi Jim,

You were right, the problem was the Assembly version.
Now the plugin seems to be running pretty well, but... My documents are still not appearing as first results.

I sent you an e-mail with the files attached.

Thanks.
Jim
#10 Posted : Thursday, January 22, 2015 12:07:29 AM
Rank: Advanced Member

Groups: Administrators, Registered

Joined: 8/13/2004
Posts: 2,667
Location: Canada
You are doubling the weight of words in docs under the /important/ path - but it is still possible that the PDF or other docs have higher rank because the word appears more than twice the number of times.

Maybe try word2.Weight *= 20;

instead.

The test search form in the index manager will show you the weights of each document.

BTW can you index things that you do not have to white out, so I can see what words are in the docs?

Thanks
Jim

-your feedback is helpful to other users, thank you!

-your feedback is helpful to other users, thank you!


danielvbdash
#11 Posted : Thursday, January 22, 2015 12:14:40 PM
Rank: Member

Groups: Registered

Joined: 1/14/2015
Posts: 12
Very good Jim!

It worked :)

Many thanks for supporting!
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.




About | Contact | Site Map | Privacy Policy

Copyright © 2002- Keyoti Inc.