Title Back Colour Keyoti Title Line Title Curve
Blue Box Top

Error during re-import process. - SearchUnit - Forum

Welcome Guest Search | Active Topics | Log In | Register

#1 Posted : Sunday, May 27, 2018 2:56:54 PM
Rank: Newbie

Groups: Registered

Joined: 5/27/2018
Posts: 1
Location: London
Hi, my client has just found that this error's been happening for a month or so during the overnight re-import. It's been fine for ages before that.

(In \bin there's Keyoti.RapidSpellWeb.ASP.NETv2.dll with version and other dlls such as Keyoti4.SearchEngine.Core.dll with version 2012.5.14.423.)

Here's the error reported in KEYOTI_WINDOWS_SERVICE.txt:

05/26/2018 00:00 Reimport started
05/26/2018 02:21 Exception caught in the import process: Invalid Unicode code point found at index 3.
Parameter name: strInput StackTrace: at Keyoti.SearchEngine.Index.DocumentIndex.a(Document A_0, a A_1)
at Keyoti.SearchEngine.Index.DocumentIndex.a(Object A_0, AddDocumentEventArgs A_1)
at Keyoti.SearchEngine.Index.IndexableSources.IndexableSource.OnAddedDocument(AddDocumentEventArgs e)
at Keyoti.SearchEngine.Index.IndexableSources.FileSysDocumentStore.a(IndexableSourceRecord A_0, ArrayList A_1, String A_2)
at Keyoti.SearchEngine.Index.IndexableSources.FileSysDocumentStore.a(Boolean A_0, ArrayList A_1, ArrayList A_2, String A_3, ArrayList A_4, ArrayList A_5, String

A_6, Boolean A_7, IndexableSourceRecord A_8, ArrayList A_9)
at Keyoti.SearchEngine.Index.IndexableSources.FileSysDocumentStore.a(Boolean A_0, ArrayList A_1, ArrayList A_2, String A_3, ArrayList A_4, ArrayList A_5, String

A_6, Boolean A_7, IndexableSourceRecord A_8, ArrayList A_9)
at Keyoti.SearchEngine.Index.IndexableSources.FileSysDocumentStore.Import(Boolean isRecursive, ArrayList targetMatches, ArrayList noRecurseMatches, String

loginURL, IndexableSourceRecord record)
at Keyoti.SearchEngine.Index.IndexableSources.FileSystemFolderBasedSource.Import()
at Keyoti.SearchEngine.Index.DocumentIndex.Import(IndexableSourceRecord indexableSourceRecord, IIndexableSource dataSource)
at Keyoti.SearchEngine.WindowsService.Task.a(Object A_0, ElapsedEventArgs A_1)
05/26/2018 02:21 Caused by reader threshold: False
05/26/2018 17:04 Service task (#0) started on index directory: E:\IndexDirectory
05/26/2018 17:04 Start hour is set for 0 in configuration.xml
05/26/2018 17:04 The first crawl/build should be performed after 415.547723733333 minutes

Any idea what might be going on?
#2 Posted : Monday, May 28, 2018 5:34:59 PM
Rank: Advanced Member

Groups: Administrators, Registered

Joined: 8/13/2004
Posts: 2,669
Location: Canada
Hi John, this means that it's trying to index a file which contains an invalid sequence of bytes (invalid with regard to Unicode encoding).

It shouldn't be stopping the indexing however, it will just move past that file.

If you want to dig deeper, the first thing to do is figure out which file the problem is occurring in. You could enable logging in the configuration (if not already), let it try to index again and then collect all .txt files from e:\indexdirectory. Document.txt or one of the *Parser.txt log files will probably contain a similar exception as above, and from that determine the file with the problem.

-your feedback is helpful to other users, thank you!

Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

About | Contact | Site Map | Privacy Policy

Copyright © 2002- Keyoti Inc.