Keyoti SearchUnit API Docs
GenericIFilterParser Class
API DocumentationKeyoti.SearchEngine.DocumentsGenericIFilterParser
Keyoti SearchUnit v6
A generic, IFilter based document parser, which can read text from any document for which an IFilter is installed on the system.
Declaration Syntax
C#C#Visual BasicVisual BasicVisual C++Visual C++F#F#
public class GenericIFilterParser : TxtDocumentParser
public class GenericIFilterParser : TxtDocumentParser
Public Class GenericIFilterParser
	Inherits TxtDocumentParser
Public Class GenericIFilterParser
	Inherits TxtDocumentParser
public ref class GenericIFilterParser : public TxtDocumentParser
public ref class GenericIFilterParser : public TxtDocumentParser
type GenericIFilterParser =  
    class
        inherit TxtDocumentParser
    end
type GenericIFilterParser =  
    class
        inherit TxtDocumentParser
    end
Members
All MembersConstructorsMethodsProperties



IconMemberDescription
GenericIFilterParser(Configuration, String)
New instance

Configuration
Gets the instance of the Configuration class that holds the settings to be used.
(Inherited from Parser.)
CopyStream(Stream) (Inherited from Parser.)
Encoding
The character encoding used in the document Stream, if applicable.
(Inherited from Parser.)
Equals(Object) (Inherited from Object.)
Finalize()()()() (Inherited from Object.)
GetFilenameFooter(Uri)
Creates a footer with filename info from the Uri
(Inherited from Parser.)
GetHashCode()()()() (Inherited from Object.)
GetNextWord(String)
Returns the next 'word' in rawBody, is iterative, so subsequent calls move to consecutive words.
(Inherited from Parser.)
GetType()()()() (Inherited from Object.)
GetWordsInUri(Uri)
Returns list of words as strings in an ArrayList, that are in the Uri
(Inherited from Parser.)
IsCurrentWordInTitle()()()()
Returns whether the word last returned by GetNextWord is part of the title.
(Inherited from Parser.)
IsInIgnoredRegion(ArrayList)
Determines whether current word (at wordStart) is in an ignored region.
(Inherited from Parser.)
IsStreamNeeded()()()() Obsolete.
Whether the parser would need a stream to be passed to it in order to perform a ReadText or ReadLinks operation.
(Inherited from Parser.)
MemberwiseClone()()()() (Inherited from Object.)
ParseWords(String, ArrayList, WordCollection, StringBuilder, ArrayList)
Parses rawBody into descrete Word objects and places them in readDocumentWords.
(Inherited from Parser.)
PreprocessBreakChunk(String)
Applies any required processing to a chunk of text that typically forms either a word or whitespace block.
(Inherited from Parser.)
ProcessWordsToFinalIndexedList(WordCollection, Boolean)
Processes the list of all words found in the document and returns a list that should be index.
(Inherited from Parser.)
ProcessWordsToFinalIndexedList(WordCollection, Boolean, ArrayList)
Processes the list of all words found in the document and returns a list that should be index.
(Inherited from Parser.)
Read(Stream, Uri, Encoding)
Reads a document and returns an object holding it's text and any links.
(Overrides TxtDocumentParser.Read(Stream, Uri, Encoding).)
Read(String, Uri, Encoding)
Read the document
(Inherited from TxtDocumentParser.)
ReadLinks(Stream, Encoding) Obsolete.
Reads links to other pages.
(Inherited from Parser.)
ReadText(Stream, Uri, Encoding) Obsolete.
Reads text and returns list of words and title
(Inherited from Parser.)
ResetWordPointers()()()()
Resets the current word being processed.
(Inherited from Parser.)
ToString()()()() (Inherited from Object.)
TruncateWordWithRepeatedChar(String)
Removes repeated non-letters from word.
(Inherited from Parser.)
WordEnd
The current word's end.
(Inherited from Parser.)
WordStart
The current word's start.
(Inherited from Parser.)
Remarks
Used for .XLS (Excel) and .PPT (Powerpoint) files.
Inheritance Hierarchy
Object
Parser
 TxtDocumentParser
  GenericIFilterParser

Assembly: Keyoti4.SearchEngine.Core (Module: Keyoti4.SearchEngine.Core.dll) Version: 2015.6.15.120