Crawls all URLs in urlList, and adds found URLs to discoveredURLs list (whether they are known to the document index or not, and not including urls in urlList)
Declaration Syntax
C# | C# | Visual Basic | Visual Basic | Visual C++ | Visual C++ | F# | F# |
public virtual void DiscoverLinkedURLs( ArrayList crawlUriList, ArrayList discoveredURLs, ArrayList pathMatchesToBeIgnored, ArrayList pathMatchesToBeIncluded, ArrayList existingDocumentURLs, bool indexNow, ArrayList addedURLs, WebSiteSpider..::..DiscoveredDocumentHandler discoveredDocumentHandler )
public virtual void DiscoverLinkedURLs( ArrayList crawlUriList, ArrayList discoveredURLs, ArrayList pathMatchesToBeIgnored, ArrayList pathMatchesToBeIncluded, ArrayList existingDocumentURLs, bool indexNow, ArrayList addedURLs, WebSiteSpider..::..DiscoveredDocumentHandler discoveredDocumentHandler )
Public Overridable Sub DiscoverLinkedURLs ( crawlUriList As ArrayList, discoveredURLs As ArrayList, pathMatchesToBeIgnored As ArrayList, pathMatchesToBeIncluded As ArrayList, existingDocumentURLs As ArrayList, indexNow As Boolean, addedURLs As ArrayList, discoveredDocumentHandler As WebSiteSpider..::..DiscoveredDocumentHandler )
Public Overridable Sub DiscoverLinkedURLs ( crawlUriList As ArrayList, discoveredURLs As ArrayList, pathMatchesToBeIgnored As ArrayList, pathMatchesToBeIncluded As ArrayList, existingDocumentURLs As ArrayList, indexNow As Boolean, addedURLs As ArrayList, discoveredDocumentHandler As WebSiteSpider..::..DiscoveredDocumentHandler )
public: virtual void DiscoverLinkedURLs( ArrayList^ crawlUriList, ArrayList^ discoveredURLs, ArrayList^ pathMatchesToBeIgnored, ArrayList^ pathMatchesToBeIncluded, ArrayList^ existingDocumentURLs, bool indexNow, ArrayList^ addedURLs, WebSiteSpider..::..DiscoveredDocumentHandler^ discoveredDocumentHandler )
public: virtual void DiscoverLinkedURLs( ArrayList^ crawlUriList, ArrayList^ discoveredURLs, ArrayList^ pathMatchesToBeIgnored, ArrayList^ pathMatchesToBeIncluded, ArrayList^ existingDocumentURLs, bool indexNow, ArrayList^ addedURLs, WebSiteSpider..::..DiscoveredDocumentHandler^ discoveredDocumentHandler )
abstract DiscoverLinkedURLs : crawlUriList : ArrayList * discoveredURLs : ArrayList * pathMatchesToBeIgnored : ArrayList * pathMatchesToBeIncluded : ArrayList * existingDocumentURLs : ArrayList * indexNow : bool * addedURLs : ArrayList * discoveredDocumentHandler : WebSiteSpider..::..DiscoveredDocumentHandler -> unit override DiscoverLinkedURLs : crawlUriList : ArrayList * discoveredURLs : ArrayList * pathMatchesToBeIgnored : ArrayList * pathMatchesToBeIncluded : ArrayList * existingDocumentURLs : ArrayList * indexNow : bool * addedURLs : ArrayList * discoveredDocumentHandler : WebSiteSpider..::..DiscoveredDocumentHandler -> unit
abstract DiscoverLinkedURLs : crawlUriList : ArrayList * discoveredURLs : ArrayList * pathMatchesToBeIgnored : ArrayList * pathMatchesToBeIncluded : ArrayList * existingDocumentURLs : ArrayList * indexNow : bool * addedURLs : ArrayList * discoveredDocumentHandler : WebSiteSpider..::..DiscoveredDocumentHandler -> unit override DiscoverLinkedURLs : crawlUriList : ArrayList * discoveredURLs : ArrayList * pathMatchesToBeIgnored : ArrayList * pathMatchesToBeIncluded : ArrayList * existingDocumentURLs : ArrayList * indexNow : bool * addedURLs : ArrayList * discoveredDocumentHandler : WebSiteSpider..::..DiscoveredDocumentHandler -> unit
Parameters
- crawlUriList (ArrayList)
- List of string objects specifying Uris to be crawled
- discoveredURLs (ArrayList)
- ArrayList of string objects with new URLs found through crawling
- pathMatchesToBeIgnored (ArrayList)
- Any strings which will be used to determine if a file should be ignored based on its Url path.
- pathMatchesToBeIncluded (ArrayList)
- Any strings which will be used to determine if a file should exclusively be included based on its Url path.
- existingDocumentURLs (ArrayList)
- List of URLs of existing documents in the index
- indexNow (Boolean)
- Whether to index these documents now.
- addedURLs (ArrayList)
- ArrayList in which added URLs will be added
- discoveredDocumentHandler (WebSiteSpider..::..DiscoveredDocumentHandler)
- Delegate to be informed as documents are found
Remarks
Unlike Crawl, this method doesn't add new URLs to the 'database'.
Crawl uses this method, overriding this method will affect Crawl.
This override is most suitable for polling discoveredURLs from another thread, to track progress.
Assembly: Keyoti4.SearchEngine.Core (Module: Keyoti4.SearchEngine.Core.dll) Version: 2015.6.15.120