Module ruya :: Class Crawler :: Class EventArgs
[hide private]
[frames] | no frames]

Class EventArgs

source code

object --+
         |
        Crawler.EventArgs
Known Subclasses:
CrawlEventArgs, UriIncludeEventArgs

Ruya's Crawler provides event-based callback mechanism during crawl to allow clients to have more control over which urls are crawled. The events use this object for event communication.

Example:
  # Client side event handler
  def beforecrawl(caller, eventargs):
     # Some process
     # ...

     # Url is already crawled before (might be determined based on a simple dictionary caching mechanism)
     eventargs.ignore= False  # Request Ruya to ignore this url during crawl

     # ...

  def aftercrawl(caller, eventargs):
     # Some process
     # ...

     # Some error occurred during saving crawled data (might be a file or database), abort further crawling
     eventargs.cancel= True       # Cancel crawling completely

     # ...



See Also: Crawler.bind

Instance Methods [hide private]
None
__init__(self, level=0, args=[])
Constructor.
source code

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Instance Variables [hide private]
  level
Crawl level on which the event was raised.
  args
Additional arguments to be passed back to event handler.
  cancel
Flag that can be set to True by clients to cancel entire crawl process - Use with caution.
  ignore
Flag that can be set to True by clients to ignore a url from crawling, and continue next url.
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, level=0, args=[])
(Constructor)

source code 
Constructor.
Parameters:
  • level (number.) - Crawl level on which the event was raised.
  • args (list) - Additional arguments to be passed back to event handler.
Returns: None
None
Overrides: object.__init__