Module ruya :: Class CrawlScope

Class CrawlScope

object --+
         |
        CrawlScope

Ruya's configuration object to determine which scope will be used for a website while crawling

Instance Methods

[hide private]

Inherited from object: __delattr__, __getattribute__, __hash__, __init__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Class Methods

[hide private]

boolean

isvalidscope(cls, scope)
Checks if the scope is valid - one of allowed scopes in CrawlScope

source code

Class Variables

[hide private]

SCOPE_ALL = 100000
NOT SUPPORTED - Maybe next version?

SCOPE_HOST = 100001
For url http://domain.ext/support/index.htm crawl pages only under host http://domain.ext/

SCOPE_DOMAIN = 100002
For url http://domain.ext/support/index.htm crawl pages from domains and sub-domains under domain domain.ext - http://domain.ext/, http://second.domain.ext/, http://third.domain.ext/

SCOPE_PATH = 100003
For url http://domain.ext/support/index.html crawl pages only under folder - http://domain.ext/support/

Properties

[hide private]

Inherited from object: __class__

Method Details

[hide private]

isvalidscope(cls, scope)
Class Method

source code

Checks if the scope is valid - one of allowed scopes in CrawlScope

Parameters:

scope (number) - A valid crawl scope.

Returns: boolean

True is crawl scope is valid, False otherwise.

Class CrawlScope

isvalidscope(cls, scope) Class Method

isvalidscope(cls, scope)
Class Method