Home | Trees | Indices | Help |
|
---|
|
object --+ | Uri
|
|||
None |
|
||
|
|||
|
|||
|
|||
|
|||
|
|||
Uri |
|
||
boolean |
|
||
boolean |
|
||
boolean |
|
||
boolean |
|
||
str |
|
||
str |
|
||
boolean |
|
||
boolean |
|
||
Inherited from |
|
|||
url Url with querystring removed. |
|||
hash SHA has for url. |
|
|||
parts Returns a tuple consisting of various parts of a url. |
|||
domainurl Returns the domain found after analyzing the url. |
|||
robotstxturl Returns the robots.txt path for a url. |
|||
domains Returns valid domains found after analyzing the url. |
|||
hashes Returns valid SHA hashes for url string. |
|||
Inherited from |
|
|
|
|
|
See Also: urlparse |
|
|
See Also: issamedomain |
Note: Sub-domain is simply determined if example.domain.ext ends in domain.ext. |
|
|
|
|
|
|
partsReturns a tuple consisting of various parts of a url.
See Also: urlparse |
domainurlReturns the domain found after analyzing the url.
|
robotstxturlReturns the robots.txt path for a url. Usually, http://domain.ext/ has robots.txt placed in it's root as http://domain.ext/robots.txt.
|
domainsReturns valid domains found after analyzing the url. http://www.domain.ext/ and http://domain.ext/ both point to the same domain domain.ext, so they must be considered same. This function assists the crawler when determining if two urls are from same domain.
|
hashesReturns valid SHA hashes for url string. Two different hashes will be returned if url domain starts with www as http://www.domain.ext/ and http://domain.ext/ both point to the same domain domain.ext.
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0beta1 on Sun May 06 20:47:05 2007 | http://epydoc.sourceforge.net |