Home | Trees | Indices | Help |
|
---|
|
object --+ | Document
|
|||
DocumentError Ruya's document error object represents crawl error occurred during crawl of a Document. |
|
|||
None |
|
||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
Inherited from |
|
|||
headers HTTP headers for this document. |
|||
_uri HTTP url of this document. |
|||
title Title of this document obtained from <title> tag. |
|||
description Description of this document obtained from <meta name=description...> tag. |
|||
keywords Description of this document obtained from <meta name=keywords...> tag. |
|||
lastmodified Last-modified header for this document - Can be used to avoid recrawling document if contents are not changed. |
|||
etag Etag header for this document - Can be used to avoid recrawling document if contents are not changed. |
|||
httpstatus HTTP status obtained while crawling this document. |
|||
httpreason HTTP reason obtained while crawling this document. |
|||
contenttype Content-type header for this document. |
|||
contentencoding Content-encoding header for this document. |
|||
_zippedcontent gzipped contents for this document. |
|||
_isZipped Internal flag to remember if gzip operation is already done for plain contents. |
|||
_bzippedcontent bz2 archived contents for this document. |
|||
_isBzipped Internal flag to remember if bz2 archive operation is already done for plain contents. |
|||
_plaincontent Plain contents for this document. |
|||
links All crawlable links found in this document. |
|||
redirecturi Actual url of this document if this document was redirected from uri. |
|||
redirects Number of times this document was redirected from uri. |
|||
redirecturis All redirected urls (matching redirects) which were crawled while crawling document uri. |
|||
error DocumentError object if error occurred during crawl for this document. |
|||
_cleandata Regular expression to match newlines |
|
|||
uri Returns the url for this document. |
|||
normalizedlinks Returns all links from this document converted to absolute links with reference to document's L(uri). |
|||
zippedcontent Returns gzipped content for this document. |
|||
bzippedcontent Return bz2 archived contents for this document. |
|||
plaincontent Returns the plain html content for this document. |
|||
hash Returns the SHA hash for plain contents of this document. |
|||
Inherited from |
|
|
Note: The content is gzipped with the maximum compression level of 9. See Also: gzip |
Note: The content is unzipped assuming the compression level of 9. See Also: gzip |
Note: The content is bz2 archived with the maximum compression level of 9. See Also: bz2 |
See Also: bz2 |
Note: Empty lines are removed from the plain contents. |
|
uriReturns the url for this document.
|
normalizedlinksReturns all links from this document converted to absolute links with reference to document's L(uri).
|
zippedcontentReturns gzipped content for this document.
Note: The content is gzipped with the maximum compression level of 9. See Also: gzip |
bzippedcontentReturn bz2 archived contents for this document.
Note: The content is bz2 archived with the maximum compression level of 9. See Also: bz2 |
plaincontentReturns the plain html content for this document.
|
hashReturns the SHA hash for plain contents of this document.
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0beta1 on Sun May 06 20:47:05 2007 | http://epydoc.sourceforge.net |