| 1 | | | | | package LWP; |
| 2 | | | | | |
| 3 | 1 | 2µs | | | $VERSION = "5.834"; |
| 4 | | | | | sub Version { $VERSION; } |
| 5 | | | | | |
| 6 | 1 | 31µs | | | require 5.005; |
| 7 | 1 | 900ns | | | require LWP::UserAgent; # this should load everything you need |
| 8 | | | | | |
| 9 | 1 | 7µs | | | 1; |
| 10 | | | | | |
| 11 | | | | | __END__ |
| 12 | | | | | |
| 13 | | | | | =head1 NAME |
| 14 | | | | | |
| 15 | | | | | LWP - The World-Wide Web library for Perl |
| 16 | | | | | |
| 17 | | | | | =head1 SYNOPSIS |
| 18 | | | | | |
| 19 | | | | | use LWP; |
| 20 | | | | | print "This is libwww-perl-$LWP::VERSION\n"; |
| 21 | | | | | |
| 22 | | | | | |
| 23 | | | | | =head1 DESCRIPTION |
| 24 | | | | | |
| 25 | | | | | The libwww-perl collection is a set of Perl modules which provides a |
| 26 | | | | | simple and consistent application programming interface (API) to the |
| 27 | | | | | World-Wide Web. The main focus of the library is to provide classes |
| 28 | | | | | and functions that allow you to write WWW clients. The library also |
| 29 | | | | | contain modules that are of more general use and even classes that |
| 30 | | | | | help you implement simple HTTP servers. |
| 31 | | | | | |
| 32 | | | | | Most modules in this library provide an object oriented API. The user |
| 33 | | | | | agent, requests sent and responses received from the WWW server are |
| 34 | | | | | all represented by objects. This makes a simple and powerful |
| 35 | | | | | interface to these services. The interface is easy to extend |
| 36 | | | | | and customize for your own needs. |
| 37 | | | | | |
| 38 | | | | | The main features of the library are: |
| 39 | | | | | |
| 40 | | | | | =over 3 |
| 41 | | | | | |
| 42 | | | | | =item * |
| 43 | | | | | |
| 44 | | | | | Contains various reusable components (modules) that can be |
| 45 | | | | | used separately or together. |
| 46 | | | | | |
| 47 | | | | | =item * |
| 48 | | | | | |
| 49 | | | | | Provides an object oriented model of HTTP-style communication. Within |
| 50 | | | | | this framework we currently support access to http, https, gopher, ftp, news, |
| 51 | | | | | file, and mailto resources. |
| 52 | | | | | |
| 53 | | | | | =item * |
| 54 | | | | | |
| 55 | | | | | Provides a full object oriented interface or |
| 56 | | | | | a very simple procedural interface. |
| 57 | | | | | |
| 58 | | | | | =item * |
| 59 | | | | | |
| 60 | | | | | Supports the basic and digest authorization schemes. |
| 61 | | | | | |
| 62 | | | | | =item * |
| 63 | | | | | |
| 64 | | | | | Supports transparent redirect handling. |
| 65 | | | | | |
| 66 | | | | | =item * |
| 67 | | | | | |
| 68 | | | | | Supports access through proxy servers. |
| 69 | | | | | |
| 70 | | | | | =item * |
| 71 | | | | | |
| 72 | | | | | Provides parser for F<robots.txt> files and a framework for constructing robots. |
| 73 | | | | | |
| 74 | | | | | =item * |
| 75 | | | | | |
| 76 | | | | | Supports parsing of HTML forms. |
| 77 | | | | | |
| 78 | | | | | =item * |
| 79 | | | | | |
| 80 | | | | | Implements HTTP content negotiation algorithm that can |
| 81 | | | | | be used both in protocol modules and in server scripts (like CGI |
| 82 | | | | | scripts). |
| 83 | | | | | |
| 84 | | | | | =item * |
| 85 | | | | | |
| 86 | | | | | Supports HTTP cookies. |
| 87 | | | | | |
| 88 | | | | | =item * |
| 89 | | | | | |
| 90 | | | | | Some simple command line clients, for instance C<lwp-request> and C<lwp-download>. |
| 91 | | | | | |
| 92 | | | | | =back |
| 93 | | | | | |
| 94 | | | | | |
| 95 | | | | | =head1 HTTP STYLE COMMUNICATION |
| 96 | | | | | |
| 97 | | | | | |
| 98 | | | | | The libwww-perl library is based on HTTP style communication. This |
| 99 | | | | | section tries to describe what that means. |
| 100 | | | | | |
| 101 | | | | | Let us start with this quote from the HTTP specification document |
| 102 | | | | | <URL:http://www.w3.org/pub/WWW/Protocols/>: |
| 103 | | | | | |
| 104 | | | | | =over 3 |
| 105 | | | | | |
| 106 | | | | | =item |
| 107 | | | | | |
| 108 | | | | | The HTTP protocol is based on a request/response paradigm. A client |
| 109 | | | | | establishes a connection with a server and sends a request to the |
| 110 | | | | | server in the form of a request method, URI, and protocol version, |
| 111 | | | | | followed by a MIME-like message containing request modifiers, client |
| 112 | | | | | information, and possible body content. The server responds with a |
| 113 | | | | | status line, including the message's protocol version and a success or |
| 114 | | | | | error code, followed by a MIME-like message containing server |
| 115 | | | | | information, entity meta-information, and possible body content. |
| 116 | | | | | |
| 117 | | | | | =back |
| 118 | | | | | |
| 119 | | | | | What this means to libwww-perl is that communication always take place |
| 120 | | | | | through these steps: First a I<request> object is created and |
| 121 | | | | | configured. This object is then passed to a server and we get a |
| 122 | | | | | I<response> object in return that we can examine. A request is always |
| 123 | | | | | independent of any previous requests, i.e. the service is stateless. |
| 124 | | | | | The same simple model is used for any kind of service we want to |
| 125 | | | | | access. |
| 126 | | | | | |
| 127 | | | | | For example, if we want to fetch a document from a remote file server, |
| 128 | | | | | then we send it a request that contains a name for that document and |
| 129 | | | | | the response will contain the document itself. If we access a search |
| 130 | | | | | engine, then the content of the request will contain the query |
| 131 | | | | | parameters and the response will contain the query result. If we want |
| 132 | | | | | to send a mail message to somebody then we send a request object which |
| 133 | | | | | contains our message to the mail server and the response object will |
| 134 | | | | | contain an acknowledgment that tells us that the message has been |
| 135 | | | | | accepted and will be forwarded to the recipient(s). |
| 136 | | | | | |
| 137 | | | | | It is as simple as that! |
| 138 | | | | | |
| 139 | | | | | |
| 140 | | | | | =head2 The Request Object |
| 141 | | | | | |
| 142 | | | | | The libwww-perl request object has the class name C<HTTP::Request>. |
| 143 | | | | | The fact that the class name uses C<HTTP::> as a |
| 144 | | | | | prefix only implies that we use the HTTP model of communication. It |
| 145 | | | | | does not limit the kind of services we can try to pass this I<request> |
| 146 | | | | | to. For instance, we will send C<HTTP::Request>s both to ftp and |
| 147 | | | | | gopher servers, as well as to the local file system. |
| 148 | | | | | |
| 149 | | | | | The main attributes of the request objects are: |
| 150 | | | | | |
| 151 | | | | | =over 3 |
| 152 | | | | | |
| 153 | | | | | =item * |
| 154 | | | | | |
| 155 | | | | | The B<method> is a short string that tells what kind of |
| 156 | | | | | request this is. The most common methods are B<GET>, B<PUT>, |
| 157 | | | | | B<POST> and B<HEAD>. |
| 158 | | | | | |
| 159 | | | | | =item * |
| 160 | | | | | |
| 161 | | | | | The B<uri> is a string denoting the protocol, server and |
| 162 | | | | | the name of the "document" we want to access. The B<uri> might |
| 163 | | | | | also encode various other parameters. |
| 164 | | | | | |
| 165 | | | | | =item * |
| 166 | | | | | |
| 167 | | | | | The B<headers> contain additional information about the |
| 168 | | | | | request and can also used to describe the content. The headers |
| 169 | | | | | are a set of keyword/value pairs. |
| 170 | | | | | |
| 171 | | | | | =item * |
| 172 | | | | | |
| 173 | | | | | The B<content> is an arbitrary amount of data. |
| 174 | | | | | |
| 175 | | | | | =back |
| 176 | | | | | |
| 177 | | | | | =head2 The Response Object |
| 178 | | | | | |
| 179 | | | | | The libwww-perl response object has the class name C<HTTP::Response>. |
| 180 | | | | | The main attributes of objects of this class are: |
| 181 | | | | | |
| 182 | | | | | =over 3 |
| 183 | | | | | |
| 184 | | | | | =item * |
| 185 | | | | | |
| 186 | | | | | The B<code> is a numerical value that indicates the overall |
| 187 | | | | | outcome of the request. |
| 188 | | | | | |
| 189 | | | | | =item * |
| 190 | | | | | |
| 191 | | | | | The B<message> is a short, human readable string that |
| 192 | | | | | corresponds to the I<code>. |
| 193 | | | | | |
| 194 | | | | | =item * |
| 195 | | | | | |
| 196 | | | | | The B<headers> contain additional information about the |
| 197 | | | | | response and describe the content. |
| 198 | | | | | |
| 199 | | | | | =item * |
| 200 | | | | | |
| 201 | | | | | The B<content> is an arbitrary amount of data. |
| 202 | | | | | |
| 203 | | | | | =back |
| 204 | | | | | |
| 205 | | | | | Since we don't want to handle all possible I<code> values directly in |
| 206 | | | | | our programs, a libwww-perl response object has methods that can be |
| 207 | | | | | used to query what kind of response this is. The most commonly used |
| 208 | | | | | response classification methods are: |
| 209 | | | | | |
| 210 | | | | | =over 3 |
| 211 | | | | | |
| 212 | | | | | =item is_success() |
| 213 | | | | | |
| 214 | | | | | The request was was successfully received, understood or accepted. |
| 215 | | | | | |
| 216 | | | | | =item is_error() |
| 217 | | | | | |
| 218 | | | | | The request failed. The server or the resource might not be |
| 219 | | | | | available, access to the resource might be denied or other things might |
| 220 | | | | | have failed for some reason. |
| 221 | | | | | |
| 222 | | | | | =back |
| 223 | | | | | |
| 224 | | | | | =head2 The User Agent |
| 225 | | | | | |
| 226 | | | | | Let us assume that we have created a I<request> object. What do we |
| 227 | | | | | actually do with it in order to receive a I<response>? |
| 228 | | | | | |
| 229 | | | | | The answer is that you pass it to a I<user agent> object and this |
| 230 | | | | | object takes care of all the things that need to be done |
| 231 | | | | | (like low-level communication and error handling) and returns |
| 232 | | | | | a I<response> object. The user agent represents your |
| 233 | | | | | application on the network and provides you with an interface that |
| 234 | | | | | can accept I<requests> and return I<responses>. |
| 235 | | | | | |
| 236 | | | | | The user agent is an interface layer between |
| 237 | | | | | your application code and the network. Through this interface you are |
| 238 | | | | | able to access the various servers on the network. |
| 239 | | | | | |
| 240 | | | | | The class name for the user agent is C<LWP::UserAgent>. Every |
| 241 | | | | | libwww-perl application that wants to communicate should create at |
| 242 | | | | | least one object of this class. The main method provided by this |
| 243 | | | | | object is request(). This method takes an C<HTTP::Request> object as |
| 244 | | | | | argument and (eventually) returns a C<HTTP::Response> object. |
| 245 | | | | | |
| 246 | | | | | The user agent has many other attributes that let you |
| 247 | | | | | configure how it will interact with the network and with your |
| 248 | | | | | application. |
| 249 | | | | | |
| 250 | | | | | =over 3 |
| 251 | | | | | |
| 252 | | | | | =item * |
| 253 | | | | | |
| 254 | | | | | The B<timeout> specifies how much time we give remote servers to |
| 255 | | | | | respond before the library disconnects and creates an |
| 256 | | | | | internal I<timeout> response. |
| 257 | | | | | |
| 258 | | | | | =item * |
| 259 | | | | | |
| 260 | | | | | The B<agent> specifies the name that your application should use when it |
| 261 | | | | | presents itself on the network. |
| 262 | | | | | |
| 263 | | | | | =item * |
| 264 | | | | | |
| 265 | | | | | The B<from> attribute can be set to the e-mail address of the person |
| 266 | | | | | responsible for running the application. If this is set, then the |
| 267 | | | | | address will be sent to the servers with every request. |
| 268 | | | | | |
| 269 | | | | | =item * |
| 270 | | | | | |
| 271 | | | | | The B<parse_head> specifies whether we should initialize response |
| 272 | | | | | headers from the E<lt>head> section of HTML documents. |
| 273 | | | | | |
| 274 | | | | | =item * |
| 275 | | | | | |
| 276 | | | | | The B<proxy> and B<no_proxy> attributes specify if and when to go through |
| 277 | | | | | a proxy server. <URL:http://www.w3.org/pub/WWW/Proxies/> |
| 278 | | | | | |
| 279 | | | | | =item * |
| 280 | | | | | |
| 281 | | | | | The B<credentials> provide a way to set up user names and |
| 282 | | | | | passwords needed to access certain services. |
| 283 | | | | | |
| 284 | | | | | =back |
| 285 | | | | | |
| 286 | | | | | Many applications want even more control over how they interact |
| 287 | | | | | with the network and they get this by sub-classing |
| 288 | | | | | C<LWP::UserAgent>. The library includes a |
| 289 | | | | | sub-class, C<LWP::RobotUA>, for robot applications. |
| 290 | | | | | |
| 291 | | | | | =head2 An Example |
| 292 | | | | | |
| 293 | | | | | This example shows how the user agent, a request and a response are |
| 294 | | | | | represented in actual perl code: |
| 295 | | | | | |
| 296 | | | | | # Create a user agent object |
| 297 | | | | | use LWP::UserAgent; |
| 298 | | | | | my $ua = LWP::UserAgent->new; |
| 299 | | | | | $ua->agent("MyApp/0.1 "); |
| 300 | | | | | |
| 301 | | | | | # Create a request |
| 302 | | | | | my $req = HTTP::Request->new(POST => 'http://search.cpan.org/search'); |
| 303 | | | | | $req->content_type('application/x-www-form-urlencoded'); |
| 304 | | | | | $req->content('query=libwww-perl&mode=dist'); |
| 305 | | | | | |
| 306 | | | | | # Pass request to the user agent and get a response back |
| 307 | | | | | my $res = $ua->request($req); |
| 308 | | | | | |
| 309 | | | | | # Check the outcome of the response |
| 310 | | | | | if ($res->is_success) { |
| 311 | | | | | print $res->content; |
| 312 | | | | | } |
| 313 | | | | | else { |
| 314 | | | | | print $res->status_line, "\n"; |
| 315 | | | | | } |
| 316 | | | | | |
| 317 | | | | | The $ua is created once when the application starts up. New request |
| 318 | | | | | objects should normally created for each request sent. |
| 319 | | | | | |
| 320 | | | | | |
| 321 | | | | | =head1 NETWORK SUPPORT |
| 322 | | | | | |
| 323 | | | | | This section discusses the various protocol schemes and |
| 324 | | | | | the HTTP style methods that headers may be used for each. |
| 325 | | | | | |
| 326 | | | | | For all requests, a "User-Agent" header is added and initialized from |
| 327 | | | | | the $ua->agent attribute before the request is handed to the network |
| 328 | | | | | layer. In the same way, a "From" header is initialized from the |
| 329 | | | | | $ua->from attribute. |
| 330 | | | | | |
| 331 | | | | | For all responses, the library adds a header called "Client-Date". |
| 332 | | | | | This header holds the time when the response was received by |
| 333 | | | | | your application. The format and semantics of the header are the |
| 334 | | | | | same as the server created "Date" header. You may also encounter other |
| 335 | | | | | "Client-XXX" headers. They are all generated by the library |
| 336 | | | | | internally and are not received from the servers. |
| 337 | | | | | |
| 338 | | | | | =head2 HTTP Requests |
| 339 | | | | | |
| 340 | | | | | HTTP requests are just handed off to an HTTP server and it |
| 341 | | | | | decides what happens. Few servers implement methods beside the usual |
| 342 | | | | | "GET", "HEAD", "POST" and "PUT", but CGI-scripts may implement |
| 343 | | | | | any method they like. |
| 344 | | | | | |
| 345 | | | | | If the server is not available then the library will generate an |
| 346 | | | | | internal error response. |
| 347 | | | | | |
| 348 | | | | | The library automatically adds a "Host" and a "Content-Length" header |
| 349 | | | | | to the HTTP request before it is sent over the network. |
| 350 | | | | | |
| 351 | | | | | For a GET request you might want to add a "If-Modified-Since" or |
| 352 | | | | | "If-None-Match" header to make the request conditional. |
| 353 | | | | | |
| 354 | | | | | For a POST request you should add the "Content-Type" header. When you |
| 355 | | | | | try to emulate HTML E<lt>FORM> handling you should usually let the value |
| 356 | | | | | of the "Content-Type" header be "application/x-www-form-urlencoded". |
| 357 | | | | | See L<lwpcook> for examples of this. |
| 358 | | | | | |
| 359 | | | | | The libwww-perl HTTP implementation currently support the HTTP/1.1 |
| 360 | | | | | and HTTP/1.0 protocol. |
| 361 | | | | | |
| 362 | | | | | The library allows you to access proxy server through HTTP. This |
| 363 | | | | | means that you can set up the library to forward all types of request |
| 364 | | | | | through the HTTP protocol module. See L<LWP::UserAgent> for |
| 365 | | | | | documentation of this. |
| 366 | | | | | |
| 367 | | | | | |
| 368 | | | | | =head2 HTTPS Requests |
| 369 | | | | | |
| 370 | | | | | HTTPS requests are HTTP requests over an encrypted network connection |
| 371 | | | | | using the SSL protocol developed by Netscape. Everything about HTTP |
| 372 | | | | | requests above also apply to HTTPS requests. In addition the library |
| 373 | | | | | will add the headers "Client-SSL-Cipher", "Client-SSL-Cert-Subject" and |
| 374 | | | | | "Client-SSL-Cert-Issuer" to the response. These headers denote the |
| 375 | | | | | encryption method used and the name of the server owner. |
| 376 | | | | | |
| 377 | | | | | The request can contain the header "If-SSL-Cert-Subject" in order to |
| 378 | | | | | make the request conditional on the content of the server certificate. |
| 379 | | | | | If the certificate subject does not match, no request is sent to the |
| 380 | | | | | server and an internally generated error response is returned. The |
| 381 | | | | | value of the "If-SSL-Cert-Subject" header is interpreted as a Perl |
| 382 | | | | | regular expression. |
| 383 | | | | | |
| 384 | | | | | |
| 385 | | | | | =head2 FTP Requests |
| 386 | | | | | |
| 387 | | | | | The library currently supports GET, HEAD and PUT requests. GET |
| 388 | | | | | retrieves a file or a directory listing from an FTP server. PUT |
| 389 | | | | | stores a file on a ftp server. |
| 390 | | | | | |
| 391 | | | | | You can specify a ftp account for servers that want this in addition |
| 392 | | | | | to user name and password. This is specified by including an "Account" |
| 393 | | | | | header in the request. |
| 394 | | | | | |
| 395 | | | | | User name/password can be specified using basic authorization or be |
| 396 | | | | | encoded in the URL. Failed logins return an UNAUTHORIZED response with |
| 397 | | | | | "WWW-Authenticate: Basic" and can be treated like basic authorization |
| 398 | | | | | for HTTP. |
| 399 | | | | | |
| 400 | | | | | The library supports ftp ASCII transfer mode by specifying the "type=a" |
| 401 | | | | | parameter in the URL. It also supports transfer of ranges for FTP transfers |
| 402 | | | | | using the "Range" header. |
| 403 | | | | | |
| 404 | | | | | Directory listings are by default returned unprocessed (as returned |
| 405 | | | | | from the ftp server) with the content media type reported to be |
| 406 | | | | | "text/ftp-dir-listing". The C<File::Listing> module provides methods |
| 407 | | | | | for parsing of these directory listing. |
| 408 | | | | | |
| 409 | | | | | The ftp module is also able to convert directory listings to HTML and |
| 410 | | | | | this can be requested via the standard HTTP content negotiation |
| 411 | | | | | mechanisms (add an "Accept: text/html" header in the request if you |
| 412 | | | | | want this). |
| 413 | | | | | |
| 414 | | | | | For normal file retrievals, the "Content-Type" is guessed based on the |
| 415 | | | | | file name suffix. See L<LWP::MediaTypes>. |
| 416 | | | | | |
| 417 | | | | | The "If-Modified-Since" request header works for servers that implement |
| 418 | | | | | the MDTM command. It will probably not work for directory listings though. |
| 419 | | | | | |
| 420 | | | | | Example: |
| 421 | | | | | |
| 422 | | | | | $req = HTTP::Request->new(GET => 'ftp://me:passwd@ftp.some.where.com/'); |
| 423 | | | | | $req->header(Accept => "text/html, */*;q=0.1"); |
| 424 | | | | | |
| 425 | | | | | =head2 News Requests |
| 426 | | | | | |
| 427 | | | | | Access to the USENET News system is implemented through the NNTP |
| 428 | | | | | protocol. The name of the news server is obtained from the |
| 429 | | | | | NNTP_SERVER environment variable and defaults to "news". It is not |
| 430 | | | | | possible to specify the hostname of the NNTP server in news: URLs. |
| 431 | | | | | |
| 432 | | | | | The library supports GET and HEAD to retrieve news articles through the |
| 433 | | | | | NNTP protocol. You can also post articles to newsgroups by using |
| 434 | | | | | (surprise!) the POST method. |
| 435 | | | | | |
| 436 | | | | | GET on newsgroups is not implemented yet. |
| 437 | | | | | |
| 438 | | | | | Examples: |
| 439 | | | | | |
| 440 | | | | | $req = HTTP::Request->new(GET => 'news:abc1234@a.sn.no'); |
| 441 | | | | | |
| 442 | | | | | $req = HTTP::Request->new(POST => 'news:comp.lang.perl.test'); |
| 443 | | | | | $req->header(Subject => 'This is a test', |
| 444 | | | | | From => 'me@some.where.org'); |
| 445 | | | | | $req->content(<<EOT); |
| 446 | | | | | This is the content of the message that we are sending to |
| 447 | | | | | the world. |
| 448 | | | | | EOT |
| 449 | | | | | |
| 450 | | | | | |
| 451 | | | | | =head2 Gopher Request |
| 452 | | | | | |
| 453 | | | | | The library supports the GET and HEAD methods for gopher requests. All |
| 454 | | | | | request header values are ignored. HEAD cheats and returns a |
| 455 | | | | | response without even talking to server. |
| 456 | | | | | |
| 457 | | | | | Gopher menus are always converted to HTML. |
| 458 | | | | | |
| 459 | | | | | The response "Content-Type" is generated from the document type |
| 460 | | | | | encoded (as the first letter) in the request URL path itself. |
| 461 | | | | | |
| 462 | | | | | Example: |
| 463 | | | | | |
| 464 | | | | | $req = HTTP::Request->new(GET => 'gopher://gopher.sn.no/'); |
| 465 | | | | | |
| 466 | | | | | |
| 467 | | | | | |
| 468 | | | | | =head2 File Request |
| 469 | | | | | |
| 470 | | | | | The library supports GET and HEAD methods for file requests. The |
| 471 | | | | | "If-Modified-Since" header is supported. All other headers are |
| 472 | | | | | ignored. The I<host> component of the file URL must be empty or set |
| 473 | | | | | to "localhost". Any other I<host> value will be treated as an error. |
| 474 | | | | | |
| 475 | | | | | Directories are always converted to an HTML document. For normal |
| 476 | | | | | files, the "Content-Type" and "Content-Encoding" in the response are |
| 477 | | | | | guessed based on the file suffix. |
| 478 | | | | | |
| 479 | | | | | Example: |
| 480 | | | | | |
| 481 | | | | | $req = HTTP::Request->new(GET => 'file:/etc/passwd'); |
| 482 | | | | | |
| 483 | | | | | |
| 484 | | | | | =head2 Mailto Request |
| 485 | | | | | |
| 486 | | | | | You can send (aka "POST") mail messages using the library. All |
| 487 | | | | | headers specified for the request are passed on to the mail system. |
| 488 | | | | | The "To" header is initialized from the mail address in the URL. |
| 489 | | | | | |
| 490 | | | | | Example: |
| 491 | | | | | |
| 492 | | | | | $req = HTTP::Request->new(POST => 'mailto:libwww@perl.org'); |
| 493 | | | | | $req->header(Subject => "subscribe"); |
| 494 | | | | | $req->content("Please subscribe me to the libwww-perl mailing list!\n"); |
| 495 | | | | | |
| 496 | | | | | =head2 CPAN Requests |
| 497 | | | | | |
| 498 | | | | | URLs with scheme C<cpan:> are redirected to the a suitable CPAN |
| 499 | | | | | mirror. If you have your own local mirror of CPAN you might tell LWP |
| 500 | | | | | to use it for C<cpan:> URLs by an assignment like this: |
| 501 | | | | | |
| 502 | | | | | $LWP::Protocol::cpan::CPAN = "file:/local/CPAN/"; |
| 503 | | | | | |
| 504 | | | | | Suitable CPAN mirrors are also picked up from the configuration for |
| 505 | | | | | the CPAN.pm, so if you have used that module a suitable mirror should |
| 506 | | | | | be picked automatically. If neither of these apply, then a redirect |
| 507 | | | | | to the generic CPAN http location is issued. |
| 508 | | | | | |
| 509 | | | | | Example request to download the newest perl: |
| 510 | | | | | |
| 511 | | | | | $req = HTTP::Request->new(GET => "cpan:src/latest.tar.gz"); |
| 512 | | | | | |
| 513 | | | | | |
| 514 | | | | | =head1 OVERVIEW OF CLASSES AND PACKAGES |
| 515 | | | | | |
| 516 | | | | | This table should give you a quick overview of the classes provided by the |
| 517 | | | | | library. Indentation shows class inheritance. |
| 518 | | | | | |
| 519 | | | | | LWP::MemberMixin -- Access to member variables of Perl5 classes |
| 520 | | | | | LWP::UserAgent -- WWW user agent class |
| 521 | | | | | LWP::RobotUA -- When developing a robot applications |
| 522 | | | | | LWP::Protocol -- Interface to various protocol schemes |
| 523 | | | | | LWP::Protocol::http -- http:// access |
| 524 | | | | | LWP::Protocol::file -- file:// access |
| 525 | | | | | LWP::Protocol::ftp -- ftp:// access |
| 526 | | | | | ... |
| 527 | | | | | |
| 528 | | | | | LWP::Authen::Basic -- Handle 401 and 407 responses |
| 529 | | | | | LWP::Authen::Digest |
| 530 | | | | | |
| 531 | | | | | HTTP::Headers -- MIME/RFC822 style header (used by HTTP::Message) |
| 532 | | | | | HTTP::Message -- HTTP style message |
| 533 | | | | | HTTP::Request -- HTTP request |
| 534 | | | | | HTTP::Response -- HTTP response |
| 535 | | | | | HTTP::Daemon -- A HTTP server class |
| 536 | | | | | |
| 537 | | | | | WWW::RobotRules -- Parse robots.txt files |
| 538 | | | | | WWW::RobotRules::AnyDBM_File -- Persistent RobotRules |
| 539 | | | | | |
| 540 | | | | | Net::HTTP -- Low level HTTP client |
| 541 | | | | | |
| 542 | | | | | The following modules provide various functions and definitions. |
| 543 | | | | | |
| 544 | | | | | LWP -- This file. Library version number and documentation. |
| 545 | | | | | LWP::MediaTypes -- MIME types configuration (text/html etc.) |
| 546 | | | | | LWP::Simple -- Simplified procedural interface for common functions |
| 547 | | | | | HTTP::Status -- HTTP status code (200 OK etc) |
| 548 | | | | | HTTP::Date -- Date parsing module for HTTP date formats |
| 549 | | | | | HTTP::Negotiate -- HTTP content negotiation calculation |
| 550 | | | | | File::Listing -- Parse directory listings |
| 551 | | | | | HTML::Form -- Processing for <form>s in HTML documents |
| 552 | | | | | |
| 553 | | | | | |
| 554 | | | | | =head1 MORE DOCUMENTATION |
| 555 | | | | | |
| 556 | | | | | All modules contain detailed information on the interfaces they |
| 557 | | | | | provide. The I<lwpcook> manpage is the libwww-perl cookbook that contain |
| 558 | | | | | examples of typical usage of the library. You might want to take a |
| 559 | | | | | look at how the scripts C<lwp-request>, C<lwp-rget> and C<lwp-mirror> |
| 560 | | | | | are implemented. |
| 561 | | | | | |
| 562 | | | | | =head1 ENVIRONMENT |
| 563 | | | | | |
| 564 | | | | | The following environment variables are used by LWP: |
| 565 | | | | | |
| 566 | | | | | =over |
| 567 | | | | | |
| 568 | | | | | =item HOME |
| 569 | | | | | |
| 570 | | | | | The C<LWP::MediaTypes> functions will look for the F<.media.types> and |
| 571 | | | | | F<.mime.types> files relative to you home directory. |
| 572 | | | | | |
| 573 | | | | | =item http_proxy |
| 574 | | | | | |
| 575 | | | | | =item ftp_proxy |
| 576 | | | | | |
| 577 | | | | | =item xxx_proxy |
| 578 | | | | | |
| 579 | | | | | =item no_proxy |
| 580 | | | | | |
| 581 | | | | | These environment variables can be set to enable communication through |
| 582 | | | | | a proxy server. See the description of the C<env_proxy> method in |
| 583 | | | | | L<LWP::UserAgent>. |
| 584 | | | | | |
| 585 | | | | | =item PERL_LWP_USE_HTTP_10 |
| 586 | | | | | |
| 587 | | | | | Enable the old HTTP/1.0 protocol driver instead of the new HTTP/1.1 |
| 588 | | | | | driver. You might want to set this to a TRUE value if you discover |
| 589 | | | | | that your old LWP applications fails after you installed LWP-5.60 or |
| 590 | | | | | better. |
| 591 | | | | | |
| 592 | | | | | =item PERL_HTTP_URI_CLASS |
| 593 | | | | | |
| 594 | | | | | Used to decide what URI objects to instantiate. The default is C<URI>. |
| 595 | | | | | You might want to set it to C<URI::URL> for compatibility with old times. |
| 596 | | | | | |
| 597 | | | | | =back |
| 598 | | | | | |
| 599 | | | | | =head1 AUTHORS |
| 600 | | | | | |
| 601 | | | | | LWP was made possible by contributions from Adam Newby, Albert |
| 602 | | | | | Dvornik, Alexandre Duret-Lutz, Andreas Gustafsson, Andreas König, |
| 603 | | | | | Andrew Pimlott, Andy Lester, Ben Coleman, Benjamin Low, Ben Low, Ben |
| 604 | | | | | Tilly, Blair Zajac, Bob Dalgleish, BooK, Brad Hughes, Brian |
| 605 | | | | | J. Murrell, Brian McCauley, Charles C. Fu, Charles Lane, Chris Nandor, |
| 606 | | | | | Christian Gilmore, Chris W. Unger, Craig Macdonald, Dale Couch, Dan |
| 607 | | | | | Kubb, Dave Dunkin, Dave W. Smith, David Coppit, David Dick, David |
| 608 | | | | | D. Kilzer, Doug MacEachern, Edward Avis, erik, Gary Shea, Gisle Aas, |
| 609 | | | | | Graham Barr, Gurusamy Sarathy, Hans de Graaff, Harald Joerg, Harry |
| 610 | | | | | Bochner, Hugo, Ilya Zakharevich, INOUE Yoshinari, Ivan Panchenko, Jack |
| 611 | | | | | Shirazi, James Tillman, Jan Dubois, Jared Rhine, Jim Stern, Joao |
| 612 | | | | | Lopes, John Klar, Johnny Lee, Josh Kronengold, Josh Rai, Joshua |
| 613 | | | | | Chamas, Joshua Hoblitt, Kartik Subbarao, Keiichiro Nagano, Ken |
| 614 | | | | | Williams, KONISHI Katsuhiro, Lee T Lindley, Liam Quinn, Marc Hedlund, |
| 615 | | | | | Marc Langheinrich, Mark D. Anderson, Marko Asplund, Mark Stosberg, |
| 616 | | | | | Markus B Krüger, Markus Laker, Martijn Koster, Martin Thurn, Matthew |
| 617 | | | | | Eldridge, Matthew.van.Eerde, Matt Sergeant, Michael A. Chase, Michael |
| 618 | | | | | Quaranta, Michael Thompson, Mike Schilli, Moshe Kaminsky, Nathan |
| 619 | | | | | Torkington, Nicolai Langfeldt, Norton Allen, Olly Betts, Paul |
| 620 | | | | | J. Schinder, peterm, Philip GuentherDaniel Buenzli, Pon Hwa Lin, |
| 621 | | | | | Radoslaw Zielinski, Radu Greab, Randal L. Schwartz, Richard Chen, |
| 622 | | | | | Robin Barker, Roy Fielding, Sander van Zoest, Sean M. Burke, |
| 623 | | | | | shildreth, Slaven Rezic, Steve A Fink, Steve Hay, Steven Butler, |
| 624 | | | | | Steve_Kilbane, Takanori Ugai, Thomas Lotterer, Tim Bunce, Tom Hughes, |
| 625 | | | | | Tony Finch, Ville Skyttä, Ward Vandewege, William York, Yale Huang, |
| 626 | | | | | and Yitzchak Scott-Thoennes. |
| 627 | | | | | |
| 628 | | | | | LWP owes a lot in motivation, design, and code, to the libwww-perl |
| 629 | | | | | library for Perl4 by Roy Fielding, which included work from Alberto |
| 630 | | | | | Accomazzi, James Casey, Brooks Cutter, Martijn Koster, Oscar |
| 631 | | | | | Nierstrasz, Mel Melchner, Gertjan van Oosten, Jared Rhine, Jack |
| 632 | | | | | Shirazi, Gene Spafford, Marc VanHeyningen, Steven E. Brenner, Marion |
| 633 | | | | | Hakanson, Waldemar Kebsch, Tony Sanders, and Larry Wall; see the |
| 634 | | | | | libwww-perl-0.40 library for details. |
| 635 | | | | | |
| 636 | | | | | =head1 COPYRIGHT |
| 637 | | | | | |
| 638 | | | | | Copyright 1995-2009, Gisle Aas |
| 639 | | | | | Copyright 1995, Martijn Koster |
| 640 | | | | | |
| 641 | | | | | This library is free software; you can redistribute it and/or |
| 642 | | | | | modify it under the same terms as Perl itself. |
| 643 | | | | | |
| 644 | | | | | =head1 AVAILABILITY |
| 645 | | | | | |
| 646 | | | | | The latest version of this library is likely to be available from CPAN |
| 647 | | | | | as well as: |
| 648 | | | | | |
| 649 | | | | | http://gitorious.org/projects/libwww-perl |
| 650 | | | | | |
| 651 | | | | | The best place to discuss this code is on the <libwww@perl.org> |
| 652 | | | | | mailing list. |
| 653 | | | | | |
| 654 | | | | | =cut |