网页抓取和ftp访问是目前很常见的一个应用需要,无论是搜索引擎的爬虫,分析程序,资源获取程序,WebService等等都是需要的,自己开发抓取库当然是最好了,不过开发需要时间和周期,使用现有的Opensource程序是个更好的选择,一来别人已经写的很好了,二来自己使用起来非常快速,三来还能够学习一下别人程序的优点。


libwww
官方网站:http://www.w3.org/Library/
更多信息:http://www.w3.org/Library/User/
运行平台:Unix/LinuxWindows

Libwww 是一个用C语言写成的高度模组化用户端的网页存取API


libcurl

官方网站:http://curl.haxx.se/libcurl
更多特点:http://curl.haxx.se/docs/features.html
运行平台:Unix/LinuxWindows


libcurl为一个免费开源的,客户端url传输库,支持FTPFTPSTFTPHTTPHTTPSGOPHERTELNETDICTFILELDAP,跨平台(支持 WindowsUnixLinux等),线程安全,支持Ipv6,并且易于使用。


libfetch
官方网站:http://libfetch.darwinports.com/
更多信息:http://www.freebsd.org/cgi/man.cgi?query=fetch&sektion=3
运行平台:BSD


HTTP/FTP客户端库】
资料来源:http://curl.haxx.se/libcurl/competitors.html

Free Software and Open Source projects have a long tradition offorks and duplicate efforts. We enjoy "doing it ourselves",no matter if someone else has done something very similar already.Free/open libraries that cover parts of libcurl's features:

libcurl (MIT)

ahighly portable and easy-to-use client-side URL transfer library,supporting FTP, FTPS, HTTP, HTTPS, SCP, SFTP, TELNET, DICT, FILE,TFTP and LDAP. libcurl also supports HTTPS certificates, HTTP POST,HTTP PUT, FTP uploading, kerberos, HTTP form based upload, proxies,cookies, user+password authentication, file transfer resume, httpproxy tunnelling and more!

libghttp (LGPL)

Havinga glance at libghttp (a gnome http library), it looks as if it worksrather similar to libcurl (for http). There's no web page for thisand the person who's email is mentioned in the README of the latestrelease I found claims he has passed the leadership of the project to"eazel". Popular choice among GNOME projects.

libwww (W3Clicense) comparisonwith libcurl

Morecomplex, and and harder to use than libcurl is. Includes everythingfrom multi-threading to HTML parsing. The most notabletransfer-related feature that libcurl does not offer but libwww does,is caching.

libferit (GPL)

C++library "for transferring files via http, ftp, gopher, proxyserver". Based on 'snarf' 2.0.9-code (formerly known aslibsnarf). Quote from freshmeat:  "As the author of snarf, I have to say this frightens me.Snarf's networking system is far from robust and complete. It'sprobably full of bugs, and although it works for maybe 85% of allcurrent situations, I wouldn't base a library on it."

neon (LGPL)

AnHTTP and WebDAV client library, with a C interface. I've mainly heardand seen people use this with WebDAV as their main interest.

(LGPL) comparisonwith libcurl

Partof glib (GNOME). Supports: HTTP 1.1, Persistent connections,Asynchronous DNS and transfers, Connection cache, Redirects, Basic,Digest, NTLM authentication, SSL with OpenSSL or Mozilla NSS, Proxysupport including SSL, SOCKS support, POST data. Probably not veryportable. Lacks: cookie support, NTLM for proxies, GSS, gzipencoding, trailers in chunked responses and more.

mozillanetlib (MPL)

HandlesURLs, protocols, transports for the Mozilla browser.

mozillalibxpnet (MPL)

Minimaldownload library targeted to be much smaller than the above mentionednetlib. HTTP and FTP support.

wget (GPL)

Whilenot a library at all, I've been told that people sometimes extractthe network code from it and base their own hacks from there.

libfetch (BSD)

DoesHTTP and FTP transfers (both ways), supports file: URLs, and an APIfor URL parsing. The utility  fetch  thatis built on libfetch is an integral part of the  FreeBSD  operatingsystem.

HTTPFetcher (LGPL)

" a small, robust, flexible library for downloading files via HTTPusing the GET method. "

http-tiny (Artistic License)

" a very small C library to make http queries (GET, HEAD, PUT,DELETE, etc.) easily portable and embeddable "

XMLHTTPObject also known as IXMLHTTPRequest (part of MSXML 3.0)

(Windows)Provides client-side protocol support for communication with HTTPservers. A client computer can use the XMLHTTP object to send anarbitrary HTTP request, receive the response, and have the Microsoft?XML Document Object Model (DOM) parse that response.

QHttp (GPL)

QHttpis a class in the Qt library from Troll Tech. Seems to be restrictedto plain HTTP. Supports GET, POST and proxy. Asynchronous.

ftplib (GPL)

" a set of routines that implement the FTP protocol. They allowapplications to create and access remote files through function callsinstead of needing to fork and exec an interactive ftp clientprogram."

ftplibpp (GPL)

AC++ library for "easy FTP client functionality. It featuresresuming of up- and downloads, FXP support, SSL/TLS encryption, andlogging functionality."

GNUCommon C++ library

Hasa URLStream class. This C++ class allow you to download a file usingHTTP. See demo/urlfetch.cpp in commoncpp2-1.3.19.tar.gz

HTTPClient (LGPL)

JavaHTTP client library.

JakartaCommons HttpClient (Apache License)

AJava HTTP client library written by the Jakarta project.

10-02 08:39