This page tells you how to find and get Project Gutenberg eBooks if:
Find our RSS feed in the cache/feeds location. Updated daily after 2am U.S. Eastern time.
The “posted” list is where every new eBook is announced as it is being uploaded to the Project Gutenberg servers. New books are then available for download, typically within 2 hours. The list has a once-daily digest option, and also online public archives.
The Project Gutenberg collection is available from dozens of sites offering access via http/https, ftp, rsync, and a few other methods. See our listing of mirror sites to choose the location, access method, or speed. Mirrors generally do not have a friendly Web-based front end, but do have the collection. See the mirroring how-to for details.
Updated at least monthly. These plain text files provide the basic information about each eBook, and are good for searching from your own system (for example, use control-F in a Web browser or word processor). They are the accession lists for Project Gutenberg. Note that these files are not recommended for automation (that is, to use as input to generate a computerized database). Instead, use one of the catalog files mentioned below.
If GUTINDEX.ALL is too big for you or you prefer separate annual lists, you can download GUTINDEX files by year.
Not part of Project Gutenberg - check laws of the country where you are, before accessing or redistributing any eBooks.
You can navigate the directory/folder contents starting at /dirs, however this is not very user-friendly.
All Project Gutenberg metadata are available digitally in the XML/RDF format. This is updated daily (other than the legacy format mentioned below). Please use one of these files as input to a database or other tools you may be developing, instead of crawling or roboting the website.
Note that the exact same metadata is available as a per-eBook .rdf file. These are found in the cache/epub (i.e., cache/generated) directory, accessible by mirroring or by the directory/folder listings above. The large XML/RDF file is simply a concatenation of all the per-eBook metadata.
MARC is a common metadata format utilized by library card catalog databases. Steve Thomas of the University of Adelaide provided a Perl script to generate MARC records from the XML/RDF catalog files. Find it here: pgrdf2marc.pl. You will need to rename it, and make any necessary changes to run on your own system. This is unsupported software, provided without warranty or guarantee.
These instructions were provided to Project Gutenberg, and are listed here in the hopes they may be useful.
Kiwix is an application that lets you download a large collection and use it locally. A copy of the Project Gutenberg content was made available in November 2018, and may be updated periodically.