A Journalist's Database of Databases
by Drew Sullivan

Data on Web sites comes in many different forms. Sometimes it can be downloaded as a spreadsheet or database. Somtimes you'll have to do a little manipulation to get it into the format you need.  Below are links to sites I have found useful. Here are the symbols for each file type listed below. Depending on the format of the data on the website, you'll have to handle each set differently. 

ascii.gif (217 bytes)  ASCII demilited or fixed-width file (easily imported into a database or spreadsheet) table.gif (161 bytes) HTML file (can be cut and pasted into spreadsheet or imported using Excel)
excel.gif (152 bytes) Spreadsheet file foxpro.gif (1104 bytes) Database file

US Downloadable Databases - These are databases that can be downloaded in whole or in part and imported into a spreadsheet or database.

ascii.gif (217 bytes) US Census Data Lookup The Census Bureau site allows you to download parts of the 1990 census  and look at the latest Census 2000 summary data.  The Census Bureau has a lot of other data files in their data access site. If you haven't worked with census data, make sure you read the instructions.  You can also download Tiger data here for mapping.
ascii.gif (217 bytes) Government Information Sharing Project This site has a number of commonly used databases that can be searched. Unfortunately, the search screens limit what you can do. It is easy to use and very well documented. Data includes the Agriculture Census, Consolidated Federal Funds data, Equal Employment Opportunity data, import/export data, school district data and economic census data. Most data can be saved and imported into a spreadsheet after a few queries.
ascii.gif (217 bytes) Federal Election Commission The FEC periodically puts up data on their site. The site is slow and often the data can't be downloaded in mass. When it is, it is usually in a comma delimited format.
ascii.gif (217 bytes) FECInfo Election data This election site is easy to use and has all the data contribution data though much of it is only searchable through a front-end. Lots of subscriber stuff as well. 
ascii.gif (217 bytes) Health Care Finance Administration Data Medicare and medicaid statistics. Includes MEDPAR data, HCFA public use file summaries and other neat tidbits. A lot of the data has been summarized making it a little less useful. Data is usually fixed-width stored as zipped files.
ascii.gif (217 bytes) excel_2.gif (276 bytes) Center for Disease Contol Data The National Center for Health Statistics under the CDC has a lot of nice downloadable datasets on mortality and health in its data warehouse. You can download the complete 1998 ICD-9 and 2000 ICD-9 here as well (the coding manual for cause of death used by many state and federal agencies) along with a guide to the ICD-9.  Data is in Lotus 1-2-3 and ASCII formats.
table.gif (161 bytes) The National Center for Health Statistics Some specific health/disease related tables from the agency above in HTML format.
ascii.gif (217 bytes) Emergency Release Notification System (ERNS) The EPA compiles ERNS data which is data on unexpected hazardous spills. The data can be downloaded by EPA region and is in a fixed-length ASCII file. Decent doumentation  and data dictionaries exist as well. See below in searchable databases for the Toxic Release Inventory. Other  EPA Databases are available here. 
 excel_2.gif (276 bytes) Bureau of Justice Statistics Tons of  crime-related numbers from the gods of crime statistics.  Also check their tables used in BJS reports for more data.
Office of Juvenile Justice Statistics Juvenile justice crime statistics from the US Office of Juvenile Justice and Delinquency Prevention. Much of the data is in an annoying program you have to load on your machine.
ascii.gif (217 bytes) National Archive of Criminal Justice Data National Archive of Criminal Justice Data which is affiliated with the Bureau of Justice Statistics. Don't try to use this data on deadline. Site generally sucks.
ascii.gif (217 bytes)excel_2.gif (276 bytes) IRS Statistics The IRS Statistics of Income program tracks all sorts of data but always in summary form. You can find all sorts of information on non-profits (including the database of tax data for approved non-profits - downloadable in ASCII fixed-length form) and other stats on income earned, migration and foreign taxes paid. All are downloadble often in spreadsheet form. Some databases are VERY big and the site is VERY slow.
foxpro.gif (1104 bytes) Federal Railroad Administration Accident Data Gone is the old gopher site but this web page still lets you download files in DBF format. Includes good record layouts. Very clean. Very nice. There are some nice summary tables that can be imported into spreadsheets and other goodies at their safety home page. 
ascii.gif (217 bytes) National Oceanographic and Atmospheric Administration -  Storm Prediction Center  This NOAA site includes a nice archive with downloadable files on tornadoes and tornado deaths since 1950 . There's also data on hail and wind damage data. Nice.
excel_2.gif (276 bytes) USDA ERS State Fact Sheets Basic ag data  on import/exports,   commidty prices, land use and production. Many sets are in Lotus spreadsheets, some maddeningly in PDF files.
ascii.gif (217 bytes) DOL - Mining accident data Excellent site with useful data summarys as well as raw data that can be downloaded as easy to use, self-extracting files. You need to look around   for the raw data but it's there.
ascii.gif (217 bytes) BTS National Data Good transportation databases...some downloadable...some annoyingly limited with weak front-end search screens.
foxpro.gif (1104 bytes)ascii.gif (217 bytes) National Highway Transportation Safety Admin FTP site where you can download FARS (see below), vehicle recall, crash test and other safety databases.
ascii.gif (217 bytes) Fatal Accident Reporting System (FARS) A real improvement over previous FAR systems. If you can get through the complicated search screens and refine your search, you can get exactly what you're looking for and download the results...a must for large, complicated databases. Also allows you to do cross-tabs. Data also available via FTP.
excel_2.gif (276 bytes)ascii.gif (217 bytes) FDIC Bank Deposit Summaries Nicely engineered site that allows good flexibility to search and download some data but not all. Has call reports, summary of deposits and other banking statistics.
ascii.gif (217 bytes) HUD User A compilation of a number of HUD databases including the American Housing Survey, Assisted Housing Survey and the State of the Nation's Cities database. Data is in SAS or SPSS format and fixed-length ASCII. Lots of nice searchable stuff on employment as well.
ascii.gif (217 bytes) US Forest Service Inventory Not for the beginner. This is complex database that tracks the "extent, condition, volume, growth, and depletions of timber on the Nation's forest land." Think of it as the Sears Catalog for lumber companies.
excel_2.gif (276 bytes) Federal Transit Administration Lots of data on the relative efficiency and costs between major public transportation projects in the US. Stored as lots of Lotus 1-2-3 files.
ascii.gif (217 bytes) Fish and Wildlife Service Endangered Species See the database download form to get a comma-separated file of all animals on the endangered species list. Also available are the plants list and delisted species.
excel_2.gif (276 bytes) Administration on Aging Lots of demographic data on older Americans in spreadsheet format.
excel_2.gif (276 bytes) Veteran's Administration Veteran Data Excel files with demographic data on Veterans. There's also a page of dozens of Excel files dealing with veteran medical programs.
table.gif (161 bytes) Social Security Admin. - Actuarial Office Name distributions from social security applications. This will tell you what names have been most popular over the years. Okay...so it ain't investigative reporting.
ascii.gif (217 bytes) The Economic Statistics Briefing Room Economic data that can be downloaded into spreadsheets
ascii.gif (217 bytes) The Social Statistics Briefing Room Social stats like crime and population
ascii.gif (217 bytes) ATSDR HazDat Registry database Search by clickable map for Superfund and other serious pollution sites. You can donload it into a spreadsheet very easily.
excel_2.gif (276 bytes)ascii.gif (217 bytes) Bureau of Labor Statistics Extract and download employment, wage, layoff and other labor stats to spreadsheet or ASCII files. Very good. 
ascii.gif (217 bytes) US Census Population Estimates You can download updated population estimates by state, city, MSA or county. You have to do a little cutting and pasting but it works okay.
ascii.gif (217 bytes) Health and Human Services list of excluded doctors On this site you can download a self extracting file that has a list of all doctors and organizations excluded from doing business with the government.
excel_2.gif (276 bytes)ascii.gif (217 bytes) Bureau of Transportation Statistics On time statistics for airlines by airport and other variables.
foxpro.gif (1104 bytes)ascii.gif (217 bytes) National Cancer Institute Atlas of Cancer. Great site where you can download .dbf or ASCII files of the occurrence and geographical distribution of various type of cancer.   The site has a good mapping tool as well. 

Document Update: June5, 2000