Package edu.umd.cloud9.collection.trecweb

Provides classes for working with the GOV2 collection.

See:
          Description

Class Summary
Gov2DocnoMapping  
NumberTrecWebDocuments  
RepackGov2Documents Program to uncompress the gov2 collection from the original distribution and repack as SequenceFiles.
RepackWt10gDocuments Program to uncompress the Wt10g collection from the original distribution and repack as SequenceFiles.
TrecWebDocument  
TrecWebDocumentInputFormat Hadoop InputFormat for processing the TREC collection.
TrecWebDocumentInputFormat.TrecWebRecordReader Hadoop RecordReader for reading TREC-formatted documents.
Wt10gDocnoMapping  
 

Package edu.umd.cloud9.collection.trecweb Description

Provides classes for working with the GOV2 collection.