|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.conf.Configured
edu.umd.cloud9.collection.clue.DemoCountClueWarcRecords
public class DemoCountClueWarcRecords
Simple demo program to count the number of records in the ClueWeb09 collection, from either the original source WARC files or repacked SequenceFiles (controlled by the first command-line parameter). The program also verifies the docid to docno mappings.
The program takes four command-line arguments:
Here's a sample invocation:
hadoop jar cloud9.jar edu.umd.cloud9.collection.clue.DemoCountSourceClueWarcRecords \ original /umd/collections/ClueWeb09 1 /umd/collections/ClueWeb09/docno-mapping.dat
| Constructor Summary | |
|---|---|
DemoCountClueWarcRecords()
Creates an instance of this tool. |
|
| Method Summary | |
|---|---|
static void |
main(String[] args)
Dispatches command-line arguments to the tool via the ToolRunner. |
int |
run(String[] args)
Runs this tool. |
| Methods inherited from class org.apache.hadoop.conf.Configured |
|---|
getConf, setConf |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface org.apache.hadoop.conf.Configurable |
|---|
getConf, setConf |
| Constructor Detail |
|---|
public DemoCountClueWarcRecords()
| Method Detail |
|---|
public int run(String[] args)
throws Exception
run in interface ToolException
public static void main(String[] args)
throws Exception
ToolRunner.
Exception
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||