|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.conf.Configured
edu.umd.cloud9.collection.clue.RepackClueWarcRecords
public class RepackClueWarcRecords
Program to uncompress the ClueWeb09 collection from the original distribution
WARC files and repack as SequenceFiles.
The program takes the following command-line arguments:
Here's a sample invocation:
hadoop jar cloud9.jar edu.umd.cloud9.collection.clue.RepackClueWarcRecords \ /umd/collections/ClueWeb09 /umd/collections/ClueWeb09.repacked.block/en.01 1 \ /umd/collections/ClueWeb09.repacked.block/docno-mapping.dat block
| Constructor Summary | |
|---|---|
RepackClueWarcRecords()
Creates an instance of this tool. |
|
| Method Summary | |
|---|---|
static void |
main(String[] args)
Dispatches command-line arguments to the tool via the ToolRunner. |
int |
run(String[] args)
Runs this tool. |
| Methods inherited from class org.apache.hadoop.conf.Configured |
|---|
getConf, setConf |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface org.apache.hadoop.conf.Configurable |
|---|
getConf, setConf |
| Constructor Detail |
|---|
public RepackClueWarcRecords()
| Method Detail |
|---|
public int run(String[] args)
throws Exception
run in interface ToolException
public static void main(String[] args)
throws Exception
ToolRunner.
Exception
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||