edu.umd.cloud9.collection.wikipedia
Class LookupWikipediaArticle
java.lang.Object
org.apache.hadoop.conf.Configured
edu.umd.cloud9.collection.wikipedia.LookupWikipediaArticle
- All Implemented Interfaces:
- Configurable, Tool
public class LookupWikipediaArticle
- extends Configured
- implements Tool
Tool for providing command-line access to page titles given either a docno or
a docid. This does not run as a MapReduce job.
Here's a sample invocation:
hadoop jar cloud9all.jar edu.umd.cloud9.collection.wikipedia.LookupWikipediaArticle \
/user/jimmy/Wikipedia/compressed.block/findex-en-20101011.dat \
/user/jimmy/Wikipedia/docno-en-20101011.dat
Note, you'll have to build a jar that contains the contents of
bliki-core-3.0.15.jar and commons-lang-2.5.jar, since -libjars won't work for
this program (since it's not a MapReduce job).
- Author:
- Jimmy Lin
run
public int run(String[] args)
throws Exception
- Specified by:
run in interface Tool
- Throws:
Exception
main
public static void main(String[] args)
throws Exception
- Throws:
Exception