edu.umd.cloud9.collection.wikipedia
Class LookupWikipediaArticle

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by edu.umd.cloud9.collection.wikipedia.LookupWikipediaArticle
All Implemented Interfaces:
Configurable, Tool

public class LookupWikipediaArticle
extends Configured
implements Tool

Tool for providing command-line access to page titles given either a docno or a docid. This does not run as a MapReduce job.

Here's a sample invocation:

 hadoop jar cloud9all.jar edu.umd.cloud9.collection.wikipedia.LookupWikipediaArticle \
   /user/jimmy/Wikipedia/compressed.block/findex-en-20101011.dat \
   /user/jimmy/Wikipedia/docno-en-20101011.dat
 

Note, you'll have to build a jar that contains the contents of bliki-core-3.0.15.jar and commons-lang-2.5.jar, since -libjars won't work for this program (since it's not a MapReduce job).

Author:
Jimmy Lin

Method Summary
static void main(String[] args)
           
 int run(String[] args)
           
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Method Detail

run

public int run(String[] args)
        throws Exception
Specified by:
run in interface Tool
Throws:
Exception

main

public static void main(String[] args)
                 throws Exception
Throws:
Exception