edu.umd.cloud9.example.cooccur
Class ComputeCooccurrenceMatrixStripes

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by edu.umd.cloud9.example.cooccur.ComputeCooccurrenceMatrixStripes
All Implemented Interfaces:
Configurable, Tool

public class ComputeCooccurrenceMatrixStripes
extends Configured
implements Tool

Implementation of the "stripes" algorithm for computing co-occurrence matrices from a large text collection. This algorithm is described in Chapter 3 of "Data-Intensive Text Processing with MapReduce" by Lin & Dyer, as well as the following paper:

Jimmy Lin. Scalable Language Processing Algorithms for the Masses: A Case Study in Computing Word Co-occurrence Matrices with MapReduce. Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP 2008), pages 419-428.

This program takes the following command-line arguments:

Author:
Jimmy Lin

Constructor Summary
ComputeCooccurrenceMatrixStripes()
          Creates an instance of this tool.
 
Method Summary
static void main(String[] args)
          Dispatches command-line arguments to the tool via the ToolRunner.
 int run(String[] args)
          Runs this tool.
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Constructor Detail

ComputeCooccurrenceMatrixStripes

public ComputeCooccurrenceMatrixStripes()
Creates an instance of this tool.

Method Detail

run

public int run(String[] args)
        throws Exception
Runs this tool.

Specified by:
run in interface Tool
Throws:
Exception

main

public static void main(String[] args)
                 throws Exception
Dispatches command-line arguments to the tool via the ToolRunner.

Throws:
Exception