edu.umd.cloud9.collection.wikipedia
Class WikipediaPageInputFormat.WikipediaPageRecordReader

java.lang.Object
  extended by edu.umd.cloud9.collection.wikipedia.WikipediaPageInputFormat.WikipediaPageRecordReader
All Implemented Interfaces:
RecordReader<LongWritable,WikipediaPage>
Enclosing class:
WikipediaPageInputFormat

public static class WikipediaPageInputFormat.WikipediaPageRecordReader
extends Object
implements RecordReader<LongWritable,WikipediaPage>

Hadoop RecordReader for reading Wikipedia pages from the XML dumps.


Constructor Summary
WikipediaPageInputFormat.WikipediaPageRecordReader(FileSplit split, JobConf conf)
          Creates a WikipediaPageRecordReader.
 
Method Summary
 void close()
          Closes this InputSplit.
 LongWritable createKey()
          Creates an object for the key.
 WikipediaPage createValue()
          Creates an object for the value.
 long getPos()
          Returns the current position in the input.
 float getProgress()
          Returns progress on how much input has been consumed.
 boolean next(LongWritable key, WikipediaPage value)
          Reads the next key-value pair.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

WikipediaPageInputFormat.WikipediaPageRecordReader

public WikipediaPageInputFormat.WikipediaPageRecordReader(FileSplit split,
                                                          JobConf conf)
                                                   throws IOException
Creates a WikipediaPageRecordReader.

Throws:
IOException
Method Detail

next

public boolean next(LongWritable key,
                    WikipediaPage value)
             throws IOException
Reads the next key-value pair.

Specified by:
next in interface RecordReader<LongWritable,WikipediaPage>
Throws:
IOException

createKey

public LongWritable createKey()
Creates an object for the key.

Specified by:
createKey in interface RecordReader<LongWritable,WikipediaPage>

createValue

public WikipediaPage createValue()
Creates an object for the value.

Specified by:
createValue in interface RecordReader<LongWritable,WikipediaPage>

getPos

public long getPos()
            throws IOException
Returns the current position in the input.

Specified by:
getPos in interface RecordReader<LongWritable,WikipediaPage>
Throws:
IOException

close

public void close()
           throws IOException
Closes this InputSplit.

Specified by:
close in interface RecordReader<LongWritable,WikipediaPage>
Throws:
IOException

getProgress

public float getProgress()
                  throws IOException
Returns progress on how much input has been consumed.

Specified by:
getProgress in interface RecordReader<LongWritable,WikipediaPage>
Throws:
IOException