« Return to Thread: datanode.BlockAlreadyExistsException

datanode.BlockAlreadyExistsException

by Jesse Hires :: Rate this Message:

Reply to Author | View in Thread

I tried asking this over at the nutch-user alias, but I am seeing very little traction, so I thought I'd ask the developers. I realize this is most likely a configuration problem on my end, but I am very new to using nutch, so I am having a difficult time understanding where I need to look.

Does anyone have any insight into the following error I am seeing in the hadoop logs? Is this something I should be concerned with, or is it expected that this shows up in the logs from time to time? If it is not expected, where can I look for more information on what is going on?

2009-10-16 17:02:43,061 ERROR datanode.DataNode - DatanodeRegistration(192.168.1.7:50010, storageID=DS-1226842861-192.168.1.7-50010-1254609174303, infoPort=50075, ipcPort=50020):DataXceiver

org.apache.hadoop.hdfs.server.datanode.BlockAlreadyExistsException: Block blk_909837363833332565_3277 is valid, and cannot be written to.
at org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:975)

at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:97)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:259)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)

at java.lang.Thread.run(Thread.java:636)


I am able to produce this just injecting the urls (2 of them), but it shows up on both datanodes, and happens whenever I run an operation that uses dfs.

I am running the latest sources from the trunk.
I've verified that only one instance of the following on the datanodes:
org.apache.hadoop.hdfs.server.datanode.DataNode
org.apache.hadoop.mapred.TaskTracker

I've also verified that only one instance of the following are running on the name node:
org.apache.hadoop.hdfs.server.namenode.NameNode
org.apache.hadoop.mapred.JobTracker


The hardware is as follows:
Two data nodes, both configured identical. Atom 330 proc, 2gigs ram, 320g SATA 3.0 hard drive, Fedora Core 10.
One name node, running some amd x86 proc, 2 gigs memory, 750g SATA, Fedora Core 10. (pieced together from spare parts)
All across a 100mb network.
Admittedly this is low end hardware, but I am doing this specifically as an exercise in using low power (as in electricity)  hardware.

I can also provide config files if needed.

Jesse

int GetRandomNumber()
{
   return 4; // Chosen by fair roll of dice
                // Guaranteed to be random
} // xkcd.com

 « Return to Thread: datanode.BlockAlreadyExistsException