Some Ramblings
Implementing our senior project on Pastry/Scribe.
if using pre-encoded data
Use Scribe to get a list of nodes with disk space free then send the encoded data to the nodes based on their id’s through pastry. If the node has failed a nearby node will have received the data instead. This node may be-able to save the data OR find a new location for it by any-casting it. If it can save the data then we have no extra bandwidth cost of finding a new location for it otherwise we increase the number of packets.
if encoding data on the network, get the list off all available nodes on the multicast network, create a table of who should compute what parodies and the data dependancies for those parody blocks. There will be global topics for all 0 blocks 1 blocks etc… to our max # of blocks in a file/subfile. The nodes should be listening on this already, they will receive the necessary blocks on this communication and compute requested parity data. When done they should send a NOK if there was some error, depending on the error and how severe there may be some consequences.
For sending data to the host we just use Pastry to route the data to the id of the host which requested the data on the data request channel.
I wonder how security will work I imymagine if when we get the save request we can save the public key for all the users.. though that would not support sharing files.
Solution: Make the Metadata controller sign requests for data then the metadata controller can handle access control and data requests are only valid with the metadata controller’s signature.
How would multiple metadata controllers work with this system with ought having all nodes know what controllers are valid. One way is to have a master MDC that is the CA for all nodes and metadata controllers, think kerberos, if that node goes down we are ok just no new nodes or metadata controllers can join the network.
One problem would be keeping all the certs in sync. Or make it so any mdc issued cert is valid as long as the parent cert is valid. this would allow mdc’s to issue certs and work independently of one another. MDC’s may still require some shared storage or database to keep all this data in. This will probably work though may make verification slightly harder.
Also it is important to note that computing the parody on the box saves bandwidth but may have trouble getting the file off the machine as fast so may have issues when computers leave the network when a file has not finished saving, though this is something that all file systems must deal with. One possible fix is to store a copy like journaling so that if there is an error we can recover from it next time we connect to the network.
No comments yet. Be the first.
Leave a reply

