LeraoSoft - Knowledge Distributed

Knowledge Distributed

Archive for the 'Network' Category

Future for Mimir: C

Background:

Mimir is a distributed file system that we created for our Senior Design.  It used FreePastry which is written in Java and overall was fairly successful.  Files could be saved, retrieved, deleted.  There was user authentication and overall it worked OK in a stable environment.  One of the issues that it had however was speed issues doing the encoding and interacting with the network stack.  The encoding issue had a lot to do with java performing extra casting for the XOR operations that were key to our erasure code.  The network issues were mostly due to not being fully asynchronous and how we were using the network.  The network proved to be a large problem particularly with saving a file though also with retrieving a file.  After first trying to use the overlay network directly for saving a file we resorted to polling the network for the nodes and then sending the data through a direct tcp socket to the individual connections.  This was an issue because we had to 1 open a lot of connections and 2 loop over them to save the data.

Update:

Since then i have come up with a way to distribute the file evenly over all of the nodes without incurring a bottleneck at the top of a tree.  To do this when a node wants to send a file out it picks a random ID in the ID space and also selects it’s compliment( ie if the id space is a circle the node 180 degrees from it.)  The node then sends half of the blocks of data to node A and half to node B.  node A and be split up their half of the circle into halves and each half then repeats this until they pick themselves for a given node(ie they are the only node in their remaining id range.

This has some benefits first it means that the originating node has to do fewer lookups to save the file and does not need to know anything about the sate of the network.  Secondly as long as the node id’s are evenly distributed in the space the file will be evenly distributed.

n the downside this means that the speed of saving the file is determened by the speed of the originating node to nodes A and B.  However our design case is for LANs so this should not be an issue and we should beable to better saturate the network.

Note: There is no reason to limit the tree to be binary it could be split into any number of pie pieces.

To help achieve this as well as even grater speed we are planing to use Chimera which is the successor to Tapestry and is written in C.  This will allow us to perform the XOR operations using various processor optimizations as well as assembly if needed without the pain of the Java C bridge.  In addition this should make our code much smaller!

No comments

Winter Design Document

The latest design document can be found HERE or a copy can be found there->Winter Design Document

No comments

Rev 100 Reached

Last night we began integrating the file encoding and decoding functions into the network and client code.  This caused us to commit and pass revision 100 and we currently have written 2763 lines of code.  Within the next week we hope to test the code and perform the first file save on the network.After we have done that we will be working on ensuring configurability and security. 

No comments

The State of the Code

So the programming has begun in full swing.  There is already a good start to the network clients for the Meta Data Controller, MimirClient(the “user” client) and the Storage Client.  Also a SubVersion repository has been set up and can be viewed on our Trac page at http://compsciguy.homeftp.org/trac/mimir.  Also i want to take this time to thank Chris Deeterly for his suggestion to implement a task handler that works like a queue to process tasks. The deadline is coming up soon (feb 11th) and we are working at a fever pitch to get this first HUGE milestone done in time. 

No comments

MPI POC for Mimir

The Mimir team for fulfillment of our final project for Parallel and Distributed Computing is implementing a Proof of Concept for our parallel decoding algorithm that we hope will show has some speedup or bandwidth benefits over a sequential gather and decode algorithm.  To do this we will setup all nodes in a tree structure.  Each node then reads the encoded data from disk and sends it to it’s parent.  After this initial setup each node then starts a loop where it takes any waiting data from the network and add’s it to it’s stack of encoded data, then performs any possible partial decoding of the data placing the results in the same stack, and finally sending useful data to it’s parent.   It repeats this process until it receives a done message from the root of the tree.  This message is sent when the root of the tree has the original file.

No comments

Next Page »