LeraoSoft - Knowledge Distributed

Knowledge Distributed

Archive for the 'Research' Category

Future for Mimir: C

Background:

Mimir is a distributed file system that we created for our Senior Design.  It used FreePastry which is written in Java and overall was fairly successful.  Files could be saved, retrieved, deleted.  There was user authentication and overall it worked OK in a stable environment.  One of the issues that it had however was speed issues doing the encoding and interacting with the network stack.  The encoding issue had a lot to do with java performing extra casting for the XOR operations that were key to our erasure code.  The network issues were mostly due to not being fully asynchronous and how we were using the network.  The network proved to be a large problem particularly with saving a file though also with retrieving a file.  After first trying to use the overlay network directly for saving a file we resorted to polling the network for the nodes and then sending the data through a direct tcp socket to the individual connections.  This was an issue because we had to 1 open a lot of connections and 2 loop over them to save the data.

Update:

Since then i have come up with a way to distribute the file evenly over all of the nodes without incurring a bottleneck at the top of a tree.  To do this when a node wants to send a file out it picks a random ID in the ID space and also selects it’s compliment( ie if the id space is a circle the node 180 degrees from it.)  The node then sends half of the blocks of data to node A and half to node B.  node A and be split up their half of the circle into halves and each half then repeats this until they pick themselves for a given node(ie they are the only node in their remaining id range.

This has some benefits first it means that the originating node has to do fewer lookups to save the file and does not need to know anything about the sate of the network.  Secondly as long as the node id’s are evenly distributed in the space the file will be evenly distributed.

n the downside this means that the speed of saving the file is determened by the speed of the originating node to nodes A and B.  However our design case is for LANs so this should not be an issue and we should beable to better saturate the network.

Note: There is no reason to limit the tree to be binary it could be split into any number of pie pieces.

To help achieve this as well as even grater speed we are planing to use Chimera which is the successor to Tapestry and is written in C.  This will allow us to perform the XOR operations using various processor optimizations as well as assembly if needed without the pain of the Java C bridge.  In addition this should make our code much smaller!

No comments

Results and Graphs

Here are a few charts of our progress we have made.  In the first chart we show what we had at the beginning of our development in the second chart we show where we are now.  The y axis was how much data was used to rebuild and the x axis is the trial number.

Where we are now

As you can see in the first chart we have a lot of points above the 50% mark meaning that if we lost half the data we would not be able to rebuild.  In the second chart you can see there are none over that mark.  The points tend to be somewhat normally distributed meaning even in the second chart there is a chance that it would take more data to rebuild but we can say with certainty of 99.9998 (from experimental data) that we can rebuild with 50% data loss.

The changes made to the robust soliton distribution are extensions from ones done at Helsinki and at UC Santa Barbara in that we added at least one spike and changed the initial low degree from 1 to 3.  We hope to be publishing more on this later.

As for speed here are the results for 1mb with just local encoding/decoding times are in seconds and throughput is in MB/s.  Also Throughput is calculated by dividing the initial size of 1mb by the speed in both cases.  this means that for each case our network usage would be more since we are creating 4 times the original size in data.

 

  Encode Time Decode Time
Average 0.077298816 0.024007306
StDev 0.004464512 0.002501727
Total Time 77.298816 24.007306
MIN 0.068047 0.016083
MAX 0.124591 0.037237
AVE THROUGPUT 12.93680876 41.6539865

 

 

 

 

No comments

MPI POC for Mimir

The Mimir team for fulfillment of our final project for Parallel and Distributed Computing is implementing a Proof of Concept for our parallel decoding algorithm that we hope will show has some speedup or bandwidth benefits over a sequential gather and decode algorithm.  To do this we will setup all nodes in a tree structure.  Each node then reads the encoded data from disk and sends it to it’s parent.  After this initial setup each node then starts a loop where it takes any waiting data from the network and add’s it to it’s stack of encoded data, then performs any possible partial decoding of the data placing the results in the same stack, and finally sending useful data to it’s parent.   It repeats this process until it receives a done message from the root of the tree.  This message is sent when the root of the tree has the original file.

No comments

Design Meeting

Discussed detail of how our project will be implemented, specifically what java classes we will need and how we will use them.  We also discussed how we were going to disseminate in a reliable manner the file over the network.  We did come up with one solution to basically have a take one and pass it on solution but i am in charge of seeing how this would work and also looking for alternitive solutions. Chris will be working on creating a luby code distribution testing suite.  Our notes on the general overlay of our java classes are as follows.

  • Java Classpaths
    • lerao.mimir
      • client — In charge of interacting with user and initiating file transfers
        • ui — In charge of how users interact with the system
      • file — In charge of manipulating the file including encoding and decoding
        • header — holds the encoding header
        • data — holds encoded and decoded data
        • block — Contains file header,data and id

      • metadata — In charge of controlling all the data
        • ui — ability to configure and view the status of the network
      • crypto — handles all cryptographic functions
      • network — handles all shared network functions
      • storage — handles saving data from the network to disk and vice-versa
No comments

Read Research Papers

This is a list of research papers. 

  1. FatNemo: Building a Resilient Multi-source Multicast Fat-Tree
  2. Resilience in Overlay Multicast Protocols
  3. Resilient Peer-to-Perr Multicast withought the Cost
  4. Structured and unstructured overlays under the microscope 
  5. Parallel Computing On any Desktop ~ By Ami Marowka
  6. Depth-Latency Tradeoffs in Multicast Tree Algorithms
  7. Design of a DHT
  8. OpenDHT: A Public DHT Service and It’s Uses 
  9. Fixing the Embarrassing Slowness of OpenDHT on PlanetLab
  10. XScribe: a Stateless, Cross-Layer Approach to P2P Multicast in Multi-Hop Ad hoc Networks
  11. Rarest First and Choke Algorithms Are Enough
  12. Scalable Application Layer Multicast 
  13. Scribe: The design of a large-scale event notification infrastructure
  14. Robust and Efficient Data Management for a Distributed Hash Table
  15. A DHT-based Backup System
  16. Digital Fountain Codes
  17. Erasure Coding vs. Replication: A Quantitative Comparison
  18. The Google File System
  19. Hyper-codes: High-performance Low-complexity Error-correcting Codes
  20. Kademlia: A Peer-to-peer Information System Based on the XOR Metric
  21. Lecture Notes from COMS 4995: Introduction to Coding Theory
  22. LT Codes
  23. MapReduce: Simplified Data Processing on Large Clusters
  24. Pond: the OceanStore Prototype
  25. An Evaluation of Panasas at BNL
  26. Survivable Information Storage Systems (PASIS)
  27. Petal: Distributed Virtual Disks
  28. Summary of Raptor Codes
  29. Designing Tornado Codes as Hyper Codes for Improved Error Correcting Performance
  30. Typhoon: An Archival System for Tolerating High Degrees of File Server Failure
  31. LanStore: a highly distributed reliable file storage system
  32. The Zebra Striped Network File System

Zip Files of these documents are availbile in the above password protected post (due to copyright laws).   

No comments

Next Page »