Archive for the 'Research' Category
Future for Mimir: C
Background:
Mimir is a distributed file system that we created for our Senior Design. It used FreePastry which is written in Java and overall was fairly successful. Files could be saved, retrieved, deleted. There was user authentication and overall it worked OK in a stable environment. One of the issues that it had however was speed issues doing the encoding and interacting with the network stack. The encoding issue had a lot to do with java performing extra casting for the XOR operations that were key to our erasure code. The network issues were mostly due to not being fully asynchronous and how we were using the network. The network proved to be a large problem particularly with saving a file though also with retrieving a file. After first trying to use the overlay network directly for saving a file we resorted to polling the network for the nodes and then sending the data through a direct tcp socket to the individual connections. This was an issue because we had to 1 open a lot of connections and 2 loop over them to save the data.
Update:
Since then i have come up with a way to distribute the file evenly over all of the nodes without incurring a bottleneck at the top of a tree. To do this when a node wants to send a file out it picks a random ID in the ID space and also selects it’s compliment( ie if the id space is a circle the node 180 degrees from it.) The node then sends half of the blocks of data to node A and half to node B. node A and be split up their half of the circle into halves and each half then repeats this until they pick themselves for a given node(ie they are the only node in their remaining id range.
This has some benefits first it means that the originating node has to do fewer lookups to save the file and does not need to know anything about the sate of the network. Secondly as long as the node id’s are evenly distributed in the space the file will be evenly distributed.
n the downside this means that the speed of saving the file is determened by the speed of the originating node to nodes A and B. However our design case is for LANs so this should not be an issue and we should beable to better saturate the network.
Note: There is no reason to limit the tree to be binary it could be split into any number of pie pieces.
To help achieve this as well as even grater speed we are planing to use Chimera which is the successor to Tapestry and is written in C. This will allow us to perform the XOR operations using various processor optimizations as well as assembly if needed without the pain of the Java C bridge. In addition this should make our code much smaller!
No commentsResults and Graphs
Here are a few charts of our progress we have made. In the first chart we show what we had at the beginning of our development in the second chart we show where we are now. The y axis was how much data was used to rebuild and the x axis is the trial number.
As you can see in the first chart we have a lot of points above the 50% mark meaning that if we lost half the data we would not be able to rebuild. In the second chart you can see there are none over that mark. The points tend to be somewhat normally distributed meaning even in the second chart there is a chance that it would take more data to rebuild but we can say with certainty of 99.9998 (from experimental data) that we can rebuild with 50% data loss.
The changes made to the robust soliton distribution are extensions from ones done at Helsinki and at UC Santa Barbara in that we added at least one spike and changed the initial low degree from 1 to 3. We hope to be publishing more on this later.
As for speed here are the results for 1mb with just local encoding/decoding times are in seconds and throughput is in MB/s. Also Throughput is calculated by dividing the initial size of 1mb by the speed in both cases. this means that for each case our network usage would be more since we are creating 4 times the original size in data.
| Encode Time | Decode Time | |
| Average | 0.077298816 | 0.024007306 |
| StDev | 0.004464512 | 0.002501727 |
| Total Time | 77.298816 | 24.007306 |
| MIN | 0.068047 | 0.016083 |
| MAX | 0.124591 | 0.037237 |
| AVE THROUGPUT | 12.93680876 | 41.6539865 |
No comments
MPI POC for Mimir
The Mimir team for fulfillment of our final project for Parallel and Distributed Computing is implementing a Proof of Concept for our parallel decoding algorithm that we hope will show has some speedup or bandwidth benefits over a sequential gather and decode algorithm. To do this we will setup all nodes in a tree structure. Each node then reads the encoded data from disk and sends it to it’s parent. After this initial setup each node then starts a loop where it takes any waiting data from the network and add’s it to it’s stack of encoded data, then performs any possible partial decoding of the data placing the results in the same stack, and finally sending useful data to it’s parent. It repeats this process until it receives a done message from the root of the tree. This message is sent when the root of the tree has the original file.
No commentsDesign Meeting
Discussed detail of how our project will be implemented, specifically what java classes we will need and how we will use them. We also discussed how we were going to disseminate in a reliable manner the file over the network. We did come up with one solution to basically have a take one and pass it on solution but i am in charge of seeing how this would work and also looking for alternitive solutions. Chris will be working on creating a luby code distribution testing suite. Our notes on the general overlay of our java classes are as follows.
- Java Classpaths
- lerao.mimir
- client — In charge of interacting with user and initiating file transfers
- ui — In charge of how users interact with the system
- file — In charge of manipulating the file including encoding and decoding
- header — holds the encoding header
- data — holds encoded and decoded data
- block — Contains file header,data and id
- metadata — In charge of controlling all the data
- ui — ability to configure and view the status of the network
- crypto — handles all cryptographic functions
- network — handles all shared network functions
- storage — handles saving data from the network to disk and vice-versa
- client — In charge of interacting with user and initiating file transfers
- lerao.mimir
Read Research Papers
This is a list of research papers.
- FatNemo: Building a Resilient Multi-source Multicast Fat-Tree
- Resilience in Overlay Multicast Protocols
- Resilient Peer-to-Perr Multicast withought the Cost
- Structured and unstructured overlays under the microscope
- Parallel Computing On any Desktop ~ By Ami Marowka
- Depth-Latency Tradeoffs in Multicast Tree Algorithms
- Design of a DHT
- OpenDHT: A Public DHT Service and It’s Uses
- Fixing the Embarrassing Slowness of OpenDHT on PlanetLab
- XScribe: a Stateless, Cross-Layer Approach to P2P Multicast in Multi-Hop Ad hoc Networks
- Rarest First and Choke Algorithms Are Enough
- Scalable Application Layer Multicast
- Scribe: The design of a large-scale event notification infrastructure
- Robust and Efficient Data Management for a Distributed Hash Table
- A DHT-based Backup System
- Digital Fountain Codes
- Erasure Coding vs. Replication: A Quantitative Comparison
- The Google File System
- Hyper-codes: High-performance Low-complexity Error-correcting Codes
- Kademlia: A Peer-to-peer Information System Based on the XOR Metric
- Lecture Notes from COMS 4995: Introduction to Coding Theory
- LT Codes
- MapReduce: Simplified Data Processing on Large Clusters
- Pond: the OceanStore Prototype
- An Evaluation of Panasas at BNL
- Survivable Information Storage Systems (PASIS)
- Petal: Distributed Virtual Disks
- Summary of Raptor Codes
- Designing Tornado Codes as Hyper Codes for Improved Error Correcting Performance
- Typhoon: An Archival System for Tolerating High Degrees of File Server Failure
- LanStore: a highly distributed reliable file storage system
- The Zebra Striped Network File System
Zip Files of these documents are availbile in the above password protected post (due to copyright laws).
No comments




