Vuze Forums

Full Version: Sync Plugin
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I'd like to ask if any developers think it is possible to develop a plugin to do syncing.

Basically, as described in the following document, it seems Vuze/Azureus could be relatively easily adapted to do sync, considering that it has all the necessary underlying mechanism and a pretty powerful and extensible architecture.

If sync is torrent based, then implementing it in Vuze would not seem to be much of problem. Is that right in your opinion?

Or, if you think plugins would not work or would not be as efficient and/or reliable, or assuring data integrity across the nodes could be problematic, then could you please outline the issues as you see them and/or suggest a pointer where we can find some useful information?

Thanks in advance.

Here's description of how a torrent client can be extended to do the sync in that document:

Synthetic synchronized torrent approach
http://preciseinfo.org/BTSync/BTSync_Not...t_approach

(Note: any valuable feedback and especially detailed analysis of the issues will be included in the Sync Detailed Manual, one of the most authoritative sync documents on line.)

 

 
What is the difference between torrents and sync?

Basically, the idea with sync is based on the fact that the main difference between sync and torrent approaches is that torrents are static while sync is dynamic. That means that the information in the sync collection/share can be updated without changing the collection key/folder.

With torrents, you can not change, update, modify or extend a collection. The problem with this approach is that the information is usually dynamic in nature. That is why torrents get outdated in most cases, and so one needs to get the new versions of the same basic collection by getting a completely different torrent, while, in most cases, the most information in a torrent does not change.

With sync approach, once you get the key to the share/collection, you no longer have to worry about whether you have the latest information and you are always in sync. Secondly, updates usually come in small increments, changing just a few files, and that could be a significant improvement in terms of bandwidth, update time and so on. Typically, you can update the existing collection in seconds.

But what is most interesting about it is that you can utilize the underlying torrent mechanisms to do the sync. In fact, that is what BTSync has done. The transfer engine, node/peer discovery and the bulk of the code are pretty much the same. You just need to add the higher level logic engine and a few other mechanisms that are not complex in any fundamental ways.

So, it seems to be quite natural to do the sync using the existing torrent base code and merely extend the torrent client functionality. It would be interesting to see any ideas or criticism or suggestions of such an approach.
Well, what we are dealing with sync is the fact that with sync approach torrents as such are outdated.

Because about the only difference between torrent approach and sync would be the reliability and data consistency issue. From the first glance, it looks like torrents are inherently more reliable and guarantee data integrity. Once you have a torrent file and there are seeders, you are guaranteed to get the correct data. Yes, true, but only on the first glance.

Because if you do not have any seeders with 100% data, your data is no longer guaranteed. Furthermore, since torrents are usually downloaded by random pieces of the files, you are likely to have lots of incomplete files containing holes in them.

So, torrents are reliable ONLY if there are 100% seeders on line while you are trying to download them.

Question: what is less reliable in sync approach compared to case you do not have the 100% seeders on line in torrent approach?

Well, there is no inherent drop in reliability in sync. If you have 100% seeders with sync approach, you are also guaranteed to get the data reliably, as reliably as with torrents. And if you don't have 100% seeders on line, then you will get a partial share/collection, just like with torrents.

But...

The main advantages of sync compared to torrents is that with a single share key you get access to "the latest and greatest" version of a collection and you don't need to keep downloading the the whole stuff from a newer version, duplicating some or most of the data in the information collection.

Furthermore, sync is usually done by sequential downloading the files and, therefore, there are no holes in files besides holes at the tail of the file that has not been downloaded, which means that you only get one incomplete file instead of getting a bunch of damaged files with torrents until you download the entire collection.

Otherwise, there is basically no difference between the torrents and sync. Speed is not affected with sync. Integrity is not affected for as long as you have at least one master (r/w) node on line. Because with sync, your integrity is verified via file hash compared to the master node, which is the same thing as any seeder node in torrent approach.

Furthermore, no r/o node may introduce the damaged or incorrect version of the file because in properly designed sync architecture the only files that propagate are the files guaranteed to be the EXACT copies of the master collection.

So, question arises: what is the advantages of the torrent approach compared to sync, where you are guaranteed to have a single key with which you can access the guaranteed "latest version" of the files?

It seems that for general purpose information distribution at least, sync approach eliminates the major shortcoming of torrents - getting outdated, as any information, regardless, eventually gets outdated.

Makes sense?