Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[RFE] Detect (and sanitize) files that are too long
#1
Hi,

For not well understood reasons, some of my completed torrents now contain valid files to which (invalid) data has been appended. At some point Vuze has singled out some of the cases, showing an error about the file size being too large but doing little else; in the rest of the cases, forcing a re-check on torrents with corrupt files has not caused the problem to be detected by Vuze.

First of all, it would be nice if forcing a re-check on a torrent detected such cases. Perhaps there is a reason why it cannot or should not be done; if not so, it would be great if such corrupt files could be singled out.

Second, it would be nice to be able to deal with larger than expected files from within Vuze when they are encountered. I know of no other solution than to delete and then redownload the problematic file. However, in cases where simply truncating the file would fix the problem it seems a little wasteful. I can imagine that there might be UI or coding challenges, or that perhaps a more general and exhaustive integrity check and correction solution is more desirable (and perhaps even planned); but if it were possible to offer truncation to the expected file size as a dumb but simple fix it would be welcome.

Cheers.
Reply
#2
'detelosk\' dateline='\'1494785673'' Wrote: Second, it would be nice to be able to deal with larger than expected files from within Vuze when they are encountered. I know of no other solution than to delete and then redownload the problematic file. However, in cases where simply truncating the file would fix the problem it seems a little wasteful. I can imagine that there might be UI or coding challenges, or that perhaps a more general and exhaustive integrity check and correction solution is more desirable (and perhaps even planned); but if it were possible to offer truncation to the expected file size as a dumb but simple fix it would be welcome.

 

I believe the feature you're asking for already exists, and can be turned on by checking the "Truncate existing files that are too large" checkbox in the Files pane of the Options. (It's visible when the options Mode is set to Intermediate level or higher.)

Vuze shouldn't often encounter this problem, though I've seen it occur myself. (In my case it's usually when I've done some sort of manual download-directory fiddling outside of Vuze, or attempted to be too clever for my own good when it comes to multiple torrents containing the same data, and I've ended up confusing Vuze — it's not typically a problem it creates on its own. Though I'm not in any way denying that it could, nor am I saying that the issue must be your fault.) Still, if you need any proof that you're not alone in this, the very existence of that checkbox should be ample evidence.

You may want to also look at the other options regarding file-creation in the Files pane, most especially the ones like "Enable incremental file creation" and "Append data to files as downloaded and reorder pieces as the download progresses". Those should be disabled unless you absolutely need them, as they complicate Vuze's management of downloaded files.

ETA: If I were to hazard a guess as to why this option isn't turned on by default, it's probably to protect valid files from being destroyed by Vuze if they happen to be put in the wrong place. For instance, if I have a 4.3GB DVD image Ubuntu_Zippy_Zoroastrian.iso and accidentally rename it to Ubuntu_Zippy_Zoroastrian.nfo (which is a 3KB text file), having Vuze blindly truncate the file instead of complaining about the size mismatch is probably something that's best reserved for users who opt in.
Reply
#3
(06-21-2017, 02:58 PM)FeRDNYC Wrote:
(05-14-2017, 11:14 AM)detelosk Wrote: Second, it would be nice to be able to deal with larger than expected files from within Vuze when they are encountered. I know of no other solution than to delete and then redownload the problematic file. However, in cases where simply truncating the file would fix the problem it seems a little wasteful. I can imagine that there might be UI or coding challenges, or that perhaps a more general and exhaustive integrity check and correction solution is more desirable (and perhaps even planned); but if it were possible to offer truncation to the expected file size as a dumb but simple fix it would be welcome.

 

I believe the feature you're asking for already exists, and can be turned on by checking the "Truncate existing files that are too large" checkbox in the Files pane of the Options. (It's visible when the options Mode is set to Intermediate level or higher.)

You are right! Thanks for pointing this out (and I concur with your reasoning as to why it is not checked by default).

(06-21-2017, 02:58 PM)FeRDNYC Wrote: Vuze shouldn't often encounter this problem, though I've seen it occur myself. (In my case it's usually when I've done some sort of manual download-directory fiddling outside of Vuze, or attempted to be too clever for my own good when it comes to multiple torrents containing the same data, and I've ended up confusing Vuze — it's not typically a problem it creates on its own. Though I'm not in any way denying that it could, nor am I saying that the issue must be your fault.) Still, if you need any proof that you're not alone in this, the very existence of that checkbox should be ample evidence.

The problem happened while driving the system to its knees, so it is quite possible that it was Vuze which exposed a bug and not the other way around. However I felt that a file integrity should cover this case and that's why I shared my issue here.

(06-21-2017, 02:58 PM)FeRDNYC Wrote: You may want to also look at the other options regarding file-creation in the Files pane, most especially the ones like "Enable incremental file creation" and "Append data to files as downloaded and reorder pieces as the download progresses". Those should be disabled unless you absolutely need them, as they complicate Vuze's management of downloaded files.

I believe that I recently resetted Vuze's preferences yet the "Append data to files as downloaded and reorder pieces as the download progresses" is set. I will have to try and understand why.

Thanks for the help! Cheers!
Reply
#4
(05-14-2017, 11:14 AM)''detelosk' Wrote:
(06-21-2017, 02:58 PM)'FeRDNYC' Wrote: You may want to also look at the other options regarding file-creation in the Files pane, most especially the ones like "Enable incremental file creation" and "Append data to files as downloaded and reorder pieces as the download progresses". Those should be disabled unless you absolutely need them, as they complicate Vuze's management of downloaded files.
 
I believe that I recently resetted Vuze's preferences yet the "Append data to files as downloaded and reorder pieces as the download progresses" is set. I will have to try and understand why.
 

Hmm. It sounds to me like that's probably the reason for the first issue you raised in your original post...
(07-23-2017, 11:09 AM)'detelosk' Wrote: For not well understood reasons, some of my completed torrents now contain valid files to which (invalid) data has been appended. At some point Vuze has singled out some of the cases, showing an error about the file size being too large but doing little else; in the rest of the cases, forcing a re-check on torrents with corrupt files has not caused the problem to be detected by Vuze.
 

But if they were completed torrents, then it still isn't a very satisfactory explanation, in my mind. Especially if Vuze wasn't able to detect the corrupted files. It should've been able to do that.

My only guess is that there may be some sort of bug in the file-checking logic that doesn't notice when the reordering part of "Append data... and reorder pieces" leaves pieces hanging off the end of files.

As you probably know, the base unit of measurement in which peers organize and exchange torrent data is the "piece". While the size of each piece varies from torrent to torrent, for a given torrent each piece (except the last) is the same size. When those pieces are mapped onto files of arbitrary size the boundaries don't match up, except occasionally by pure coincidence. So a single piece will start with the end of one file, and end with the beginning of another.

That's why, when you download only a portion of the files in a large torrent, Vuze will create partial files for some of the ones you didn't select: Those files share a piece with one of the files that's marked for download, so Vuze has to receive the entire piece in order to get the part containing the file data you need. It has to save the rest of the piece for re-checking purposes, and to upload the data to peers that request it. Sometimes, if the torrent contains very small files, Vuze will download an entire file even though you set it "Do not download" — that file is smaller than the piece size, and it was contained in a piece needed for one of the enabled files.

This is also why, when you add existing files to an incomplete torrent, Vuze can actually damage those files when it performs a re-check — if there are adjacent files missing, then Vuze has to consider any of the pieces shared between the existing files and the missing files to be invalid. To download those pieces it needs to clear space for them in the files on disk, and the partial-piece section of the existing file is wiped out in the process.

The point of this is, I suspect what you're seeing is sort of the opposite problem. The data appended to the end of your valid files is most likely the rest of the data for the last piece in the file, the one that's shared with the next adjacent file. That could explain why Vuze doesn't detect the extra data when re-checking — technically it's not extra data, it's the remainder of the data that's supposed to be in that piece of the torrent. The fact that it's accidentally saved in the wrong place / in two places... is something it probably should notice, and be able to either flag as a problem or even just automatically correct on its own. But it sounds like it's not currently doing that, at least in some situations.

Assuming my theory is correct, of course, because that's all it is. Parg would be the one who'd know for sure whether I'm even on the right track, unfortunately nobody seems to know when or if we'll be able to get his input on this. But it sounds possible to me that something, perhaps the "Append data... and reorder pieces" option, is causing a strange side-effect. That could particularly be the case (I'm guessing) if it's turned on or off while active or incomplete torrents are still in the queue, and that ends up confusing Vuze's idea of where it's supposed to place the piece data for "border" pieces, and/or where it should expect to find that data when checking pieces.

I know for a fact something like that can happen with the "Add suffix to incomplete files" option, where Vuze can occasionally forget to rename files when they're completed, or can get confused and double-suffix incomplete files. (They're rare and weird corner cases, which I've mostly seen when adding external files to an incomplete torrent. Because when you have a complete file being added to an incomplete torrent, should you name it TheFile.mp4.part and let Vuze rename it? Or should it be TheFile.mp4 because it's already complete? I can never remember, and that's how I end up confusing Vuze. But mis-named files are easily corrected and don't affect the integrity of the file data itself, unlike the issue you're seeing.)

End of the day, if disabling "Append data... and reorder pieces" makes the problem go away and you don't see any more files with extraneous data tacked on the end, then problem solved, and now we have another reason to avoid that option.
Reply
#5
If I happen to reproduce the problem I will take into account your insights to try and get closer to the root of the issue. Thanks!
Reply
#6
Since Spiggot sacked their main developer, these forums seem to rely on the goodwill of ONLY other users.  Want the answer to your query? -> -> follow me -> ->

I suggest you UPGRADE to BiglyBT and enjoy what is essentially the same client with more features, without the bullshit and is in constant development by one of the original Azureus developers.  Plus, there are dedicated support chat channels built into the client and during the install process, you can easily migrate all of your Vuze/Azureus settings/statistics and torrents to the BiglyBT client.

I run over 3,000 torrents, so don't worry, the client will handle your torrents with ease and I had no problems with the migration of settings and torrents. It was all seamless. BBT looks the same as Vuze and it is legal because they (BiglyBT, Vuze, Bit-Tyrant, OneSwarm) are all based on Azureus under a GPL License.

Look for me in the General help chat for your answer.
Reply


Possibly Related Threads...
Thread Author Replies Views Last Post
  changing directory all my existing .torrent and data files/folders are in coyote2 3 7,623 12-19-2017, 06:03 AM
Last Post: coyote2
  Columns Not Sorting and no Drag and Drop RachRB 1 5,226 12-18-2017, 11:00 PM
Last Post: ekstasee



Users browsing this thread: 1 Guest(s)