Opened 5 years ago

Closed 5 years ago

#2204 closed bug (Fixed)

deluged hangs on shutdown with large number of torrents started

Reported by: miskoala Owned by: bro
Priority: minor Milestone: 1.3.6
Component: Core Version: 1.3-stable (git)
Keywords: Cc:

Description

I have ~500 completed torrents in deluge (for seeding):

When I start 5 torrents and rest is paused deluged shutdown clean.

When I start ~500 torrents and then try to shutdown deluged - it hangs (tried with "Quit And Shutdown Daemon" in deluge-gtk and "kill PID" in termial), the proccess is still running but seems to be unresponsive

[DEBUG   ] 02:26:40 alertmanager:123 save_resume_data_alert: <torrent name> resume data generated
[DEBUG   ] 02:26:40 torrentmanager:1042 on_alert_save_resume_data
[DEBUG   ] 02:26:40 alertmanager:123 save_resume_data_alert: <torrent name> resume data generated
[DEBUG   ] 02:26:40 torrentmanager:1042 on_alert_save_resume_data
(...)

[DEBUG   ] 02:26:45 alertmanager:123 tracker_announce_alert: <torrent name> (<tracker>) sending announce (stopped)
[DEBUG   ] 02:26:45 torrentmanager:970 on_alert_tracker_announce
(...)
[DEBUG   ] 02:27:33 alertmanager:123 tracker_error_alert: <torent name> (<tracker>) (-1)  (2)
[DEBUG   ] 02:27:33 torrentmanager:990 on_alert_tracker_error
(...)
[DEBUG   ] 02:27:35 alertmanager:123 tracker_error_alert: <torrent name> (<tracker>) (-1)  (2)
[DEBUG   ] 02:27:35 torrentmanager:990 on_alert_tracker_error

log doesn't say anything more..

just similiar messages multiplied by ~500 or by the number of trackers per torrent. then deluged process is still running in background but I can't connect to daemon in deluge-gtk. I can wait 15 minutes and nothing happens in logs too, so I need to kill deluged with "kill -9 PID"

Change History (11)

comment:1 Changed 5 years ago by bro

I have the same issue on FreeBSD 9 (libtorrent 0.15.9_1) with 2000 torrents. Deluge doesn't really hang, but it seems to wait infinitely for libtorrent to finish saving resume data, so it's probably a libtorrent issue.

The while loop in the function stop in torrentmanager.py never finishes.

Adding a debug print in the while loop

print "shutdown_torrent_pause_list:", len(self.shutdown_torrent_pause_list)

shows that the list stops shrinking at some point. It varies what length it stops at, but when it stops shrinking it will stay there until it's killed off.

comment:2 Changed 5 years ago by Cas

  • Milestone changed from 1.3.x to performance

comment:3 Changed 5 years ago by bluebomber

  • Milestone changed from performance to 1.3.x

NOTE: Please excuse me if this isn't the correct way of saying that this affects me, too.

Deluge hangs for me indefinitely upon shutdown, whether that's invoked via CTRL+Q, file menu, or system tray menu. Moreover, after I forcefully kill the process, the next time deluge launches it has put a large portion (25-50% on average) of my torrents into "Checking resume data" mode.

When I quit the window stays open, the system tray icon disappears, and sometimes the window goes dim. Running

$ deluge -L debug

reveals that when quitting, the process ends up in a loop, periodically printing lines like this:

[DEBUG ] 14:57:16 alertmanager:123 dht_reply_alert: TORRENTNAME () received DHT peers: 1

Ubuntu 12.04 64-bit Deluge 1.3.5 libtorrent 0.15.10.0

comment:4 Changed 5 years ago by andar

My guess is that we may be hitting a limit with libtorrent's alert queue..

alert_queue_size is the maximum number of alerts queued up internally. If alerts are not popped, the queue will eventually fill up to this level. This defaults to 1000.

With sessions that have more than 1000 torrents, it's likely a lot of resume_data alerts are being discarded and then Deluge sits in an infinite loop waiting for those lost alerts.

It seems the best short-term fix would be inflate the alert_queue_size to something that would accommodate the session size.. Maybe we could have the torrent_manager.add() method modify the alert_queue_size if the number of torrents in the session breaches the limit. I think we should probably strive to keep the alert_queue_size at around (number_torrents)*1.2 to ensure we have ample room for alerts other than the resume data ones.

Going forward, we really should clean-up the shutdown code to probably use some sort of Deferred list to handle things, but that wouldn't solve the immediate issue.

Anyone with this issue want to give my theory a go? Just boost the session_settings.alert_queue_size manually for now and see if it solves the shutdown hang.

comment:5 Changed 5 years ago by andar

One more thought.. We could probably just have the torrent_manager.shutdown() method modify the alert_queue_size before it initiates all the resume data saving. This would save the add() method from having to constantly modify the value.

comment:6 Changed 5 years ago by bro

  • Owner set to bro
  • Status changed from new to accepted

andar, apparently that was exactly the problem. For the first time in more than a year I could shut down deluged properly, I can finally sleep at night!

I'll fix a patch for this.

comment:7 Changed 5 years ago by andar

If you're going to go the route of setting the alert_queue_size in shutdown(), then I would suggest setting it really high, like MAXINT high, just to absolutely make sure we don't lose any alerts for people with very large sessions.

comment:8 Changed 5 years ago by bluebomber

I "only" have 340 torrents, and I'm experiencing the problem.

comment:9 Changed 5 years ago by andar

Yea, that could make sense if you consider that for every torrent in the session we will get a paused torrent alert, save resume data alert and likely at least one tracker alert during the shutdown phase. This means that we could be generating 3 alerts per torrent on top of all the other alerts still being generated by libtorrent.

I'm starting to think we should also be boosting the default for this setting to something quite a bit higher than 1000 because there really isn't any circumstance where we want to miss any alerts.

comment:10 Changed 5 years ago by Cas

  • Milestone changed from 1.3.x to 1.3.6

comment:11 Changed 5 years ago by gazpachoking

  • Resolution set to fixed
  • Status changed from accepted to closed

Fixed in 9286d43ba8e

Note: See TracTickets for help on using tickets.