Opened 12 years ago
Last modified 8 years ago
#2243 new bug
parse hostname
Reported by: | Alien_Huker | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | 2.x |
Component: | Unknown | Version: | 1.3.5 |
Keywords: | Cc: |
Description
in file deluge/core/torrent.py on line 620
if parts[-2] in ("co", "com", "net", "org") or parts[-1] in ("uk")
uk is "United Kingdom" country top-level domains. all country top-level domains have two-letter.
change code to
if parts[-2] in ("co", "com", "net", "org") or len(parts[-1]) == 2
this check for all counties top-level domains.
Change History (7)
comment:1 by , 12 years ago
comment:2 by , 12 years ago
anyway check need not only for 'uk'. other contries must add to this list. you agree? for example, in my list host "bt.0day.kiev.ua" view "kiev.ua" but must be 0day.kiev.ua
comment:3 by , 12 years ago
Milestone: | Future → 1.3.6 |
---|
comment:4 by , 12 years ago
There is no good way of getting a domain name from the hostname. Period.
Browsers do this by checking against "effective tld names" (aka "public suffixes"). There's a project by Mozilla for that (direct link to the list).
There are numerous implementations using the publicsuffix.org's list (Python module: http://pypi.python.org/pypi/publicsuffix), but IMHO that would be overkill.
I suggest removing this feature altogether and always displaying the full hostname. There are no good alternatives short of checking the DNS (again, overkill).
Current code is buggy as hell (it would fail at the most basic
DOMAIN.edu.au
, not even going into DOMAIN.pvt.k12.ma.us
territory).
comment:5 by , 12 years ago
After a short discussion on IRC, I suggest having a cache for parsed hostnames.
This is to avoid tracker1.domain.tld
and tracker2.domain.tld
being considered two separate trackers.
A one-time DNS query could determine the real domain of a hostname and then be cached for future use. This single expensive operation would not be overkill :)
comment:6 by , 12 years ago
Milestone: | 1.3.6 → 1.4.0 |
---|
I don't think this is the correct fix because it would then allow any two digits for the third part and also doesn't account for random digits for the second part.