Opened 11 years ago

Last modified 7 years ago

#2243 new bug

parse hostname

Reported by: Alien_Huker Owned by:
Priority: minor Milestone: 2.x
Component: Unknown Version: 1.3.5
Keywords: Cc:

Description

in file deluge/core/torrent.py on line 620

if parts[-2] in ("co", "com", "net", "org") or parts[-1] in ("uk")

uk is "United Kingdom" country top-level domains. all country top-level domains have two-letter.

change code to

if parts[-2] in ("co", "com", "net", "org") or len(parts[-1]) == 2

this check for all counties top-level domains.

Change History (7)

comment:1 Changed 11 years ago by Cas

I don't think this is the correct fix because it would then allow any two digits for the third part and also doesn't account for random digits for the second part.

comment:2 Changed 11 years ago by Alien_Huker

anyway check need not only for 'uk'. other contries must add to this list. you agree? for example, in my list host "bt.0day.kiev.ua" view "kiev.ua" but must be 0day.kiev.ua

comment:3 Changed 11 years ago by Cas

  • Milestone changed from Future to 1.3.6

comment:4 Changed 11 years ago by Chionsas

There is no good way of getting a domain name from the hostname. Period.

Browsers do this by checking against "effective tld names" (aka "public suffixes"). There's a project by Mozilla for that (direct link to the list).

There are numerous implementations using the publicsuffix.org's list (Python module: http://pypi.python.org/pypi/publicsuffix), but IMHO that would be overkill.

I suggest removing this feature altogether and always displaying the full hostname. There are no good alternatives short of checking the DNS (again, overkill).

Current code is buggy as hell (it would fail at the most basic DOMAIN.edu.au, not even going into DOMAIN.pvt.k12.ma.us territory).

comment:5 Changed 11 years ago by Chionsas

After a short discussion on IRC, I suggest having a cache for parsed hostnames. This is to avoid tracker1.domain.tld and tracker2.domain.tld being considered two separate trackers.

A one-time DNS query could determine the real domain of a hostname and then be cached for future use. This single expensive operation would not be overkill :)

comment:6 Changed 11 years ago by Cas

  • Milestone changed from 1.3.6 to 1.4.0

comment:7 Changed 7 years ago by Cas

  • Milestone changed from 2.0.x to 2.x

Milestone renamed

Note: See TracTickets for help on using tickets.