Is BitTorrent Evil?
Posted 10 Aug 2007 at 04:21 UTC by ncm 
Never mind copyright abuse, porn, and corporate co-optation. Just looking
at usage of the global set of point-to-point links ("tubes" per Sen.
Stevens), BitTorrent and similar protocols draw down the available
network capacity many times more than necessary to move files or
frames from the machines that have them to the machines that want them.
What can we do instead?
Maybe you're not convinced, because BT has always worked fine for you.
So, think about what happens when Haughty Hydra or Debbie Does Jeddah
is released. You grab a .torrent and start downloading. Ten
thousand other people do the same thing, all over the world. Most are
in the U.S. and Europe. BT starts collecting chunks of the file from
whoever has them. Some people who have them are
right nearby, but most are far, far away. BT doesn't care where they
are, it just takes them. However, the load it puts on the internet when
it gets a chunk from across the Atlantic is appallingly greater than
when it gets one from nearby.
Each time a packet goes onto a fiber it delays some other packet,
and a transatlantic packet may do that dozens of times, going through
very heavily-contested fibers. The same payload from the guy next door
would have exactly equal value to you, but it would delay many, many
fewer packets, over typically less heavily loaded, higher-capacity
links. How did that chunk get across the ocean in the first place?
Maybe it came from your neighbor, and went all the way across before
you copied it back again.
Why should you care? You're paying a flat rate. However, your ISP
isn't. They pay per packet, more or less, and extra load drives up
prices that come back to you. Worse, you pay in reduced competition,
because the companies that own the big tubes, er, pipes have an
overwhelming advantage over the little ISPs who have to use them.
Reports that 30% of net traffic is BT packets don't mean success, they
mean failure: that same amount of data sent optimally might be hard to
notice.
What is to be done? We need a protocol that chooses its chunks from
the nearest possible source first. The longer you wait to ask for a
chunk that's far away, the more likely it is that somebody nearby will
have it by the time you do ask. Measuring distance is hard, but
measuring latency is a pretty good substitute. Nobody can afford to
measure distance to everybody else, but you can trade measurements with
nearby nodes you discover.
We need to replace BitTorrent. Has anybody already invented this?
Will you?
Yes and no., posted 10 Aug 2007 at 09:58 UTC by quad »
(Journeyer)
Bittorrent could be seen in the opposite light as well. Put a file on a
server in North America, and now 100% of the file is passing across
those poor small fibres, for foreign downloads.
At least with Bittorrent, there's a chance you can grab a piece from
your next door neighbour.
I don't think that 30% means failure either. It just means people want
to share files. Bittorrent is just the best way to do it currently, no
matter who you are, individual or megacorporation.
I'm glad to see optimization is being looked into. Seems to be
following usual software practice as well: first implement the feature,
then worry about optimization.
The opening blurb ("many times more than necessary") is patently false.
A user of a torrent does not download the same piece twice or more more,
so the bandwidth use is the same as with FTP. So, opening with blatant
lies ruins the article.
The locality argument would have carried some weight if the author
provided measurements in support of the thesis. Otherwise one may argue
that most torrent connections are local in Internet terms, because local
connections offer better bandwidth. Whenever I sample my connections,
the majority of users are in North America and very few are in Europe.
That's because the transatlantic links are congested and so their
throughput is lower. To be sure, this beneficial effect is mitigated by
users throttling at the endpoint and thus distorting the picture that a
node observes. So we don't know what is right. But this is just why
actual research is important instead of handwaving.
When citing "reports" it helps, you know, actually citing them. For one
thing, I "heard" "reports" that 75% of traffic on an unnamed "backbone
networks" is BT. And which number is right? Maybe both?
"Many times more than necessary" is not really patently false, unless you
somehow manage to interpret that to mean "many times more than FTP, which as
cdfrey points out probably makes packets go longer distances than necessary
anyway".
(But that's just my charitable interpretation of your interpretation. A less
charitable interpretation would be something like "my own argument about terrorism and left-wingers
was totally busted by ncm, so I'll look for a flimsy excuse to interpret his
words in the worst light possible so that I can say something bad about
him.")
Anyway, the Cache Discovery Protocol does sound like what ncm wants. But I'm
interested to know what further work there is in improving audio and video
compression, so that people don't even have to download that much data in
the first place.
Compared to what?, posted 14 Aug 2007 at 21:46 UTC by ncm »
(Master)
I don't want to make assumptions about why Pete missed the point so
badly. We all have lapses. Suffice to say that I never mentioned FTP.
I would not have guessed anybody would see FTP as a standard of
comparison for optimal use of network resources. Worst case, maybe.
To be precise, an optimal distribution mechanism would send each
chunk across the transatlantic link exactly once. If a chunk is sent
across five times, that's five times more than necessary. Never mind
that FTP might run it across five thousand times.
(Pete, accusing people of lying just because you don't understand
their argument interferes with rational discussion. "bi", we can each
make up our own uncharitable interpretations; posting yours doesn't help.)
I'm very grateful to quad for pointing out CDP, Ono, and Vivaldi.
Actually, Chris, BT (at least, classical BT) isn't the current best
way, from a network utilization standpoint. The current best is Akamai.
There are lots of ways in which it's not good, but it's silly to argue
about which legacy method is least bad. The point is to invent
something better. That BT is a good starting point seems already to
have been recognized.
http://www.isa.its.tudelft.nl/~pouwelse/Bittorrent_Measurements_6pages.pdf
It would be interesting to know, probably come up with an equation showing the relation between proximity and bandwidth utilization.
But then I thought the concept was already established decades ago? As old as the sliding window protocol?
Utilization, posted 15 Aug 2007 at 00:54 UTC by ncm »
(Master)
I didn't think the problem would be so hard for people to understand.
Imagine a very simple network: five nodes, connected in a line:
A <--> B <--> C <--> D <--> E
A has a file, and the rest each want a copy of it. If each takes
a copy via FTP, it traverses AB four times, BC three times, CD twice,
and DE once, 10 total. What's optimal? It could traverse each link
once, 4 total. What does classical BitTorrent do? Each gets about 1/4
of its pieces from each of the other nodes, for a total cost of 7.5.
Now, imagine BC is not a single hop, but actually runs through two dozen
routers.
A <--> B <-- ... --> C <--> D <--> E
Every packet sent through BC costs 24 times as much as one carried on AB
or CD. FTP costs 1+25+26+27=
79. Optimal costs
1+24+1+1=
27. BT costs ((1+24+25+26) + (25+24+1+2) +
(26+25+1+1)+(27+26+2+1))/4 =
59, more than twice optimal.
Now, imagine a hundred nodes in place of A and B, and another hundred
in place of C, D, and E (not all in a line). Under classical BT, each
node will bring about
half the file across BC or CB, so it will cross a hundred times, at a
total cost of 2.4k, and get the other half locally, at a cost of a few
hundred, say 3k total. Optimal is 224. FTP is close to BT.
Improved BT+CDP or BT+CDP+Ono might be much closer to 224 than 3000.
I admit I didn't define my "best" very clearly, and in the context of
ncm's replies, I guess it isn't quite accurate. :-)
With a title including the word "evil", part of my gut reaction is to
come to the defence of Bittorrent, since one of its good points is that
it allows anyone to serve up large files if needed. That gives it a
huge advantage in the "best" column in my books. Video podcasts need
this sort of thing, and I'm often amazed at how infrequently bittorrent
is used.
From this point of view, Akamai is not nearly as accessible.
Anyway, focusing on the technical is good. Bittorrent does expand to
fill all available bandwidth, and once that occurs for a transatlantic
pipe, bittorrent is still happy. (Everything else suffers.) The math
would change, with B to C traffic slowing down and taking a smaller
fraction of the traffic, and the clouds on each side filling in the gaps.
The old fashioned way to handle this was mirror servers. This suggests
that different trackers for various geographical areas could be useful,
but messy.
This also suggests that I don't have any brainy ideas that other people
haven't thought of before. :-)
It would be useful if an ISP could run an 'auto-tracker' or something
that figured out any particular thing a few of its customers were
downloading and tried to get them all to talk to each other while it
opened out 2-3 links of its own to the outside each traversing one of
its major backbone links.
I remember the digital fountain people showing up at CodeCon and
telling everybody how great their patented multicast based technology
was and wondering why anybody was using BitTorrent or anything similar.
Their patented multicast technology wouldn't have had the problem
ncm is complaining about and it was quite spiffy. But,
of course, their technology was patented, which relegates it to the
'useless for 15 years' heap.
Evil vs. Rude, posted 15 Aug 2007 at 19:11 UTC by ncm »
(Master)
cdfrey: By the formal definitions, I suppose the title should have been "Is BitTorrent Rude?", but that would have been much less catchy. Of course, pervasively deployed mirror servers are the cooperative equivalent of Akamai
et al.
The Wikipedia article claims that BitTorrent, Inc. has failed to document CDP, but who knows?
Torrent for ISPs, posted 16 Aug 2007 at 23:45 UTC by ncm »
(Master)
There might be a market for "torrent-spoofers" for ISPs. Imagine if
your ISP noticed torrents and inserted itself into the conversation. It
could save off copies of chunks requested by subscribers and offer them
to all the rest, while making those chunks apparently unavailable from
upstream. Less helpfully, it could throttle outward traffic, and favor
delivering to other copies of itself. I wonder if BitTorrent, Inc. is
doing this.
I'm late to the party and I have no actual experience with the
bittorrent protocol. However, I'm not sure it actually works as nathan
describes it.
My understanding is that bittorrent clients try to select peers that
they get good throughput with. If that's correct, this should favor
local peers somewhat.
In the example of the transatlantic link, sure if the link is
uncongested and you get good throughput on it, the torrent client might
use it, but then does it really matter much in that case ? If the link
is congested though, clients will start looking for peers they have
better throughput with, i.e. most likely on their side of the pond.
Now this is only my intuition, I have absolutely no idea if this works
as nicely as that in practice.
Recently you requested personal assistance from our on-line support=20
center. Below is a summary of your request and our response.
If this issue is not resolved to your satisfaction, you may reopen it=20
within the next 7 days.
Thank you for allowing us to be of service to you.
Your Service Coordination Group is here to help with any questions or
concerns you have about your RealtyTrac account, tools, services or
information.
We are available by telephone at 1-877-888-8722 Monday - Friday 8am -
5pm= (PST). =20
Sincerely,
The Service Coordination Group
RealtyTrac, Inc.
Phone: 1-877-888-8722
Fax: 1-949-861-9413
----------------------------------------------------
My initial problem is that i entered my credit card for a 7-days trial
of their service and didn't realize that if i don't cancel it on my end,
they'd automatically charge the memebership dues... so I sent them this
email:
I do NOT honestly recall ordering services that warrant these charges:=20
08/28/07=09 POS =09 REALTYTRAC INC 949-502-8300, CA REFID:087239820577=
382 -49.95
08/21/07=09 POS =09 RTI PUBLISHING 949-502-8300, CA REFID:087231273057=
506 -29.95
07/30/07=09 POS =09 REALTYTRAC INC 949-502-8300, CA REFID:16720882022=
5576 -49.95
regards
your former trial-only member.
--------------------------------
after receiving their customer response,
the resolution I'm contemplating is to fax them my incident report every
7 days with my computer's intelligent program. Is that seem to be fair
enough self-service?
moderation, posted 5 Sep 2007 at 02:50 UTC by ncm »
(Master)
I'm reading badvogato's posting above (like those elsewhere) as a call for some sort of moderation capability in Advogato articles. Maybe replies should be visible or not according to whether diary postings are?