Direct Connect: Just These Guys, Ya Know?

Official LinuxDC++ release

September 24, 2007 by Fredrik Ullner Leave a comment

DC++’s source is split into two entities: the interface (that’s currently undergoing massive change) and the core (that’s handling all of the things you don’t see). While DC++’s interface isn’t platform-independant, the core is (most, at least). As such, ports to other platforms have been conducted.

One of these ports, is LinuxDC++. LinuxDC++ use DC++’s core and has a interface written in GTK+. While the name may suggest that the client is Linux-only, it has been reported that other platforms can use it (such as *BSD). LinuxDC++ can use everything that DC++ can do, and it doesn’t look or behave differently from other clients’ point of view. It’s DC++ – with a different graphical representation.

Yesterday, LinuxDC++ came out of beta-CVS-state and was officially released.

The LinuxDC++ team is pleased to announce the release of LinuxDC++ 1.0.0. This release is the culmination of almost three years worth of work. Hopefully you’ll find it was worth the wait. Thanks to all of those who have contributed in making this release possible. You can look forward to many more things to come in the near future including i18n support, user commands and magnet support.

You can download LinuxDC++ here, though you need to compile it yourself (unless your distribution have a binary for you).

Filed under General

Defeating traffic shaping with encryption

September 19, 2007 by cologic Leave a comment

Many ISPs now throttle P2P connections. The precise mechanisms vary, but rather than concentrate on an individual step in this iterative measure/countermeasure process, this post will attempt to skip several intermediate steps and assume ISPs interfere more than, in general, they currently do. Thus, the threat models of interest will be those ISPs which already implement deep packet inspection to examine not just the structure but the protocol-specific content of users’ traffic. The implementation of L7-filter demonstrates such products’ minimal capabilities; some traffic shapers can do much more.

Therefore, to evade even the level of traffic shaping l7-filter can provide, a P2P protocol must at least avoid regex-detectable patterns. Both NMDC and ADC, as widely implemented, currently fail this metric rather dramatically, leaving them trivially throttled. Importantly, even the initial connection setup must avoid leaking any fixed patterns unique to P2P whilst still allowing the two DC clients to communicate with each other. Satisfying this constraint requires that any extant plaintext be generic enough not to allow traffic shapers to pin it with much probability on DC. Sufficient opportunity exists within an operational DC network to render this much practical.

More sophisticated traffic detection can perform traffic analysis on the timing, number, frequency, and bandwidth usage pattern of connections which remains impervious even to countermeasures sufficient to defeat traffic shaping on the order of l7-filter. Numerous countermeasures, in turn, exist to respond to such analysis, though many of them are of limited utility in a P2P network that aims for bandwidth-efficiency. To the extent a primary goal of DC is to transfer data over a network from one computer to another, indeed, any countermeasure which renders that end more difficult becomes profitable only when the network environment becomes so hostile that protocols with less overhead can’t operate. Since that’s not yet largely the case, prematurely modifying DC to defeat traffic analysis risks instead backfiring and rendering it merely maladaptive.

Alternatively, an ISP might not even bother with such measures but instead effectively whitelist rather than blacklist traffic. Such a provider wouldn’t need to take any elaborate measures to preclude P2P traffic; instead, they would explicitly allow a limited number of approved and identifiable protocols and simply not include, say, DC in that set. Rogers of Canada, for example, has constructed such a system by throttling all encrypted transfers. Surviving under such a system would be difficult at best, but perhaps possible; it would essentially require network-borne steganography to simultaneously encrypt yet hide the very presence of encrypted data. Steganographic systems tend require enough overhead to fall prey to the probable maladaptivity already discussed and as such don’t fit well in the current merely moderately hostile network environment. Further, since as Michael Geist observes, such systems run false positive risks high such that they seem unlikely to long survive on any open or neutral network, they seem to serve more as reductios than sustainable systems.

The threat model this post addresses, then, is that of a network which performs deep packet inspection but allows unrecognized traffic through basically unmolested. This post concentrates on protocol obfuscation and encryption as counters to threats present under this model.

The DC++ mod BCDC++ briefly implemented minimal obfuscation in the form of an option to send ‘garbage commands’. However, its flexibility whilst retaining compatibility unmodified remote NMDC clients was limited. Simply prepending some structurally plausible yet syntactically incorrect NMDC commands to a client-client connection and leaving the rest of the connection unmodified, it could potentially fool simple-minded filtering machines, but even l7-filter could trivially detect it. This outcome was inevitable; NMDC structurally as commonly implemented is incapable of avoiding traffic shaping, though a nominally NMDC-based client willing to break compatibility can solve it. Not only does this protocol not support protocol encryption, but its server-party-speaks-first system strongly hinders attempts to negotiate such a system afterwards. NMDC, then, provides a poor base for avoiding traffic shaping.

ADC, by contrast, not only contains a nascent, standardized secure extension exist in the form of ADCS (section 6.5 of the draft specification) but the connecting client speaks first in a connection. Effort to render DC resistant to traffic shaping is thus better directed towards ADC-based clients than NMDC-based clients. Again, the options of obfuscation and more substantial encryption emerge. Whilst these categories blend into each other, mere obfuscation has the weaknesses of providing barely more protection against traffic shaping than plaintext. In effect, it encourages man-in-the-middle attacks by network operators since, lacking a notion of actual privacy against outside interference, an eavesdropper need merely keep enough state to decode an ongoing DC connection. Although this does appear beyond the ken of l7-filter, stateful inspection, which can perform such attacks, is technically feasible and likely in use.

Given this potential, defeating traffic shaping should not rely just on obfuscation but rather on full-fledged encryption. Man-in-the-middle attacks can affect encryption connections as well, but so long as both sides can communicate out of channel, they can be avoided. For example, the necessary setup for a client-client connection’s encryption to avoid MITM interception could be performed in the client-hub connection from which it spawned.

Having shown at least the otherwise most viable alternatives to implementing true encryption in ADC would fail, at least two more disclaimers remain necessary: first, that because this cryptosystem can be defeated by simply logging on to a hub as an ordinary user, one should be clear that it protects not against RIAA/MPAA-style attacks of searches for files and then reporting the users who return search results, but merely meddlesome network operators; that would be some computational overhead, but for all but 100Mbit+ users or those with ancient computers, it remains relatively unobtrusive CPU-wise; and that one should not under any circumstances use homegrown cryptosystems, because they’re almost certainly insecure, lacking extensive peer review. Using openly documented, widely employed, and extensively analyzed protocol such as TLS or SSH2 not only ensures the similar plaintext as non-P2P protocols towards which ISPs might not be so hostile but also avoids most of the aforementioned pitfalls of cryptosystem implementations, and therefore should be strongly preferred. To further avoid side channel attacks, one should prefer not just such a protocol but a well-developed implementation of that protocol.

Though neither NMDC-based clients nor those merely relying on obfuscation would likely work for long, then, an ADC-based client employing encryption should allow a DC user to evade many mechanisms of traffic shaping.

Don’t forget that you can make topic suggestions for blog posts in our “Blog Topic Suggestion Box!”

Filed under ADC, Developers, General, Security

Your direct connect client has supplied an invalid IP address in a connection request

September 17, 2007 by Fredrik Ullner Leave a comment

If you received “Your direct connect client has supplied an invalid IP address in a connection request (client sent xxx.xxx.xxx.xxx, you have yyy.yyy.yyy.yyy)” in a hub, I hope you didn’t scratch your head about it for too long.

Anyway, what this error message mean is that when you try to connect to another user (to download or upload), your client is attempting to send a IP-address that’s invalid. Got it? “client sent xxx.xxx.xxx.xxx, you have yyy.yyy.yyy.yyy”.

Well, just go into Settings < Connection settings, and input “yyy.yyy.yyy.yyy” in the “External / WAN IP” box.

(I think the hub may automatically send the correct one, if it has $UserIP enabled.)

Don’t forget that you can make topic suggestions for blog posts in our “Blog Topic Suggestion Box!”

Filed under Documentation, General

Sockets and buffers

September 17, 2007 by Fredrik Ullner 3 Comments

I’m going to make a stab at what the difference is between the old “Use small send buffer (enable if upload slow downloads a lot)” option, in comparison to the new “Socket write buffer” and “Socket read buffer” options.

The old “small send buffer” option was added to serve as a help for those who have network connections where their upload affect their download. When DC++ send data through a socket (“a connection to another computer”), it every so often stops and waits until the other send say “yes, I got the information, give me more”. What this option does is that it sets the interval when DC++ should stop sending and start paying attention for a verification. More specifically, having this option on sets the interval (packet size) to 1 KiB, versus 16 KiB when it’s off. This mean that your drive will work more (DC++ will read more from it) and the speed of your downloads and uploads will be lower. [I have no idea why this option was removed. It won’t come back in the near future, as far as I know.]

The socket write and read buffer options are different from the old buffer option. What these options do, is that they set something called a “TCP window“.

TCP uses what is called the “congestion window”, or CWND, to determine how many packets can be sent at one time. The larger the congestion window size, the higher the throughput. The TCP “slow start” and “congestion avoidance” algorithms determine the size of the congestion window. The maximum congestion window is related to the amount of buffer space that the kernel allocates for each socket. For each socket, there is a default value for the buffer size, which can be changed by the program using a system library call just before opening the socket. There is also a kernel enforced maximum buffer size. The buffer size can be adjusted for both the send and receive ends of the socket.

To get maximal throughput it is critical to use optimal TCP send and receive socket buffer sizes for the link you are using. If the buffers are too small, the TCP congestion window will never fully open up. If the receiver buffers are too large, TCP flow control breaks and the sender can overrun the receiver, which will cause the TCP window to shut down. This is likely to happen if the sending host is faster than the receiving host. Overly large windows on the sending side is not a big problem as long as you have excess memory.

The annoying thing here is that the buffer size isn’t something we can say is the “correct” or “incorrect” value. This is something you need to try for yourself. Having said that, you can approximate them.

Take your maximum througput speed (eg, 10 Mbit/s) and multiply it with the “latency” between you and the other users. Basically, you can find out the latency by typing “cmd /k ping other_users_ip” in Run in Windows and by looking at the round-trip time. (Note that the other party may be blocking pings.) (The latency is what you normally see as “lag” in games.) What all this mean is that there’s no general formula for all users that may affect you. In any case, if you have a ping time of 50 ms to most users, you should input (10 Mbit / 8 bit) * 0.05; 62500 in DC++. The default value DC++ is using is 65535, so it’s quite close.

Dislaimer: I may be wrong about “a little” or “a lot” in this post, but I think I fairly got the bigger picture correct.

Filed under General

Where is my downloaded file?

September 13, 2007 by Fredrik Ullner Leave a comment

When you start downloading a file with DC++, a temporary file is created, where the data is stored until the file is complete. The incomplete file will be named “filename.TTH.dctmp” (or “filename.TTH.dctmp.antifrag” if you have “Use AntiFragmentation Method for Downloads ” on.) and stored in the directory you’ve entered in Settings -> Downloads. Of course, if you’re using the %[targetdrive] option, depending on where you point your download, the incomplete file will be stored on the respective drive you’re downloading to.

“Sometimes, I see that I have downloaded a file (either through the transfer view, or the finished downloads window), but I can’t find the file in the correct download directory. Why?” have a bunch of people asked.

To know this, you have to know what DC++ do when a file is complete. When a file is completed, the file name is changed from “filename.TTH.dctmp” (with possible “.antifrag”) to just the file name. DC++ then locate where you want the file to be transferred to. If DC++ can find the location and has write access, it will move the file there. If DC++ doesn’t, it will (1) put the file in the default download directory, if the original directory doesn’t exist (that is, the drive doesn’t exist), or (2) just leave the file in the unfinished directory as it is. (2) usually happen when there’s no write access. (But it’s not limited to that.)

Don’t forget that you can make topic suggestions for blog posts in our “Blog Topic Suggestion Box!”

Filed under General

Mailing lists for DC++

September 9, 2007 by Fredrik Ullner Leave a comment

The DC++ forum and bugzilla has been gone a while now, and unfortunately there’s no progress on when they’re coming back.

However, Jacek has established some mailing lists that you can use to stay current.

The SVN commit list is quite useful to be on, if you want to follow SVN updates. Beware, not all SVN commits appear in the list; They need to be approved first.

The newly opened development list is a useful resource if you’re interested in the development of DC++. (If we’re not posting about DC++’s progress here, of course.) Also, instead of mailing Jacek with your patches, he insist that you send patches to the mailing list¹.

The other newly opened mailing list is the bug list, where you can post about your bugs². (I’ve already noted it in the bug reporting page here. Feel free to post any of the mentioned items there as well in the bug list.)

¹ If you have a help file patch, you may send it directly to me, instead of cluttering the mailing list.
² I think you need to be a member on SourceForge to access the list.

Update: Well, apparently the bug list is read-only, so I’m re-opening the reporting page we have here.

Filed under Developers, General

Compiling DC++ in new format

September 4, 2007 by pietry 7 Comments

It’s been a while since last release of DC++. What has been going on since? Well, a lot of changes are done, and this means that the compiling process changed too.

First of all, Microsoft’s Visual C++ is no longer required for the compile process. Instead of it, MinGW is used. MinGW stands for Minimalist Gnu for Windows, and it’s the gcc ( and g++ ) compiler, moved to Windows.

To install mingw, download from their site the following packages in tar.gz format ( not the sources ) w32api, binutils, mingw-runtime, gcc-core and gcc-g++. You need the 4.2+ version of gcc ( still appears as a technology preview ).

There are 2 variants of gcc, dw2 and sjlj, it is possible to work with the sjlj variant, but I used dw2 and it worked fine. Untar all those in a folder named MinGW by example, go to MinGW/bin and copy gcc-dw2.exe to gcc.exe and g++-dw2.exe to g++.exe . Now, add the MinGW/bin to your path.

Install SCons; and add it to your path ( SCons requires Python, so you will need to install that first).

Download HTML help workshop . Copy the include and library files to the respective directories in the htmlhelp folder. Make sure hhc.exe is in your PATH.

The installation of STLport is the main problem I faced while trying to compile. First, install CygWin. After that, get STLport ( version 5.1.3 , latest at this time ). Unzip the stlport to stlport folder in the DC++ source. Now, run cygwin, and browse to stlport/build/lib.

Now, type : make -f gcc.mak After it’s finished, type also

make -f gcc.mak install

make -f gcc.mak install-release-static
make -f gcc.mak install-dbg-static
make -f gcc.mak install-stldbg-static

Last three are required so that DC++ will run stand-alone ( with no required dlls).

Now, open a command prompt and run scons. Following options are available :

“tools=mingw” – Use mingw for building (default)
“tools=default” – Use msvc for building (yes, the option value is strange)
“mode=debug” – Compile a debug build (default)
“mode=release” – Compile an optimized release build

I used scons tools=mingw mode=release

If you get some error about uPnP or something like that, you need to get natupnp.h; and paste it to MinGW/include folder.

If some other references errors show up, try running scons again.

Don’t worry if lots of warnings appear ( they don’t stop the compiling process, and they will be fixed in near future ).

Don’t get scared if the .exe is a bit large ( it contains redundant symbols ). The exe will be optimised into smaller size ( I got 88 MB exe, and after optimization it should get below 8 MB )

Filed under Developers, Source code

ADC, not NMDC, can support multiple share profiles

August 30, 2007 by cologic Leave a comment

NMDC cannot support multiple shares within a single client for the same reasons that it’s vulnerable to nick-faking; client-client connections cannot be unambiguously correlated. Identifying arbitrary client-client connections with the hubs from which they emanate is actually simpler than precluding nick-faking; any protocol which can prevent such an exploit, which requires characterizing not only the hub but the specific user, has the required machinery to allow unambiguous multiple share support. Because ADC can prevent nick-faking, it can also provide multiple share support in a manner unavailable to NMDC.

Whilst a protocol capable of identifying both user and hub can trivially identify a subset of that information and therefore the hub, which is sufficient for multiple share support, that doesn’t imply the converse of a protocol not identifying both the user and the hub being unable to identify just the hub, alternatively sufficient. However, when applied to NMDC and ADC the converse holds true. A gap between this claim and its converse would appear in NMDC having some mechanism capable of identifying hub, but not user. However, because in C-C connections, the sole identification of a nick is not hub-centric, but rather user-centric, such a gap mechanism does not occur. Therefore, one can reasonably conclude that no additional method beyond those previously discussed and dismissed exists within NMDC that would otherwise, after all, allow it potential ADC-level multiple share functionality.

Finally, though the CTM token of ADC has weaknesses, these are, both by contrast with NMDC’s client-client user identification weaknesses and in absolute terms, negligible. In particular, multiple users can collude across hubs to produce identical CTM tokens, and even sans such active malice, such a situation can by chance occur. However, multiple remote clients capable of cooperating to that degree might as well be treated as one client; indeed, a single remote client can behave arbitrarily whilst claiming a constant CTM token as well, so such collusion doesn’t actually increase vulnerability. The situation occurring by chance also fails to warrant substantial concern, provided a high-quality randomness source. This is an example of a birthday attack; the chances of such a collision remain negligibly small with at least 128 bits of randomness even with a large number of CTM tokens drawn.

As none of these weaknesses poses as substantial a threat to reliable identification of a C-C user with an identity derived from an existing hub as exists in NMDC, and indeed a negligible threat at all, ADC-based systems can well function to provide functionality that NMDC-based systems not only historically have not but theoretically cannot robustly and reliably provide, including multiple share support, resistance to nick-faking, and any other feature dependent on associating client-client connections with hub users.

Don’t forget that you can make topic suggestions for blog posts in our “Blog Topic Suggestion Box!”

Filed under ADC, General, NMDC

Nick faking in NMDC

August 19, 2007 by cologic 3 Comments

The NMDC protocol contains several usability and security flaws which ADC fixes, including that of failing to unambiguously identify users to each other over client-client connections. As an example of its trivial exploitability under NMDC, this inability to cross-reference client-client and client-hub identifying tokens results in vulnerability to nick-faking.

This dearth of corroboration beyond a potentially ambiguous global nick ensures that no reliable way exists for a standard client to authoritatively confirm or refute a remote client’s ostensible self-identification, and thus to avoid nick-faking. At least four detection methods ameliorate but fail to solve this problem in NMDC: UserIP-based correlations, search result-based correlations, temporal correlations, and depending on the implementation, the simple expedient of asking someone whose nick is being faked if they’re responsible for a given C-C connection. None, however, suffice to create a reliable and secure P2P protocol.

Relying on $UserIP fails both in that no one-to-one mapping between users and IPs necessarily exists and that many hub-owners evince reticence towards distributing such information to users. Even were hub-owners more open to distribution of users’ IPs, or were one to trust nick/IP correlations derived from search results, multiple NAT’d users will appear from the same IP, meaning that though different ISPs imply different users, the converse remains untrue: the two connections having the same IP does not imply that they’re from the same user. These inherent issues with IP-based $UserIP imply continued vulnerability of NMDC, even allowed these methods, to nick-faking.

Not only do IP-based methods leave ambiguity, but intentional cheating, protocol innovation such as upload queuing, and even heavily lagged hubs similarly undermine useful dependency on temporal correlations. Traditionally, DC clients can know what what other users might connect to them because such connections generally occur in response to a $ConnectToMe message. Either a client seeks to upload or download. If seeking to download, standard procedure is to send a $ConnectToMe, which allows for the creation of a list of expected C-C connections which can dramatically narrow possibilities upon reception; if one’s sent a $CTM to user with nick X on hub Y rather than hub Z, then receiving an incoming C-C connection claiming nick X strongly suggests that one’s viewing the user from hub Y and not hub Z.

However, this still can fail – perhaps because hub Y hasn’t relayed the $CTM to the receiving user or because that user does not deign to respond by C-C connection to that message and sometimes because not all C-C connections occur in response to a proximate message at all. Upload queues, for example, can indefinitely delay a C-C response until when a slot actually opens up and thus break mechanisms depending on temporal proximity. Any of these alternative paths to a C-C connection break temporal correlation-based methods.

Finally, combining multiple techniques, including just asking the ostensible other user eliminates, does not eliminate this vulnerability, in the former case because they don’t, taken together, cover even the set of likely possible events and in the latter case because such manual intervention indicates a basic brokenness of a protocol. To find examples, just take the intersections of the sets of counterexamples of each technique, since they operate basically orthogonally – an upload-queuing user behind a NAT breaks both. One can appeal to manual fixes in response to arbitrary protocol flaws – taken to its limit, one can run a DC-like system entirely manually. As such, suggesting to use that tactic is a non-response.

Whilst none of these fixes reliably fixes NMDC, ADC robustly avoids nick-faking in client-client connections by use of the CTM token.

Don’t forget that you can make topic suggestions for blog posts in our “Blog Topic Suggestion Box!”

Filed under General, NMDC

Cross-compiling DC++ on Linux

August 12, 2007 by cologic Leave a comment

Just some notes:

If you use a binary-based distribution, don’t use the default STLport – that’ll have been compiled for the native platform, not Win32. Compile your own copy from source, however you do it.
Your distribution might have a MinGW cross-compiler already packaged and available. Debian, Ubuntu, and possibly RPM distributions (I don’t use any and thus can’t verify that). The Gentoo site is down, so I didn’t check.
The default MinGW filename prefix in build_util.py is i386-mingw32. You might need to adjust that, either by setting the MINGW_PREFIX variable or just editing build_util.py. For example, the correct prefix on my development system is i586-mingw32msvc.
On at least the versions of the MinGW runtime packaged with the Linux distribution I use, commdlg.h lacks the OPENFILENAME_SIZE_VERSION_400 constant (it’s 76 decimal) and winuser.h XBUTTON1/XBUTTON2 (VK_* and MK_* aren’t the same). Get the corrected header files from the official MinGW download site if necessary; they’re as portable as necessary.
Certain older versions of SCons on MinGW are vulnerable to a bug which DC++’s build system triggers. Either update SCons or apply the patch available if you experience that bug.
Finally, regardless of platform, using SCons’s —implicit-deps-unchanged option can dramatically speed up compiles if not too much has changed.