Gnutella2 Specification Document
What
is the Scope of Gnutella2?
What
is the Gnutella2 Standard?
Common
Gnutella2 Standard (All Applications)
Gnutella2
Standard for File Sharing
Gnutella2
the Network Architecture
Node
Types and Responsibilities
TCP
Stream Connection and Handshaking
Compression
Negotiation and Encoding
Gnutella2
UDP Semi-Reliable Transceiver
/PI/RELAY
– Relayed Ping Marker
/PI/UDP
– Ping Response Address
/PO/RELAY
– Relayed Pong Marker
/KHL/NH/LS
– Library Statistics
/KHL/CH/LS
– Library Statistics
Known
Hub Cache and Hub Cluster Cache
Node
Route Cache and Addressed Packet Forwarding
Reverse
Connection (Push) Requests
/PUSH
– Push / Connect Back Request
Query
Hash Tables, Superset Table and Exchange
The
Aggregate or Superset Table
Gnutella2
Object Search Mechanism
/QKR/RNA
– Requesting Node Address
/QKA/SNA
– Sending Node Address
/QKA/QNA
– Queried Node Address
/Q2/UDP
– Return Address and Authentication
/Q2/URN
– Universal Resource Name
/Q2/DN
– Descriptive Name (Generic) Criteria
/Q2/SZR
– Size Restriction Criteria
/QH2/BUP
– Browse User Profile Tag
/QH2/HG
– Hit Group Descriptor
/QH2/H/URN
– Universal Resource Name
/QH2/H/URL
– Universal Resource Location
/QH2/H/DN
– Descriptive Name (Generic) Criteria
/QH2/H/CSC
– Cached Source Count
/QH2/H/PART
– Partial Content Tag
/QH2/MD
– Unified Metadata Block
/QH2/UPRO/NICK
– Nick/Screen Name
Simple
Query Language and Metadata
HTTP/1.1
Client for Downloading
User
Profile Challenge and Delivery
Profile
Challenge and Response/Delivery
/UPROC
– User Profile Challenge
/UPROD
– User Profile Delivery
This is a specification and standard
document for “Gnutella2”. It is currently a "first public draft". Comments and
revisions are welcome. This document covers the Gnutella2 architecture and
standard only: it does not deal with Gnutella1 or other related standards,
technologies and protocols that are well documented
elsewhere.
Gnutella2 is a modern and efficient
peer-to-peer network standard and architecture designed to provide a solid
foundation for distributed global services such as person to person
communication, data location and transfer and other future
services.
Peer to peer technologies have become
mainstream over recent years, and there are already a significant number of P2P
networks in various stages of development and operation. How does yet another network
help?
Gnutella2 is unique amongst the currently
operating peer to peer networks in several important ways:
The original “Gnutella” was created several
years ago as a very simple, single vendor file-sharing specific network. Its simplicity made it a popular
platform for file sharing application developers; however this simplicity also
critically limited its effectiveness.
As a result, competing file-sharing specific networks slowly but surely
took over as the tools of choice as Gnutella users became frustrated with poor
performance and turned elsewhere.
The original Gnutella[1] network was
designed for a very limited purpose and, despite many changes over the years,
remains limited today. Efforts to
make it a better file sharing network continue with mixed
success.
Gnutella2 shares the “Gnutella” name,
striving to create the network that Gnutella should have been from the
beginning. It shares the adopted
ideals of openness and cooperation, but offers a fresh start that was sorely
needed. The crippling limitations
of the old network have been left behind and replaced with an
entirely new network architecture ready to grow and develop through the
creative efforts of many.
The single name “Gnutella2” really refers to
two separate components: Gnutella2 the Standard and Gnutella2 the
Network.
The Gnutella2 Network is perhaps the most
easily recognised component. It is
a new high-performance peer to peer network architecture upon which a variety of
distributed applications can be built, such as file sharing applications,
communication tools, etc.
The Gnutella2 Standard is a set of
requirements for building applications which operate on the Gnutella2 network in
different capacities. It specifies
the minimum compliance level required to be recognised as a Gnutella2-compatible
application. Compliance with a
Gnutella2 Standard ensures participating applications provide a minimum
acceptable level of service to other network participants.
The Gnutella2 Standard is a set of
requirements for building applications which operate with the Gnutella2 network
in different capacities. For
example, the Gnutella2 Standard for File Sharing specifies a set of features and
behaviours which must be available in any Gnutella2-connected file-sharing
product offered to the public.
As an open, general purpose platform,
Gnutella2 networks must be able to operate with a diverse family of different
implementing applications. Every
effort has been made to limit the ill-effects a non-compliant application can
cause (deliberately or accidentally), however when it comes to critical features
such as common URN schemes and character encodings, minimum standards help to
ensure a favourable baseline user experience.
The open and transparent nature of the
Gnutella2 architecture makes technical enforcement difficult, so a more viable
(and hopefully, more productive) social scheme has instead been adopted. Only applications meeting the
appropriate Gnutella2 Standard may be marked as “Gnutella2-compliant”. Websites containing information about
Gnutella2 (such as gnutella2.com) are encouraged to list only compliant
applications, and application developers are encouraged to deny communications
with known non-compliant applications.
Applications which do not comply with the standard or are still in the
development process should never be made available to the public, however
private testing is always encouraged.
Ultimately it is the responsibility of the
developer to ensure their own application complies with the relevant standards,
both with respect to Gnutella2 and any other functionality they may be
including. However as an
inter-dependent community, developers of Gnutella2-compliant applications are
encouraged to take an interest in other Gnutella2 applications, and where
possible, examine them for compliance.
Similarly, new developers are strongly encouraged to seek assistance from
other developers in verifying their work.
This need not compromise competitive advantage – if the application is
sensitive, the important compliance testing phase can be performed in the days
prior to release.
At the current time, only one Gnutella2
standard has been published: the Gnutella2 standard for File Sharing
Applications, detailed below.
Additional standards for other application classes will be published in
the future as required.
Developers of new application classes are
operating in somewhat untried territory, and should review the existing
published standards for best practices which can be borrowed. In particular, the basic components of
the Gnutella2 network architecture should always be implemented in
full.
All applications making use of Gnutella2
technology for any application class MUST IMPLEMENT the following core
features:
Applications making use of Gnutella2
technology for file sharing MUST IMPLEMENT the following
features:
The remainder of this document is concerned
with the Gnutella2 network architecture, which is the most substantial part of
the Gnutella2 initiative and forms the core of the Gnutella2
standard.
The Gnutella2 network architecture consists
of several key components:
Note that a typical peer to peer application
utilising Gnutella2 will have many additional components related to its core
function. This list covers only
components with an obvious network interaction, and should not be used as an
exhaustive design guide. A typical
file sharing application may have in excess of 120 components at the core
level.
The Gnutella2 network is an ad-hoc, self-organising collection of interconnected nodes cooperating to enable productive distributed activities.
Not all of the nodes participating in the system are equal: there are two primary node types, “hubs” and “leaves”. The goal is to maximise the number of leaves and minimise the number of hubs, however due to the limited nature of resources the maximum viable ratio of leaves to hubs is limited. This quantity is known as the “leaf density”.
Leaf nodes are the most common node type on the network – they have no special responsibilities and do not form a working part of the network infrastructure. Nodes with limited resources must operate as leaf nodes: this includes limited bandwidth, CPU or RAM, low or unpredictable expected uptime, and inability to accept inbound TCP or UDP.
Hub nodes on the other hand form an important and active part of the network infrastructure, organising surrounding nodes, filtering and directing traffic over several media types. Hub nodes devote substantial resources to the network, and as a result their capacity to participate in higher level network functions is limited. Only the most capable nodes are selected to act as hubs, based upon the criteria in the following section.
Hubs are selected based on the following
internal criteria:
In addition to these internal factors, hubs must also consider the network’s need for additional hubs. Without a central point of authority the need for additional hubs cannot be determined with absolute certainty; however it can be approximated by examining the state of nearby nodes and specifically the state of hubs in the local hub cluster.
Nodes operating as Gnutella2 hubs have a set of responsibilities to meet. Hubs are highly interconnected, forming a “hub network” or “hub web”, with each hub maintaining a connection to 5-30 other “neighbouring” hubs. The number of hub interconnections must scale up with the overall size of the network.
Each hub also accepts connections from a large collection of leaf nodes, typically 300-500 depending on available resources. Leaf nodes are considered to be the “edge” of the network. In practice leaves simultaneously connect to three hubs, however from the point of view of the hubs each leaf is considered a dead end.
The group of hubs within the hub network spanning the local hub and its neighbours is termed the “hub cluster”, and is an important grouping. Hub clusters maintain constant communication with each other, sharing information about network load and statistics, exchanging cache entries and filtering tables. The hub cluster is also the smallest searchable unit of the network as far as a search client is concerned.
Hub responsibilities include:
TCP stream connections are established
between Gnutella2 nodes when they elect to form a permeant link, creating the
fundamental network topology of a highly interconnected hub network serving
dense clusters of leaf nodes.
TCP connections are initiated by leaf or hub
nodes in an attempt to gain a connection to a hub node. Leaf nodes are never the target of an
outbound connection. The TCP port
number is not standardised, and must be stored with the IP
address.
Upon the establishment of a TCP stream
connection between two Gnutella2 nodes, a handshaking phase must be completed to
negotiate the nature of the link and exchange other necessary
information.
This handshaking phase is the only part of
the communication which remains compatible with the old Gnutella network,
allowing new connections to be negotiated without fore-knowledge the
capabilities of the other node. The
handshaking process has been well documented elsewhere, however a short summary
is provided here.
The Gnutella handshake process consists of
three header blocks. The node which
initiated the connection sends an initial header block, of the
form:
GNUTELLA CONNECT/0.6
Listen-IP: 1.2.3.4:6346
Remote-IP: 6.7.8.9
User-Agent: Shareaza
1.8.2.0
Accept:
application/x-gnutella2
X-Ultrapeer: False
The receiver then responds with its own
header block:
GNUTELLA/0.6 200 OK
Listen-IP: 6.7.8.9:6346
Remote-IP: 1.2.3.4
User-Agent: Shareaza
1.8.2.0
Content-Type:
application/x-gnutella2
Accept:
application/x-gnutella2
X-Ultrapeer: True
X-Ultrapeer-Needed: False
Finally, the initiator accepts the
receiver’s header block, and provides any final
information:
GNUTELLA/0.6 200 OK
Content-Type:
application/x-gnutella2
X-Ultrapeer: False
The latter two stages may be replaced with
an error condition if the connection is being rejected. Appropriate error status codes are
returned in this case, for example:
GNUTELLA/0.6 503 Too many
connections
(more headers)
Note that only the HTTP-style error code
should be interpreted by the software: any descriptive text provided is for
display purposes only and is not standardised.
Important headers which are required or
strongly recommended are detailed in the following
sections.
Two important headers to send on all
connections are “Remote-IP” and “Listen-IP”. Both of these headers should be sent on
the first transmission, meaning in the first and second header blocks in the
three block exchange.
The Remote-IP header contains the IP
address from which the remote host is
connecting. This allows a remote
host operating through some kind of network address translation system to learn
its effective external address.
The Local-IP header contains the IP address
and port number that the local host
is listening for inbound TCP connections on. It should be listening for UDP datagrams
on the same port. The format of
this header is “IP:PORT”, eg “1.2.3.4:6346”.
The User-Agent header is used to identify
the client software operating on the sending node. It should be sent on the first
transmission, meaning in the first and second header blocks in the three block
exchange. Note that this is a
descriptive string that often includes a version number, and is not a “vendor
code” as described elsewhere.
The Accept and Content-Type header exchange is used to
negotiate the data protocol that will be used in the connection, in this case
Gnutella2. The Gnutella2 content
type is “application/x-gnutella2”, and this exchange follows standard HTTP rules
for negotiating content type.
The first step is to advertise support for
the content type (Gnutella2) in the first header block with “Accept:
application/x-gnutella2”. The
responding node will then indicate that it will send Gnutella2 content with
“Content-Type: application/x-gnutella2”, and that it also supports Gnutella2
with “Accept: application/x-gnutella2”.
The initiating host then confirms that it will be sending Gnutella2 with
“Content-Type: application/x-gnutella2” in the third header block. For more information on the
Accept/Content-Type exchange, consult a HTTP reference.
Note that the content type negotiation
process is designed to be a “one way” process, i.e. a different content type can
be negotiated for sending and receiving.
However when the gnutella2 protocol is negotiated, both channels must use
the same content type. This means
that a receiving node must not accept Gnutella2 if the initiator did not
advertise support for it, and if at the end of the handshake bidirectional
Gnutella2 was not negotiated, the connection should be
terminated.
There are two node types in a Gnutella2 peer
to peer network, a hub and a leaf as described in the “Node Types and
Responsibilities” topic. During the
initial handshake the two parties must exchange their current node type and
advise of their capabilities, negotiating the node
types they will adopt when the connection completes, and indeed whether it
should complete at all.
As the handshake sequence is compatible with
Gnutella1, the headers involved in negotiating node types are identical to those
used to negotiate Gnutella1 “Ultrapeer” states:
X-Ultrapeer: [True|False]
X-Ultrapeer-Needed:
[True|False]
Both headers contain a Boolean value, “true” or “false”, case insensitive.
The X-Ultrapeer header indicates whether the transmitting node is currently operating as a hub. Hub nodes will send “X-Ultrapeer: True” while leaf nodes will send “X-Ultrapeer: False”.
The X-Ultrapeer-Needed header indicates whether the transmitting node would like (and allow) the receiver to be a hub. A hub which sees no need for additional hubs in its area of the network will send “X-Ultrapeer-Needed: False”, indicating to the connecting node that it must operate in leaf mode if it wishes to connect. A hub which sees a need for additional hubs will send “X-Ultrapeer-Needed: True”, indicating that the receiving node should become a hub if it is capable of doing so. A leaf may send “X-Ultrapeer-Needed: True” to indicate that it is seeking a connection to a hub.
The X-Ultrapeer header should be sent on all three of the header blocks to indicate the current intended state of the node, while the X-Ultrapeer-Needed header should be sent in the first two header blocks only to indicate the desired status of the receiver. If the nodes cannot agree on a satisfactory arrangement, the connection will be terminated at or prior to the third header block with an appropriate message, for example:
GNUTELLA/0.6 503 Too many hub connections
GNUTELLA/0.6 503 Too many leaves
GNUTELLA/0.6 503 I have leaves, can’t downgrade to leaf mode
GNUTELLA/0.6 503 Lead mode disabled
It is desirable for connecting nodes to exchange the node addresses of other hubs on the network to facilitate rapid connection. The Gnutella2 protocol includes highly efficient methods to share hub addresses with peers once connected, but for a node trying to connect and encountering only “full” hubs, learning new hubs to try is helpful.
The X-Try-Ultrapeers header was developed for this purpose, and like the rest of the handshake is also semi-compatible with Gnutella1. It contains a comma separated list of hub node addresses and ports, along with a timestamp recording the time the hub was last seen. For example:
X-Try-Ultrapeers: 1.2.3.4:6346 2003-03-25T23:59Z, [..more..]
Hub addresses should not be sent unless the transmitter has reasonable knowledge of the hub’s existence and the timestamp is not too old. The content type negotiation headers should be used to verify that the listed hub addresses are indeed Gnutella2 hubs.
The Gnutella2 architecture makes widespread use of “deflate” compression, due to its high availability and ease of integration. Support of compressed TCP links is not a requirement in the Gnutella2 standard, however it is strongly recommended.
Deflate compression of a TCP link is negotiated with the standard HTTP headers pair “Accept-Encoding” and “Content-Encoding”. For example:
Header block one (initiator):
Accept-Encoding: deflate
Header block two (receiver):
Accept-Encoding: deflate
Content-Encoding: deflate
Header block three (initiator):
Content-Encoding: deflate
In this example the initiator advertises support for a receiving a deflated connection. The receiver then indicates that it too supports receiving a deflated connection, and that it intends to transmit deflated data. Finally, the initiator upon noting that the receiver supports deflate indicates that it too will transmit deflated data.
Note that unlike the content-type / protocol negotiation, deflated encoding can be applied on either incoming, outgoing or both channels of a connection.
For performance reasons, nodes should consider whether they can afford to supports additional deflated connections before advertising support for them, or agreeing to provide a deflated data stream. In the Gnutella2 network topology, all links benefit from compression except the leaf to hub channel of the leaf/hub link. Exempting this channel from compression saves the leaf and more importantly the hub a considerable CPU and RAM investment.
After the third and final header block has
been received by the initiator, subsequent communication over the TCP stream
occurs in the negotiated protocol, with the negotiated encoding. This means that while the handshake
sequence was backwards compatible with Gnutella1, after Gnutella2 support has
been negotiated all subsequent communication occurs in the Gnutella2 common
protocol – an entirely new system not backwards compatible any other
protocol.
The User Datagram Protocol (UDP) is
invaluable in peer to peer systems because it provides a relatively low-cost
(low-overhead) method of sending short, irregular messages to a very large
number of peers on demand.
Establishing a TCP stream connection to a peer simply to deliver a single
packet of information is wasteful in data volume and time for the peers involved
and state-aware network devices along the route, for example network address
translation facilities. When
dealing with a large number of peers quickly, these costs become
unbearable. UDP provides a solution
and makes this kind of interaction possible.
However the delivery of UDP packets is not
reliable: packets may be lost en-route for a number of reasons. Often this behaviour is desirable, for
example when the destination node’s connection is highly congested, UDP packets
are likely to be discarded. If the
content was not critical, this loss is appropriate as the host’s resources are
effectively unavailable. In other
scenarios involving critical payloads, UDP’s lack of
reliability is a problem: either the sender needs to make sure the receiver gets
the payload, or it needs to know definitively that the receiver was
unavailable.
The Gnutella2 network solves this problem by
implementing a selectively engaged reliability layer on top of the basic UDP
protocol. This reliability layer
shares some common functionality with TCP, but importantly does not provide any
connection state and thus retains the efficiency originally sought in
UDP.
This allows Gnutella2 to select the most
optimal communication medium for each and every transmission it needs to
perform:
Gnutella2 semi-reliable communication is
transported using the UDP protocol.
The port number for receiving UDP is the same as the port number
listening for TCP connections.
Implementing an additional reliable protocol
within UDP requires a small control header before the payload itself. This header is put to good
use:
The header has a fixed size of 8 bytes, and
is represented by the following C structure:
#pragma pack(1)
typedef struct
{
CHAR
szTag[3];
BYTE
nFlags;
WORD
nSequence;
BYTE
nPart;
BYTE
nCount;
} GND_HEADER;
The members of the structure are detailed
below:
Large packets must be fragmented before they
can be sent through most network interfaces. Different network media have different
MTUs, and it is difficult to predict what the lowest
common size will be. Fragmentation
and reassembly is performed by the existing Internet protocols, however there
are two important reasons why the reliability layer performs its own
fragmentation:
Each node determines its own MTU, often
based on a best guess combined with information from the host’s sockets
implementation. Packets exceeding
this size are fragmented into multiple datagrams of the appropriate size. Each datagram has the same sequence
number and the same fragment count (nCount), but a
different fragment number (nPart).
When a packet is to be transmitted the
network layer must:
·
Cache the
payload
·
Allocate a new locally and
temporally unique sequence number
·
Derive the appropriate number of
fragments
·
Queue the fragments for
dispatch
·
If the fragments do not need to
be acknowledged, the packet can be flushed now
The payload will generally be cached for an
appropriate timeout period, or until the data cache becomes full at which time
older payloads can be discarded.
Fragments are dispatched according to the dispatch algorithm of choice,
and the sender listens for acknowledgements. When an acknowledgement is
received:
If a fragment has been transmitted but has
not been acknowledged within the timeout, it should be retransmitted. A finite number of retransmissions are
allowed before the packet as a whole expires, at which time it is assumed that
the packet was not received.
When a new datagram is received, the network
layer must:
Fragment datagrams need to be dispatched
intelligently to spread the load on network resources and maximise the chance
that the receiver will get the message.
To do this, the dispatch algorithm should take into account several
points:
The recommended parameters for the
reliability layer are as follows:
MTU = 500 bytes
Transmit Retransmit Interval = 10
seconds
Transmit Packet Timeout / Expire = 26
seconds
(allows for two
retransmissions before expiry)
Receive Packet Expiry = 30
seconds
(allows 10 seconds
beyond final retransmission)
Relatively low-level network implementations
such as this are reasonably complicated, but must operate fast. It is desirable to avoid runtime memory
allocations in network code as much as possible, and particularly at this
level.
It should be noted that in almost all cases,
transmissions to “untested” nodes are single fragment. Replies on the other hand are often
larger, and may be deflated in many fragments. This is optimal because attempting to
contact a node which may be unavailable involves a retransmission of only a
single fragment.
Flow control is an important topic, however
it is handled at a higher layer.
The UDP reliability layer is only responsible for guaranteeing delivery
of selected datagrams.
Only critical transmissions whose reception
cannot otherwise be inferred should have the acknowledge request bit
set.
All Gnutella2 communications are represented
with Gnutella2 lightweight tree packets.
This applies everywhere from TCP stream communications to reliable UDP
transmissions to HTTP packet exchanges (where protocol data has been
negotiated). Each tree packet may
contain meaningful payload data and/or one or more child packets, allowing
complex document structures to be created and extended in a backward compatible
manner.
The concept can be compared to an XML
document tree. The “packets” are
elements, which can in turn contain zero or more child elements (packets). The payload of a packet is like the
attributes of an XML element.
However serializing XML has a lot of overhead due to all the naming, even
in a compact binary form. The
Gnutella2 packet structure makes a compromise: it names elements (packets),
allowing them to be globally recognized and understood without knowledge of
their format – and stores attributes as binary payloads, requiring knowledge of
their content to parse them.
Thus the element (packet or child packet) is
the finite unit of comprehension.
This system provides an excellent trade-off between format transparency
and compactness.
+ Query Hit Packet
|
|-+ Node ID (standard)
|
|-+ Server Status (standard)
| \-+ Shareaza
Server Status (private extension)
|
|-+ Hit Object
| |-+ URN (standard)
| |-+ Descriptive name
(standard)
| | \-+ Alternate name list
(extension)
| |-+ URL (standard)
| |-+ Priority indicator (private
extension)
| | \-+ Digital signature
(private)
| |-+ Alternate source summary
(standard)
| \-+ Available ranges
(standard)
| . \-+ Estimated completion time (private
extension)
|
|-+ Selective digital signature
(private)
|
\-+ Routing tags
Each Gnutella2 packet
contains:
Each packet contains a relative type name of
up to 8 bytes in length, which are case sensitive. The packet type name is meaningful only
in the namespace of the packet’s parent, or in the absence of a parent, the
context of the packet (e.g. root level TCP stream).
This means that, for example a packet “A”
inside packet “X” is different to a packet “A” inside packet “Y”. Packets are of the same type only if
their fully qualified absolute type names are equal.
As a convention, when discussing packet type
names, they will be noted in their absolute form with a URL style slash (/)
separating each level. In the above
example, the first packet is “/X/A” while the second is “/Y/A”. It is clear now that the packets are of
different types.
Packet type names can contain from 1 to 8
bytes inclusive, and none of these bytes may be a null (0). Community approved packets are by
convention named with uppercase characters and digits, for example “PUSH”. Private packet types are by convention
named with lowercase characters and digits, prefixed with the vendor code of the
owner, for example “RAZAclr2”.
Packets are encoded with a single leading
control byte, followed by one or more bytes of packet length, followed by one or
more bytes of packet name/ID, followed by zero or more child packets (framed the
same way), followed by zero or more bytes of payload:
+---------+----/----+----/----+-------------/-----------+
| Control | Length_ | Name___ | children and/or
payload |
+---------+----/----+----/----+-------------/-----------+
All packets can contain a payload only,
children and a payload, children only, or nothing at all. The total length of the packet header
(control, length and type name) cannot exceed 12 bytes and cannot be less than 2
bytes.
The control byte is always non-zero. A zero control byte identifies the end
of a stream of packets, and thus has special meaning. It should not be used in root packet
streams (which do not end).
Control bytes have the following
format:
Bit 7 < - > Bit 0
+----+----+----+----+----+----+----+----+
| Len_Len | Name_Len – 1 | CF | BE | //
|
+----+----+----+----+----+----+----+----+
The length field immediately follows the
control byte, and can be 0 to 3 bytes long. Length bytes are stored in the byte
order of the packet.
The length value includes the payload of this packet AND any child packets in their entirety. This is obviously needed so that the entire packet can be detected and acquired from a stream. The length does not include the header (control byte, length, and name). The length field precedes the name field to allow it to be read faster from a stream when acquiring packets.
The length field is in the byte order of the
root packet.
The type name field immediately follows the
length bytes, and can be from 1 to 8 bytes long. Its format is detailed in the previous
section entitled “Namespace Considerations”.
Child packets are only present if the
“compound packet bit” is set in the control byte. If set, there is one or more child
packet immediately following the end of the header. These child packets are included in the
total length of their parent (along with the payload, which follows the child
packets after a packet stream terminator).
Child packets are framed exactly the same
way, with a control byte, length, name, children and/or payload. When the compound bit is set and the
packet is not of zero length, the first child packet must exist. Subsequent child packets may also exist,
and are read in sequentially in the same way that they are read from a root
packet stream. The end of the child
packet stream is signalled by the presence of a zero control byte, OR the end of
the parent packet’s length (in which case there is no payload). Including a terminating zero control
byte when there is no payload is still valid, but
unnecessary.
Payload may exist whenever the length field
is non-zero. However, if the
compound bit is set, one or more child packets must be read before the payload
is reached. If there is no packet
left after the end of the last child, there is no payload.
Note that there are a number of “marker
packet types”, which have no children or payload. It is desirable to encode these in as
small a space as possible, which means omitting the length field and setting the
len_len bits to zero in the control byte.
This creates a potential conflict, as the control byte itself may be zero
if the type name is one byte long – and as noted above, a zero control byte has
special meaning (end of packet stream).
This must be avoided; luckily it is perfectly legal to set the compound
packet flag (CF) on zero length packets, thus producing a non-zero control byte
and the most compact packet possible.
The compound packet bit MUST be checked when
decoding every packet. It should be
done in low-level decoding code to avoid accidental omission. Do not assume that a packet will not
have children – it might not now, but no packets are sterile. Anything could be augmented or extended
in some unknown way in the future.
If you are not interested in children, skip them (which is easy, you
don’t even need to recurs through their children).
BYTE nInput
= ReadNextByte();
if ( nInput == 0 ) return
S_NO_MORE_CHILDREN;
BYTE nLenLen = ( nInput & 0xC0 )
>> 6;
BYTE nTypeLen = ( nInput & 0x38 ) >>
3;
BYTE nFlags
= ( nInput & 0x07 );
BOOL bBigEndian = ( nFlags
& 0x02 ) ? TRUE : FALSE;
BOOL bIsCompound = ( nFlags &
0x04 ) ? TRUE : FALSE;
ASSERT( ! bBigEndian );
DWORD nPacketLength = 0;
ReadBytes( (BYTE*)&nPacketLength, nLenLen
);
CHAR szType[9];
ReadBytes( (BYTE*)szType, nTypeLen + 1
);
szType[ nTypeLen + 1 ] = 0;
The format of a packet payload is defined by the packet type and can consist of any binary data; however there are a number of conventions in place for serializing common datatypes.
Multi-byte integers are serialized in the byte-order of the topmost packet. Little-endian is the default byte-order; however big-endian byte order can be selected for those who want it.
A network or node address consists of a physical address and a port number, and are of variable length depending on the address family.
In IPv4, a network/node address is six bytes long: 4 bytes for an IP address and 2 bytes for a port number as follows:
typedef struct
{
BYTE ip[4];
SHORT port;
} IPV4_ENDPOINT;
Note that this is considered an array of 4 8-bit integers (bytes), followed by a 16-bit integer (short). Byte order does not affect bytes, but it will affect the 16-bit port number.
IPv6 addresses are longer and are not yet defined within the scope of Gnutella2, however applications should be aware that if the node address is not 6 bytes it is of a different address family.
Globally unique identifiers (GUIDs) are used to identify nodes on the network. GUIDs are serialized as an array of 16 bytes.
Strings are serialized as a zero-terminated array of 8 or 16 bit integers, with byte order considerations applying if 16 bit integers are used. 16 bit storage is selected by storing a single 0xFF byte before the string – a traditional byte-order mark is not required because the byte order has already defined for the packet as a whole.
A zero character (0x00 or 0x0000) marks the end of the string, however if the string data meets the end of the packet (or child packet) payload, the terminator is not required. This means that packets whose payload consists of a string do not need to include a zero string terminator and their payload length will be the byte length of the encoded string exactly.
When 16-bit mode is selected, the string is serialized in 16-bit Unicode. Most runtime environments provide tools for working with 16-bit Unicode text.
When 8-bit mode is selected, the string is serialized in UTF-8. 7-bit characters are passed as-is, while extended characters are encoded with multi-byte sequences.
All applications are required to support
decoding both 8 and 16 bit strings, however it is up to each application which
string format to encode with. For
example it is acceptable to send all strings as 16-bit Unicode if
desired.
Once TCP stream connections have been established forming the overall network topology, a set of basic packet types and behaviours are used to keep the network together and perform maintenance functions.
These maintenance packets are documented in this section, each with name, payload, children and behaviours.
The ping packet is used to verify the presence of the addressed node, testing that:
Aside from performing a “keep-alive” function on TCP links, pings can be used to solicit inbound UDP datagrams to test the local node’s ability to receive UDP. Many NAT systems will route UDP traffic from nodes with whom a persistent TCP stream has been established, so it is necessary to test inbound traffic from nodes with which there is no TCP connection. This is achieved by sending a ping with a /PI/UDP packet to a connected hub via TCP, which will forward the request to its neighbours who then reply to the originator via UDP.
On a TCP stream connection, if one end has not received a valid packet from the other after an internally determined period of time, it has the option of sending a keep-alive ping to verify the other end’s presence. If a pong is not received in a timeout period it may close the connection.
If a node needs to verify that it is able to
receive UDP datagrams from nodes with which it does not have a TCP stream
connection, it may send a ping packet with a /PI/UDP child packet. If it then receives a
/
Upon receiving a keep-alive ping, a node should respond with a keep-alive pong immediately.
If the ping was received from a TCP neighbour and contains a /PI/UDP child packet, but no /PI/RELAY tag, a /PI/RELAY tag should be added and the packet forwarded to all neighbouring hubs. If a /PI/RELAY tag was present, the node should send a /PO packet with a /PO/RELAY tag to the UDP address identified in the /PI/UDP child packet.
This packet has no payload at the current time.
The ping packet has two child packets defined at the current time:
If a /PI packet contains a /PI/RELAY child marker, the ping has been relayed for a third party. The receiving node should reply to the original party rather than the sender. The original party should be contacted via UDP with the address contained in the /PI/UDP child. If the /PI/UDP child is absent, the packet is invalid.
The /PI/RELAY child should be added to a /PI packet when relaying a ping to hub neighbours at the request of a leaf.
Upon receiving a /PI packet with a /PI/RELAY
tag, a /
This packet has no payload at the current time.
This packet has no known children at the
current time.
The /PI/UDP child packet specifies the return address for a relay able or relayed ping.
Add a /PI/UDP child packet to a /PI packet to request a relayed “two hop” ping.
The /PI/UDP child packet indicates that a
/PI packet needs to be relayed.
The /PI/UDP packet contains an endpoint address.
This packet has no known children at the
current time.
The pong packet serves as a reply to a one or two hop ping (/PI) packet. It indicates that the pinged node received and processed the ping, and in the case of UDP, that the local node was able to received UDP from the sender.
Send a pong packet in response to a direct
or relayed ping request.
No action is required in response to a pong
packet. If the packet contains a
/PO/RELAY marker child, test the sender’s network address against the list of
active TCP connections and if there is no match, consider the local node able to
receive UDP.
This packet has no payload at the current time.
The ping packet has one child packet type defined at the current time:
If a /
The /PO/RELAY child marker should be added
to a /
No specific action is required for a relayed pong.
This packet has no payload at the current time.
This packet has no known children at the
current time.
The local node information packet is used to convey essential information about the node on either end of a TCP stream connection.
The /LNI packet should be sent to a TCP neighbour upon connection, and at regular intervals following that if the information within it has changed. A minimum update time of one minute is recommended.
Upon receiving a /LNI packet, the information within should be stored for subsequent use.
This packet has no payload at the current time.
The /LNI packet has many child packet types defined at the current time:
The /LNI/NA child packet specifies the node or network address of the sending node.
This child is required.
The physical network address of the sending node. See datatypes for more information.
This packet has no known children at the current time.
The /LNI/GU child packet specifies the globally unique identifier of the sending node.
This child is required.
The 16 byte globally unique node identifier of the sending node.
This packet has no known children at the
current time.
The /LNI/V child packet specifies the vendor code of the software operating the sending node.
This child is optional.
A four byte vendor code.
This packet has no known children at the
current time.
The /LNI/LS packet provides information about the content library of the sending node.
This child is optional.
Two 32-bit integers representing the number of files in the local library, and the total KB in the local library respectively. If the sending node is a hub these statistics are combined with the known statistics from all connected leaves. The payload may grow beyond 8 bytes in the future.
This packet has no known children at the
current time.
The /LNI/HS packet is included only if the sending node is a hub, and contains the status of the hub.
This child packet should only be sent by hub nodes. Leaf nodes should not transmit it, and if it is received from a leaf node, it should be ignored.
Two 16-bit integers representing the current leaf count and the maximum leaf count respectively. This packet may grow beyond 4 bytes in the future.
This packet has no known children at the
current time.
The known hub list packet is used to exchange a list of known Gnutella2 hubs with neighbouring nodes. It is exchanged regularly over all TCP connections, between hubs and between hubs and leaves (and vice versa). It enables a completely up to date picture of the local hub cluster, and the gradual propagation of cached hub records.
This packet should be sent through TCP links at regular intervals. One minute is suggested for busy links, leaves may send to hubs less often. Note that unlike /LNI, /KHL must be sent regularly even if the information does not change in order to refresh timestamps and keep time-sorted caches in the correct order.
Upon receiving a known hub list, the detailed hubs should be stored in appropriate data structures. Neighbouring hubs must be retained in the hub cluster records, while cached hubs should be merged with the local accessible hub cache.
This packet has no payload at the current time.
This packet has several child packet types defined at the current time:
The /KHL/TS child provides a timestamp representing the current universal time at the sending node. This can be used as a reference when considering other timestamps in the packet, allowing them to be adjusted to eliminate differences between the time setting on the local and remote node.
This child is optional but recommended.
A 32-bit integer representing the current UNIX time, or time (NULL).
This packet has no known children at the current time.
This child packet represents a neighbouring hub.
One should be included for each Gnutella2 neighbour hub, with the possible exception of the one to which this packet is being sent.
The node address of the hub. See datatypes for more information.
This packet has several child packet types defined at the current time:
The /KHL/NH/GU child packet specifies the globally unique identifier of the hub.
This child is optional.
The 16 byte globally unique node identifier of the hub.
This packet has no known children at the
current time.
The /KHL/NH/V child packet specifies the vendor code of the software operating the hub.
This child is optional.
A four byte vendor code.
This packet has no known children at the
current time.
The /KHL/NH/LS packet provides information about the content library of the hub and its connected leaves.
This child is optional.
Two 32-bit integers representing the number of files, and the total KB of available content in the hub’s own library and the libraries of its connected leaves. The payload may grow beyond 8 bytes in the future.
This packet has no known children at the
current time.
The /KHL/NH/HS packet contains the status of the hub.
This child is optional.
Two 16-bit integers representing the current leaf count and the maximum leaf count respectively. This packet may grow beyond 4 bytes in the future.
This packet has no known children at the
current time.
This child packet represents a hub from the hub cache.
An arbitrary number of these children should be included, constrained by space and the number of available cached hubs. A selection of freshly time stamped but previously unknown hubs should be included, and the list should not include hubs in the local cluster.
The node address of the hub, followed by the time this hub was last seen as a 32-bit timestamp. The last seen time can be adjusted based on the difference between the /KHL/TS timestamp and the local time. Note that the node address is of variable length, so it is important to consider the last 4 bytes as the timestamp rather than assuming a fixed length node address.
This packet has several child packet types defined at the current time:
This extra information is considered less relevant for a cached hub because it is unlikely to be up to date enough to be useful.
The /KHL/CH/GU child packet specifies the globally unique identifier of the hub.
This child is optional.
The 16 byte globally unique node identifier of the hub.
This packet has no known children at the
current time.
The /KHL/CH/V child packet specifies the vendor code of the software operating the hub.
This child is optional.
A four byte vendor code.
This packet has no known children at the
current time.
The /KHL/NH/LS packet provides information about the content library of the hub and its connected leaves.
This child is optional.
Two 32-bit integers representing the number of files, and the total KB of available content in the hub’s own library and the libraries of its connected leaves. The payload may grow beyond 8 bytes in the future.
This packet has no known children at the
current time.
The /KHL/CH/HS packet contains the status of the hub.
This child is optional.
Two 16-bit integers representing the current leaf count and the maximum leaf count respectively. This packet may grow beyond 4 bytes in the future.
This packet has no known children at the
current time.
Each Gnutella2 node must maintain a non-exhaustive cache of known hubs at the global level, and an exhaustive cache of hubs within the neighbouring hub cluster(s).
The known hub list is used to provide connection hints to the local connection manager, and other nodes which connect permanently or transiently. The most recent portion of it is exchanged with neighbours regularly.
It is also used when executing a query on
the network, which involves iteratively contacting hubs representing each hub
cluster and simultaneously recording the hubs which have been accounted for and
adding those newly discovered.
The known hub cache should be highly efficient, addressable by node address and timestamp, and sorted by the last seen timestamp of each hub record. Adding fresh hubs should push the oldest hubs from the bottom of the cache.
It is suggested that each cached hub entry store:
Hubs whose last seen timestamp is too old should be removed from the cache, and should certainly never be sent to other nodes.
The hub cluster cache is used to maintain an up to date list of neighbouring hubs, and the neighbouring hubs of neighbouring hubs. Thinking in terms of hops, it is an exhaustive list of every hub which is 1 or 2 hops away from the local node (be the local node a hub or a leaf). The cluster cache is really only important when operating in hub mode, however it can be maintained in leaf mode for informational purposes.
The hub cluster cache is used by a hub when responding to a keyed remote query request, in the generation of a query acknowledgement packet (/QA). The /QA packet contains a list of neighbouring hubs and a selection of second degree neighbours.
The cluster cache is updated using information from /LNI and /KHL packets.
Each Gnutella2 node must maintain a dynamic node route cache to map node GUIDs to appropriate destinations. The route cache is consulted when a packet needs to be dispatched to or toward a GUID-addressed node. GUID-addressing is used over network-addressing in a number of situations.
Each entry in a node route cache will have:
A route cache needs to be addressable by GUID, and must implement a refresh mechanism to store routes for an appropriate amount of time based upon route hits. Many schemes exist to engineer efficient lookup systems, such as hash tables, two-table exchanges and balanced trees.
The route cache is similar to the several GUID mapping caches involved in old Gnutella applications, with two key differences:
GUIDs are used to identify virtual entities within the network, such as nodes (node GUIDs), searches (search GUIDs) and other transactional requests. The GUIDs associated with these entities can then be committed to the route cache and mapped to reflect the easiest route back to the owner of the object.
All nodes should be aware of their directly connected neighbours, and treat these node GUIDs as a special case that need never expire.
Any Gnutella2 packet may be addressed to a particular destination node by GUID. Upon receiving an addressed packet, it should be immediately forwarded either to the destination, or toward it. Addressed packets should not be interpreted locally unless the destination address matches the local GUID.
Loops are avoided by placing restrictions upon forwarding:
Note that these restrictions apply only to generically addressed packets. Some packet types have specific forwarding rules which override these. These rules allow any node to be reached in two hops.
Packets are addressed by including a special child packet as the first child of the root packet. The child packet is named “TO”, so its full name would be “/?/TO” where ? is the root packet name. The address packet’s payload consists of a 16 byte GUID, and it has no children defined at the current time.
Addressed packet forwarding can be used to deliver any valid Gnutella2 communications to leaf nodes that are unable to accept TCP or UDP traffic directly. The packet is simply sent to one of the hubs to which the target leaf is attached, or a hub in the same hub cluster, and the forwarding rules will allow the packet to reach its destination.
This mechanism is used to request that a “firewalled” leaf initiate a “reverse connection” or “call-back” to an elected address. The root packet type “/PUSH” is used, and is documented in the following sections.
The push packet is used to request that a node initiate a TCP connection to a specified node address. It is useful where the receiving node is unable to accept TCP connections naturally.
An addressed /PUSH packet should be sent to a hub with direct or proxy access to the destination leaf node. It can be sent directly by a hub to one of its leaves without the need for addressing.
Upon receiving a /PUSH packet, the local node should initiate a TCP connection to the elected node address and issue a “push handshake”, documented below.
The network address of the node to contact, including of course the port number.
This packet has no known children at the
current time, however it is often used with the generic addressing child packet
“/?/TO” to indicate that it should be forwarded to a
destination.
The Gnutella2 push handshake is very simple – it simply identifies the connection as originating from a push, and provides the GUID of the initiating node:
PUSH guid:GUIDinHEXguidINhexGUIDinHEXguidI (\r\n)
(\r\n)
Note that unlike the Gnutella1 case, no purpose-specific information is provided. The pushed connection can now be used for any purpose a normally established TCP link could adopt, including but not limited to:
Gnutella2 implementations are however strongly advised to provide backward support for the Gnutella1 push handshake, even if not supporting Gnutella1 directly. This is because applications supporting both protocols may be unable to determine in advance whether the host they are pushing to is Gnutella1 or Gnutella2.
The legacy-style push handshake looks like:
GIV 0:GUIDinHEXguidINhexGUIDinHEXguidI/ (\n\n)
The leading zero and trailing slash have task-specific meanings in Gnutella1; however Gnutella2 applications can safely ignore them and consider only the GUID. Be sure to allow for variable length handshakes, however.
Building an efficient network topology is not viable without means to restrict the flow of information appropriately. In the Gnutella2 network architecture, neighbouring nodes exchange compressed hash filter tables describing the content which can be reached by communicating with them. These tables are updated dynamically as necessary.
The concept of filtering hash tables in file sharing was pioneered by Limegroup for the LimeWire Gnutella1 application.
Query hash tables or QHTs provide enough information to know with certainty that a particular node (and possibly its descendants) will not be able to provide any matching objects for a given query. Conversely the QHT may reveal that a node or its descendants may be able to provide matching objects.
This property means that queries can be discarded confidently when a transmission is known to be unnecessary, while not providing the filtering or forwarding node any actual information about the searchable content. Neighbours know what their neighbours do not have, but cannot say for sure what they do have. QHTs are also very efficient both in terms of exchange and maintenance and lookup cost.
A QHT is a table of 2^N bits, where each bit represents a unique word-hash value. For example a table of 20 bits has 2^20 = 1048576 possible word-hash values. If stored uncompressed this table would be 128 KB in size.
In an empty table, every word-hash value bit will be “1”, which represents “empty”. To populate the table with searchable content, an application must:
· Locate every plain-text word and URN which could be searched for and produce a match/hit
· Hash the word with a simple hash function to produce a word-hash value which is 0 <= value < (2^N).
· Set the appropriate bit in the table to zero, representing “full”.
Words are strings of one or more alphanumeric characters which are not all numeric.
To convert a word into a hash value, the following case-insensitive algorithm is used:
// HashWord( string_ptr, char_count, table_bit_count );
DWORD CQueryHashTable::HashWord(LPCTSTR psz, int nLength, int nBits)
{
DWORD nNumber = 0;
int nByte = 0;
for ( ; nLength > 0 ; nLength--, psz++ )
{
int nValue = tolower( *psz ) & 0xFF;
nValue = nValue << ( nByte * 8 );
nByte = ( nByte + 1 ) & 3;
nNumber = nNumber ^ nValue;
}
return HashNumber( nNumber, nBits );
}
DWORD CQueryHashTable::HashNumber(DWORD nNumber, int nBits)
{
WORD64 nProduct = (WORD64)nNumber * (WORD64)0x4F1BBCDC;
WORD64 nHash = ( nProduct << 32 ) >> ( 32 + ( 32 - nBits ) );
return (DWORD)nHash;
}
URNs are treated as a special case: rather than dividing them up into word tokens, they are hashed as a complete fixed length string. For example:
urn:sha1:WIXYJFVJMIWNMUWPRPBGUTODIV52RMJA
Bitprint URNs are
actually composite values which include both a SHA1 and TigerTree root
value. Rather than adding the whole
bitprint to the table, each of the constituent URNs are added separately. This allows SHA1-only querying and
tiger-only querying. A root
TigerTree URN takes the form:
urn:tree:tiger/:CN25MLNU3XNN7IHKZMNOA63XG6SKDJ2W7Z3HONA
Other URNs should
be expressed in their most natural form before being fed to the word hash
function.
For words consisting of at least five characters, it is often useful to be able to match substrings within the word. Unfortunately adding every possible substring of each word would increase the density of the QHT, however a simple and effective compromise is available:
For words with 5 or more characters:
This allows searching on prefixes of the
word, for example “match” will now match “matches”.
Nodes must keep their neighbouring hubs up to date with their latest local query hash table at all times. Rather than sending the whole table whenever it changes, nodes may opt to send a “table patch”, which includes only the difference between the old and new table.
The /QHT packet is used to supply a query hash table update to a neighbouring hub. Its format is compatible with the Gnutella1 “query routing protocol”, except that Gnutella2 requires a 1-bit per value table while Gnutella1 requires a 4 or 8 bit per value table. Gnutella2 supports patch compression using the deflate algorithm, however this should not be applied if the TCP link itself is compressed.
The query hash table update packet supplies an update to the query hash table for the sending node. It supports two operations: clear existing table, and patch existing table with new data.
The /QHT packet should be sent when the local or aggregate query hash table has changed.
Upon receiving a /QHT packet, the local hub should patch the query hash table for the connection on which the packet was received. If the connection is to a leaf node, the hub should update its aggregate table and queue a propagation of that table to neighbouring hubs in the cluster.
The payload is of the same format as Gnutella1’s “query routing protocol” message:
The first byte is a command byte, 0x00 indicates a reset command, while 0x01 indicates a patch command.
typedef struct
{
BYTE command;
// = 0 (reset)
LONG
table_size_in_entries; // =
2^(size_in_bits)
BYTE infinity; // = 1
} QHT_RESET;
A reset packet establishes a new table and clears it (replacing any existing table). In Gnutella2 the value of infinity must always be 1. Note that the table size is measured in entries rather than bits, so it is equal to 2^N where N is the number of bits.
typedef struct
{
BYTE command; // = 1 (patch)
BYTE fragment_no; // = number of this patch fragment
BYTE fragment_count; // = number of patch fragments
BYTE compression; // = 0 for none, = 1 for deflate
BYTE bits; // = 1 for Gnutella2
(data)
}
QHT_PATCH;
Patches may be fragmented into more than one packet if necessary, in which case the fragment number and count fields are used to reassemble the patch. Note that fragments must be sent in correct order.
Patches may be compressed – in this case the whole patch is compressed, rather than compressing fragments individually. One compression type is defined, deflate compression.
Gnutella2 always uses tables with 1 bit per entry, so the bit count must be 1.
The patch data in uncompressed form is of (2^N)/8 bytes length, where N is the table bit size. This is exactly the same size as the table which is being patched. To apply the patch, simply XOR the existing table with the patch data. In other words, whenever there is a 1 bit, the bit in the existing table is toggled.
A table of 2^N bits can be stored in an array of bytes (2^N)/8 long. To resolve a hash-value into a byte and bit number, use the following equations:
int nByte = ( nHashValue >> 3 );
int nBit = ( nHashValue & 7 );
The least significant bit is numbered 0; the most significant bit is numbered 7. To set a bit as empty (setting it to 1):
table_ptr[ nByte ] |= ( 1 << nBit );
To set a bit as full (setting it to zero):
table_ptr[ nByte ] &= ~( 1 << nBit );
Nodes operating in hub mode must maintain an aggregate or superset query hash table, consisting of their own searchable content combined with the QHTs supplied by all connected leaves. This aggregate table is supplied to neighbouring hubs, allowing them to completely filter traffic for the local hub and its leaves.
This has two important implications:
To create an aggregate table, start with an empty table of fixed size containing all 1’s. For each contributing table, copy any zero (full) bits to the aggregate table. This is effectively an AND operation. If a source table is smaller than the aggregate table, a single 0 bit in the source will equate to several zero bits in the aggregate. If the source table is larger than the aggregate table, a single zero bit will map to a single bit with some loss of accuracy.
It is of great importance that all QHTs in the system be sufficiently large to allow an aggregate table to remain suitable sparse. Ideally each leaf node should provide a table less than 1% dense.
Before transmitting a query packet to a connection that has provided a query hash table, match the words and URNs in the query against the QHT.
· If any of the lookups based on URNs found a hit, send the query packet
· If at least two thirds of lookups based on words found a hit, send
· Otherwise, drop the packet
It is important to apply the “two thirds” rule only for words. URNs must provide an automatic match.
Consider all text content in the query, including generic search text and metadata search text if it is present. When dealing with simple query language that involves quoted phrases and exclusions, apply the following rules:
See the section on the simple query language
for more information.
The distributed object search mechanism is a very important component of the Gnutella2 architecture. It allows objects distributed throughout the network to be located by a search client in an optimal fashion, requesting and receiving a subset of the total information known about that object.
The Gnutella2 object search mechanism is best described as an “iterative crawl” of the network, with a series of important optimisations derived from the network topology and components:
This model has a number of important advantages:
When a search client wishes to execute a new search, the following events happen:
In an environment where a single query injection may result in a number of responses, acting as a virtual “traffic amplifier”, it is desirable to verify that the return address is indeed that of the original sender. This prevents the network from being used as a method of generating a large-scale traffic generator.
The Gnutella2 network uses a system of “query keys” to solve this problem.
Before a search client can transmit a query to a particular hub, it must obtain a “query key” for that hub and include it in the transmission. Query keys are unique to the hub generating them and the return address with which they are associated and generally expire after a relatively long period of time.
Search clients request a query key by sending a query key request to a hub, which includes the intended return address for the key. The hub generates a key unique to that return address and dispatches it there. This makes it impossible to get a query key for an IP which is not controlled, and prevents keys from being shared between two nodes (key stealing).
When receiving a query from a foreign node, Gntuella2 hubs check the query key against the one they have issued for the query’s requested return address and proceed only if there is a match. If and only if the query key matches will the query acknowledgement be sent, or the search processed locally and forwarded to local leaves and neighbouring hubs.
This has several positive effects. If a query does not have a valid key for
the receiving hub, it will not be processed and will thus not generate a traffic
amplification effect which may be used in a denial of service attack on a third
party host. Secondly, the query key
mechanism ensures that queries are only processed if they were transmitted from
a host which has control over the host in the return address (in the normal
case, this is the same host). This
means that flood control mechanisms can remain just as effective as in the TCP
case. Similarly, host blocking is
possible as the original query source can be verified.
The query key request asks a destination hub
to provide a query key that the requesting node can use to securely perform
queries. This mechanism ensures
that the network cannot be used to direct unsolicited packets to external
hosts.
A query key request should be sent to a hub via the connectionless transport if a key is not already available for that hub for the local node address, or if a previously acquired key no longer works or has expired. There is no reason to request a query key via a TCP link.
Upon receiving a query key request, a hub should respond with a query key answer (/QKA) packet if it is willing to supply a query key. The answer packet should be dispatched to the return address indicated in the request packet only, and the allocated key should be locked to that address.
A query key may be requested for a node address other than that of the sender, for example when a UDP-firewalled leaf node wishes to acquire a query key for a remote hub that it will route through a locally connected hub. In this case it must request a query key for the node address of a connected hub.
This child is required to ensure that the sender knows its address.
The network address of the requesting node. See datatypes for more information.
This packet has no known children at the
current time.
The query key answer packet is sent in
response to a query key request or in the event that a query with an incorrect
or omitted query key is received.
It advises the receiver of the correct query key to be used when
contacting the sending hub to execute a remote cluster
query.
The query key answer packet should be sent by a hub when a query key has been requested. It should contain a key unique to the local hub and the requesting node address supplied in the request, and should be sent to that address only.
The query key answer packet may also be sent upon receiving a query with a missing or invalid query key. In this case no query should be performed.
Upon receiving a query key answer packet, the supplied query key should be stored for later use. If the answer packet contains a sending node address (/QKA/SNA) child which does not match the local node address, it may be forwarded to a connected leaf node via TCP with a matching address. In the forwarding case, a query node address (/QKA/QNA) child is added prior to forwarding to retain the remote (queried) hub’s address.
The /QKA/QK child contains the query key that was requested.
This child is required if a key is being issued. Sending a /QKA without a /QKA/QK indicates that the hub does not wish to authorise the requesting node to query it or its cluster.
A 32-bit query key which is unique to the hub and the sending node address.
This packet has no known children at the
current time.
This child specifies the node address for which the key has been issued. This is not necessarily the node that physically transmitted the key request or the query.
This child is required.
The node address (with port number optional) of the node which requested the query key.
This packet has no known children at the
current time.
If this query key answer packet has been forwarded to a firewalled leaf from a connected hub, a /QKA/QNA child will have been added to indicate the address of the remote hub.
This child should be added by a hub when relaying a received query key answer on to a leaf via TCP.
If this child is present, the query key answer should be treated as if it originated from the stored address rather than the TCP neighbour.
The node address (with port number optional) of the remote hub.
This packet has no known children at the
current time.
The Gnutella2 object search mechanism defines a powerful query descriptor capable of specifying a number of different extensible search criteria, and requesting a subset of information about matching objects.
The /Q2 packet is the generic Gnutella2 object query descriptor. It contains search criteria, identification, authentication, return addressing and the type of information the search client is interested in.
A node should send a /Q2 packet to another node when it wishes to execute a query on that node (and its searchable relations).
/Q2 packets should always be filtered against a query hash table if one is available for the outbound link.
Upon receiving a query, a node should:
· Verify its authentication
· Send an acknowledgment if it was not received from a hub
· Forward it to connected nodes if necessary (detailed below)
· Process it locally and dispatch results
Forwarding of /Q2 packets is achieved based on fixed rules:
The payload of a /Q2 packet consists of a GUID which uniquely identifies the search operation.
This packet has many defined child packet types at the current time:
The /Q2/UDP child packet specifies the return address to be used for all direct replies, and the query key authentication which authorises transmissions to this address. If the query key is invalid, the query should be discarded and a valid query key issued to the nominated address (if desired).
This child is required if the query is to be delivered via UDP. It is optional if delivered via TCP: if not included, replies will be reverse forwarded via TCP.
If this child is present, all replies should be directed to the nominated address. The query key should be verified before replying.
A network address (variable length) followed by a query key (fixed length of 32 bits).
This packet has no known children at the
current time.
The /Q2/URN child specifies an exact universal resource name which will match the query. Rules:
· If one or more URNs are present in a query, whether or not they are understood, matching must occur on URN only – other criteria should be ignored.
· If multiple URNs are present, only one need match an object for it to be considered a match for the query.
· Following from the above, multiple URNs of the same family may be specified and match multiple objects (this allows a compound query for several known objects to be executed in one packet)
This child should be added if a specific URN is being sought.
A string identifying the URN families followed by the URN in either text or binary representation. Text representations should always be 8-bit so that they can be considered binary.
The following compact URN families are recognised:
This packet has no known children at the
current time.
The descriptive name child provides generic query text to be used in matching objects.
A string in the simple query language.
This packet has no known children at the
current time.
The metadata child provides an XML document containing rich query information. The schema identified in the XML must match a metadata schema of prospective matching objects.
A string containing XML. The <?xml?> header is optional. Standard plural/singular Gnutella metadata schemas are used, for example:
<?xml version=”1.0”?>
<applications xsi:noNamespaceSchemaLocation=”http://www.shareaza.com/schemas/application.xsd”>
<application title=”Shareaza”/>
</applications>
All string values are in simple query language, including XML attribute and element values.
This packet has no known children at the
current time.
The size restriction criteria specify a minimum and maximum size for matching objects.
Two 32 bit integers, a minimum size followed by a maximum size in bytes.
This packet has no known children at the
current time.
The Interest packet advertises an interest in certain information classes, effectively filtering the information that will be returned about matching objects.
Send an interest packet to restrict the type of information that will be returned about matching objects. For example if querying only to find instances of a known object, advertise interest for “URL” only. Omitting the interest packet produces a default set of response information.
If an interest packet is received, only return the requested information types. If no interest packet is present, return all available information that is feasible. Extensions may be withheld.
An array of strings identifying information classes. The following are defined:
Note that default URNs are always supplied and need not be requested.
This packet has no known children at the
current time.
When a Gnutella2 hub receives a /Q2 query packet from a search client, it needs to provide the client with acknowledgement for several reasons:
The compact query acknowledgement lists the hubs which have been queried, and provides some hubs which could be queried later.
The query acknowledgement packet is used to inform a search client that a target hub has received its query and is processing it. It also provides information for the search client’s hub cache, expanding knowledge of the network and ensuring hubs are not searched more than once in a given query.
Send a query acknowledgement packet only when operating in hub mode, and only when the query was not received from a hub via a TCP link. Acknowledgements should only be sent after verifying the query key authentication.
When a query acknowledgement is received, update the local hub cache as necessary:
A GUID identifying the query being acknowledged.
This packet has several child packet types defined at the current time:
The /QA/TS child provides a timestamp representing the current universal time at the sending node. This can be used as a reference when considering other timestamps in the packet, allowing them to be adjusted to eliminate differences between the time setting on the local and remote node.
This child is optional but recommended.
A 32-bit integer representing the current UNIX time, or time(NULL).
This packet has no known children at the current time.
References a hub which has now been queried, and should not be queried again.
Send a /QA/D child for the local hub and each neighbouring hub, whether or not the hub was actually forwarded the query. If a hub was not forwarded the query, it was still effectively searched as the QHT indicated it could not field a matching object.
Upon receiving a /QA/D child, add the hub to the cache, freshen its timestamp and mark it as queried in this search. Do not query this node again for several minutes.
A network/node address followed by a 16-bit leaf count.
This packet has no known children at the current time.
References a hub which was not searched in this operation, but could be searched later if it has not already been touched.
Send a /QA/S child for every unique hub in the local hub cluster which is not a direct neighbour. Optionally select some recent cached hubs and send /QA/S children for them also, to make up at least 10 in total.
Upon receiving a /QA/S child, add the hub to the cache and/or freshen its timestamp.
A network/node address, optionally followed by a 32-bit last-seen timestamp. This makes detecting the address family slightly harder, but the compactness is worthwhile. Timestamps are omitted for hubs in the local hub cluster, which should make up the majority.
This packet has no known children at the current time.
Advises the search client that the hub does not wish to be contacted again for a specified period of time. Queries earlier than this time may result in a longer-term ban.
Send this child, with or without executing the search if the hub’s resources are becoming saturated. It provides an indirect means of flow control.
Upon receiving a /QA/RA child, record the retry after interval and do not send any requests to the hub until the period has expired.
A 16 bit or 32 bit integer specifying the delay in seconds until the hub may be contacted again.
This packet has no known children at the current time.
This child is used when a query acknowledgement is forwarded from a proxy hub to a firewalled leaf node. It advises the leaf of the original sender.
When a hub receives a /QA packet for a search originated by one of its leaves, it should forward the /QA to the appropriate leaf after adding a /QA/FR child with the remote hub’s network address.
If this child is present and the /QA packet was received via TCP from a local hub, the node address within should be used as the remote hub’s node address. This information is often reflected in the first /QA/D child, however this is not always present.
A network address which may omit the port number.
This packet has no known children at the current time.
Search results from Gnutella2 nodes are delivered in /QH2 query hit packets. Each query hit packet may contain a number of matching object descriptors, and a single query may result in many /QH2 packets from many nodes. Search results may contain varying amounts of information depending on the capabilities of different nodes, and the interest advertised in the original query.
The /QH2 Gnutella2 query hit packet contains one or more search result descriptors, along with information about the node generating the result set. A single node may generate several /QH2 packets in response to a /Q2 query. Like all Gnutella2 packets, the /QH2 packet is highly extensible.
Send a /QH2 packet when processing a /Q2 query packet reveals one or more matching objects. Query hits can be returned directly to the /Q2/UDP return address in the query, or routed back along the delivery chain. Firewalled leaf nodes are advised to return /QH2 packets to their connected hub for dispatch, as the hub will be able to make a reliable delivery.
Upon receiving a /QH2 query hit packet, check the query GUID against locally initiated queries. If it originated elsewhere, increment the route-back hop counter and forward it to the most appropriate link or UDP endpoint. If it matches a locally generated query, process the search results.
A single byte representing the hop count, followed by a 16 byte GUID identifying the search that produced these results.
This packet has many child packet types defined at the current time:
Specifies the GUID of the sending node.
This child is required.
The GUID of the sending now.
This packet has no known children at the current time.
Specifies the node / network address of the sending node.
This child is optional but recommended.
The node / network address of the sending
node. See datatypes for more
information.
This packet has no known children at the current time.
Lists a hub to which the sending node (a leaf) is connected. This provides a contact point for routing communications to a possibly firewalled leaf node.
This child is optional, and may appear more than once.
The node / network address of a hub to which the sending node is connected.
This packet has no known children at the current time.
Identifies the software operating the sending node.
This child is optional.
A four character vendor code.
This packet has no known children at the current time.
Indicates that the sending node supports browsing the local user profile.
This child is optional.
No payload is defined at the current time.
This packet has no known children at the current time.
Indicates that the sending node supports person to person private message chat.
This child is optional.
No payload is defined at the current time.
This packet has no known children at the current time.
Describes a group of search results.
This child is optional and may appear more than once.
A single byte, the group ID number.
This packet has one child packet type currently defined:
Provides a status snapshot of the upload server that applies to search results in the current hit group.
Add this child to the hit group descriptor to provide server state information.
Use this information when interpreting hits linked with this hit group.
This packet has no known children at the current time.
Describes a single object which matched the original query.
Send a /QH2/H child for each matching object to be included in this /QH2 packet.
No payload is defined at the current time.
This packet has many child packet types defined at the current time, and is often extended:
Specifies a universal resource name which can be used to identify the object.
At least one instance of this child is required. Multiple instances are optional.
A string identifying the URN families followed by the URN in either text or binary representation. Text representations should always be 8-bit so that they can be considered binary.
The following compact URN families are recognised:
This packet has no known children at the current time.
This child specifies a location where the object being described can be acquired.
This child should be sent if it was requested and a URL is available.
If a /QH2/H/URL child was requested but is
not present, the sending node may not have the object. It is legal to advertise information
about objects that are not available.
Empty, or a string which represents an absolute URL.
If a URL is provided, that URL should be considered to point to an instance of the object being described.
If the payload is empty, the sender is indicating that it has the object available locally. It can be retrieved with an HTTP request using the “uri-res” resolver.
This packet has no known children at the current time.
The descriptive name for the object, and the object’s size.
This child should be sent if it was
requested and is available.
A 32-bit integer indicating the object’s size, followed by a string describing it. If a /QH2/H/SZ child is present, the 32-bit size prefix is omitted here.
This packet has no known children at the
current time.
Provides metadata describing the object.
This chid should be sent if it was requested and is available.
A string containing XML. The <?xml?> header is optional. Standard plural/singular Gnutella metadata schemas are used, for example:
<?xml version=”1.0”?>
<applications xsi:noNamespaceSchemaLocation=”http://www.shareaza.com/schemas/application.xsd”>
<application title=”Shareaza”/>
</applications>
This packet has no known children at the
current time.
Specifies the size of the object.
This child should be sent if DN was requested, and if the size is not to be packed into the /QH2/H/DN child.
A 32-bit or 64-bit integer representing the object size.
This packet has no known children at the
current time.
Indicates which hit group (/QH2/HG) this hit belongs to. If this child is not present, the hit belongs to group zero.
Send this child if this hit belongs to a hit group other than group zero.
If this child is present, the hit belongs to a hit group other than group zero.
A single byte, the hit group ID.
This packet has no known children at the
current time.
Specifies a 32-bit integer which uniquely identifies the object on the sender’s system.
This child is not normally sent for search results. It is used in browse-user responses to allow objects to be referenced by virtual folder structures.
Use this ID to link objects to virtual folder structures.
A 32-bit integer ID.
This packet has no known children at the
current time.
Indicates the number of external cached sources known to the sending node
This child is sent if cached sources are available and interest in URLs was expressed in the query.
A 16-bit integer.
This packet has no known children at the
current time.
Indicates that the sending node has only part of the object, and how much it has.
This child is sent if a /QH2/H/URL child is present and the URL contains only part of the object.
A 32-bit integer, the number of bytes available.
This packet has no known children at the
current time.
Provides a user review and/or rating for the object.
This child is sent if a user comment is available, and comments were requested.
An XML string, of the form:
<comment rating=”n”>
Text
</comment>
The rating is an integer 0 <= rating <= 5, indicating the number of “stars”. The text may not include HTML, but may include forum-style block codes and emoticon notation.
This packet has no known children at the
current time.
Indicates that a preview URL is available for the object. A preview URL will serve a miniature version of the content.
This child is sent if a preview is available.
Empty, or an absolute
URL.
If a URL is provided, that URL should be considered to point to a preview of the object being described.
If the payload is empty, the sender is indicating that it has a preview of the object available locally. It can be retrieved with an HTTP request to the path:
/gnutella/preview/v1?urn:xyz
This packet has no known children at the
current time.
Provides metadata describing one or more objects in the packet. This is an alternative method of delivering metadata, which allows metadata for similar objects to be grouped together. It is compatible with the unified metadata block format in Gnutella1.
This chid should be sent if it was requested and is available.
A string containing one or more XML documents, each beginning with a <?xml?> header. Standard plural/singular Gnutella metadata schemas are used, for example:
<?xml version=”1.0”?>
<applications xsi:noNamespaceSchemaLocation=”http://www.shareaza.com/schemas/application.xsd”>
<application index=”0” title=”Shareaza”/>
</applications>
<?xml version=”1.0”?>
<images xsi:noNamespaceSchemaLocation=”http://www.shareaza.com/schemas/image.xsd”>
<image index=”1” description=”Shareaza Logo” width=”128” height=”128”/>
<image index=”2” description=”Shareaza Screen Shot” width=”1024” height=”768”/>
</images>
An additional attribute, “index” is added to each singular entry. The index is a zero-based integer identifying the Nth /QH2/H child within this /QH2 packet.
This packet has no known children at the
current time.
Delivers a user profile or user profile summary, describing the user operating the sending node.
Send a version of this child if the operating user wishes to list their screen name in search results.
The current version of this packet contains a summary only, consisting of the screen/nick name. Future versions may include a more comprehensive XML text block in the payload.
No payload is defined at the current time.
One child packet type is defined at the current time:
Specifies the nick/screen name of the user.
A string.
This packet has no known children at the
current time.
An effective search system must provide a query language which is:
At the same time, it is desirable for the language to be reasonably easy to parse in software.
Gnutella2 employs a simple query language that is familiar to users of web search engines and allows most common criteria to be entered intuitively.
Cat Dog
(matches strings with “cat” and “dog”, in any order)
-Cat Dog
(matches strings with “dog” but not “cat”, in any order)
-Cat –Dog
(matches strings with neither “cat” nor “dog”, illegal in an external search as there are no positive words)
“cat dog”
(matches strings with “cat” followed by “dog”, “cat dog” matches, “dog cat” does not)
-“cat dog”
(matches strings without “cat dog”, “cat dog” does not match, “dog cat” does)
“cat dog” –fish
(matches strings with “cat dog” and without “fish”)
Searching metadata involves a set of specific rules:
When comparing numeric values, the match function is specified by the search criteria:
Generic search criteria (/Q2/DN) can be
matched against metadata fields, however care must be taken not to match against
data members which are not directly descriptive to the object. Client-side schema descriptors are a
good solution, listing the scope of a general search on each recognised
schema.
This is not a core part of the Gnutella2
architecture, and is discussed in depth elsewhere.
This is not a core part of the Gnutella2
architecture, and is discussed in depth elsewhere.
Community is an important part of a peer to peer experience, and a unified user profile format enables meaningful exchange of user metadata. Gnutella user profiles are XML documents adhering to a user profile schema referenced below.
User profile information is exchanged at strategic points in network communications, using packet types defined in subsequent sections.
To be copied in.
Requesting a user profile from another user
involves a challenge and response transaction. The challenge step allows the initiator
to provide a random token for the respondent to sign with a digital signature
and return, verifying its identity.
This allows a distributed identity safeguard, which is under
development.
Requests the user profile of a node.
Send this packet to request the user profile of another node.
Upon receiving this packet, respond with a user profile if one is available.
No payload is defined at the current time.
This packet has no known children at the
current time.
Delivers a user profile.
Send this packet in response to a user profile challenge (/UPROC)
An XML string, adhering to the Gnutella user profile schema.
This packet has no known children at the
current time.
© Shareaza Pty. Ltd.
2003-03-26