从规格来看:
”卡德姆利亚参与者必须执行的最重要的程序是定位与某个给定节点 ID 最接近的 k 个节点。我们称这个过程为节点查找。Kademlia 使用递归算法进行节点查找。查找发起者首先从其最近的非空 k 桶中挑选个节点(或者,如果该桶的条目少于 a 个,则只取其所知的 a 个最近的节点)。发起方随后向其选择的 a 节点发送并行异步 FIND_NODE RPCS, a 是一个系统范围的并发参数,例如 3。”
以及:
"在递归步骤中,启动器将 FIND_NODE 重新发送到它从以前的 RPC 中了解到的节点。(此递归可以在所有以前的 RPC 都返回之前开始)。在发起方听说的离目标最近的 k 个节点中,它挑选一个尚未查询的节点,并将 FIND_NODE RPC 重新发送给它们。无法快速响应的节点将被排除在考虑范围之外,直到它们做出响应。如果一轮 FIND_NODES 未能返回比已经看到的最近的节点更近的节点,则发起者将 FIND _ NODES 重新发送给它尚未查询的所有 k 个最近的节点。当发起者已经查询并从它所看到的 k 个最近的节点获得响应时,查找终止。当 a = 1 时,查找算法在消息开销和检测故障节点的延迟方面与 Chord 相似。然而,Kademlia 可以路由更低的延迟,因为它可以灵活地选择 k 个节点中的任何一个来转发请求。”
这里的想法是:
- 拼命返回 k 闭合节点。
- 并行联系其他对等方,以最大限度地减少延迟。
图 8:节点查找算法
这看起来很复杂,但实现(稍后显示)相当简单。
术语:
查找发起者:这是您自己的对等方想要对其他对等方进行store
或retrieve
调用。节点查找是在存储或检索值之前执行的,以便您的对等方有一个合理的。
"为了存储(密钥、值)对,参与者定位与密钥最近的 k 个节点,并将它们发送到 STORE RPCS。“这里术语“定位”实际上是指执行查找。与(来自规范):“要查找(键、值)对,节点首先执行查找,以查找 id 最接近键的 k 个节点,”,其中专门使用了术语“查找”。
我们可以在布莱恩·穆勒的 Python 代码中看到这一点:
代码清单 29:Python 中的查找邻居算法
async def get(self, key):
“““
Get a key if the network has it.
Returns:
:class:`None` if not found, the value otherwise.
“““
dkey = digest(key)
# if this node has it, return it
if self.storage.get(dkey) is not None:
return self.storage.get(dkey)
node = Node(dkey)
nearest = self.protocol.router.findNeighbors(node)
...
async def set(self, key, value):
“““
Set the given string key to the given value in the network.
“““
self.log.debug("setting '%s' = '%s' on network" % (key,
value))
dkey = digest(key)
return await self.set_digest(dkey, value)
async def set_digest(self, dkey, value):
“““
Set the given SHA1 digest key (bytes) to the given value in the
network.
“““
node = Node(dkey)
nearest = self.protocol.router.findNeighbors(node)
...
这里,findNeighbors
是规范中描述的查找算法。
从规范来看,“无法快速响应的节点”是什么意思?尤其是“迅速”这个词?这是一个特定于实现的决定。
如果你作为同行,自己的 k 桶里没有同行呢?这不应该发生(你至少应该有你正在联系的对等点),但是如果那个对等点只有你在它的 k 桶里,那么就没有什么可以返回的。
从规格来看,在“离其最近的非空 k-bucket,”中“最近”是什么意思?我在这里假设它是异或距离度量,但问题是,我们用什么作为具有一系列触点的桶的“密钥”?由于没有定义,该实现将在所有存储桶中的所有联系人中搜索更接近的初始联系人集。此外,异或距离计算意味着我们不能只在包含密钥所在范围的桶中进行外部搜索。这更好地匹配了另一个条件“或者,如果该桶的条目少于 a,它只取它所知道的 a 个最近的节点”,这意味着在所有桶中搜索所有a
个最近的节点。
同样来自规范:“查找发起者从挑选一个节点开始。“拣货”是什么意思?这是否意味着还需要根据亲密度对“最接近的桶”中的联系人进行额外排序?完全没有定义。
如果您想尝试“最近的桶”版本,请启用#define TRY_CLOSEST_BUCKET
,它像代码清单 30 一样实现。
代码清单 30:尝试最近的桶
KBucket
bucket = FindClosestNonEmptyKBucket(key);
// Not in spec -- sort by the closest nodes in the closest
bucket.
List<Contact> nodesToQuery = allNodes.Take(Constants.ALPHA).ToList();
否则,该实现只是在所有桶中获得最近的a
触点。
代码清单 31:所有桶中最接近的联系人
List<Contact> allNodes = node.BucketList.GetCloseContacts(key,
node.OurContact.ID).Take(Constants.K).ToList();
然而,这种实现,当用虚拟节点测试时(其中系统基本上知道每一个其他节点),有效地获得了k
最近的联系,因为它搜索了虚拟节点空间中的所有桶。因此,如果我们想练习算法,代码清单 32 更好。
代码清单 32:单元测试的最密切联系
List<Contact> allNodes =
node.BucketList.GetKBucket(key).Contacts.Take(Constants.K).ToList();
这其实就引出了下一个问题。根据前面的代码,在最初获取联系人时,此时更接近的联系人(我使用的是“联系人”和“节点”可互换)是否应该添加到更接近的联系人列表中?规范没有说不要,但也没有明确说应该这样做。假设我们一开始只选择a
联系人,那么我们肯定没有查找预期返回的k
联系人,所以我如上所述实现了这一点——我们拥有的a
最近的联系人被添加到“更近”的列表中,而我们拥有的a
更远的联系人被添加到“更远”的列表中。
代码清单 33:向探测联系人列表添加更近/更远的联系人
// Also not explicitly in spec:
// Any closer node in the alpha list is immediately added to our
closer contact list,
// and any farther node in the alpha list is immediately added to our
// farther contact list.
closerContacts.AddRange(nodesToQuery.Where(
n => (n.ID ^ key) < (node.OurContact.ID ^ key)));
fartherContacts.AddRange(nodesToQuery.Where(
n => (n.ID ^ key) >= (node.OurContact.ID ^ key)));
我们如何处理a
之外的联系人?考虑到这一点(来自规范):如果一轮 FIND_NODES 未能返回比已经看到的最近的节点更近的节点,则发起者将 FIND_NODE 重新发送到它尚未查询的所有 k 个最近的节点,“它适用于节点的第一次查询,还是仅适用于查询后返回的节点集?我将假设它适用于第一次查询中未查询的其余 a 节点,最多为k-a
个联系人。
规范是这样说的:“大多数操作都是按照上面的查找过程实现的。“哪些操作,什么时候?我们以后会解决这个问题。
从规格:“卡德姆利亚的许多好处来自于它对关键空间中点之间的距离使用了一种新的异或度量。XOR 是对称的,允许 Kademlia 参与者从其路由表中包含的完全相同的节点分布接收查找查询。”
因此,节点和密钥之间的距离是与密钥关联的节点 ID。不幸的是,异或距离度量不适用于预先排序的标识列表。当与联系人列表进行异或运算时,两个键的结果“距离”计算可能会有很大不同。如堆栈溢出中所述:【12】
“问题是,桶不必满,如果你想发送,比如说,一个响应中有 20 个节点,单个桶是不够的。因此,您必须以相对于目标键的升序(异或)遍历路由表(根据您自己的节点标识或自然距离进行排序),以访问多个桶。因为异或距离度量在每次位进位时都会折叠(异或==无进位加法),所以它不能很好地映射到任何路由表布局。换句话说,参观最近的存储桶是不行的...我认为很多人只是简单地遍历整个路由表,因为对于常规节点来说,它最多只包含几十个桶,而一个分布式哈希表节点看不到太多流量,所以它每秒只需要执行几次这个操作。如果您在密集的、缓存友好的数据结构中实现这一点,那么最大的份额实际上可能是内存流量,而不是进行一些异或和比较的中央处理器指令。
也就是说,全表扫描很容易实现。"
那篇文章的作者提供了一个实现【13】,它不需要全表扫描,但是在我的实现中,我们将只对桶列表联系人进行全扫描。
让我们从基线实现开始,目前:
- 不涉及并行的问题。
- 不处理仅查询 a 节点——我们暂时将
a
设置为常数 k。 - 目前,我们还忽略了一个事实,即这个算法对于节点查找和对于值查找是相同的。
这简化了实现,这样我们就可以为基本算法提供一些单元测试,之后再加入并行性和 a 概念,我们的单元测试应该还是通过的。此外,这里的方法都标记为virtual
,以防您想要覆盖实现。首先,一些助手方法:
代码清单 34: FindClosestNonEmptyKBucket
public virtual KBucket FindClosestNonEmptyKBucket(ID key)
{
KBucket closest =
node.BucketList.Buckets.Where(
b => b.Contacts.Count > 0).OrderBy(b => b.Key ^
key).FirstOrDefault();
Validate.IsTrue<NoNonEmptyBucketsException>(closest != null,
"No
non-empty buckets exist. You must first register a peer and add
that peer to your bucketlist.");
return
closest;
}
代码清单 35: GetClosestNodes
public List<Contact> GetClosestNodes(ID key, KBucket bucket)
{
return bucket.Contacts.OrderBy(c =>
c.ID ^ key).ToList();
}
代码清单 36: RpcFindNodes
public
(List<Contact> contacts, Contact
foundBy, string val)
RpcFindNodes(ID key, Contact contact)
{
var (newContacts, timeoutError) =
contact.Protocol.FindNode(node.OurContact, key);
// Null continuation
here to support unit tests where a DHT hasn't been set up.
dht?.HandleError(timeoutError, contact);
return (newContacts, null, null);
}
请注意,在这段代码中,我们有一个处理超时和其他错误的方法,我们将在后面描述。
代码清单 37: GetCloserNodes
public bool GetCloserNodes(
ID
key,
Contact
nodeToQuery,
Func<
ID,
Contact,
(List<Contact> contacts,
Contact foundBy,
string val)> rpcCall,
List<Contact> closerContacts,
List<Contact> fartherContacts,
out string val,
out Contact foundBy)
{
// As in, peer's nodes:
// Exclude ourselves and the peers we're contacting
(closerContacts and
// fartherContacts) to a get unique list of new peers.
var
(contacts, cFoundBy, foundVal) = rpcCall(key, nodeToQuery);
val =
foundVal;
foundBy =
cFoundBy;
List<Contact> peersNodes = contacts.
ExceptBy(node.OurContact,
c => c.ID).
ExceptBy(nodeToQuery,
c => c.ID).
Except(closerContacts).
Except(fartherContacts).ToList();
// Null
continuation is a special case primarily for unit testing when we have
// no nodes in any buckets.
var
nearestNodeDistance = nodeToQuery.ID ^ key;
lock
(locker)
{
closerContacts.
AddRangeDistinctBy(peersNodes.
Where(p
=> (p.ID ^ key) < nearestNodeDistance),
(a,
b) => a.ID == b.ID);
}
lock
(locker)
{
fartherContacts.
AddRangeDistinctBy(peersNodes.
Where(p
=> (p.ID ^ key) >= nearestNodeDistance),
(a, b) => a.ID == b.ID);
}
return
val != null;
}
请注意,我们总是排除我们自己的节点和我们正在联系的节点。还要注意lock
语句来同步操作联系人列表。
另外:
Func<ID, Contact, (List<Contact> contacts, Contact foundBy, string val)> rpcCall
该参数处理调用FindNode
或FindValue
,我们将在后面讨论。
这个流程图都是在Router
类的Lookup
方法中汇总的。这就是实现流程图复杂性的地方,所以这里有很多。请记住rpcCall
调用的不是FindNode
就是FindValue
,所以这里的一些逻辑必须弄清楚,如果找到一个值,要退出查找。这里也广泛使用了元组,这有望使代码更加清晰!最后,这个方法实际上是一个基类中的抽象方法,因为稍后,我们将看到实现这个算法是一个并行的“查找”(规范中谈到的东西),但是现在,代码清单 38 显示了非并行的实现。
代码清单 38:查找算法
public override (bool
found, List<Contact> contacts, Contact
foundBy, string val)
Lookup(
ID
key,
Func<ID, Contact, (List<Contact> contacts, Contact foundBy, string val)> rpcCall,
bool
giveMeAll = false)
{
bool
haveWork = true;
List<Contact> ret = new List<Contact>();
List<Contact> contactedNodes = new List<Contact>();
List<Contact> closerContacts = new List<Contact>();
List<Contact> fartherContacts = new List<Contact>();
List<Contact> closerUncontactedNodes = new
List<Contact>();
List<Contact> fartherUncontactedNodes = new
List<Contact>();
#if DEBUG
List<Contact> allNodes =
node.BucketList.GetKBucket(key).Contacts.Take(Constants.K).ToList();
#else
// This is a bad way to get a list of close contacts with
virtual nodes because
// we're always going to get the closest nodes right at the get go.
List<Contact> allNodes =
node.BucketList.GetCloseContacts(key,
node.OurContact.ID).Take(Constants.K).ToList();
#endif
List<Contact> nodesToQuery = allNodes.Take(Constants.ALPHA).ToList();
closerContacts.AddRange(nodesToQuery.Where(
n => (n.ID ^ key) < (node.OurContact.ID ^ key)));
fartherContacts.AddRange(nodesToQuery.Where(
n => (n.ID ^ key) >= (node.OurContact.ID ^ key)));
// The remaining contacts not tested yet can be put here.
fartherContacts.AddRange(allNodes.Skip(Constants.ALPHA).
Take(Constants.K - Constants.ALPHA));
// We're about to contact these nodes.
contactedNodes.AddRangeDistinctBy(nodesToQuery, (a, b) =>
a.ID == b.ID);
// Spec: The initiator then sends parallel, asynchronous
FIND_NODE RPCS to the a
// nodes it has chosen, a is a system-wide concurrency parameter, such as
3.
var
queryResult =
Query(key, nodesToQuery, rpcCall, closerContacts, fartherContacts);
if
(queryResult.found)
{
#if
DEBUG // For unit
testing.
CloserContacts = closerContacts;
FartherContacts = fartherContacts;
#endif
return
queryResult;
}
// Add any new closer contacts to the list we're going to
return.
ret.AddRangeDistinctBy(closerContacts, (a, b) => a.ID ==
b.ID);
// Spec: The lookup terminates when the initiator has queried
and received
// responses from the k closest nodes it has seen.
while
(ret.Count < Constants.K && haveWork)
{
closerUncontactedNodes =
closerContacts.Except(contactedNodes).ToList();
fartherUncontactedNodes =
fartherContacts.Except(contactedNodes).ToList();
bool
haveCloser = closerUncontactedNodes.Count > 0;
bool
haveFarther = fartherUncontactedNodes.Count > 0;
haveWork = haveCloser || haveFarther;
// Spec: Of the k nodes the initiator has heard of closest
to the target...
if
(haveCloser)
{
// Spec: ...it picks a that it has not yet queried and
resends the
// FIND_NODE RPC to them.
var newNodesToQuery =
closerUncontactedNodes.Take(Constants.ALPHA).ToList();
// We're about to contact these nodes.
contactedNodes.AddRangeDistinctBy(newNodesToQuery,
(a, b) => a.ID == b.ID);
queryResult
=
Query(key, newNodesToQuery, rpcCall, closerContacts,
fartherContacts);
if (queryResult.found)
{
#if
DEBUG // For unit
testing.
CloserContacts
= closerContacts;
FartherContacts
= fartherContacts;
#endif
return queryResult;
}
}
else if (haveFarther)
{
var newNodesToQuery =
fartherUncontactedNodes.Take(Constants.ALPHA).ToList();
// We're about to contact these nodes.
contactedNodes.AddRangeDistinctBy(fartherUncontactedNodes,
(a, b) => a.ID == b.ID);
queryResult
=
Query(key, newNodesToQuery, rpcCall, closerContacts,
fartherContacts);
if (queryResult.found)
{
#if
DEBUG // For unit
testing.
CloserContacts
= closerContacts;
FartherContacts
= fartherContacts;
#endif
return queryResult;
}
}
}
#if
DEBUG // For unit
testing.
CloserContacts = closerContacts;
FartherContacts = fartherContacts;
#endif
// Spec (sort of): Return max(k) closer nodes, sorted by
distance.
// For unit testing, giveMeAll can be true so that we can
match against our
// alternate way of getting closer contacts.
return (
false,
(giveMeAll ? ret : ret.Take(Constants.K).OrderBy(c
=> c.ID ^ key).ToList()),
null,
null);
}
正如规范所说,这个算法是迭代,而不是递归。我见过的每个实现都使用迭代。
这里有一个使用的扩展方法。
代码清单 39:添加范围标识符
public static void AddRangeDistinctBy<T>(
this List<T> target,
IEnumerable<T>
src, Func<T,
T, bool>
equalityComparer)
{
src.ForEach(item =>
{
// no items in the list must match the item.
if
(target.None(q => equalityComparer(q, item)))
{
target.Add(item);
}
});
}
在进入单元测试之前,我们需要能够创建虚拟节点,这意味着实现一个最小的VirtualProtocol
。
代码清单 40:虚拟节点
public class VirtualProtocol : IProtocol
{
public Node Node { get; set;
}
/// <summary>
/// For
unit testing with deferred node setup.
/// </summary>
public
VirtualProtocol(bool responds = true)
{
Responds = responds;
}
/// <summary>
///
Register the in-memory node with our virtual protocol.
/// </summary>
public
VirtualProtocol(Node node, bool responds = true)
{
Node = node;
Responds = responds;
}
public RpcError Ping(Contact sender)
{
return new RpcError() { TimeoutError = !Responds };
}
/// <summary>
/// Get
the list of contacts for this node closest to the key.
/// </summary>
public (List<Contact>
contacts, RpcError error) FindNode(Contact sender, ID key)
{
return
(Node.FindNode(sender, key).contacts, NoError());
}
/// <summary>
///
Returns either contacts or null if the value is found.
/// </summary>
public (List<Contact>
contacts, string val, RpcError error)
FindValue(Contact sender, ID key)
{
var
(contacts, val) = Node.FindValue(sender, key);
return
(contacts, val, NoError());
}
/// <summary>
/// Stores the key-value on the remote
peer.
/// </summary>
public RpcError Store(
Contact sender,
ID key,
string val,
bool isCached = false,
int expTimeSec = 0)
{
Node.Store(sender,
key, val, isCached, expTimeSec);
return
NoError();
}
protected
RpcError NoError()
{
return new RpcError();
}
}
请注意,对于单元测试,我们可以选择模拟一个无响应的节点。
我们还必须在Node
类中实现FindNode
(其他方法我们将在后面实现)。
代码清单 41:查找节点
public (List<Contact>
contacts, string val) FindNode(Contact sender, ID key)
{
Validate.IsFalse<SendingQueryToSelfException>(sender.ID == ourContact.ID,
"Sender
should not be ourself!");
SendKeyValuesIfNewContact(sender);
bucketList.AddContact(sender);
// Exclude sender.
var
contacts = bucketList.GetCloseContacts(key, sender.ID);
return
(contacts, null);
}
请注意,此呼叫试图将发件人添加为联系人,这可能会将联系人添加到收件人的遗愿列表中,也可能会更新联系人信息。如果联系人是新的,则发送与新对等方“更接近”的任何键值。这是卡德姆利亚算法的一个重要方面:离存储的键值“更近”的对等点获得这些键值,从而提高查找键值的效率。
我们还需要一种算法来寻找接收者范围内的密切接触者。
代码清单 42: GetCloseContacts
/// <summary>
/// Brute
force distance lookup of all known contacts, sorted by distance, then we
/// take at most k (20) of the closest.
/// </summary>
/// <param name="toFind">The ID for which we want to find close contacts.</param>
/// <param name="exclude">The ID to exclude (the requestor's
ID)</param>
public List<Contact>
GetCloseContacts(ID key, ID exclude)
{
lock (this)
{
var
contacts = buckets.
SelectMany(b
=> b.Contacts).
Where(c
=> c.ID != exclude).
Select(c
=> new { contact = c, distance = c.ID ^
key }).
OrderBy(d
=> d.distance).
Take(Constants.K);
return
contacts.Select(c => c.contact).ToList();
}
}
这将测试排序和返回的最大联系人数限制。请注意,我使用所有随机 id 可能不是最理想的情况;但是,为了确保一致的单元测试,我们在DEBUG
模式下为随机数生成器播种相同的值。
代码清单 43:随机值的调试模式播种
#if DEBUG
public static Random rnd = new Random(1);
#else
private static Random rnd = new Random();
#endif
最后,我们可以实施“获得密切接触者”单元测试。
代码清单 44: GetCloseContactsOrderedTest()
[TestMethod]
public void GetCloseContactsOrderedTest()
{
Contact
sender = new Contact(null, ID.RandomID);
Node
node = new Node(new Contact(null, ID.RandomID), new VirtualStorage());
List<Contact> contacts = new List<Contact>();
// Force multiple buckets.
100.ForEach(() => contacts.Add(new Contact(null,
ID.RandomID)));
contacts.ForEach(c => node.BucketList.AddContact(c));
ID key
= ID.RandomID; // Pick an ID
List<Contact> closest = node.FindNode(sender,
key).contacts;
Assert.IsTrue(closest.Count
== Constants.K, "Expected K contacts to be returned.");
// The contacts should be in ascending order with respect to
the key.
var
distances = closest.Select(c => c.ID ^ key).ToList();
var
distance = distances[0];
// Verify distances are in ascending order:
distances.Skip(1).ForEach(d =>
{
Assert.IsTrue(distance
< d, "Expected
contacts ordered by distance.");
distance = d;
});
// Verify the contacts with the smallest distances were
returned from all
// possible distances.
var
lastDistance = distances[distances.Count - 1];
var
others = node.BucketList.Buckets.SelectMany(
b => b.Contacts.Except(closest)).Where(c => (c.ID ^ key) <
lastDistance);
Assert.IsTrue(others.Count()
== 0,
"Expected
no other contacts with a smaller distance than the greatest
distance to exist.");
}
这个测试只是验证我们没有新的节点要查询,因为我们正在联系的所有节点都已经被联系了。
代码清单 45:查询单元测试的非代码
/// <summary>
/// Given
that all the nodes we’re contacting are nodes *being* contacted,
/// the
result should be no new nodes to contact.
/// </summary>
[TestMethod]
public void NoNodesToQueryTest()
{
// Setup
router = new Router(new Node(new Contact(null, ID.Mid), new VirtualStorage()));
nodes = new List<Node>();
20.ForEach((n) => nodes.Add(new
Node(new
Contact(null,
new ID(BigInteger.Pow(new
BigInteger(2), n))), new VirtualStorage())));
// Fixup protocols:
nodes.ForEach(n => n.OurContact.Protocol = new VirtualProtocol(n));
// Our contacts:
nodes.ForEach(n =>
router.Node.BucketList.AddContact(n.OurContact));
// Each peer needs to know about the other peers except of
course itself.
nodes.ForEach(n => nodes.Where(nOther => nOther != n).
ForEach(nOther => n.BucketList.AddContact(nOther.OurContact)));
// Select the key such that n ^ 0 == n
// This ensures that the distance metric uses only the node
ID, which makes for
// an integer difference for distance, not an XOR distance.
key = ID.Zero;
// all contacts are in one bucket.
contactsToQuery =
router.Node.BucketList.Buckets[0].Contacts;
closerContacts = new List<Contact>();
fartherContacts = new List<Contact>();
contactsToQuery.ForEach(c =>
{
router.GetCloserNodes(key, c, router.RpcFindNodes,
closerContacts,
fartherContacts,
out var
_, out
var _);
});
Assert.IsTrue(closerContacts.ExceptBy(contactsToQuery,
c=>c.ID).Count() == 0,
"No new
nodes expected.");
Assert.IsTrue(fartherContacts.ExceptBy(contactsToQuery,
c=>c.ID).Count() == 0,
"No new
nodes expected.");
}
具有讽刺意味的是,真的没有其他方法可以获得这些节点,特别是对于Lookup
方法中的while
循环。因此,我没有在这个问题上绞尽脑汁,而是在代码清单 46 中实现了这个测试。
代码清单 46:查找单元测试
public void LookupTest()
{
// Seed with different random values
100.ForEach(seed =>
{
ID.rnd
= new Random(seed);
Setup();
List<Contact> closeContacts =
router.Lookup(key, router.RpcFindNodes, true).contacts;
List<Contact> contactedNodes = new List<Contact>(closeContacts);
//
Is the above call returning the correct number of close contacts?
//
The unit test for this is sort of lame. We should get at least
// as many contacts as when calling GetCloserNodes.
GetAltCloseAndFar(
contactsToQuery,
closerContactsAltComputation,
fartherContactsAltComputation);
Assert.IsTrue(closeContacts.Count
>= closerContactsAltComputation.Count,
"Expected
at least as many contacts.");
// Technically,
we can’t even test whether the contacts returned
// in GetCloserNodes exists in router.Lookup because it may have found
nodes
// even closer, and it only returns K nodes!
//
We can overcome this by eliminating the Take in the return of
router.Lookup().
closerContactsAltComputation.ForEach(c
=>
Assert.IsTrue(closeContacts.Contains(c)));
});
}
这个单元测试的设置很复杂。
代码清单 47:查找设置
protected void Setup()
{
// Setup
router = new Router(
new Node(new Contact(null, ID.RandomID), new VirtualStorage()));
nodes = new List<Node>();
100.ForEach(() =>
{
Contact contact = new Contact(new VirtualProtocol(), ID.RandomID);
Node
node = new Node(contact, new VirtualStorage());
((VirtualProtocol)contact.Protocol).Node = node;
nodes.Add(node);
});
// Fixup protocols:
nodes.ForEach(n => n.OurContact.Protocol = new VirtualProtocol(n));
// Our contacts:
nodes.ForEach(n =>
router.Node.BucketList.AddContact(n.OurContact));
// Each peer needs to know about the other peers except of
course itself.
nodes.ForEach(n => nodes.Where(nOther => nOther != n).
ForEach(nOther =>n.BucketList.AddContact(nOther.OurContact)));
// Pick a random bucket, or bucket where the key is in range,
otherwise
// we’re defeating the purpose of the algorithm.
key = ID.RandomID; // Pick an ID
// DO NOT DO THIS:
// List<Contact> nodesToQuery = router.Node.BucketList.GetCloseContacts(
// key, router.Node.OurContact.ID).Take(Constants.ALPHA).ToList();
contactsToQuery =
router.Node.BucketList.GetKBucket(key).Contacts.Take(Constants.ALPHA).ToList();
// or:
// contactsToQuery =
// router.FindClosestNonEmptyKBucket(key).Contacts.Take(Constants.ALPHA).ToList();
closerContacts = new List<Contact>();
fartherContacts = new List<Contact>();
closerContactsAltComputation = new
List<Contact>();
fartherContactsAltComputation = new
List<Contact>();
theNearestContactedNode = contactsToQuery.OrderBy(n => n.ID
^ key).First();
distance = theNearestContactedNode.ID.Value ^ key.Value;
}
此外,我们使用不同的实现来获取更近和更远的接触,以便我们可以验证测试中的算法。代码清单 48 显示了替代实现。
代码清单 48:靠近和远离节点的替代实现
protected void GetAltCloseAndFar(
List<Contact> contactsToQuery,
List<Contact> closer, List<Contact> farther)
{
// For each node (ALPHA == K for testing) in our bucket
(nodesToQuery) we're
// going to get k nodes closest to the key:
foreach (Contact contact in contactsToQuery)
{
// Find the node we're contacting:
Node contactNode = nodes.Single(n =>
n.OurContact == contact);
// Close contacts except ourself and the nodes we're
contacting.
// Note that of all the contacts in the bucket list, many of
the k returned
// by the GetCloseContacts call are contacts we're querying,
so they are
// being excluded.
var closeContactsOfContactedNode =
contactNode.
BucketList.
GetCloseContacts(key,
router.Node.OurContact.ID).
ExceptBy(contactsToQuery,
c => c.ID.Value);
foreach (Contact closeContactOfContactedNode in
closeContactsOfContactedNode)
{
// Which of these contacts are closer?
if ((closeContactOfContactedNode.ID ^
key) < distance)
{
closer.AddDistinctBy(closeContactOfContactedNode,
c => c.ID.Value);
}
// Which of these contacts are farther?
if ((closeContactOfContactedNode.ID ^
key) >= distance)
{
farther.AddDistinctBy(closeContactOfContactedNode,
c => c.ID.Value);
}
}
}
}
这个测试和下一个测试在 while 循环之前练习Lookup
算法的部分。注意桶和标识是如何设置的。
代码清单 49: SimpleAllCloserContactsTest
[TestMethod]
public void SimpleAllCloserContactsTest()
{
// Setup
// By selecting our node ID to zero, we ensure that all
distances of
// other nodes are > the distance to our node.
router = new Router(new Node(new Contact(null, ID.Max), new VirtualStorage()));
nodes = new List<Node>();
Constants.K.ForEach((n) => nodes.Add(new
Node(new
Contact(null,
new ID(BigInteger.Pow(new
BigInteger(2), n))), new VirtualStorage())));
// Fixup protocols:
nodes.ForEach(n => n.OurContact.Protocol = new VirtualProtocol(n));
// Our contacts:
nodes.ForEach(n =>
router.Node.BucketList.AddContact(n.OurContact));
// Each peer needs to know about the other peers except of
course itself.
nodes.ForEach(n => nodes.Where(nOther => nOther != n).
ForEach(nOther => n.BucketList.AddContact(nOther.OurContact)));
// Select the key such that n ^ 0 == n
// This ensures that the distance metric uses only the node
ID, which makes
// for an integer difference for distance, not an XOR distance.
key = ID.Zero;
// all contacts are in one bucket.
contactsToQuery =
router.Node.BucketList.Buckets[0].Contacts;
var
contacts = router.Lookup(key, router.RpcFindNodes, true).contacts;
Assert.IsTrue(contacts.Count
== Constants.K, "Expected k closer contacts.");
Assert.IsTrue(router.CloserContacts.Count
== Constants.K,
"All
contacts should be closer.");
Assert.IsTrue(router.FartherContacts.Count
== 0,
"Expected no
farther contacts.");
}
同样,请注意代码清单 50 中桶和标识是如何设置的。
代码清单 50:SimpleAllFartherContactsTest
/// <summary>
/// Creates
a single bucket with node IDs 2^i for i in [0, K) and
/// 1\. use
a key with ID.Value == 0 to that distance computation is an
/// integer difference.
/// 2\. use
an ID.Value == 0 for our node ID so all other nodes are farther.
/// </summary>
[TestMethod]
public void SimpleAllFartherContactsTest()
{
// Setup
// By selecting our node ID to zero, we ensure that all
distances of other
// nodes are > the distance to our node.
router = new Router(new Node(new Contact(null, ID.Zero), new VirtualStorage()));
nodes = new List<Node>();
Constants.K.ForEach((n) => nodes.Add(new
Node(new
Contact(null,
new ID(BigInteger.Pow(new
BigInteger(2), n))), new VirtualStorage())));
// Fixup protocols:
nodes.ForEach(n => n.OurContact.Protocol = new VirtualProtocol(n));
// Our contacts:
nodes.ForEach(n =>
router.Node.BucketList.AddContact(n.OurContact));
// Each peer needs to know about the other peers except of
course itself.
nodes.ForEach(n => nodes.Where(nOther => nOther != n).
ForEach(nOther => n.BucketList.AddContact(nOther.OurContact)));
// Select the key such that n ^ 0 == n
// This ensures that the distance metric uses only the node
ID, which makes
// for an integer difference for distance, not an XOR distance.
key = ID.Zero;
var
contacts = router.Lookup(key, router.RpcFindNodes, true).contacts;
Assert.IsTrue(contacts.Count
== 0, "Expected
no closer contacts.");
Assert.IsTrue(router.CloserContacts.Count
== 0,
"Did not
expected closer contacts.");
Assert.IsTrue(router.FartherContacts.Count
== Constants.K,
"All
contacts should be farther.");
}