

  一开始看到网上有人说zookeeper满足了CAP的CP特性,我一直以为zookeeper至少也是Sequential Consistent。那zookeeper自己怎么说的呢?在它文档中,首先它宣称自己是“Sequential Consistency”,不过它的“Sequential Consistency”相比Leslie Lamport老哥的,似乎缩水了,怎么缩的呢?后面它羞羞答答的解释“Updates from a client will be applied in the order that they were sent”,你看,updates是in the order的,read呢,这可没说。


  Sometimes developers mistakenly assume one other guarantee that ZooKeeper does not in fact make. This is: Simultaneously Consistent Cross-Client Views : ZooKeeper does not guarantee that at every instance in time, two different clients will have identical views of ZooKeeper data. Due to factors like network delays, one client may perform an update before another client gets notified of the change. Consider the scenario of two clients, A and B. If client A sets the value of a znode /a from 0 to 1, then tells client B to read /a, client B may read the old value of 0, depending on which server it is connected to. If it is important that Client A and Client B read the same value, Client B should should call the sync() method from the ZooKeeper API method before it performs its read. So, ZooKeeper by itself doesn’t guarantee that changes occur synchronously across all servers, but ZooKeeper primitives can be used to construct higher level functions that provide useful client synchronization.



  在《ZooKeeper: Wait-free coordination for Internet-scale systems》一文中可以看出,zookeeper的所有node都是可以服务client的,我猜测是因为处理session过期、watch这些东西,一个leader真是独木难支。当client连接了一个follower以后,所有的读写请求都发给follower。







  顺便说一句,zookeeper client为了感知其他client的修改,应该通过watch的形式。

  Read-your-own-writes consistency



  All requests that update ZooKeeper state are forwarded to the leader……The server that

  receives the client request responds to the client when it delivers the corresponding state change.

  所以A的写请求结束以后,它连接的node已经deliver了state change,以后的读操作顺理成章得到更新后的数据。

  真正的Sequential Consistency

  在官网中,它只说“Updates from a client will be applied in the order that they were sent”,有没有可能read出现乱序呢?有没有可能后来的read读到更旧的数据?



  If the client connects to a new server, that new server ensures that its view of the ZooKeeper data is at least as recent as the view of the client by checking the last zxid of the client against its last zxid. If the client has a more recent view than the server, the server does not reestablish the session with the client until the server has caught up.

  如果client能正确处理zxid的话,我感觉zookeeper也具有Lamport老哥的Sequential Consistency,但是zxid的存储应该是一个比较难以解决的问题,所以zookeeper谨慎一点,把自己的Sequential Consistency缩了一下水。

  如何理解Single System Image

  zookeeper官网还说它保证了“Single System Image”,其解释为“A client will see the same view of the service regardless of the server that it connects to.”。实际上看来这个解释还是有一点误导性的。其实由上面zxid的原理可以看出,它表达的意思是“client只要连接过一次zookeeper,就不会有历史的倒退”。



