1 Scrapy分布式原理

2 队列用什么维护

首先想到的可能是一些特定数据结构, 数据库, 文件等等.

这里推荐使用Redis队列.

3 怎样来去重

保证Request队列每个request都是唯一的.

集合中每个元素是不重复的

4 怎样防止中断

5 怎样实现该架构

Scrapy-Redis库已经完整实现了该架构.

源代码地址: Scrapy-Redis

  • /scrapy_redis/connection.py
# 用来连接redis基本的库
import sixfrom scrapy.utils.misc import load_objectfrom . import defaults# Shortcut maps 'setting name' -> 'parmater name'.
# 连接redis的参数, 中心主机master上的redis数据库的连接信息, 从机统一配置
SETTINGS_PARAMS_MAP = {'REDIS_URL': 'url','REDIS_HOST': 'host','REDIS_PORT': 'port','REDIS_ENCODING': 'encoding',
}# 从settings中获取redis参数
def get_redis_from_settings(settings):"""Returns a redis client instance from given Scrapy settings object.This function uses ``get_client`` to instantiate the client and uses``defaults.REDIS_PARAMS`` global as defaults values for the parameters. Youcan override them using the ``REDIS_PARAMS`` setting.Parameters----------settings : SettingsA scrapy settings object. See the supported settings below.Returns-------serverRedis client instance.Other Parameters----------------REDIS_URL : str, optionalServer connection URL.REDIS_HOST : str, optionalServer host.REDIS_PORT : str, optionalServer port.REDIS_ENCODING : str, optionalData encoding.REDIS_PARAMS : dict, optionalAdditional client parameters."""params = defaults.REDIS_PARAMS.copy()params.update(settings.getdict('REDIS_PARAMS'))# XXX: Deprecate REDIS_* settings.for source, dest in SETTINGS_PARAMS_MAP.items():val = settings.get(source)if val:params[dest] = val# Allow ``redis_cls`` to be a path to a class.if isinstance(params.get('redis_cls'), six.string_types):params['redis_cls'] = load_object(params['redis_cls'])return get_redis(**params)# Backwards compatible alias.
from_settings = get_redis_from_settings# 连接redis
def get_redis(**kwargs):"""Returns a redis client instance.Parameters----------redis_cls : class, optionalDefaults to ``redis.StrictRedis``.url : str, optionalIf given, ``redis_cls.from_url`` is used to instantiate the class.**kwargsExtra parameters to be passed to the ``redis_cls`` class.Returns-------serverRedis client instance."""redis_cls = kwargs.pop('redis_cls', defaults.REDIS_CLS)url = kwargs.pop('url', None)if url:return redis_cls.from_url(url, **kwargs)else:return redis_cls(**kwargs)
  • /scrapy_redis/defaults.py
# 一些默认的变量
import redis# For standalone use.
DUPEFILTER_KEY = 'dupefilter:%(timestamp)s'PIPELINE_KEY = '%(spider)s:items'REDIS_CLS = redis.StrictRedis
REDIS_ENCODING = 'utf-8'
# Sane connection defaults.
REDIS_PARAMS = {'socket_timeout': 30,'socket_connect_timeout': 30,'retry_on_timeout': True,'encoding': REDIS_ENCODING,
}SCHEDULER_QUEUE_KEY = '%(spider)s:requests'
SCHEDULER_QUEUE_CLASS = 'scrapy_redis.queue.PriorityQueue'
SCHEDULER_DUPEFILTER_KEY = '%(spider)s:dupefilter'
SCHEDULER_DUPEFILTER_CLASS = 'scrapy_redis.dupefilter.RFPDupeFilter'START_URLS_KEY = '%(name)s:start_urls'
START_URLS_AS_SET = False
  • /scrapy_redis/dupefilter.py
# redis集合去重
import logging
import timefrom scrapy.dupefilters import BaseDupeFilter
from scrapy.utils.request import request_fingerprintfrom . import defaults
from .connection import get_redis_from_settingslogger = logging.getLogger(__name__)# TODO: Rename class to RedisDupeFilter.
class RFPDupeFilter(BaseDupeFilter):"""Redis-based request duplicates filter.This class can also be used with default Scrapy's scheduler."""logger = loggerdef __init__(self, server, key, debug=False):"""Initialize the duplicates filter.Parameters----------server : redis.StrictRedisThe redis server instance.key : strRedis key Where to store fingerprints.debug : bool, optionalWhether to log filtered requests."""# 声明了一些基本的变量self.server = serverself.key = keyself.debug = debugself.logdupes = True# 从settings中获取相关信息@classmethoddef from_settings(cls, settings):"""Returns an instance from given settings.This uses by default the key ``dupefilter:<timestamp>``. When using the``scrapy_redis.scheduler.Scheduler`` class, this method is not used asit needs to pass the spider name in the key.Parameters----------settings : scrapy.settings.SettingsReturns-------RFPDupeFilterA RFPDupeFilter instance."""server = get_redis_from_settings(settings)# XXX: This creates one-time key. needed to support to use this# class as standalone dupefilter with scrapy's default scheduler# if scrapy passes spider on open() method this wouldn't be needed# TODO: Use SCRAPY_JOB env as default and fallback to timestamp.key = defaults.DUPEFILTER_KEY % {'timestamp': int(time.time())}debug = settings.getbool('DUPEFILTER_DEBUG')return cls(server, key=key, debug=debug)@classmethoddef from_crawler(cls, crawler):"""Returns instance from crawler.Parameters----------crawler : scrapy.crawler.CrawlerReturns-------RFPDupeFilterInstance of RFPDupeFilter."""return cls.from_settings(crawler.settings)# 判断队列中是否存在相同的requestdef request_seen(self, request):"""Returns True if request was already seen.Parameters----------request : scrapy.http.RequestReturns-------bool"""# 调用函数, 获取request指纹信息fp = self.request_fingerprint(request)# This returns the number of values added, zero if already exists.# 通过redis的sadd方法往集合中插入指纹, 插入成功则说明不重复added = self.server.sadd(self.key, fp)return added == 0# 获取指纹信息def request_fingerprint(self, request):"""Returns a fingerprint for a given request.Parameters----------request : scrapy.http.RequestReturns-------str"""# 实际上调用了request内部的获取指纹的方法return request_fingerprint(request)@classmethoddef from_spider(cls, spider):settings = spider.settingsserver = get_redis_from_settings(settings)dupefilter_key = settings.get("SCHEDULER_DUPEFILTER_KEY", defaults.SCHEDULER_DUPEFILTER_KEY)key = dupefilter_key % {'spider': spider.name}debug = settings.getbool('DUPEFILTER_DEBUG')return cls(server, key=key, debug=debug)def close(self, reason=''):"""Delete data on close. Called by Scrapy's scheduler.Parameters----------reason : str, optional"""self.clear()def clear(self):"""Clears fingerprints data."""self.server.delete(self.key)def log(self, request, spider):"""Logs given request.Parameters----------request : scrapy.http.Requestspider : scrapy.spiders.Spider"""if self.debug:msg = "Filtered duplicate request: %(request)s"self.logger.debug(msg, {'request': request}, extra={'spider': spider})elif self.logdupes:msg = ("Filtered duplicate request %(request)s"" - no more duplicates will be shown"" (see DUPEFILTER_DEBUG to show all duplicates)")self.logger.debug(msg, {'request': request}, extra={'spider': spider})self.logdupes = False
  • /scrapy_redis/picklecompat.py
"""A pickle wrapper module with protocol=-1 by default."""
# pickle提供了一个简单的持久化功能。可以将对象以文件的形式存放在磁盘上。try:import cPickle as pickle  # PY2
except ImportError:import pickledef loads(s):return pickle.loads(s)def dumps(obj):return pickle.dumps(obj, protocol=-1)
  • /scrapy_redis/pipelines.py
from scrapy.utils.misc import load_object
from scrapy.utils.serialize import ScrapyJSONEncoder
from twisted.internet.threads import deferToThreadfrom . import connection, defaultsdefault_serialize = ScrapyJSONEncoder().encode# 将所有的爬取结果集中存到主机master上, 开启可能会影响网速
class RedisPipeline(object):"""Pushes serialized item into a redis list/queueSettings--------REDIS_ITEMS_KEY : strRedis key where to store items.REDIS_ITEMS_SERIALIZER : strObject path to serializer function."""def __init__(self, server,key=defaults.PIPELINE_KEY,serialize_func=default_serialize):"""Initialize pipeline.Parameters----------server : StrictRedisRedis client instance.key : strRedis key where to store items.serialize_func : callableItems serializer function."""self.server = serverself.key = keyself.serialize = serialize_func@classmethoddef from_settings(cls, settings):params = {'server': connection.from_settings(settings),}if settings.get('REDIS_ITEMS_KEY'):params['key'] = settings['REDIS_ITEMS_KEY']if settings.get('REDIS_ITEMS_SERIALIZER'):params['serialize_func'] = load_object(settings['REDIS_ITEMS_SERIALIZER'])return cls(**params)@classmethoddef from_crawler(cls, crawler):return cls.from_settings(crawler.settings)# 最主要的方法def process_item(self, item, spider):# 调用了一个线程方法return deferToThread(self._process_item, item, spider)def _process_item(self, item, spider):key = self.item_key(item, spider)# 序列化itemdata = self.serialize(item)# 将item同意添加到redis队列里面, 存到主机self.server.rpush(key, data)return itemdef item_key(self, item, spider):"""Returns redis key based on given spider.Override this function to use a different key depending on the itemand/or spider."""return self.key % {'spider': spider.name}
  • /scrapy_redis/queue.py
from scrapy.utils.reqser import request_to_dict, request_from_dictfrom . import picklecompat# 维护了三个队列
# 基类
class Base(object):"""Per-spider base queue class"""def __init__(self, server, spider, key, serializer=None):"""Initialize per-spider redis queue.Parameters----------server : StrictRedisRedis client instance.spider : SpiderScrapy spider instance.key: strRedis key where to put and get messages.serializer : objectSerializer object with ``loads`` and ``dumps`` methods."""if serializer is None:# Backward compatibility.# TODO: deprecate pickle.serializer = picklecompatif not hasattr(serializer, 'loads'):raise TypeError("serializer does not implement 'loads' function: %r"% serializer)if not hasattr(serializer, 'dumps'):raise TypeError("serializer '%s' does not implement 'dumps' function: %r"% serializer)self.server = serverself.spider = spiderself.key = key % {'spider': spider.name}self.serializer = serializerdef _encode_request(self, request):"""Encode a request object"""obj = request_to_dict(request, self.spider)return self.serializer.dumps(obj)def _decode_request(self, encoded_request):"""Decode an request previously encoded"""obj = self.serializer.loads(encoded_request)return request_from_dict(obj, self.spider)def __len__(self):"""Return the length of the queue"""raise NotImplementedErrordef push(self, request):"""Push a request"""raise NotImplementedErrordef pop(self, timeout=0):"""Pop a request"""raise NotImplementedErrordef clear(self):"""Clear queue/stack"""self.server.delete(self.key)class FifoQueue(Base):"""Per-spider FIFO queue"""def __len__(self):"""Return the length of the queue"""return self.server.llen(self.key)def push(self, request):"""Push a request"""self.server.lpush(self.key, self._encode_request(request))def pop(self, timeout=0):"""Pop a request"""if timeout > 0:data = self.server.brpop(self.key, timeout)if isinstance(data, tuple):data = data[1]else:data = self.server.rpop(self.key)if data:return self._decode_request(data)# 优先级队列,存入了request的优先级
class PriorityQueue(Base):"""Per-spider priority queue abstraction using redis' sorted set"""def __len__(self):"""Return the length of the queue"""return self.server.zcard(self.key)def push(self, request):"""Push a request"""data = self._encode_request(request)score = -request.priority# We don't use zadd method as the order of arguments change depending on# whether the class is Redis or StrictRedis, and the option of using# kwargs only accepts strings, not bytes.self.server.execute_command('ZADD', self.key, score, data)def pop(self, timeout=0):"""Pop a requesttimeout not support in this queue class"""# use atomic range/remove using multi/execpipe = self.server.pipeline()pipe.multi()pipe.zrange(self.key, 0, 0).zremrangebyrank(self.key, 0, 0)results, count = pipe.execute()if results:return self._decode_request(results[0])class LifoQueue(Base):"""Per-spider LIFO queue."""def __len__(self):"""Return the length of the stack"""return self.server.llen(self.key)def push(self, request):"""Push a request"""self.server.lpush(self.key, self._encode_request(request))def pop(self, timeout=0):"""Pop a request"""if timeout > 0:data = self.server.blpop(self.key, timeout)if isinstance(data, tuple):data = data[1]else:data = self.server.lpop(self.key)if data:return self._decode_request(data)# TODO: Deprecate the use of these names.
SpiderQueue = FifoQueue
SpiderStack = LifoQueue
SpiderPriorityQueue = PriorityQueue
  • /scrapy_redis/scheduler.py
import importlib
import sixfrom scrapy.utils.misc import load_objectfrom . import connection, defaults# TODO: add SCRAPY_JOB support.
class Scheduler(object):"""Redis-based schedulerSettings--------# 开启之后爬取完成, 不会清空指纹队列SCHEDULER_PERSIST : bool (default: False)Whether to persist or clear redis queue.# True = 每次开启都会清空redis队列SCHEDULER_FLUSH_ON_START : bool (default: False)Whether to flush redis queue on start.SCHEDULER_IDLE_BEFORE_CLOSE : int (default: 0)How many seconds to wait before closing if no message is received.SCHEDULER_QUEUE_KEY : strScheduler redis key.SCHEDULER_QUEUE_CLASS : strScheduler queue class.SCHEDULER_DUPEFILTER_KEY : strScheduler dupefilter redis key.SCHEDULER_DUPEFILTER_CLASS : strScheduler dupefilter class.SCHEDULER_SERIALIZER : strScheduler serializer."""def __init__(self, server,persist=False,flush_on_start=False,queue_key=defaults.SCHEDULER_QUEUE_KEY,queue_cls=defaults.SCHEDULER_QUEUE_CLASS,dupefilter_key=defaults.SCHEDULER_DUPEFILTER_KEY,dupefilter_cls=defaults.SCHEDULER_DUPEFILTER_CLASS,idle_before_close=0,serializer=None):"""Initialize scheduler.Parameters----------server : RedisThe redis server instance.persist : boolWhether to flush requests when closing. Default is False.flush_on_start : boolWhether to flush requests on start. Default is False.queue_key : strRequests queue key.queue_cls : strImportable path to the queue class.dupefilter_key : strDuplicates filter key.dupefilter_cls : strImportable path to the dupefilter class.idle_before_close : intTimeout before giving up."""if idle_before_close < 0:raise TypeError("idle_before_close cannot be negative")self.server = serverself.persist = persistself.flush_on_start = flush_on_startself.queue_key = queue_keyself.queue_cls = queue_clsself.dupefilter_cls = dupefilter_clsself.dupefilter_key = dupefilter_keyself.idle_before_close = idle_before_closeself.serializer = serializerself.stats = Nonedef __len__(self):return len(self.queue)@classmethoddef from_settings(cls, settings):kwargs = {'persist': settings.getbool('SCHEDULER_PERSIST'),'flush_on_start': settings.getbool('SCHEDULER_FLUSH_ON_START'),'idle_before_close': settings.getint('SCHEDULER_IDLE_BEFORE_CLOSE'),}# If these values are missing, it means we want to use the defaults.optional = {# TODO: Use custom prefixes for this settings to note that are# specific to scrapy-redis.'queue_key': 'SCHEDULER_QUEUE_KEY','queue_cls': 'SCHEDULER_QUEUE_CLASS','dupefilter_key': 'SCHEDULER_DUPEFILTER_KEY',# We use the default setting name to keep compatibility.'dupefilter_cls': 'DUPEFILTER_CLASS','serializer': 'SCHEDULER_SERIALIZER',}for name, setting_name in optional.items():val = settings.get(setting_name)if val:kwargs[name] = val# Support serializer as a path to a module.if isinstance(kwargs.get('serializer'), six.string_types):kwargs['serializer'] = importlib.import_module(kwargs['serializer'])server = connection.from_settings(settings)# Ensure the connection is working.server.ping()return cls(server=server, **kwargs)@classmethoddef from_crawler(cls, crawler):instance = cls.from_settings(crawler.settings)# FIXME: for now, stats are only supported from this constructorinstance.stats = crawler.statsreturn instancedef open(self, spider):self.spider = spidertry:self.queue = load_object(self.queue_cls)(server=self.server,spider=spider,key=self.queue_key % {'spider': spider.name},serializer=self.serializer,)except TypeError as e:raise ValueError("Failed to instantiate queue class '%s': %s",self.queue_cls, e)self.df = load_object(self.dupefilter_cls).from_spider(spider)if self.flush_on_start:self.flush()# notice if there are requests already in the queue to resume the crawlif len(self.queue):spider.log("Resuming crawl (%d requests scheduled)" % len(self.queue))def close(self, reason):if not self.persist:self.flush()def flush(self):self.df.clear()self.queue.clear()def enqueue_request(self, request):if not request.dont_filter and self.df.request_seen(request):self.df.log(request, self.spider)return Falseif self.stats:self.stats.inc_value('scheduler/enqueued/redis', spider=self.spider)self.queue.push(request)return Truedef next_request(self):block_pop_timeout = self.idle_before_closerequest = self.queue.pop(block_pop_timeout)if request and self.stats:self.stats.inc_value('scheduler/dequeued/redis', spider=self.spider)return requestdef has_pending_requests(self):return len(self) > 0
  • /scrapy_redis/spiders.py
# 定义了一些爬取的spider
from scrapy import signals
from scrapy.exceptions import DontCloseSpider
from scrapy.spiders import Spider, CrawlSpiderfrom . import connection, defaults
from .utils import bytes_to_strclass RedisMixin(object):"""Mixin class to implement reading urls from a redis queue."""redis_key = Noneredis_batch_size = Noneredis_encoding = None# Redis client placeholder.server = Nonedef start_requests(self):"""Returns a batch of start requests from redis."""return self.next_requests()def setup_redis(self, crawler=None):"""Setup redis connection and idle signal.This should be called after the spider has set its crawler object."""if self.server is not None:returnif crawler is None:# We allow optional crawler argument to keep backwards# compatibility.# XXX: Raise a deprecation warning.crawler = getattr(self, 'crawler', None)if crawler is None:raise ValueError("crawler is required")settings = crawler.settingsif self.redis_key is None:self.redis_key = settings.get('REDIS_START_URLS_KEY', defaults.START_URLS_KEY,)self.redis_key = self.redis_key % {'name': self.name}if not self.redis_key.strip():raise ValueError("redis_key must not be empty")if self.redis_batch_size is None:# TODO: Deprecate this setting (REDIS_START_URLS_BATCH_SIZE).self.redis_batch_size = settings.getint('REDIS_START_URLS_BATCH_SIZE',settings.getint('CONCURRENT_REQUESTS'),)try:self.redis_batch_size = int(self.redis_batch_size)except (TypeError, ValueError):raise ValueError("redis_batch_size must be an integer")if self.redis_encoding is None:self.redis_encoding = settings.get('REDIS_ENCODING', defaults.REDIS_ENCODING)self.logger.info("Reading start URLs from redis key '%(redis_key)s' ""(batch size: %(redis_batch_size)s, encoding: %(redis_encoding)s",self.__dict__)self.server = connection.from_settings(crawler.settings)# The idle signal is called when the spider has no requests left,# that's when we will schedule new requests from redis queuecrawler.signals.connect(self.spider_idle, signal=signals.spider_idle)def next_requests(self):"""Returns a request to be scheduled or none."""use_set = self.settings.getbool('REDIS_START_URLS_AS_SET', defaults.START_URLS_AS_SET)fetch_one = self.server.spop if use_set else self.server.lpop# XXX: Do we need to use a timeout here?found = 0# TODO: Use redis pipeline execution.while found < self.redis_batch_size:data = fetch_one(self.redis_key)if not data:# Queue empty.breakreq = self.make_request_from_data(data)if req:yield reqfound += 1else:self.logger.debug("Request not made from data: %r", data)if found:self.logger.debug("Read %s requests from '%s'", found, self.redis_key)def make_request_from_data(self, data):"""Returns a Request instance from data coming from Redis.By default, ``data`` is an encoded URL. You can override this method toprovide your own message decoding.Parameters----------data : bytesMessage from redis."""url = bytes_to_str(data, self.redis_encoding)return self.make_requests_from_url(url)def schedule_next_requests(self):"""Schedules a request if available"""# TODO: While there is capacity, schedule a batch of redis requests.for req in self.next_requests():self.crawler.engine.crawl(req, spider=self)def spider_idle(self):"""Schedules a request if available, otherwise waits."""# XXX: Handle a sentinel to close the spider.self.schedule_next_requests()raise DontCloseSpiderclass RedisSpider(RedisMixin, Spider):"""Spider that reads urls from redis queue when idle.Attributes----------redis_key : str (default: REDIS_START_URLS_KEY)Redis key where to fetch start URLs from..redis_batch_size : int (default: CONCURRENT_REQUESTS)Number of messages to fetch from redis on each attempt.redis_encoding : str (default: REDIS_ENCODING)Encoding to use when decoding messages from redis queue.Settings--------REDIS_START_URLS_KEY : str (default: "<spider.name>:start_urls")Default Redis key where to fetch start URLs from..REDIS_START_URLS_BATCH_SIZE : int (deprecated by CONCURRENT_REQUESTS)Default number of messages to fetch from redis on each attempt.REDIS_START_URLS_AS_SET : bool (default: False)Use SET operations to retrieve messages from the redis queue. If False,the messages are retrieve using the LPOP command.REDIS_ENCODING : str (default: "utf-8")Default encoding to use when decoding messages from redis queue."""@classmethoddef from_crawler(self, crawler, *args, **kwargs):obj = super(RedisSpider, self).from_crawler(crawler, *args, **kwargs)obj.setup_redis(crawler)return objclass RedisCrawlSpider(RedisMixin, CrawlSpider):"""Spider that reads urls from redis queue when idle.Attributes----------redis_key : str (default: REDIS_START_URLS_KEY)Redis key where to fetch start URLs from..redis_batch_size : int (default: CONCURRENT_REQUESTS)Number of messages to fetch from redis on each attempt.redis_encoding : str (default: REDIS_ENCODING)Encoding to use when decoding messages from redis queue.Settings--------REDIS_START_URLS_KEY : str (default: "<spider.name>:start_urls")Default Redis key where to fetch start URLs from..REDIS_START_URLS_BATCH_SIZE : int (deprecated by CONCURRENT_REQUESTS)Default number of messages to fetch from redis on each attempt.REDIS_START_URLS_AS_SET : bool (default: True)Use SET operations to retrieve messages from the redis queue.REDIS_ENCODING : str (default: "utf-8")Default encoding to use when decoding messages from redis queue."""@classmethoddef from_crawler(self, crawler, *args, **kwargs):obj = super(RedisCrawlSpider, self).from_crawler(crawler, *args, **kwargs)obj.setup_redis(crawler)return obj
  • /scrapy_redis/utils.py
# 工具库
import sixdef bytes_to_str(s, encoding='utf-8'):"""Returns a str if a bytes object is given."""if six.PY3 and isinstance(s, bytes):return s.decode(encoding)return s

Scrapy分布式原理及Scrapy-Redis源码解析(待完善)相关推荐

  1. Redis源码解析——前言

    今天开启Redis源码的阅读之旅.对于一些没有接触过开源代码分析的同学来说,可能这是一件很麻烦的事.但是我总觉得做一件事,不管有多大多难,我们首先要在战略上蔑视它,但是要在战术上重视它.除了一些高大上 ...

  2. Redis源码解析——双向链表

    相对于之前介绍的字典和SDS字符串库,Redis的双向链表库则是非常标准的.教科书般简单的库.但是作为Redis源码的一部分,我决定还是要讲一讲的.(转载请指明出于breaksoftware的csdn ...

  3. Redis源码解析——字典基本操作

    有了<Redis源码解析--字典结构>的基础,我们便可以对dict的实现进行展开分析.(转载请指明出于breaksoftware的csdn博客) 创建字典 一般字典创建时,都是没有数据的, ...

  4. Redis源码解析——内存管理

    在<Redis源码解析--源码工程结构>一文中,我们介绍了Redis可能会根据环境或用户指定选择不同的内存管理库.在linux系统中,Redis默认使用jemalloc库.当然用户可以指定 ...

  5. Redis源码解析(15) 哨兵机制[2] 信息同步与TILT模式

    Redis源码解析(1) 动态字符串与链表 Redis源码解析(2) 字典与迭代器 Redis源码解析(3) 跳跃表 Redis源码解析(4) 整数集合 Redis源码解析(5) 压缩列表 Redis ...

  6. Diffusion Model原理详解及源码解析

    作者:秃头小苏@CSDN 编辑:3D视觉开发者社区 文章目录 Diffusion Model原理详解及源码解析 写在前面 Diffusion Model原理详解✨✨✨ 整体思路 实施细节 正向过程 逆 ...

  7. Redis源码解析——Zipmap

    本文介绍的是Redis中Zipmap的原理和实现.(转载请指明出于breaksoftware的csdn博客) 基础结构 Zipmap是为了实现保存Pair(String,String)数据的结构,该结 ...

  8. Redis源码解析——有序整数集

    有序整数集是Redis源码中一个以大尾(big endian)形式存储,由小到大排列且无重复的整型集合.它存储的类型包括16位.32位和64位的整型数.在介绍这个库的实现前,我们还需要先熟悉下大小尾内 ...

  9. Redis源码解析——字典结构

    C++语言中有标准的字典库,我们可以通过pair(key,value)的形式存储数据.但是C语言中没有这种的库,于是就需要自己实现.本文讲解的就是Redis源码中的字典库的实现方法.(转载请指明出于b ...

最新文章

  1. python 检验数据正态分布程度_python 实现检验33品种数据是否是正态分布
  2. 第二十四章:页面导航(五)
  3. 开始看 汇编语言程序设计
  4. 【MyBatis使用】mapper.xml 文件内<if test>标签判断参数值不等于null和空 当参数值为 0 时筛选条件失效原因分析(源码探究)
  5. boost::mp11::mp_sort相关用法的测试程序
  6. boost::graph模块实现使用不相交的集合数据结构计算无向变化的连通分量图形的测试程序
  7. 总结从linux - windows 上(GCC与MSVC 2015) 移植C或者C++代码时候遇到的编译和链接问题
  8. 负基础学python编程_【数据科学系统学习】Python # 编程基础[二]
  9. svg与png/jpg快速转换
  10. 百度SEO Keyword Surfer v0.3.7(关键词优化)
  11. 彻底搞定C指针---指向指针的指针(转)
  12. mysql读写分离java配置方法_springboot配置数据库读写分离
  13. memcached在Java中的应用以及magent的配置-每天进步一点点
  14. 3.5.3 连接偏移量管理器
  15. 开发QQ桌球瞄准器(3):绘制瞄准线及母球
  16. python关闭指定浏览器页面_Python自动关闭浏览器关闭网页的方法
  17. 彩色图片用opencv批量转成黑底白底
  18. 墨绘学:向孩子敞开心扉
  19. 神经网络编程(python实现)
  20. Tryhackme blue

热门文章

  1. phpcms如何做企业站-- 替换首页最初操作
  2. WinForm------TreeList修改节点图标和按钮样式
  3. 基于 SSH 的远程操作以及安全,快捷的数据传输转
  4. 洛谷 P4549 【模板】裴蜀定理
  5. linux下查看mysql版本的四种方法
  6. 第十四章 七段数码管绘制时间
  7. TCP与UDP传输协议
  8. 交换机端口呈现err-disable的原因
  9. 【原创】利用typeface实现不同字体的调用显示及String转换为Unicode
  10. 数据结构与算法课程作业--奇数个数的数的查找方法-异或