本文内容为 http://www.redis.io/commands/scan 的翻译、注解、例子扩充及其它修改。感谢 Redis 作者 Antirez 为开源社区作出的贡献,本文保证最新、最准、最全以表示对其的敬意。欢迎留言纠错、提示更新或支持。
SCAN cursor [MATCH pattern] [COUNT count]

SCAN 命令及紧密相关的命令 SSCANHSCAN 及 ZSCAN 被用于增量迭代(Incrementally iterate)一组元素。

The SCAN command and the closely related commands SSCAN, HSCAN and ZSCAN are used in order to incrementally iterate over a collection of elements.
  • SCAN 用于迭代遍历当前被选中的(Selected) Redis 数据库中的键的集合。
  • SSCAN 用于迭代遍历指定集合类型(Sets type)对象中的元素。
  • HSCAN 用于迭代遍历指定哈希类型(Hash type)对象中的字段(Field)及关联的值。
  • ZSCAN 用于迭代遍历指定有序集合类型(Sorted Set type)对象中的元素(Element)及关联的分值(Score)
  • SCAN iterates the set of keys in the currently selected Redis database.
  • SSCAN iterates elements of Sets types.
  • HSCAN iterates fields of Hash types and their associated values.
  • ZSCAN iterates elements of Sorted Set types and their associated scores.

因为这些命令允许增量迭代遍历,每一次调用只返回一小部分元素,所以它们可以被使用于实际产品环境中,而不会像命令 KEYSSMEMBERS 那样带来负面影响,后两条命令在作用于大批量键或元素时,可能会阻塞服务器相当长一段时间(甚至几秒钟)。

Since these commands allow for incremental iteration, returning only a small number of elements per call, they can be used in production without the downside of commands like KEYS or SMEMBERS that may block the server for a long time (even several seconds) when called against big collections of keys or elements.

然而,当使用像 SMEMBERS 这样的阻塞命令时,能够提供一个集合在指定时刻的所有元素。SCAN 命令家庭只为返回的元素提供有限的此类保证,因为我们正在增量迭代遍历的一组元素可能在迭代过程中发生改变。

However while blocking commands like SMEMBERS are able to provide all the elements that are part of a Set in a given moment, The SCAN family of commands only offer limited guarantees about the returned elements since the collection that we incrementally iterate can change during the iteration process.

SCANSSCANHSCANZSCAN 的区别:

Note that SCAN, SSCAN, HSCAN and ZSCAN all work very similarly, so this documentation covers all the four commands. However an obvious difference is that in the case of SSCAN, HSCAN and ZSCAN the first argument is the name of the key holding the Set, Hash or Sorted Set value. The SCAN command does not need any key name argument as it iterates keys in the current database, so the iterated object is the database itself.
  • 命令 SCANSSCANHSCANZSCAN 都会返回两个元素。第一个元素为以字符串表示(String representing)64 位无符号数(Unsigned 64 bit number)(即游标),第二个元素为元素数组。

    • SCAN 返回的元素数组为一个键列表。
    • SSCAN 返回的元素数组为一个集合成员的列表。
    • HSCAN 返回的元素数组包含两部份,对于哈希对象中返回的每个元素,都输出一个字段及一个值。
    • ZSCAN 返回的元素数组包含两部份,对于有序集合中返回的每个元素,都输出一个成员及其关联的分值。
    • SCANSSCANHSCAN and ZSCAN return a two elements multi-bulk reply, where the first element is a string representing an unsigned 64 bit number (the cursor), and the second element is a multi-bulk with an array of elements.
    • SCAN array of elements is a list of keys.
    • SSCAN array of elements is a list of Set members.
    • HSCAN array of elements contain two elements, a field and a value, for every returned element of the Hash.
    • ZSCAN array of elements contain two elements, a member and its associated score, for every returned element of the sorted set.
  • 例 1

    redis 127.0.0.1:6379> scan 0
    1) "17"
    2)  1) "key:12"
        2) "key:8"
        3) "key:4"
        4) "key:14"
        5) "key:16"
        6) "key:17"
        7) "key:15"
        8) "key:10"
        9) "key:3"
       10) "key:7"
       11) "key:1"
    redis 127.0.0.1:6379> scan 17
    1) "0"
    2) 1) "key:5"
       2) "key:18"
       3) "key:0"
       4) "key:2"
       5) "key:19"
       6) "key:13"
       7) "key:6"
       8) "key:9"
       9) "key:11"

    例 2

    redis 127.0.0.1:6379> sadd myset 1 2 3 foo foobar feelsgood
    (integer) 6
    redis 127.0.0.1:6379> sscan myset 0 match f*
    1) "0"
    2) 1) "foo"
       2) "feelsgood"
       3) "foobar"

    例 3

    redis 127.0.0.1:6379> scan 0 MATCH *11*
    1) "288"
    2) 1) "key:911"
    redis 127.0.0.1:6379> scan 288 MATCH *11*
    1) "224"
    2) (empty list or set)
    redis 127.0.0.1:6379> scan 224 MATCH *11*
    1) "80"
    2) (empty list or set)
    redis 127.0.0.1:6379> scan 80 MATCH *11*
    1) "176"
    2) (empty list or set)
    redis 127.0.0.1:6379> scan 176 MATCH *11* COUNT 1000
    1) "0"
    2)  1) "key:611"
        2) "key:711"
        3) "key:118"
        4) "key:117"
        5) "key:311"
        6) "key:112"
        7) "key:111"
        8) "key:110"
        9) "key:113"
       10) "key:211"
       11) "key:411"
       12) "key:115"
       13) "key:116"
       14) "key:114"
       15) "key:119"
       16) "key:811"
       17) "key:511"
       18) "key:11"

    例 4

    redis 127.0.0.1:6379> hmset hash name Jack age 33
    OK
    redis 127.0.0.1:6379> hscan hash 0
    1) "0"
    2) 1) "name"
       2) "Jack"
       3) "age"
       4) "33"
  • SCAN basic usage

    SCAN is a cursor based iterator. This means that at every call of the command, the server returns an updated cursor that the user needs to use as the cursor argument in the next call.

    An iteration starts when the cursor is set to 0, and terminates when the cursor returned by the server is 0. The following is an example of SCAN iteration:

    redis 127.0.0.1:6379> scan 0
    1) "17"
    2)  1) "key:12"
        2) "key:8"
        3) "key:4"
        4) "key:14"
        5) "key:16"
        6) "key:17"
        7) "key:15"
        8) "key:10"
        9) "key:3"
       10) "key:7"
       11) "key:1"
    redis 127.0.0.1:6379> scan 17
    1) "0"
    2) 1) "key:5"
       2) "key:18"
       3) "key:0"
       4) "key:2"
       5) "key:19"
       6) "key:13"
       7) "key:6"
       8) "key:9"
       9) "key:11"

    In the example above, the first call uses zero as a cursor, to start the iteration. The second call uses the cursor returned by the previous call as the first element of the reply, that is, 17.

    As you can see the SCAN return value is an array of two values: the first value is the new cursor to use in the next call, the second value is an array of elements.

    Since in the second call the returned cursor is 0, the server signaled to the caller that the iteration finished, and the collection was completely explored. Starting an iteration with a cursor value of 0, and calling SCAN until the returned cursor is 0 again is called a full iteration.

  • Scan guarantees

    The SCAN command, and the other commands in the SCAN family, are able to provide to the user a set of guarantees associated to full iterations.
    • A full iteration always retrieves all the elements that were present in the collection from the start to the end of a full iteration. This means that if a given element is inside the collection when an iteration is started, and is still there when an iteration terminates, then at some point SCAN returned it to the user.
    • A full iteration never returns any element that was NOT present in the collection from the start to the end of a full iteration. So if an element was removed before the start of an iteration, and is never added back to the collection for all the time an iteration lasts, SCAN ensures that this element will never be returned.
    However because SCAN has very little state associated (just the cursor) it has the following drawbacks:
    • A given element may be returned multiple times. It is up to the application to handle the case of duplicated elements, for example only using the returned elements in order to perform operations that are safe when re-applied multiple times.
    • Elements that were not constantly present in the collection during a full iteration, may be returned or not: it is undefined.

  • Number of elements returned at every SCAN call

    SCAN family functions do not guarantee that the number of elements returned per call are in a given range. The commands are also allowed to return zero elements, and the client should not consider the iteration complete as long as the returned cursor is not zero.

    However the number of returned elements is reasonable, that is, in practical terms SCAN may return a maximum number of elements in the order of a few tens of elements when iterating a large collection, or may return all the elements of the collection in a single call when the iterated collection is small enough to be internally represented as an encoded data structure (this happens for small sets, hashes and sorted sets).

    However there is a way for the user to tune the order of magnitude of the number of returned elements per call using the COUNT option.

  • The COUNT option

    While SCAN does not provide guarantees about the number of elements returned at every iteration, it is possible to empirically adjust the behavior of SCAN using the COUNT option. Basically with COUNT the user specified the amount of work that should be done at every call in order to retrieve elements from the collection. This is just an hint for the implementation, however generally speaking this is what you could expect most of the times from the implementation.
    • The default COUNT value is 10.
    • When iterating the key space, or a Set, Hash or Sorted Set that is big enough to be represented by an hash table, assuming no MATCH option is used, the server will usually return count or a bit more than count elements per call.
    • When iterating Sets encoded as intsets (small sets composed of just integers), or Hashes and Sorted Sets encoded as ziplists (small hashes and sets composed of small individual values), usually all the elements are returned in the first SCAN call regardless of the COUNT value.
    Important: there is no need to use the same COUNT value for every iteration. The caller is free to change the count from one iteration to the other as required, as long as the cursor passed in the next call is the one obtained in the previous call to the command.
  • The MATCH option

    It is possible to only iterate elements matching a given glob-style pattern, similarly to the behavior of the KEYS command that takes a pattern as only argument.
    To do so, just append the MATCH <pattern> arguments at the end of the SCAN command (it works with all the SCAN family commands).
    This is an example of iteration using MATCH:
    redis 127.0.0.1:6379> sadd myset 1 2 3 foo foobar feelsgood
    (integer) 6
    redis 127.0.0.1:6379> sscan myset 0 match f*
    1) "0"
    2) 1) "foo"
       2) "feelsgood"
       3) "foobar"
    redis 127.0.0.1:6379>
    It is important to note that the MATCH filter is applied after elements are retrieved from the collection, just before returning data to the client. This means that if the pattern matches very little elements inside the collection, SCAN will likely return no elements in most iterations. An example is shown below:
    redis 127.0.0.1:6379> scan 0 MATCH *11*
    1) "288"
    2) 1) "key:911"
    redis 127.0.0.1:6379> scan 288 MATCH *11*
    1) "224"
    2) (empty list or set)
    redis 127.0.0.1:6379> scan 224 MATCH *11*
    1) "80"
    2) (empty list or set)
    redis 127.0.0.1:6379> scan 80 MATCH *11*
    1) "176"
    2) (empty list or set)
    redis 127.0.0.1:6379> scan 176 MATCH *11* COUNT 1000
    1) "0"
    2)  1) "key:611"
        2) "key:711"
        3) "key:118"
        4) "key:117"
        5) "key:311"
        6) "key:112"
        7) "key:111"
        8) "key:110"
        9) "key:113"
       10) "key:211"
       11) "key:411"
       12) "key:115"
       13) "key:116"
       14) "key:114"
       15) "key:119"
       16) "key:811"
       17) "key:511"
       18) "key:11"
    redis 127.0.0.1:6379>
    As you can see most of the calls returned zero elements, but the last call where a COUNT of 1000 was used in order to force the command to do more scanning for that iteration.
  • Multiple parallel iterations

    It is possible for an infinite number of clients to iterate the same collection at the same time, as the full state of the iterator is in the cursor, that is obtained and returned to the client at every call. Server side no state is taken at all.

  • Terminating iterations in the middle

    Since there is no state server side, but the full state is captured by the cursor, the caller is free to terminate an iteration half-way without signaling this to the server in any way. An infinite number of iterations can be started and never terminated without any issue.

  • Calling SCAN with a corrupted cursor

    Calling SCAN with a broken, negative, out of range, or otherwise invalid cursor, will result into undefined behavior but never into a crash. What will be undefined is that the guarantees about the returned elements can no longer be ensured by the SCAN implementation.
    The only valid cursors to use are:
    • The cursor value of 0 when starting an iteration.
    • The cursor returned by the previous call to SCAN in order to continue the iteration.
  • Guarantee of termination

    The SCAN algorithm is guaranteed to terminate only if the size of the iterated collection remains bounded to a given maximum size, otherwise iterating a collection that always grows may result into SCAN to never terminate a full iteration.
    This is easy to see intuitively: if the collection grows there is more and more work to do in order to visit all the possible elements, and the ability to terminate the iteration depends on the number of calls to SCAN and its COUNT option value compared with the rate at which the collection grows.
  • 版本支持

    2.8.0+

    时间复杂度(Time complexity)

    O(1) 用于每次调用,O(N) 用于一次完整的迭代。

    O(1) for every call. O(N) for a complete iteration, including enough command calls for the cursor to return back to 0. N is the number of elements inside the collection..