Upload
binh-thanh-nguyen
View
34
Download
0
Embed Size (px)
DESCRIPTION
Redis NoSQL after researching
Citation preview
Presented by,
Searle OliveiraDatabase AdministratorGlobo.com
What is Redis?
02 Feb 2010
• DBA at Globo.com• Certified MySQL Associate ID 222572032• Long time Linux/MySQL user• PHP OOP Developer• Beginner C programming• Redis Admin Project Owner
About me
• What is Redis?• Redis x Memcached • Redis Persistence• Redis Replication• Sharding• Benchmark• Supported Languages• Quick Start• Commands• Study Case: twitter• Redis Admin
Schedule
Redis is a advanced persistent keyvalue database, by Salvatore Antirez Sanfilippo, where every key is associated with a value.
For example, Set the key "surname_1992" to the string "Smith".
> SET surname_1992 “Smith”
In other words, you can look at Redis as a data structures server.
The essence of a keyvalue store is the ability to store some data, called value, inside a key.
What is Redis?
It is similar to memcached but the dataset is not volatile.
Values can be strings, exactly like in memcached, but also lists, sets, and ordered sets.
All data types can be manipulated with atomic operations.
Redis is free software released under the very liberal BSD license.
Redis x Memcached
1. Redis encourages a column oriented style of programming data storage.
No Transactions No ACID
2. Most DataGrids encourage an entity oriented style: Transational ACID A Map usually is a business object or database table. Constrained Tree Schemas are typical.
Redis x Conventional KV
What makes Redis different from many other keyvalue storages, is that every single value has a type. The following types are supported:
• Strings• Lists• Sets• Sorted Sets
Data Types
The type of a value determines what operations (commands) are available for the value itself.
For example, you can append elements to a list stored at the key "mypostlist" using the LPUSH or RPUSH command:
> LPUSH mypostlist “Hello World!”
Each command is performed through serverside atomic operations.
Data Types
Redis loads and mantains the whole dataset into memory, but the dataset is persistent, since at the same time it is saved on disk, so that when the server is restarted data can be loaded back in memory.
There are two kind of persistence supported: semi persistent mode (Snapshotting) and fully persistent mode (Append Only File).
Data in memory, but saved on disk
Redis is very fast but at the same time persistent the whole dataset is taken in memory.
• Semi persistent modetime to time save data on disc asynchronously.
• Fully persistent modealternatively every change is written into an append only file.
Redis Persistences
In this mode Redis, from time to time, writes a dump on disk asynchronously. The dataset is loaded from the dump every time the server is (re)started.
Redis can be configured to save the dataset when a certain number of changes is reached and after a given number of seconds elapses.
Because data is written asynchronously, when a system crash occurs, the last few queries can get lost (that is acceptable in many applications but not in all).
Semi Persistent Mode
This mode is called Append Only File, where every command received altering the dataset (so not a readonly command, but a write command) is written on an append only file ASAP.
This commands are replayed when the server is restarted in order to rebuild the dataset in memory.
Append Only File supports a very handy feature: the server is able to safely rebuild the append only file in background in a nonblocking fashion when it gets too long.
Fully Persistent Mode
Redis can be used as a memcached on steroids because is as fast as memcached but with a number of features more.
Like memcached, Redis also supports setting timeouts to keys so that this key will be automatically removed when a given amount of time passes.
“a number of features more”
It's persistent but supports expires
Whatever will be the persistence mode you'll use Redis supports masterslave replications if you want to stay really safe or if you need to scale to huge amounts of reads.
Redis Replication is trivial to setup. So trivial that all you need to do in order to configure a Redis server to be a slave of another one with automatic synchronization.
slaveof 192.168.1.100 6379
Redis Replication
Distributing the dataset across multiple Redis instances is easy in Redis, as in any other keyvalue storage. And this depends basically on the Languages client libraries being able to do so.
A shard is a method of horizontal partitioning in a database or search engine.
“Horizontal partitioning is a design principle whereby rows of a database table are held separately, rather than splitting by columns (as for normalization). Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.”
Sharding
Redis supports multiple databases with commands to atomically move keys from one database to the other.
By default DB 0 is selected for every new connection, but using the SELECT command it is possible to select/create a different database.
The MOVE operation can move an item from one DB to another atomically.
Multiple databases support
Redis includes the redisbenchmark utility that simulates SETs/GETs done by N clients at the same time sending M total queries (it is similar to the Apache's ab utility).
Below you'll find the full output of the benchmark executed against a Linux box.
• 50 simultaneous clients performing 100.000 requests. • SET and GET is a 256 bytes string. • Running Linux 2.6, it's Xeon X3320 2.5Ghz. • Text executed using the loopback interface (127.0.0.1).
How Fast is Redis?
• About 110.000 SETs per second• About 81.000 GETs per second.
Benchmark Results
====== SET ====== 100.007 requests completed in 0.88 seconds 50 parallel clients 3 bytes payload keep alive: 1
58.50% <= 0 milliseconds99.17% <= 1 milliseconds99.58% <= 2 milliseconds99.85% <= 3 milliseconds99.90% <= 6 milliseconds100.00% <= 9 milliseconds114.293.71 requests per second
====== GET ====== 100.000 requests completed in 1.23 seconds 50 parallel clients 3 bytes payload keep alive: 1
43.12% <= 0 milliseconds96.82% <= 1 milliseconds98.62% <= 2 milliseconds100.00% <= 3 milliseconds
81.234.77 requests per second
• Ruby • Python • PHP • Erlang • Tcl • Perl • Lua • Java • Scala • Clojure • C# • C • Javascript
Supported languages
• Engine Yard• Github• Vidiowiki• Wish Internet Consulting• Ruby Minds• Boxcar (iphone application for push notifications)• LLOOGG• Virgilio Film (Italian movies community)
Globo.com is not using Redis, yet!
Who is using Redis?
This quickstart is a five minutes howto on how to get started with Redis.
The latest stable source distribution of Redis:
$ wget http://redis.googlecode.com/files/redis-1.02.tar.gz
Redis can be compiled in most POSIX systems. To compile Redis just untar the tar.gz, enter the directly and type 'make'.
$ tar xvzf redis-1.02.tar.gz$ cd redis-1.02$ make
Quick Start
Redis can run just fine without a configuration file (when executed without a config file a standard configuration is used). To run Redis just type the following command:
$ ./redis-server
With the default configuration Redis will log to the standard output so you can check what happens.
Run the server
Redis ships with a command line client that is automatically compiled when you ran make and it is called rediscli.
For instance to set a key and read back the value use the following:
$ ./redis-cli set mykey somevalueOK$ ./redis-cli get mykeysomevalue
Play with the built in client
1. Connection handling2. Commands operating on all the kind of values3. Commands operating on string values4. Commands operating on lists5. Commands operating on sets6. Commands operating on sorted sets7. Sorting8. Persistence control commands9. Remote server control commands
Redis Command Reference
QUIT close the connection Ask the server to silently close the connection.
AUTH simple password authentication if enabled Request for authentication in a password protected Redis
server. A Redis server can be instructed to require a password before to allow clients to issue commands. This is done using
the requirepass directive in the Redis configuration file.
1. Connection handling
EXISTS key Test if a key exists
DEL key Delete a key
TYPE key Return the type of the value stored at key
KEYS pattern Return all the keys matching a given pattern
RANDOMKEY Return a random key from the key space
EXPIRE Set a time to live in seconds on a key
TTL Get the time to live in seconds of a key
2. Commands operating on All
RENAME oldname newname Rename the old key in the new one, destroing the newname key if it already exists
RENAMENX oldname newname Rename the old key in the new one, if the newname key does not already exist
2.1 Commands operating on All
DBSIZE Return the number of keys in the current db
SELECT index Select the DB having the specified index
MOVE key dbindex Move the key from the currently selected DB to the DB having as index dbindex
FLUSHDB Remove all the keys of the currently selected DB
FLUSHALL Remove all the keys from all the databases
2.2 Commands operating on All
SET key value Set a key to a string value
GET key Return the string value of the key
GETSET key value Set a key to a string returning the old value of the key
MGET key1 key2 ... keyN Multiget, return the strings values of the keys
SETNX key value Set a key to a string value if the key does not exist
3. Commands operating on String
MSET key1 value1 key2 value2 ... keyN valueN Set a multiple keys to multiple values in a single atomic operation
MSETNX key1 value1 key2 value2 ... keyN valueN Set a multiple keys to multiple values in a single atomic operation if none of the keys already exist
3.1 Commands operating on String
INCR key Increment the integer value of key
INCRBY key integer Increment the integer value of key by integer
DECR key Decrement the integer value of key
DECRBY key integer Decrement the integer value of key by integer
3.2 Commands operating on String
RPUSH key value Append an element to the tail of the List value at key
LPUSH key value Append an element to the head of the List value at key
LLEN key Return the length of the List value at key
LRANGE key start end Return a range of elements from the List at key
LTRIM key start end Trim the list at key to the specified range of elements
4. Commands operating on Lists
LINDEX key index Return the element at index position from the List at key
LSET key index value Set a new value as the element at index position of the List at key
LREM key count value Remove the firstN, lastN, or all the elements matching value from the List at key
LPOP key Return and remove (atomically) the first element of the List at key
4.1 Commands operating on Lists
RPOP key Return and remove (atomically) the last element of the List at key
BLPOP key1 key2 ... keyN timeout Blocking LPOP
BRPOP key1 key2 ... keyN timeout Blocking RPOP
RPOPLPUSH srckey dstkey Return and remove (atomically) the last element of the source List stored at _srckey_ and push the same element to the destination List stored at _dstkey_
4.2 Commands operating on Lists
SADD key member Add the specified member to the Set value at key
SREM key member Remove the specified member from the Set value at key
SPOP key Remove and return (pop) a random element from the Set value at key
SMOVE srckey dstkey member Move the specified member from one Set to another atomically
5. Commands operating on Sets
SCARD key Return the number of elements (the cardinality) of the Set at key
SISMEMBER key member Test if the specified value is a member of the Set at key
SINTER key1 key2 ... keyN Return the intersection between the Sets stored at key1, key2, ..., keyN
SINTERSTORE dstkey key1 key2 ... keyN Compute the intersection between the Sets stored at key1, key2, ..., keyN, and store the resulting Set at dstkey
5.1 Commands operating on Sets
SUNION key1 key2 ... keyN Return the union between the Sets stored at key1, key2, ..., keyN
SUNIONSTORE dstkey key1 key2 ... keyN Compute the union between the Sets stored at key1, key2, ..., keyN, and store the resulting Set at dstkey
SMEMBERS key Return all the members of the Set value at key
SRANDMEMBER key Return a random member of the Set value at key
5.2 Commands operating on Sets
SDIFF key1 key2 ... keyN Return the difference between the Set stored at key1 and all the Sets key2, ..., keyN
SDIFFSTORE dstkey key1 key2 ... keyN Compute the difference between the Set key1 and all the Sets key2, ..., keyN, and store the resulting Set at dstkey
5.3 Commands operating on Sets
ZADD key score member Add the specified member to the Sorted Set value at key or update the score if it already exist
ZREM key member Remove the specified member from the Sorted Set value at key
ZINCRBY key increment member If the member already exists increment its score by _increment_, otherwise add the member setting _increment_ as score
6. Commands operating on Sorted Sets
ZRANGE key start end Return a range of elements from the sorted set at key
ZREVRANGE key start end Return a range of elements from the sorted set at key, exactly like ZRANGE, but the sorted set is ordered in traversed in reverse order, from the greatest to the smallest score
ZRANGEBYSCORE key min max Return all the elements with score >= min and score <= max (a range query) from the sorted set
6.1 Commands operating on Sorted Sets
ZCARD key Return the cardinality (number of elements) of the sorted set at key
ZSCORE key element Return the score associated with the specified element of the sorted set at key
ZREMRANGEBYSCORE key min max Remove all the elements with score >= min and score <= max from the sorted set
6.2 Commands operating on Sorted Sets
SORT key [BY pattern] [LIMIT start count] [GET pattern] [ASC|DESC] [ALPHA] [STORE dstkey]
Sort the elements contained in the List, Set, or Sorted Set value at key. By default sorting is numeric with elements being compared as double precision floating point numbers. These are the simplest form of SORT:
> SORT mylist DESC > SORT mylist LIMIT 0 10 > SORT mylist LIMIT 0 10 ALPHA DESC
7. Sorting
SAVE Synchronously save the DB on disk
BGSAVE Asynchronously save the DB on disk
LASTSAVE Return the UNIX time stamp of the last successfully saving of the dataset on disk
SHUTDOWN Synchronously save the DB on disk, then shutdown the server
BGREWRITEAOF Rewrite the append only file in background when it gets too big
8. Persistence control commands
INFO Provide information and statistics about the server
MONITOR Dump all the received requests in real time
SLAVEOF Change the replication settings
9. Remote server control commands
Design and implementation of a simple Twitter clone using only the Redis keyvalue store.
We don't have tables, so what should be designed? We need toidentify what keys are needed to represent our objects and whatkind of values this keys need to hold.
Let's start from Users. We need to represent this users of course, with the username, userid, password, followers and followingusers, and so on.
The first question is, what should identify an user inside oursystem?
Study Case: twitter
> SET global:nextUserId 1000> INCR global:nextUserId => 1001> SET uid:1001:username “searleoliveira”> SET uid:1001:password “e10adc3949ba59abbe56e057f20f883e”
* atomic INCR operation
We use the “global:nextUserId” key in order to always get anunique ID for every new user. Then we use this unique ID to populate all the other keys holding our user data.
This is a Design Pattern with keyvalues stores!
Keep it in mind.
Study Case: twitter
Sometimes it can be useful to be able to get the user ID from the username, so we set this key too:
> SET username:searleoliveira:uid 1001
This may appear strange at first, but remember that we are only able to access data by key!
It's not possible to tell Redis to return the key that holds a specific value.
This new paradigm is forcing us to organize the data so that everything is accessible by primary key, speaking with relational DBs language.
Study Case: twitter
Following, followers and updatesEvery user has followers users and following users. We have aperfect data structure for this work!
We'll need to access this data in chronological order later, from the most recent update to the older ones, so the perfect kind of Value for this work is a List.
> LPUSH uid:1001:followers 1002> LPUSH uid:1001:followers 1003> LPUSH uid:1001:followers ... Set of uids of all the followers users > LPUSH uid:1001:following ... Set of uids of all the following users
Study Case: twitter
PostsAnother important thing we need is a place were we can add the posts to display in the user home page.
> LPUSH uid:1001:posts “Hey, I am Superman!”
* a List of posts ids, every new post is LPUSHed here.
Study Case: twitter
AuthenticationWe'll handle authentication in a simple but robust way: we don't want to use PHP sessions or other things like this, our system must be ready in order to be distributed among different servers, so we'll take the whole state in our Redis database.
> SET uid:1001:auth fea5e81ac8ca77622bed1c2132a021f9> SET auth:fea5e81ac8ca77622bed1c2132a021f9 1001
Study Case: twitter
AuthenticationTo authenticate an user we'll do this simple work:
• Get the username and password via the login form; • Check if the username:<username>:uid key actually exists;• If it exists we have the user id, (i.e. 1001); • Check if uid:1001:password matches, if not, error message; 5. Ok authenticated! Set "fea5e81ac8ca77622bed1c2132a021f9"
(the value of uid:1001:auth) as "auth" cookie.
This happens every time the users log in.
Study Case: twitter
LogoutThe only thing it's missing from all the authentication is the logout.
What we do on logout? That's simple, we'll just change the random string in uid:1001:auth.
remove the old auth:<oldauthstring> add a new auth:<newauthstring>.
Study Case: twitter
UpdatesUpdates, also known as posts, are even simpler. In order to create a new post on the database we do something like this:
> INCR global:nextUpdatesId => 10343> SET updates:10343 "$owner_id|$time|I am Superman!"
As you can se the user id and time of the post are stored directly inside the string, we don't need to lookup by time or user id in the example application so it is better to compact everything inside the post string.
After we create a post we obtain the post id. We need to LPUSH this post id in every user that's following the author of the post, and of course in the list of posts of the author.
Study Case: twitter
Paginating updatesNow it should be pretty clear how we can user LRANGE in order to get ranges of posts, and render this posts on the screen:
> LRANGE uid:1001:posts 0 10> LRANGE uid:1001:posts 10 20> LRANGE uid:1001:posts 0 -1
Study Case: twitter
• One global map with a common key space• Redis applications store an entity using attributes:
R.set(“U:123:firstname”, “Searle”) R.set(“U:123:lastname”,”Oliveira”) R.set(“U:123:password”,”AdE4Erf”) R.set(“U:123:username”,”searleoliveira”) R.set(“U:uid:searleoliveira”,”123”)
• Each named attribute of the entity combined with the entity
key becomes a key for the entries for the corresponding values.
• Space consumed is a concern though!
Column Oriented Style
Probably you will not need more than one server for a lot of applications, even when you have a lot of users.
But let's assume we are Twitter and need to handle a huge amount of traffic.
What to do? The first thing to do is to hash the key and issue the request on different servers based on the key hash. The general idea is that you can turn your key into a number, and than take the reminder of the division of this number by the number of servers you have:
server_id = crc32(key) % number_of_servers
Making it horizontally scalable
For example every time we post a new message, we need to increment the global:nextPostId key. How to fix this problem? A Single server will get a lot if increments. The simplest way to handle this is to have a dedicated server just for increments. This is probably an overkill btw unless you have really a lot of traffic. There is another trick. The ID does not really need to be an incremental number, but just it needs to be unique. So you can get a random string long enough to be unlikely (almost impossible, if it's md5size) to collide, and you are done. We successfully eliminated our main problem to make it really horizontally scalable!
Special keys
• Speed• Persistence• Support for Data Structures• Atomic Operations• Variety of Supported Languages• Master/Slave Replication• Sharding• Simple to Install, Setup and Manage• Portable• Liberal Licensing
Features
What is Redis Admin?Redis Admin, or ReAdmin, is a open source web interface to the Administration of Redis.
ReAdmin is fully written in PHP using Redis, of course.
The current version supports create schemas, add keys strings, lists and sets only. Sorted sets is not supported yet.
Supports persistence control, command query editor and a server status on dashboard.
ReAdmin was created by Searle Oliveira.
Web Server A web server, with Rewrite module, is needed to install ReAdmin on (e.g., Apache, IIS etc.)
.htaccess Rule This module uses a rulebased rewriting engine (based on a regularexpression parser) to rewrite requested URLs on the fly. Available in Apache 1.2 and later.
PHP 5.2.0, or newer, with the Standard PHP Library (SPL) extension enabled.
Redis 1.2.0 or newer
Web browser Any web browser with cookies enabled.
Requirements
Download Choose a last stable version from the downloads page.
Extract files Untar the stable package in your web server's document root (e.g., /var/www/).
Configuration Simply use a plain text editor to edit a file named config.php in the main (toplevel) ReAdmin directory (the one that contains index.php).
Finish For Apache you can use supplied .htaccess file in that folder, for other web servers, you should configure this yourself.
How to Install?
Redis do not use the Schemas concept.
Redis uses DB instances (0, 1, 2, 3 ... 15).
So, to organize this instances and make more easy for human, we created an alias to them. These alias are Schemas.
Db0 => facebookDb1 => youtubeDb2 => twitterDb3 => google
What is Schemas?
Sorted Sets commands Sorted Sets is not supported yet.
DB instances The DB instances, in Redis, are limited to numbers because they are just an accessory to work with a single dataset. We can create only 15 DB instances (from 0 to 14), currently. The DB15 instance is reserved to ReAdmin info_schema.
Know Limitations
Screenshot: Login
Screenshot: Dashboard
Screenshot: Show Keys
Screenshot: Add Key String
Screenshot: Persistence Control
Screenshot: Command
@phpae #ec2: redisadmin http://phpappengine.com/2010/opensource/redisadmin
@nginx_tips RedisAdmin Download: readmin0.0.6test.tar.gz (103 KB) http://goo.gl/fb/KsMB
@abrdev open source web interface Excellent Open source web admin panel to redis! http://code.google.com/p/redisadmin
@antirez This could be an interesting project, Redisadmin: http://code.google.com/p/redisadmin/
Twitter Updates
Redis Projecthttp://code.google.com/p/redis
ReAdmin Projecthttp://code.google.com/p/redisadmin
Shard (database architecture)http://en.wikipedia.org/wiki/Shard_(database_architecture)
QCon San Francisco 2009, Billy Newport http://www.infoq.com/presentations/newportevolvingkeyvalueprogrammingmodel
Read More