原文地址:http://wiki.apache.org/cassandra/GettingStarted

介绍:

这篇文章旨在为那些第一次接触cassandra的用户提供单节点以及多节点配置的概述。cassandra 原本主要是运行在多节点环境。单节点环境主要是用来熟悉它。

step 0: 前提条件和社区联系

  • Some people running OS X have trouble getting Java 6 to work. If you've kept up with Apple's updates, Java 6 should already be installed (it comes in Mac OS X 10.5 Update 1). Unfortunately, Apple does not default to using it. What you have to do is change your JAVA_HOME environment setting to /System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home and add /System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home/bin to the beginning of your PATH.

The best way to ensure you always have up to date information on the project, releases, stability, bugs, and features is to subscribe to the users mailing list (subscription required) and participate in the #cassandra channel on IRC.

Step 1: Download Cassandra

  • Download links for the latest stable release can always be found on the website.

  • Users of Debian or Debian-based derivatives can install the latest stable release in package form, see DebianPackaging for details.

  • Users of RPM-based distributions can get packages from Datastax.

  • If you are interested in building Cassandra from source, please refer to How to Build page.

For more details about misc builds, please refer to Cassandra versions and builds page.

Step 2: Basic Configuration

The Cassandra configuration files can be found in the conf directory of binary and source distributions. If you have installed Cassandra from a deb or rpm package, the configuration files will be located in /etc/cassandra.

Step 2.1: Directories Used by Cassandra

If you've installed Cassandra with a deb or rpm package, the directories that Cassandra will use should already be created an have the correct permissions. Otherwise, you will want to check the following config settings.

In conf/cassandra.yaml you will find the following configuration options: data_file_directories (/var/lib/cassandra/data), commitlog_directory (/var/lib/cassandra/commitlog), and saved_caches_directory (/var/lib/cassandra/saved_caches). Make sure these directories exist and can be written to.

By default, Cassandra will log write its logs in /var/log/cassandra/. Make sure this directory exists and is writeable, or change this line in conf/log4j-server.properies:

log4j.appender.R.File=/var/log/cassandra/system.log

Step 2.2: Configure Memory Usage (Optional)

By default, Cassandra will allocate memory based on physical memory your system has, using somewhere between 1/4 and 1/2 of the available RAM.

If you want to specify how much memory Cassandra should use explicitly, edit conf/cassandra-env.sh, find the following lines, uncomment them, and change their values:

#MAX_HEAP_SIZE="4G"
#HEAP_NEWSIZE="800M"

As a rule of thumb, you should set HEAP_NEWSIZE to be 1/4 of MAX_HEAP_SIZE. If you face OutOfMemory exceptions or massive GCs with this configuration, increase both of these values.

Step 3: Start Cassandra

And now for the moment of truth, start up Cassandra by invoking 'bin/cassandra -f' from the command line1. The service should start in the foreground and log gratuitously to the console. Assuming you don't see messages with scary words like "error", or "fatal", or anything that looks like a Java stack trace, then everything should be working.

Press "Control-C" to stop Cassandra.

If you start up Cassandra without the "-f" option, it will run in the background. You can stop the process by killing it, using 'pkill -f CassandraDaemon', for example.

Step 4: Using cassandra-cli

bin/cassandra-cli is an interactive command line interface for Cassandra. You can alter the schema and interact with data using the cli. Run the following command to connect to your local Cassandra instance:

bin/cassandra-cli

You should see the following prompt, if successful:

Connected to: "Test Cluster" on 127.0.0.1/9160
Welcome to Cassandra CLI version 1.0.7Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.[default@unknown] 

You can access to the online help with 'help;' command. Commands are terminated with a semicolon (';') in the cli.

[default@unknown] help;

First, create a keyspace for your test.

[default@unknown] create keyspace DEMO;
f53dff10-5bd8-11e1-0000-915a024292eb
Waiting for schema agreement...
... schemas agree across the cluster
[default@unknown] 

Don't forget to add a semicolon (';') at end of the command.

Second, authenticate you to use the DEMO keyspace.

[default@unknown] use DEMO;
Authenticated to keyspace: DEMO
[default@DEMO]

Third, create a Users column family:

[default@DEMO] create column family Users
...     with key_validation_class = 'UTF8Type'
...     and comparator = 'UTF8Type'
...     and default_validation_class = 'UTF8Type';
[default@DEMO]

Now you can store data into Users column family:

[default@DEMO] set Users[1234][name] = scott;
Value inserted.
Elapsed time: 10 msec(s).
[default@DEMO] set Users[1234][password] = tiger;
Value inserted.
Elapsed time: 10 msec(s).
[default@DEMO]

You have inserted a row into the Users column family. The row key is '1234', and we set values for two columns in the row: 'name', and 'password'.

Now let's fetch the data you inserted:

[default@DEMO] get Users[1234];
=> (column=name, value=scott, timestamp=1350769161684000)
=> (column=password, value=tiger, timestamp=1350769245191000)Returned 2 results.
Elapsed time: 67 msec(s).
[default@DEMO]

You can easily specify types other than UTF-8 when creating or updating a column family. See 'help update column family;' and 'help create column family;' for more details.

To be certain though, take some time to try out the examples in CassandraCli before moving on Also, if you run into problems, Don't Panic, calmly proceed to If Something Goes Wrong.

  • Users of recent Linux distributions and Mac OS X Snow Leopard should be able to start up Cassandra simply by untarring and invoking bin/cassandra -f with root privileges. Snow Leopard ships with Java 1.6.0 and does not require changing the JAVA_HOME environment variable or adding any directory to your PATH. On Linux just make sure you have a working Java JDK package installed such as the openjdk-6-jdk on Ubuntu Lucid Lynx.

Configuring Multinode Cluster

Now you have single working Cassandra node. It is a Cassandra cluster which has only one node. By adding more nodes, you can make it a multi node cluster.

Setting up a Cassandra cluster is almost as simple as repeating the above procedures for each node in your cluster. There are a few minor exceptions though.

Cassandra nodes exchange information about one another using a mechanism called Gossip, but to get the ball rolling a newly started node needs to know of at least one other, this is called a Seed. It's customary to pick a small number of relatively stable nodes to serve as your seeds, but there is no hard-and-fast rule here. Do make sure that each seed also knows of at least one other, remember, the goal is to avoid a chicken-and-egg scenario and provide an avenue for all nodes in the cluster to discover one another.

In addition to seeds, you'll also need to configure the IP interface to listen on for Gossip and Thrift, (listen_address and rpc_address respectively). Use a 'listen_addressthat will be reachable from thelisten_addressused on all other nodes, and arpc_address` that will be accessible to clients.

One other thing you need to care at multi node cluster is Token. Each node in the cluster owns a part of token range from 0 to 2^127-1. If the Nth node in the cluster has token value T(N), the node owns range from T(N-1)+1 to T(N). Cassandra decide nodes where a data should be stored based on the consistent mapping of the row key and token range (refer to RandomPartitioner, ByteOrderedPartitioner).

The token can be assigned to node by initial_token parameter in cassandra.yaml. The parameter is effective only at the first boot of the node. Once you boot a node, use 'nodetool move' command to change the assigned token. You need to specify appropriate initial_token for each node to balance data load across the nodes. Here is a python script to calculate balanced tokens.

# Number of nodes in the cluster
num_node = 4for n in range(num_node):print int(2**127 / num_node * n)

Once everything is configured and the nodes are running, use the bin/nodetool ring utility to verify a properly connected cluster. For example:

eevans@achilles:‾$ bin/nodetool -host 192.168.0.10 -p 7199 ring
Address         DC      Rack    Status State   Load        Owns    Token                                       127605887595351923798765477786913079296
192.168.0.10    DC1     r1      Up     Normal  17.3 MB     25.00%  0
192.168.0.11    DC1     r1      Up     Normal  17.4 MB     25.00%  42535295865117307932921825928971026432
192.168.0.12    DC1     r1      Up     Normal  37.2 MB     25.00%  85070591730234615865843651857942052864
192.168.0.13    DC1     r1      Up     Normal  24.55 MB    25.00%  127605887595351923798765477786913079296     

Advanced cluster management is described in Operations.

If you don't yet have access to hardware for a Cassandra cluster you can try it out on EC2 with CloudConfig.

For more details about configuring multi node cluster, please refer to MultinodeCluster.

Write your application

The recommended way to communicate with Cassandra in your application is to use a higher-level client. These provide programming language specific API:s for talking to Cassandra in a variety of languages. The details will vary depending on programming language and client, but in general using a higher-level client will mean that you have to write less code and get several features for free that you would otherwise have to write yourself.

That said, it is useful to know that Cassandra uses Thrift for its external client-facing API. Cassandra's main API/RPC/Thrift port is 9160. Thrift supports a wide variety of languages so you can code your application to use Thrift directly if you so chose (but again we recommend a high-level client where available).

Important note: If you intend to use thrift directly, you need to install a version of thrift that matches the revision that your version of Cassandra uses. InstallThrift

Cassandra's main API/RPC/Thrift port is 9160 by default, which is defined as rpc_port in cassandra.yaml. It is a common mistake for API clients to connect to the JMX port instead.

Checking out a demo application like Twissandra (Python + Django) will also be useful.

If Something Goes Wrong

If you followed the steps in this guide and failed to get up and running, we'd love to help. Here's what we need.

  1. If you are running anything other than a stable release, please upgrade first and see if you can still reproduce the problem.
  2. Make sure debug logging is enabled (hint: conf/log4j.properties) and save a copy of the output.

  3. Search the mailing list archive and see if anyone has reported a similar problem and what, if any resolution they received.

  4. Ditto for the bug tracking system.

  5. See if you can put together a unit test, script, or application that reproduces the problem.

Finally, post a message with all relevant details to the list (subscription required), or hop onto IRC (network irc.freenode.net, channel #cassandra) and let us know.

cassandra 官方wiki相关推荐

  1. 开源数据中心资产管理系统openDCIM 官方WIKI翻译

    为什么80%的码农都做不了架构师?>>>    对openDCIM 官方WIKI的翻译 客户端要求 不支持IE8或更早版本 IE9及其以后版本, Chrome, Mozilla, o ...

  2. OkHttp 官方Wiki之【使用案例】

    原文位置:https://github.com/square/okhttp/wiki/Recipes Recipes 食谱/知识点清单 We've written some recipes that ...

  3. Cassandra Wiki

    本文已迁移到我的新博客地址:blog.favorstack.io 欢迎访问~ Cassandra是一个高度可扩展.最终一致的.分布式的结构化键值存储系统.Cassandra结合了Dynamo的分布式系 ...

  4. Cassandra 可视化工具

    2019独角兽企业重金招聘Python工程师标准>>> 最近开始接触Cassandra,这些天在cassandra的wiki发现了一些可视化工具的推荐.现在也把这个链接推荐给大家:h ...

  5. 【Android 热修复】运行 Tinker 官方示例 ( 处理 TINKER_ID 问题 | 编译 debug 包 | 修改 Gradle 脚本 | 生成 patch 包 | 热修复 )

    文章目录 一.下载官方示例源码 二.处理 TINKER_ID 问题 三.编译 debug 包 四.安装 APK 并运行 五.修改 Gradle 构建脚本中的文件名称 六.修改程序逻辑代码 七.生成 p ...

  6. Ubuntu系统备份工具大全(官方整理推荐)

    其实官方在系统备份这块已经有Wiki整理和收集各类实用的工具.以下是翻译自官方Wiki的部分文档: 备份工具  wiki文档实用程序 工具 界面 格式类型 Raw/File 支持 远程 增量 差异 自 ...

  7. Cassandra 简介

    Cassandra 简介 Apache Cassandra是一个高度可扩展的高性能分布式数据库,用于处理大量商用服务器上的大量数据,提供高可用性,无单点故障.这是一种NoSQL类型的数据库. 让我们先 ...

  8. 华为云数据库GaussDB(for Cassandra)揭秘第二期:内存异常增长的排查经历

    摘要:华为云数据库GaussDB(for Cassandra) 是一款基于计算存储分离架构,兼容Cassandra生态的云原生NoSQL数据库:它依靠共享存储池实现了强一致,保证数据的安全可靠. 本文 ...

  9. 初识GaussDB(for Cassandra)

    本文分享自华为云社区<华为云数据库GaussDB(for Cassandra)揭秘第一期: 初识GaussDB(for Cassandra)>,原文作者:高斯Cassandra官方 . & ...

最新文章

  1. 【linux】Linux kernel uapi header file(用户态头文件)
  2. hbase shell命令扩展(转自http://www.netfoucs.com/cuirong1986/article/details/7986900)
  3. UVa10881 Piotr's Ants
  4. 手撕设计模式之「工厂方法模式」(Java描述)
  5. 双十一购物节,Nacos 1.4.0 + Go SDK 1.0.1发布
  6. oracle 11g安装教程
  7. 【vulnhub】靶机- [DC系列]DC9(附靶机))
  8. Visual Studio 2008下AJAX的设置
  9. 小米路由器dns辅服务器未响应,小米路由器频繁掉线的原因与解决办法
  10. 马斯克:特斯拉汽车产量今年有望达到50万辆
  11. PHPCMS之 列表和内容页
  12. 概率论 方差公式_【考研数学】概率论与数理统计
  13. Daily scrum[2013.12.02]
  14. AM335X 3款核心板比较
  15. ppt内嵌excel显示找不到服务器,翻遍互联网都找不到的干货:如何在 PPT 里面演示动态图表?...
  16. C++关键字(static/register/atuo/extern/volatile/const)释疑
  17. Mandriva本地安装
  18. apple 证书 账号 内购 详解
  19. 关于win10输入法导致电脑直接卡机无法动弹问题
  20. 十分钟,让你了解DSP/DMP/SSP

热门文章

  1. java中怎么判断数组下标越界_初学java遇到疑惑,数组下标越界,求解答!
  2. python七段数码管10秒倒计时_用7段数码管显示9秒倒计时.doc
  3. 牛顿法(Newton‘s method)和拟牛顿法(quasi Newton method)
  4. 语音识别 公司_语音识别公司_语音识别公司排名 - 云+社区 - 腾讯云
  5. 量化噪声的大小与什么成正比_什么叫 量化噪声?什么叫 量化白噪声?
  6. TYPORA的使用手册
  7. 比Everything更强的文件搜索工具,支持文件名、文件内容和文件图片上的文字搜索,文件内容搜索工具,文件图片内容搜索工具,OCR图片文本识别搜索,文件快速搜索工具,文字识别文件搜索工具
  8. html:超文本标记语言的特点
  9. BUUCTF·[WUSTCTF2020]大数计算·WP
  10. 手机和电脑在局域网下快速传递文件,隐私保证,快速搭建一个FTP