博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
storm集群的安装
阅读量:5364 次
发布时间:2019-06-15

本文共 9817 字,大约阅读时间需要 32 分钟。

storm图解

storm的基本概念
  Topologies:拓扑,也俗称一个任务
  Spoults:拓扑的消息源
  Bolts:拓扑的处理逻辑单元
  tuple:消息元组,在Spoults和Bolts传递数据报的一种格式
  Streams:流
  Streams groupings:流的分组策略
  Tasks:任务处理单元
  Executor:工作线程
  Workers:工作进程
  Configuration:topology的配置

官网:

storm:
  实时在线运算,用于流式计算,就是数据像水一样源源不断的来,storm此时就得把这些数据处理完
  storm一般不单独使用,因为它不存储,一般数据从消息队列进来处理完可以存储到mysql或其他数据库中去
  Apache Storm是一个免费的开源分布式实时计算系统。Apache Storm可以轻松可靠地处理无限数据流,实现Hadoop为批处理所做的实时处理。Apache Storm很简单,可以与任何编程语言一起使用,并且使用起来很有趣!
  Apache Storm有许多用例:实时分析,在线机器学习,连续计算,分布式RPC,ETL等。Apache Storm很快:一个基准测试时钟表示每个节点每秒处理超过一百万个元组。它具有可扩展性,容错性,可确保您的数据得到处理,并且易于设置和操作。
  Apache Storm与您已经使用的消息队列和数据库技术集成。Apache Storm拓扑消耗数据流并以任意复杂的方式处理这些流,然后在计算的每个阶段之间重新划分流。

Storm与Hadoop的对比
Topology与Mapreduce
  一个关键的区别是:一个MapReduce job最终会结束,而一个Topology永远会存在(除非手动kill掉)
Nimbus与JobTracker
  在Storm的集群里面有两种节点:控制节点(master node)和工作槽位节点(worker node,默认每台机器最多4个slots槽位).控制节点上面运行一个叫nimbus后台程序,它的作用类似于haddop里面的JobTracker。nimbus负责在集群里面分发代码,分配计算任务给机器,并且监控状态.。
Supervisor与TaskTracker
  每一个工作节点上面运行一个叫做Supervisor的节点,Supervisor会监听分配给它那台机器的工作,根据需要启动/关闭工作进程.每一个工作进程执行一个topology的一个子集;一个运行的topology由运行在很多机器上的很多工作进程组成。

 

安装步骤:

1.安装一个zookeeper集群

2.下载storm的安装包,解压
3.修改配置文件storm.yaml

#所使用的zookeeper集群主机

- hadoop01
- hadoop02
- hadoop03

#nimbus所在的主机名

nimbus.host: "hadoop01"
#默认4个槽位,可以根据机器性能配置大于4个
supervisor.slots.ports
-6701
-6702
-6703
-6704
-6705

#启动storm

#在nimbus主机上
nohup ./storm nimbus 1 > /dev/bull 2>&1 &
nohup ./storm ui 1 > /dev/null 2>&1 &

在supervisor主机上

nohup ./storm supervisor 1 > /dev/null 2>&1 &

 

1.zookeeper集群前面已经安装过

2.下载storm的安装包,解压

[linyouyi@hadoop01 software]$ wget https://mirrors.aliyun.com/apache/storm/apache-storm-2.0.0/apache-storm-2.0.0.tar.gz[linyouyi@hadoop01 software]$ lltotal 739172-rw-rw-r-- 1 linyouyi linyouyi 312465430 Apr 30 06:17 apache-storm-2.0.0.tar.gz-rw-r--r-- 1 linyouyi linyouyi 218720521 Aug  3 17:56 hadoop-2.7.7.tar.gz-rw-rw-r-- 1 linyouyi linyouyi 132569269 Mar 18 14:28 hbase-2.0.5-bin.tar.gz-rw-r--r-- 1 linyouyi linyouyi  54701720 Aug  3 17:47 server-jre-8u144-linux-x64.tar.gz-rw-r--r-- 1 linyouyi linyouyi  37676320 Aug  8 09:36 zookeeper-3.4.14.tar.gz[linyouyi@hadoop01 software]$ tar -zxvf apache-storm-2.0.0.tar.gz -C /hadoop/module/[linyouyi@hadoop01 software]$ cd /hadoop/module/apache-storm-2.0.0[linyouyi@hadoop01 apache-storm-2.0.0]$ lltotal 308drwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 bindrwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 conf-rw-r--r--  1 linyouyi linyouyi 91939 Apr 30 05:13 DEPENDENCY-LICENSESdrwxr-xr-x 19 linyouyi linyouyi  4096 Apr 30 05:13 examplesdrwxrwxr-x 19 linyouyi linyouyi  4096 Aug 12 21:11 externaldrwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:59 extlibdrwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:59 extlib-daemondrwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 libdrwxrwxr-x  5 linyouyi linyouyi  4096 Aug 12 21:11 lib-toolsdrwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:59 lib-webappdrwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:58 lib-worker-rw-r--r--  1 linyouyi linyouyi 82390 Apr 30 05:13 LICENSEdrwxr-xr-x  2 linyouyi linyouyi  4096 Apr 30 05:13 licensesdrwxrwxr-x  2 linyouyi linyouyi  4096 Aug 12 21:11 log4j2-rw-r--r--  1 linyouyi linyouyi 34065 Apr 30 05:13 NOTICEdrwxrwxr-x  6 linyouyi linyouyi  4096 Aug 12 21:11 public-rw-r--r--  1 linyouyi linyouyi  7914 Apr 30 05:13 README.markdown-rw-r--r--  1 linyouyi linyouyi     6 Apr 30 05:13 RELEASE-rw-r--r--  1 linyouyi linyouyi 23865 Apr 30 05:13 SECURITY.md

3.修改配置文件storm.yaml

[linyouyi@hadoop01 apache-storm-2.0.0]$ vim conf/storm.yaml#zookeeper地址storm.zookeeper.servers:    - "hadoop01"    - "hadoop02"    - "hadoop03"nimbus.seeds: ["hadoop01"]#nimbus.seeds: ["host1", "host2", "host3"][linyouyi@hadoop01 apache-storm-2.0.0]$ cd ../ [linyouyi@hadoop01 module]$ scp -r apache-storm-2.0.0 linyouyi@hadoop02:/hadoop/module/[linyouyi@hadoop01 module]$ scp -r apache-storm-2.0.0 linyouyi@hadoop03:/hadoop/module/

4.启动服务

[linyouyi@hadoop01 module]$ cd apache-storm-2.0.0//如果报找不到java_home则需要配置conf/strom-env.sh文件[linyouyi@hadoop01 apache-storm-2.0.0]$ bin/storm nimbus &[linyouyi@hadoop01 apache-storm-2.0.0]$ jps30051 Nimbus44057 QuorumPeerMain30381 Jps[linyouyi@hadoop01 apache-storm-2.0.0]$ netstat -tnpl | grep 30684(Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.)tcp6       0      0 :::6627                 :::*                    LISTEN      30684/java[linyouyi@hadoop01 apache-storm-2.0.0]$ bin/storm ui &[linyouyi@hadoop01 apache-storm-2.0.0]$ jps32674 UIServer44057 QuorumPeerMain30684 Nimbus32989 Jps[linyouyi@hadoop01 apache-storm-2.0.0]$ netstat -tnpl | grep 32674tcp6       0      0 :::8080                 :::*                    LISTEN      32674/java//浏览器查看http://hadoop01:8080发现很多工作槽都是0,下面我们在hadoop02,hadoop03启动supervisor,工作槽就不再是0了[linyouyi@hadoop02 apache-storm-2.0.0]$ bin/storm supervisor[linyouyi@hadoop02 apache-storm-2.0.0]$ jps70952 Jps70794 Supervisor34879 QuorumPeerMain[linyouyi@hadoop03 apache-storm-2.0.0]$ bin/storm supervisor[linyouyi@hadoop03 apache-storm-2.0.0]$ jps119587 QuorumPeerMain116291 Jps116143 Supervisor

 

 

storm提交Topologies常用命令

//命令格式: storm jar [jar路径] [拓扑包名.拓扑类名] [stormIP地址] [storm端口] [拓扑名称] [参数][linyouyi@hadoop01 apache-storm-2.0.0]$ bin/storm jar --helpusage: storm jar [-h] [--jars JARS] [--artifacts ARTIFACTS]                 [--artifactRepositories ARTIFACTREPOSITORIES]                 [--mavenLocalRepositoryDirectory MAVENLOCALREPOSITORYDIRECTORY]                 [--proxyUrl PROXYURL] [--proxyUsername PROXYUSERNAME]                 [--proxyPassword PROXYPASSWORD] [--storm-server-classpath]                 [--config CONFIG] [-storm_config_opts STORM_CONFIG_OPTS]                 topology-jar-path topology-main-class                 [topology_main_args [topology_main_args ...]]positional arguments:  topology-jar-path     will upload the jar at topology-jar-path when the                        topology is submitted.  topology-main-class   main class of the topology jar being submitted  topology_main_args    Runs the main method with the specified arguments.optional arguments:  --artifactRepositories ARTIFACTREPOSITORIES                        When you need to pull the artifacts from other than                        Maven Central, you can pass remote repositories to                        --artifactRepositories option with a comma-separated                        string. Repository format is "
^
". '^' is taken as separator because URL allows various characters. For example, --artifactRepositories "jboss-repository^http://repository.jboss.com/maven2,H DPRepo^http://repo.hortonworks.com/content/groups/publ ic/" will add JBoss and HDP repositories for dependency resolver. --artifacts ARTIFACTS When you want to ship maven artifacts and its transitive dependencies, you can pass them to --artifacts with comma-separated string. You can also exclude some dependencies like what you're doing in maven pom. Please add exclusion artifacts with '^' separated string after the artifact. For example, -artifacts "redis.clients:jedis:2.9.0,org.apache.kafka :kafka-clients:1.0.0^org.slf4j:slf4j-api" will load jedis and kafka-clients artifact and all of transitive dependencies but exclude slf4j-api from kafka. --config CONFIG Override default storm conf file --jars JARS When you want to ship other jars which are not included to application jar, you can pass them to --jars option with comma-separated string. For example, --jars "your-local-jar.jar,your-local- jar2.jar" will load your-local-jar.jar and your-local- jar2.jar. --mavenLocalRepositoryDirectory MAVENLOCALREPOSITORYDIRECTORY You can provide local maven repository directory via --mavenLocalRepositoryDirectory if you would like to use specific directory. It might help when you don't have '.m2/repository' directory in home directory, because CWD is sometimes non-deterministic (fragile). --proxyPassword PROXYPASSWORD password of proxy if it requires basic auth --proxyUrl PROXYURL You can also provide proxy information to let dependency resolver utilizing proxy if needed. URL representation of proxy ('http://host:port') --proxyUsername PROXYUSERNAME username of proxy if it requires basic auth --storm-server-classpath If for some reason you need to have the full storm classpath, not just the one for the worker you may include the command line option `--storm-server- classpath`. Please be careful because this will add things to the classpath that will not be on the worker classpath and could result in the worker not running. -h, --help show this help message and exit -storm_config_opts STORM_CONFIG_OPTS, -c STORM_CONFIG_OPTS Override storm conf properties , e.g. nimbus.ui.port=4443[linyouyi@hadoop01 apache-storm-2.0.0]$ storm jar /home/storm/storm-starter.jar storm.start.WordCountTopology.wordcountTop

提交storm-starter.jar到远程集群,并启动wordcountTop拓扑

转载于:https://www.cnblogs.com/linyouyi/p/11342906.html

你可能感兴趣的文章
设计模式——简单工厂模式(本人摘录文章)
查看>>
jmeter通过jsonpath获取json结果中,数组里面特定的参数
查看>>
mysql之索引
查看>>
win7,vs2010,asp.net项目中修改外部js文件,在调试时加载的还是旧文件
查看>>
posix多线程有感--线程高级编程(优先级有关)
查看>>
Element-wise operations
查看>>
Js定义一个表单并提交
查看>>
HDU5438 拓扑排序
查看>>
算法题解之二叉树与分治法
查看>>
[SoapUI] 比较两个不同环境下XML格式的Response, 结果不同时设置Test Step的执行状态为失败...
查看>>
[SoapUI] 各种日期计算
查看>>
BZOJ 2058 [Usaco2010 Nov]Cow Photographs:逆序对【环上最小逆序对】
查看>>
componentsSeparatedByString 的注意事项
查看>>
python list
查看>>
JSP中获取各种路径的方法
查看>>
__get__set__isset__unset
查看>>
hdu 3339(最短路+01背包)
查看>>
hdu 4517(递推枚举统计)
查看>>
【算法笔记】B1006 换个格式输出整数
查看>>
c++ string和wstring之间的转换
查看>>