windows 安装h2o

H2O-安装 (H2O - Installation)

H2O can be configured and used with five different options as listed below −

可以配置H2O并使用以下五个不同的选项-

  • Install in Python

    在Python中安装

  • Install in R

    在R中安装

  • Web-based Flow GUI

    基于Web的Flow GUI

  • Hadoop

    Hadoop

  • Anaconda Cloud

    Python云

In our subsequent sections, you will see the instructions for installation of H2O based on the options available. You are likely to use one of the options.

在我们的后续章节中,您将根据可用选项查看安装H2O的说明。 您可能会使用其中一个选项。

在Python中安装 (Install in Python)

To run H2O with Python, the installation requires several dependencies. So let us start installing the minimum set of dependencies to run H2O.

要使用Python运行H2O,安装需要几个依赖项。 因此,让我们开始安装最小的依赖关系集以运行H2O。

安装依赖项 (Installing Dependencies)

To install a dependency, execute the following pip command −

要安装依赖项,请执行以下pip命令-


$ pip install requests

Open your console window and type the above command to install the requests package. The following screenshot shows the execution of the above command on our Mac machine −

打开控制台窗口,然后键入以上命令以安装请求包。 以下屏幕截图显示了在Mac机器上执行上述命令的过程-

After installing requests, you need to install three more packages as shown below −

安装请求后,您需要再安装三个软件包,如下所示:


$ pip install tabulate
$ pip install "colorama >= 0.3.8"
$ pip install future

The most updated list of dependencies is available on H2O GitHub page. At the time of this writing, the following dependencies are listed on the page.

H2O GitHub页面上提供了最新的依赖关系列表。 在撰写本文时,页面上列出了以下依赖项。


python 2. H2O — Installation
pip >= 9.0.1
setuptools
colorama >= 0.3.7
future >= 0.15.2

删除旧版本 (Removing Older Versions)

After installing the above dependencies, you need to remove any existing H2O installation. To do so, run the following command −

安装以上依赖项后,您需要删除所有现有的H2O安装。 为此,请运行以下命令-


$ pip uninstall h2o

安装最新版本 (Installing the Latest Version)

Now, let us install the latest version of H2O using the following command −

现在,让我们使用以下命令安装最新版本的H2O-


$ pip install -f http://h2o-release.s3.amazonaws.com/h2o/latest_stable_Py.html h2o

After successful installation, you should see the following message display on the screen −

成功安装后,您应该在屏幕上看到以下消息显示-


Installing collected packages: h2o
Successfully installed h2o-3.26.0.1

测试安装 (Testing the Installation)

To test the installation, we will run one of the sample applications provided in the H2O installation. First start the Python prompt by typing the following command −

为了测试安装,我们将运行H2O安装中提供的示例应用程序之一。 首先通过键入以下命令来启动Python提示符-


$ Python3

Once the Python interpreter starts, type the following Python statement on the Python command prompt −

Python解释器启动后,在Python命令提示符下键入以下Python语句-


>>>import h2o

The above command imports the H2O package in your program. Next, initialize the H2O system using the following command −

上面的命令将H2O软件包导入程序中。 接下来,使用以下命令初始化H2O系统-


>>>h2o.init()

Your screen would show the cluster information and should look the following at this stage −

您的屏幕将显示集群信息,并且在此阶段应显示以下内容:

Now, you are ready to run the sample code. Type the following command on the Python prompt and execute it.

现在,您可以运行示例代码了。 在Python提示符下键入以下命令并执行它。


>>>h2o.demo("glm")

The demo consists of a Python notebook with a series of commands. After executing each command, its output is shown immediately on the screen and you will be asked to hit the key to continue with the next step. The partial screenshot on executing the last statement in the notebook is shown here −

该演示由一个带有一系列命令的Python笔记本组成。 执行完每个命令后,其输出将立即显示在屏幕上,并且将要求您按一下键以继续下一步。 在此处显示有关在笔记本中执行最后一条语句的部分屏幕截图-

At this stage your Python installation is complete and you are ready for your own experimentation.

在这一阶段,您的Python安装已完成,并且可以进行自己的实验了。

在R中安装 (Install in R)

Installing H2O for R development is very much similar to installing it for Python, except that you would be using R prompt for the installation.

为R开发安装H2O与为Python安装非常相似,除了您将使用R提示符进行安装。

启动R Console (Starting R Console)

Start R console by clicking on the R application icon on your machine. The console screen would appear as shown in the following screenshot −

通过单击计算机上的R应用程序图标来启动R控制台。 控制台屏幕将出现,如以下屏幕截图所示-

Your H2O installation would be done on the above R prompt. If you prefer using RStudio, type the commands in the R console subwindow.

您的H2O安装将在上述R提示符下完成。 如果您更喜欢使用RStudio,请在R控制台子窗口中键入命令。

删除旧版本 (Removing Older Versions)

To begin with, remove older versions using the following command on the R prompt −

首先,在R提示符下使用以下命令删除旧版本-


> if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) }
> if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") }

下载依赖项 (Downloading Dependencies)

Download the dependencies for H2O using the following code −

使用以下代码下载H2O的依赖关系-


> pkgs <- c("RCurl","jsonlite")
for (pkg in pkgs) {
if (! (pkg %in% rownames(installed.packages()))) { install.packages(pkg) }
}

安装水 (Installing H2O)

Install H2O by typing the following command on the R prompt −

通过在R提示符下键入以下命令来安装H2O-


> install.packages("h2o", type = "source", repos = (c("http://h2o-release.s3.amazonaws.com/h2o/latest_stable_R")))

The following screenshot shows the expected output −

以下屏幕截图显示了预期的输出-

There is another way of installing H2O in R.

还有另一种在R中安装H2O的方法。

从CRAN在R中安装 (Install in R from CRAN)

To install R from CRAN, use the following command on R prompt −

要从CRAN安装R,请在R提示符下使用以下命令-


> install.packages("h2o")

You will be asked to select the mirror −

您将被要求选择镜子-


--- Please select a CRAN mirror for use in this session ---

A dialog box displaying the list of mirror sites is shown on your screen. Select the nearest location or the mirror of your choice.

屏幕上会显示一个对话框,其中显示了镜像站点列表。 选择最近的位置或您选择的镜子。

测试安装 (Testing Installation)

On the R prompt, type and run the following code −

在R提示符下,键入并运行以下代码-


> library(h2o)
> localH2O = h2o.init()
> demo(h2o.kmeans)

The output generated will be as shown in the following screenshot −

生成的输出将如以下屏幕截图所示-

Your H2O installation in R is complete now.

R中的H2O安装现已完成。

安装Web GUI流 (Installing Web GUI Flow)

To install GUI Flow download the installation file from the H20 site. Unzip the downloaded file in your preferred folder. Note the presence of h2o.jar file in the installation. Run this file in a command window using the following command −

要安装GUI Flow,请从H20站点下载安装文件。 将下载的文件解压缩到您的首选文件夹中。 请注意在安装中存在h2o.jar文件。 使用以下命令在命令窗口中运行此文件-


$ java -jar h2o.jar

After a while, the following will appear in your console window.

一段时间后,以下内容将出现在控制台窗口中。


07-24 16:06:37.304 192.168.1.18:54321 3294 main INFO: H2O started in 7725ms
07-24 16:06:37.304 192.168.1.18:54321 3294 main INFO:
07-24 16:06:37.305 192.168.1.18:54321 3294 main INFO: Open H2O Flow in your web browser: http://192.168.1.18:54321
07-24 16:06:37.305 192.168.1.18:54321 3294 main INFO:

To start the Flow, open the given URL http://localhost:54321 in your browser. The following screen will appear −

要启动流,请在浏览器中打开给定的URL http:// localhost:54321 。 将出现以下屏幕-

At this stage, your Flow installation is complete.

至此,您的Flow安装完成。

在Hadoop / Anaconda Cloud上安装 (Install on Hadoop / Anaconda Cloud)

Unless you are a seasoned developer, you would not think of using H2O on Big Data. It is sufficient to say here that H2O models run efficiently on huge databases of several terabytes. If your data is on your Hadoop installation or in the Cloud, follow the steps given on H2O site to install it for your respective database.

除非您是经验丰富的开发人员,否则您不会考虑在大数据上使用H2O。 在这里足以说H2O模型可以在数TB的大型数据库上高效运行。 如果您的数据在Hadoop安装中或在Cloud中,请按照H2O站点上给出的步骤为各自的数据库安装数据。

Now that you have successfully installed and tested H2O on your machine, you are ready for real development. First, we will see the development from a Command prompt. In our subsequent lessons, we will learn how to do model testing in H2O Flow.

既然您已经在计算机上成功安装并测试了H2O,那么就可以进行实际开发了。 首先,我们将在Command提示符下看到开发情况。 在接下来的课程中,我们将学习如何在H2O Flow中进行模型测试。

在命令提示符下进行开发 (Developing in Command Prompt)

Let us now consider using H2O to classify plants of the well-known iris dataset that is freely available for developing Machine Learning applications.

现在让我们考虑使用H2O对可免费用于开发机器学习应用程序的著名虹膜数据集的植物进行分类。

Start the Python interpreter by typing the following command in your shell window −

通过在您的shell窗口中键入以下命令来启动Python解释器-


$ Python3

This starts the Python interpreter. Import h2o platform using the following command −

这将启动Python解释器。 使用以下命令导入h2o平台-


>>> import h2o

We will use Random Forest algorithm for classification. This is provided in the H2ORandomForestEstimator package. We import this package using the import statement as follows −

我们将使用随机森林算法进行分类。 这在H2ORandomForestEstimator包中提供。 我们使用import语句如下导入这个包:


>>> from h2o.estimators import H2ORandomForestEstimator

We initialize the H2o environment by calling its init method.

我们通过调用其init方法来初始化H2o环境。


>>> h2o.init()

On successful initialization, you should see the following message on the console along with the cluster information.

成功初始化后,您应该在控制台上看到以下消息以及集群信息。


Checking whether there is an H2O instance running at http://localhost:54321 . connected.

Now, we will import the iris data using the import_file method in H2O.

现在,我们将在H2O中使用import_file方法导入虹膜数据。


>>> data = h2o.import_file('iris.csv')

The progress will display as shown in the following screenshot −

进度将显示,如以下屏幕截图所示-

After the file is loaded in the memory, you can verify this by displaying the first 10 rows of the loaded table. You use the head method to do so −

将文件加载到内存中后,您可以通过显示已加载表的前10行来验证这一点。 您使用head方法这样做-


>>> data.head()

You will see the following output in tabular format.

您将以表格格式看到以下输出。

The table also displays the column names. We will use the first four columns as the features for our ML algorithm and the last column class as the predicted output. We specify this in the call to our ML algorithm by first creating the following two variables.

该表还显示列名。 我们将使用前四列作为ML算法的功能,并使用最后一列类作为预测的输出。 通过首先创建以下两个变量,我们在ML算法的调用中指定了这一点。


>>> features = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']
>>> output = 'class'

Next, we split the data into training and testing by calling the split_frame method.

接下来,我们通过调用split_frame方法将数据分为训练和测试。


>>> train, test = data.split_frame(ratios = [0.8])

The data is split in the 80:20 ratio. We use 80% data for training and 20% for testing.

数据以80:20的比例分割。 我们将80%的数据用于培训,将20%的数据用于测试。

Now, we load the built-in Random Forest model into the system.

现在,我们将内置的随机森林模型加载到系统中。


>>> model = H2ORandomForestEstimator(ntrees = 50, max_depth = 20, nfolds = 10)

In the above call, we set the number of trees to 50, the maximum depth for the tree to 20 and number of folds for cross validation to 10. We now need to train the model. We do so by calling the train method as follows −

在上面的调用中,我们将树的数量设置为50,将树的最大深度设置为20,将交叉验证的折叠数设置为10。现在,我们需要训练模型。 我们通过如下调用train方法来做到这一点-


>>> model.train(x = features, y = output, training_frame = train)

The train method receives the features and the output that we created earlier as first two parameters. The training dataset is set to train, which is the 80% of our full dataset. During training, you will see the progress as shown here −

训练方法接收特征和我们之前创建的输出作为前两个参数。 训练数据集设置为训练,这是我们完整数据集的80%。 在训练期间,您将看到如下所示的进度-

Now, as the model building process is over, it is time to test the model. We do this by calling the model_performance method on the trained model object.

现在,随着模型构建过程的结束,是时候测试模型了。 我们通过在训练好的模型对象上调用model_performance方法来实现。


>>> performance = model.model_performance(test_data=test)

In the above method call, we sent test data as our parameter.

在上述方法调用中,我们发送了测试数据作为参数。

It is time now to see the output, which is the performance of our model. You do this by simply printing the performance.

现在是时候看到输出了,这是我们模型的性能。 您可以通过简单地打印演奏来做到这一点。


>>> print (performance)

This will give you the following output −

这将为您提供以下输出-

The output shows the Mean Square Error (MSE), Root Mean Square Error (RMSE), LogLoss and even the Confusion Matrix.

输出显示均方误差(MSE),均方根误差(RMSE),LogLoss甚至混淆矩阵。

在Jupyter中运行 (Running in Jupyter)

We have seen the execution from the command and also understood the purpose of each line of code. You may run the entire code in a Jupyter environment, either line by line or the whole program at a time. The complete listing is given here −

我们已经从命令中看到了执行过程,并且也了解了每一行代码的用途。 您可以在Jupyter环境中逐行或一次运行整个程序来运行整个代码。 完整的清单在这里给出-


import h2o
from h2o.estimators import H2ORandomForestEstimator
h2o.init()
data = h2o.import_file('iris.csv')
features = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']
output = 'class'
train, test = data.split_frame(ratios=[0.8])
model = H2ORandomForestEstimator(ntrees = 50, max_depth = 20, nfolds = 10)
model.train(x = features, y = output, training_frame = train)
performance = model.model_performance(test_data=test)
print (performance)

Run the code and observe the output. You can now appreciate how easy it is to apply and test a Random Forest algorithm on your dataset. The power of H20 goes far beyond this capability. What if you want to try another model on the same dataset to see if you can get better performance. This is explained in our subsequent section.

运行代码并观察输出。 现在,您可以了解在数据集上应用和测试随机森林算法有多么容易。 H20的功能远远超出了此功能。 如果要在同一数据集上尝试另一个模型,看看是否可以获得更好的性能该怎么办。 这将在我们的后续部分中进行解释。

应用不同的算法 (Applying a Different Algorithm)

Now, we will learn how to apply a Gradient Boosting algorithm to our earlier dataset to see how it performs. In the above full listing, you will need to make only two minor changes as highlighted in the code below −

现在,我们将学习如何将梯度增强算法应用于我们之前的数据集,以了解其性能。 在上面的完整清单中,您只需要进行两个较小的更改,如下面的代码中突出显示的那样:


import h2o
from h2o.estimators import H2OGradientBoostingEstimator
h2o.init()
data = h2o.import_file('iris.csv')
features = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']
output = 'class'
train, test = data.split_frame(ratios = [0.8])
model = H2OGradientBoostingEstimator
(ntrees = 50, max_depth = 20, nfolds = 10)
model.train(x = features, y = output, training_frame = train)
performance = model.model_performance(test_data = test)
print (performance)

Run the code and you will get the following output −

运行代码,您将获得以下输出-

Just compare the results like MSE, RMSE, Confusion Matrix, etc. with the previous output and decide on which one to use for production deployment. As a matter of fact, you can apply several different algorithms to decide on the best one that meets your purpose.

只需将MSE,RMSE,Confusion Matrix等结果与之前的输出进行比较,然后决定使用哪一个进行生产部署即可。 实际上,您可以应用几种不同的算法来确定最适合您目的的算法。

翻译自: https://www.tutorialspoint.com/h2o/h2o_installation.htm

windows 安装h2o

windows 安装h2o_H2O-安装相关推荐

  1. Redis进阶实践之三如何在Windows系统上安装安装Redis

    一.Redis的简介 Redis是一个key-value存储系统.和Memcached类似,它支持存储的value类型相对更多,包括string(字符串).list(链表).set(集合).zset( ...

  2. windows 7下安装VS2005,SQL Server2005,VS2008

    最近电脑安装了windows 7操作系统,开发工具VS2005,VS2008和SQL Server2005需要重新安装,对这些工具在windows 7下的兼容性起初也存在疑问,经过一番努力,这些软件终 ...

  3. windows node.js 安装

    最近基础到vue 看到vue-cli 我以前是用vue.js 文件渲染前台的文件 那么vue-cli是干嘛的啊 带着疑问,带着好奇,我看到了一篇博客 https://blog.csdn.net/muz ...

  4. windows上python3安装

    下载python 下载地址 https://www.python.org/downloads/windows/ image.png 安装python 1.添加python到环境变量 image.png ...

  5. 活动目录实战之一 windows 2008 r2 安装域中第一台域控制器

    windows 2008R2已经出来很长时间了,想写一下关于活动目录的一些知识.例如:我们应该如何安装域内第一台域控制器呢,找了很多文章,觉得胖哥这篇文章写的非常好,图文并茂,并且把原理讲的也非常清楚 ...

  6. 在Windows 7下安装Oracle 11g的解决方法

    在Windows 7下安装Oracle 11g的解决方法 前不久卸载掉了自己的Windows Vista系统,装上了Windows7 旗舰版,在装机过程中也遇到了很多问题,有些问题是自己不曾遇到过的, ...

  7. Windows下RabbitMQ安装及注意事项

    Windows下RabbitMQ安装及注意事项 简介 背景 1.      RabbitMQ是一个由erlang开发的AMQP(Advanved Message Queue)的开源实现. Rabbit ...

  8. windows系统上安装与使用Android NDK

    转自http://www.cnblogs.com/luxiaofeng54/archive/2011/02/12/1952391.html 很早就听说了android的NDK应用,只是一直没有时间去研 ...

  9. Windows 2008下安装配置 WDS Windows部署服务

    Windows(Windows Deployment Services) 部署服务适用与大中型网络中的计算机操作系统部署.可以使用 Windows 部署服务来管理映像以及无人参与安装脚本,并提供人工参 ...

  10. Apache+php+mysql在windows下的安装与配置(图文)

    先准备好软件: 一.安装Apache,配置成功一个普通网站服务器 运行下载好的"apache_2.0.55-win32-x86-no_ssl.msi",出现如下界面: 出现Apac ...

最新文章

  1. 什么可以代替压感笔_什么是优生五项?一般体检可以代替优生五项检查?医生:不可以!...
  2. 实验1 C语言开发环境使用和数据类型、运算符、表达式
  3. [轉]俞老师在同济大学的演讲词:度过有意义的生命
  4. C# 非模式窗体show()和模式窗体showdialog()的区别
  5. 百度商业大规模微服务分布式监控系统-凤睛
  6. SQL语句之left join、right join、inner join的区别
  7. 利用UDEV服务解决RAC ASM存储设备名
  8. 组件库实战 | 用vue3+ts实现全局Header和列表数据渲染ColumnList
  9. php打印出函数的内容吗,PHP打印函数集合详解以及PHP打印函数对比详解(精)
  10. 敏捷估计与规划pdf
  11. 【大数据学习-hadoop1】大数据如何处理
  12. Java的static关键字用法及原理
  13. asp.net Coolite 学习
  14. [UI] 精美UI界面欣赏[11]
  15. 人人都是产品经理——一切从Kick Off开始
  16. SUN SPARC T4-4电源故障引起的宕机
  17. CAPM模型的应用--回归模型中的Alpha, r_f
  18. 产品如何取得WFA的WiFi认证(二)成为WFA会员
  19. dva model里面的effects函数可以调用effects函数
  20. 科普:SAS是什么语言

热门文章

  1. 新手如何在IEEE上发表论文?
  2. Incorrect argument type to variable ‘max_allowed_packet‘解决方法
  3. 【Phabricator】教科书一般的Phabricator安装教程(配合官方文档并带有踩坑解决方案)...
  4. 使用RMF报表设计器进行报表设计
  5. Python -- 图像处理—PIL库的使用
  6. 基础架构即服务(iaas)_基础架构即服务
  7. 燃烧的远征服务器排队小程序,你还在让顾客排队吗?试试小程序吧!让顾客不再排队!...
  8. 一场无名的宿醉,失措了一夜的安然。
  9. MAYA打造地震后的古城场景-3D建模场景模型教程
  10. 辛普森悖论_辛普森悖论如何影响AB测试