
Libraries(SQL And DataFrame、Spark Streaming、MLlib、Third-Party Projects)  
Documentation(frequently asked questions--常见问答)  
apache software foundation(apache软件基金会)

Apache Spark is a fast(快速) and general engine(普遍的引擎) for large-scale(大规模) data processing(大数据处理).

Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk. 
==>速度上,在内存中,比Hadoop的 MapReduce快100倍,在磁盘中,也要快10倍(以Logistic regression,即Logistic回归进行比较)
Apache Spark has an advanced(超前的) DAG execution engine that supports acyclic data flow and in-memory computing.

Ease of Use:
Spark offers over 80 high-level(高层次) operators that make it easy to build parallel apps(并行的app). And you can use it interactively(人机交互) from the Scala, Python and R shells.

Combine(聚合) SQL, streaming, and complex analytics(复杂的分析).
Spark powers a stack(堆) of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly(无缝) in the same application.
==> Spark可以在一个app中,无缝的使用SQL、DataFrame、MLlib、GraphX、SparkStreaming

Runs Everywhere:
Spark runs on Hadoop, Mesos, standalone, or in the cloud. It can access diverse(不同的) data sources(数据来源) including HDFS, Cassandra, HBase, and S3.
You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN, or on Apache Mesos. Access data in HDFS, Cassandra, HBase, Hive, Tachyon, and any Hadoop data source.

Spark is used at a wide range of(大范围的) organizations (各种各样的组织)to process large datasets(大数据集). You can find example use cases at the Spark Summit conference(spark峰会), or on the Powered By page.
There are many ways to reach the community:
·Use the mailing lists to ask questions.
·In-person events include numerous(为数众多的) meetup groups and Spark Summit.
·We use JIRA for issue tracking(问题跟踪).

Apache Spark is built by a wide set of developers from over 200 companies. Since 2009, more than 1000 developers have contributed to Spark!
The project's committers come from 19 organizations.
If you'd like to participate in (参与)Spark, or contribute to the libraries on top of it, learnhow to contribute.

Getting Started:
Learning Spark is easy whether you come from a Java or Python background:
Download the latest release — you can run Spark locally(本地) on your laptop(便携式电脑).
·Read the quick start guide(*******).
·Spark Summit 2014 contained free training videos and exercises.
·Learn how to deploy Spark on a cluster.


  1. swagger接口文档出现的空文档问题
  2. ListT 赋值时 的深拷贝和浅拷贝
  3. 【明哥版】2020最新Android Studio Win10 安装教程
  4. 2022年首个退役的Apahce大数据项目
  5. [Unity]滑动条与图片填充与滑动条填充(滑动条和Image的关联)
  6. 完整解决XMMS中文显示乱码
  7. 【51单片机】Proteus C51 例题
  8. 文件分隔符和转义字符
  9. 关于业务主键和逻辑主键
  10. python画二叉树