神刀安全网

Scala API for Google Cloud DataFlow (from Spotify)

Scio

Scala API for Google Cloud DataFlow (from Spotify) Scala API for Google Cloud DataFlow (from Spotify) Scala API for Google Cloud DataFlow (from Spotify) Scala API for Google Cloud DataFlow (from Spotify)

Ecclesiastical Latin IPA: /ˈʃi.o/, [ˈʃiː.o], [ˈʃi.i̯o]

Verb: I can, know, understand, have knowledge.

Scio is a Scala API for Google Cloud Dataflow inspired by Spark andScalding. See the current API documentation for more information.

Features

  • Scala API close to that of Spark and Scalding core APIs
  • Fully managed service *
  • Unified batch and streaming programming model *
  • Integration with Google Cloud products: Cloud Storage, BigQuery, Pub/Sub, Datastore, Bigtable *
  • HDFS source/sink
  • Interactive mode with Scio REPL
  • Type safe BigQuery
  • Integration withAlgebird andBreeze
  • Pipeline orchestration with Scala Futures
  • Distributed cache

* provided by Google Cloud Dataflow

Quick Start

The ubiquitous word count example can be run directly with SBT in local mode, using README.md as input.

sbt "project scio-examples" "run-main com.spotify.scio.examples.WordCount --input=README.md --output=wc" cat wc/part-00000-of-00001.txt

Documentation

  • Scio Wiki – wiki page
  • ScalaDocs – current API documentation
  • Scio REPL – tutorial for the interactive Scio REPL
  • Scio, Spark and Scalding – comparison of these frameworks
  • Type safe BigQuery – tutorial for the type safe BigQuery API
  • HDFS – using Scio with HDFS files

Artifacts

Scio includes the following artifacts:

  • scio-core : core library
  • scio-test : test utilities, add to your project as a "test" dependency
  • scio-bigquery : Add-on for BigQuery, included in scio-core but can also be used standalone
  • scio-bigtable : Add-on for Bigtable
  • scio-extra : Extra utilities for working with collections, Breeze, etc.
  • scio-hdfs : Add-on for HDFS

License

Copyright 2016 Spotify AB.

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

转载本站任何文章请注明:转载至神刀安全网,谢谢神刀安全网 » Scala API for Google Cloud DataFlow (from Spotify)

分享到:更多 ()

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
分享按钮