Ecclesiastical Latin IPA: /ˈʃi.o/, [ˈʃiː.o], [ˈʃi.i̯o]
Verb: I can, know, understand, have knowledge.
- Scala API close to that of Spark and Scalding core APIs
- Fully managed service *
- Unified batch and streaming programming model *
- Integration with Google Cloud products: Cloud Storage, BigQuery, Pub/Sub, Datastore, Bigtable *
- HDFS source/sink
- Interactive mode with Scio REPL
- Type safe BigQuery
- Integration withAlgebird andBreeze
- Pipeline orchestration with Scala Futures
- Distributed cache
* provided by Google Cloud Dataflow
The ubiquitous word count example can be run directly with SBT in local mode, using
README.md as input.
sbt "project scio-examples" "run-main com.spotify.scio.examples.WordCount --input=README.md --output=wc" cat wc/part-00000-of-00001.txt
- Scio Wiki – wiki page
- ScalaDocs – current API documentation
- Scio REPL – tutorial for the interactive Scio REPL
- Scio, Spark and Scalding – comparison of these frameworks
- Type safe BigQuery – tutorial for the type safe BigQuery API
- HDFS – using Scio with HDFS files
Scio includes the following artifacts:
scio-core: core library
scio-test: test utilities, add to your project as a "test" dependency
scio-bigquery: Add-on for BigQuery, included in
scio-corebut can also be used standalone
scio-bigtable: Add-on for Bigtable
scio-extra: Extra utilities for working with collections, Breeze, etc.
scio-hdfs: Add-on for HDFS
Copyright 2016 Spotify AB.
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0