Accessing External Storage from Spark
Spark can access all storage sources supported by Hadoop, including a local file system, HDFS, HBase, and Amazon S3.
For developer information about working with external storage, see External Storage in the Spark Programming Guide.
Accessing Compressed Files
- saveAsTextFile(path, compressionCodecClass="codec_class")
- saveAsHadoopFile(path,outputFormatClass, compressionCodecClass="codec_class")
For examples of accessing Avro and Parquet files, see Spark with Avro and Parquet.
For details on how to access specific types of external storage and files, see: